You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
FIX: varchar columnsize does not account for utf8 conversion (#392)
### Work Item / Issue Reference
<!--
IMPORTANT: Please follow the PR template guidelines below.
For mssql-python maintainers: Insert your ADO Work Item ID below (e.g.
AB#37452)
For external contributors: Insert Github Issue number below (e.g. #149)
Only one reference is required - either GitHub issue OR ADO Work Item.
-->
<!-- mssql-python maintainers: ADO Work Item -->
<!-- External contributors: GitHub Issue -->
> GitHub Issue: #391
-------------------------------------------------------------------
### Summary
<!-- Insert your summary of changes below. Minimum 10 characters
required. -->
This PR enlarges the fetch buffer size for char/varchar columns by a
factor of 4 to account for characters which take up more space in utf8
than in whichever encoding the database is using.
It also adds a test, which passes at each commit and therefore tracks
how the interface changes. I removed some of the fallback mechanisms and
I hope that the evolution of the error messages over the different
iterations of the test shows why that's preferable.
For the SQLGetData path for wchars, a `columnSize == 0` check is added,
to avoid needing the fallback branch for `nvarchar(max)`. Previously,
SQLGetData was called once with a buffer of length 0 and only then did
the real attempt to fetch data start, after entering the fallback
branch. `columnSize == 0` was actually also the only case where the
fallback branch behaved correctly, anything else discarded the first
`columnSize` characters.
A test that covers all of latin1 and documents behavior with unmapped
characters is added as well.
Note that if the database is using a utf8 collation then a buffer of
size `columnSize` would be enough, as it tells us the number of bytes
and not the number of characters (this distinction only matters for
utf8).
<!--
### PR Title Guide
> For feature requests
FEAT: (short-description)
> For non-feature requests like test case updates, config updates ,
dependency updates etc
CHORE: (short-description)
> For Fix requests
FIX: (short-description)
> For doc update requests
DOC: (short-description)
> For Formatting, indentation, or styling update
STYLE: (short-description)
> For Refactor, without any feature changes
REFACTOR: (short-description)
> For release related changes, without any feature changes
RELEASE: #<RELEASE_VERSION> (short-description)
### Contribution Guidelines
External contributors:
- Create a GitHub issue first:
https://github.com/microsoft/mssql-python/issues/new
- Link the GitHub issue in the "GitHub Issue" section above
- Follow the PR title format and provide a meaningful summary
mssql-python maintainers:
- Create an ADO Work Item following internal processes
- Link the ADO Work Item in the "ADO Work Item" section above
- Follow the PR title format and provide a meaningful summary
-->
---------
Co-authored-by: gargsaumya <saumyagarg.100@gmail.com>
Co-authored-by: Jahnvi Thakkar <61936179+jahnvi480@users.noreply.github.com>
Co-authored-by: Gaurav Sharma <sharmag@microsoft.com>
0 commit comments