Skip to content

Commit 249cfdf

Browse files
subrata-mssubrata-msgargsaumya
authored
FIX: VARCHAR fetch fails when data length equals column size with non-ASCII CP1252 characters (#444)
### Work Item / Issue Reference <!-- IMPORTANT: Please follow the PR template guidelines below. For mssql-python maintainers: Insert your ADO Work Item ID below For external contributors: Insert Github Issue number below Only one reference is required - either GitHub issue OR ADO Work Item. --> <!-- mssql-python maintainers: ADO Work Item --> > [AB#42604](https://sqlclientdrivers.visualstudio.com/c6d89619-62de-46a0-8b46-70b92a84d85e/_workitems/edit/42604) <!-- External contributors: GitHub Issue --> > GitHub Issue: #435 ------------------------------------------------------------------- ### Summary This pull request improves the handling of character encoding and buffer sizing for SQL `CHAR`/`VARCHAR` data in the ODBC Python bindings, especially for cross-platform compatibility between Linux/macOS and Windows. The changes ensure that character data is decoded correctly and that buffer sizes are sufficient to prevent corruption or truncation when dealing with multi-byte UTF-8 data returned by the ODBC driver on non-Windows systems. **Character Encoding Handling:** - Introduced the `GetEffectiveCharDecoding` function to determine the correct decoding to use for SQL `CHAR` data: always UTF-8 on Linux/macOS (since the ODBC driver returns UTF-8), and the user-specified encoding on Windows. This function is now used consistently throughout the codebase to select the decoding method. [[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R42-R54) [[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2878-R2903) [[3]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2955-R2991) [[4]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R3694) [[5]](diffhunk://#diff-85167a2d59779df18704284ab7ce46220c3619408fbf22c631ffdf29f794d635R670) [[6]](diffhunk://#diff-85167a2d59779df18704284ab7ce46220c3619408fbf22c631ffdf29f794d635L814-R840) **Buffer Sizing for UTF-8:** - Updated buffer allocation logic to use `columnSize * 4 + 1` for SQL `CHAR`/`VARCHAR` columns on Linux/macOS, accounting for the worst-case UTF-8 expansion (up to 4 bytes per character), preventing data truncation when multi-byte characters are present at the column boundary. [[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2944-R2964) [[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R3477-R3484) [[3]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R3662-R3678) **Decoding and Data Fetching:** - Modified all data fetching and decoding paths (`FetchLobColumnData`, `SQLGetData_wrap`, `ProcessChar`, and batch fetch functions) to use the effective character encoding and the correct buffer sizes, ensuring consistent and correct decoding regardless of platform. [[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2878-R2903) [[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2955-R2991) [[3]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R3694) [[4]](diffhunk://#diff-85167a2d59779df18704284ab7ce46220c3619408fbf22c631ffdf29f794d635L814-R840) **API Changes:** - Updated the `FetchBatchData` function and its callers to accept and propagate the character encoding parameter, ensuring the encoding context is preserved throughout the data-fetching stack. [[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L3601-R3632) [[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L4088-R4135) [[3]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L4224-R4272) **Minor Fixes:** - Minor formatting and logging improvements for error messages and function signatures. These changes collectively improve correctness and reliability when handling string data from SQL databases, especially in multi-platform environments. **CP1252 VARCHAR Boundary Fix — Summary** **Problem::** VARCHAR columns with CP1252 non-ASCII characters (e.g., é, ñ, ö) returned corrupted data when the string length exactly equaled the column size. Inserting "café René!" into VARCHAR(10) returned "©!". **Root Cause::** Three bugs in [ddbc_bindings.cpp](vscode-file://vscode-app/c:/Users/spaitandi/AppData/Local/Programs/Microsoft%20VS%20Code/072586267e/resources/app/out/vs/code/electron-browser/workbench/workbench.html): **Undersized buffer** — SQLGetData / SQLBindCol allocated [columnSize + 1](vscode-file://vscode-app/c:/Users/spaitandi/AppData/Local/Programs/Microsoft%20VS%20Code/072586267e/resources/app/out/vs/code/electron-browser/workbench/workbench.html) bytes, but on Linux/macOS the ODBC driver converts server data to UTF-8 where CP1252 é (1 byte) becomes 0xC3 0xA9 (2 bytes). A 10-char string with 2 accented characters needs 12 bytes, exceeding the 11-byte buffer → truncation → LOB fallback re-reads consumed data → corruption. **Wrong decode encoding** — After fixing the buffer, data arrived intact but was decoded with the user's [charEncoding](vscode-file://vscode-app/c:/Users/spaitandi/AppData/Local/Programs/Microsoft%20VS%20Code/072586267e/resources/app/out/vs/code/electron-browser/workbench/workbench.html) (CP1252) instead of UTF-8. Since ODBC on Linux/macOS already converts to UTF-8, double-interpreting as CP1252 produced mojibake (café René!). **[ProcessChar](vscode-file://vscode-app/c:/Users/spaitandi/AppData/Local/Programs/Microsoft%20VS%20Code/072586267e/resources/app/out/vs/code/electron-browser/workbench/workbench.html) assumed UTF-8 on all platforms** — The batch/fetchall hot path used [PyUnicode_FromStringAndSize](vscode-file://vscode-app/c:/Users/spaitandi/AppData/Local/Programs/Microsoft%20VS%20Code/072586267e/resources/app/out/vs/code/electron-browser/workbench/workbench.html) which assumes UTF-8 input. Correct on Linux (ODBC returns UTF-8), but wrong on Windows (ODBC returns native encoding like CP1252). --------- Co-authored-by: subrata-ms <subrata@microsoft.com> Co-authored-by: gargsaumya <saumyagarg.100@gmail.com>
1 parent 7ebef4f commit 249cfdf

4 files changed

Lines changed: 849 additions & 26 deletions

File tree

mssql_python/pybind/ddbc_bindings.cpp

Lines changed: 79 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,19 @@
4040
#define DAE_CHUNK_SIZE 8192
4141
#define SQL_MAX_LOB_SIZE 8000
4242

43+
// Returns the effective character decoding encoding for SQL_C_CHAR data.
44+
// On Linux/macOS, the ODBC driver always returns UTF-8 for SQL_C_CHAR,
45+
// having already converted from the server's encoding (e.g., CP1252).
46+
// On Windows, the driver returns bytes in the server's native encoding.
47+
inline std::string GetEffectiveCharDecoding(const std::string& userEncoding) {
48+
#if defined(__APPLE__) || defined(__linux__)
49+
(void)userEncoding;
50+
return "utf-8";
51+
#else
52+
return userEncoding;
53+
#endif
54+
}
55+
4356
//-------------------------------------------------------------------------------------------------
4457
//-------------------------------------------------------------------------------------------------
4558
// Logging Infrastructure:
@@ -1154,7 +1167,8 @@ void SqlHandle::markImplicitlyFreed() {
11541167
// Log error but don't throw - we're likely in cleanup/destructor path
11551168
LOG_ERROR("SAFETY VIOLATION: Attempted to mark non-STMT handle as implicitly freed. "
11561169
"Handle type=%d. This will cause handle leak. Only STMT handles are "
1157-
"automatically freed by parent DBC handles.", _type);
1170+
"automatically freed by parent DBC handles.",
1171+
_type);
11581172
return; // Refuse to mark - let normal free() handle it
11591173
}
11601174
_implicitly_freed = true;
@@ -2876,17 +2890,18 @@ py::object FetchLobColumnData(SQLHSTMT hStmt, SQLUSMALLINT colIndex, SQLSMALLINT
28762890
return py::bytes(buffer.data(), buffer.size());
28772891
}
28782892

2879-
// For SQL_C_CHAR data, decode using the specified encoding
2893+
// For SQL_C_CHAR data, decode using the appropriate encoding.
2894+
const std::string effectiveCharEncoding = GetEffectiveCharDecoding(charEncoding);
28802895
py::bytes raw_bytes(buffer.data(), buffer.size());
28812896
try {
2882-
py::object decoded = raw_bytes.attr("decode")(charEncoding, "strict");
2897+
py::object decoded = raw_bytes.attr("decode")(effectiveCharEncoding, "strict");
28832898
LOG("FetchLobColumnData: Decoded narrow string with '%s' - %zu bytes -> %zu chars for "
28842899
"column %d",
2885-
charEncoding.c_str(), buffer.size(), py::len(decoded), colIndex);
2900+
effectiveCharEncoding.c_str(), buffer.size(), py::len(decoded), colIndex);
28862901
return decoded;
28872902
} catch (const py::error_already_set& e) {
28882903
LOG_ERROR("FetchLobColumnData: Failed to decode with '%s' for column %d: %s",
2889-
charEncoding.c_str(), colIndex, e.what());
2904+
effectiveCharEncoding.c_str(), colIndex, e.what());
28902905
// Return raw bytes as fallback
28912906
return raw_bytes;
28922907
}
@@ -2942,7 +2957,23 @@ SQLRETURN SQLGetData_wrap(SqlHandlePtr StatementHandle, SQLUSMALLINT colCount, p
29422957
row.append(
29432958
FetchLobColumnData(hStmt, i, SQL_C_CHAR, false, false, charEncoding));
29442959
} else {
2945-
uint64_t fetchBufferSize = columnSize + 1 /* null-termination */;
2960+
// Allocate columnSize * 4 + 1 on ALL platforms (no #if guard).
2961+
//
2962+
// Why this differs from SQLBindColums / FetchBatchData:
2963+
// Those two functions use #if to apply *4 only on Linux/macOS,
2964+
// because on Windows with a non-UTF-8 collation (e.g. CP1252)
2965+
// each character occupies exactly 1 byte, so *1 suffices and
2966+
// saves memory across the entire batch (fetchSize × numCols
2967+
// buffers).
2968+
//
2969+
// SQLGetData_wrap allocates a single temporary buffer per
2970+
// column per row, so the over-allocation cost is negligible.
2971+
// Using *4 unconditionally here keeps the code simple and
2972+
// correct on every platform—including Windows with a UTF-8
2973+
// collation where multi-byte chars could otherwise cause
2974+
// truncation at the exact column boundary (e.g. CP1252 é in
2975+
// VARCHAR(10)).
2976+
uint64_t fetchBufferSize = columnSize * 4 + 1 /* null-termination */;
29462977
std::vector<SQLCHAR> dataBuffer(fetchBufferSize);
29472978
SQLLEN dataLen;
29482979
ret = SQLGetData_ptr(hStmt, i, SQL_C_CHAR, dataBuffer.data(), dataBuffer.size(),
@@ -2953,20 +2984,23 @@ SQLRETURN SQLGetData_wrap(SqlHandlePtr StatementHandle, SQLUSMALLINT colCount, p
29532984
uint64_t numCharsInData = dataLen / sizeof(SQLCHAR);
29542985
if (numCharsInData < dataBuffer.size()) {
29552986
// SQLGetData will null-terminate the data
2956-
// Use Python's codec system to decode bytes with specified encoding
2987+
// Use Python's codec system to decode bytes.
2988+
const std::string decodeEncoding =
2989+
GetEffectiveCharDecoding(charEncoding);
29572990
py::bytes raw_bytes(reinterpret_cast<char*>(dataBuffer.data()),
29582991
static_cast<size_t>(dataLen));
29592992
try {
29602993
py::object decoded =
2961-
raw_bytes.attr("decode")(charEncoding, "strict");
2994+
raw_bytes.attr("decode")(decodeEncoding, "strict");
29622995
row.append(decoded);
29632996
LOG("SQLGetData: CHAR column %d decoded with '%s', %zu bytes "
29642997
"-> %zu chars",
2965-
i, charEncoding.c_str(), (size_t)dataLen, py::len(decoded));
2998+
i, decodeEncoding.c_str(), (size_t)dataLen,
2999+
py::len(decoded));
29663000
} catch (const py::error_already_set& e) {
29673001
LOG_ERROR(
29683002
"SQLGetData: Failed to decode CHAR column %d with '%s': %s",
2969-
i, charEncoding.c_str(), e.what());
3003+
i, decodeEncoding.c_str(), e.what());
29703004
// Return raw bytes as fallback
29713005
row.append(raw_bytes);
29723006
}
@@ -3453,7 +3487,14 @@ SQLRETURN SQLBindColums(SQLHSTMT hStmt, ColumnBuffers& buffers, py::list& column
34533487
// TODO: handle variable length data correctly. This logic wont
34543488
// suffice
34553489
HandleZeroColumnSizeAtFetch(columnSize);
3490+
// Use columnSize * 4 + 1 on Linux/macOS to accommodate UTF-8
3491+
// expansion. The ODBC driver returns UTF-8 for SQL_C_CHAR where
3492+
// each character can be up to 4 bytes.
3493+
#if defined(__APPLE__) || defined(__linux__)
3494+
uint64_t fetchBufferSize = columnSize * 4 + 1 /*null-terminator*/;
3495+
#else
34563496
uint64_t fetchBufferSize = columnSize + 1 /*null-terminator*/;
3497+
#endif
34573498
// TODO: For LONGVARCHAR/BINARY types, columnSize is returned as
34583499
// 2GB-1 by SQLDescribeCol. So fetchBufferSize = 2GB.
34593500
// fetchSize=1 if columnSize>1GB. So we'll allocate a vector of
@@ -3601,7 +3642,8 @@ SQLRETURN SQLBindColums(SQLHSTMT hStmt, ColumnBuffers& buffers, py::list& column
36013642
// TODO: Move to anonymous namespace, since it is not used outside this file
36023643
SQLRETURN FetchBatchData(SQLHSTMT hStmt, ColumnBuffers& buffers, py::list& columnNames,
36033644
py::list& rows, SQLUSMALLINT numCols, SQLULEN& numRowsFetched,
3604-
const std::vector<SQLUSMALLINT>& lobColumns) {
3645+
const std::vector<SQLUSMALLINT>& lobColumns,
3646+
const std::string& charEncoding = "utf-8") {
36053647
LOG("FetchBatchData: Fetching data in batches");
36063648
SQLRETURN ret = SQLFetchScroll_ptr(hStmt, SQL_FETCH_NEXT, 0);
36073649
if (ret == SQL_NO_DATA) {
@@ -3631,8 +3673,22 @@ SQLRETURN FetchBatchData(SQLHSTMT hStmt, ColumnBuffers& buffers, py::list& colum
36313673
std::find(lobColumns.begin(), lobColumns.end(), col + 1) != lobColumns.end();
36323674
columnInfos[col].processedColumnSize = columnInfos[col].columnSize;
36333675
HandleZeroColumnSizeAtFetch(columnInfos[col].processedColumnSize);
3676+
// On Linux/macOS, the ODBC driver returns UTF-8 for SQL_C_CHAR where
3677+
// each character can be up to 4 bytes. Must match SQLBindColums buffer.
3678+
#if defined(__APPLE__) || defined(__linux__)
3679+
SQLSMALLINT dt = columnInfos[col].dataType;
3680+
bool isCharType = (dt == SQL_CHAR || dt == SQL_VARCHAR || dt == SQL_LONGVARCHAR);
3681+
if (isCharType) {
3682+
columnInfos[col].fetchBufferSize = columnInfos[col].processedColumnSize * 4 +
3683+
1; // *4 for UTF-8, +1 for null terminator
3684+
} else {
3685+
columnInfos[col].fetchBufferSize =
3686+
columnInfos[col].processedColumnSize + 1; // +1 for null terminator
3687+
}
3688+
#else
36343689
columnInfos[col].fetchBufferSize =
36353690
columnInfos[col].processedColumnSize + 1; // +1 for null terminator
3691+
#endif
36363692
}
36373693

36383694
// Performance: Build function pointer dispatch table (once per batch)
@@ -3642,13 +3698,18 @@ SQLRETURN FetchBatchData(SQLHSTMT hStmt, ColumnBuffers& buffers, py::list& colum
36423698
std::vector<ColumnProcessor> columnProcessors(numCols);
36433699
std::vector<ColumnInfoExt> columnInfosExt(numCols);
36443700

3701+
// Compute effective char encoding once for the batch (same for all columns)
3702+
const std::string effectiveCharEnc = GetEffectiveCharDecoding(charEncoding);
3703+
36453704
for (SQLUSMALLINT col = 0; col < numCols; col++) {
36463705
// Populate extended column info for processors that need it
36473706
columnInfosExt[col].dataType = columnInfos[col].dataType;
36483707
columnInfosExt[col].columnSize = columnInfos[col].columnSize;
36493708
columnInfosExt[col].processedColumnSize = columnInfos[col].processedColumnSize;
36503709
columnInfosExt[col].fetchBufferSize = columnInfos[col].fetchBufferSize;
36513710
columnInfosExt[col].isLob = columnInfos[col].isLob;
3711+
columnInfosExt[col].charEncoding = effectiveCharEnc;
3712+
columnInfosExt[col].isUtf8 = (effectiveCharEnc == "utf-8");
36523713

36533714
// Map data type to processor function (switch executed once per column,
36543715
// not per cell)
@@ -4094,7 +4155,8 @@ SQLRETURN FetchMany_wrap(SqlHandlePtr StatementHandle, py::list& rows, int fetch
40944155
SQLSetStmtAttr_ptr(hStmt, SQL_ATTR_ROW_ARRAY_SIZE, (SQLPOINTER)(intptr_t)fetchSize, 0);
40954156
SQLSetStmtAttr_ptr(hStmt, SQL_ATTR_ROWS_FETCHED_PTR, &numRowsFetched, 0);
40964157

4097-
ret = FetchBatchData(hStmt, buffers, columnNames, rows, numCols, numRowsFetched, lobColumns);
4158+
ret = FetchBatchData(hStmt, buffers, columnNames, rows, numCols, numRowsFetched, lobColumns,
4159+
charEncoding);
40984160
if (!SQL_SUCCEEDED(ret) && ret != SQL_NO_DATA) {
40994161
LOG("FetchMany_wrap: Error when fetching data - SQLRETURN=%d", ret);
41004162
return ret;
@@ -4103,10 +4165,10 @@ SQLRETURN FetchMany_wrap(SqlHandlePtr StatementHandle, py::list& rows, int fetch
41034165
// Reset attributes before returning to avoid using stack pointers later
41044166
SQLSetStmtAttr_ptr(hStmt, SQL_ATTR_ROW_ARRAY_SIZE, (SQLPOINTER)1, 0);
41054167
SQLSetStmtAttr_ptr(hStmt, SQL_ATTR_ROWS_FETCHED_PTR, NULL, 0);
4106-
4168+
41074169
// Unbind columns to allow subsequent fetchone() calls to use SQLGetData
41084170
SQLFreeStmt_ptr(hStmt, SQL_UNBIND);
4109-
4171+
41104172
return ret;
41114173
}
41124174

@@ -4231,8 +4293,8 @@ SQLRETURN FetchAll_wrap(SqlHandlePtr StatementHandle, py::list& rows,
42314293
SQLSetStmtAttr_ptr(hStmt, SQL_ATTR_ROWS_FETCHED_PTR, &numRowsFetched, 0);
42324294

42334295
while (ret != SQL_NO_DATA) {
4234-
ret =
4235-
FetchBatchData(hStmt, buffers, columnNames, rows, numCols, numRowsFetched, lobColumns);
4296+
ret = FetchBatchData(hStmt, buffers, columnNames, rows, numCols, numRowsFetched, lobColumns,
4297+
charEncoding);
42364298
if (!SQL_SUCCEEDED(ret) && ret != SQL_NO_DATA) {
42374299
LOG("FetchAll_wrap: Error when fetching data - SQLRETURN=%d", ret);
42384300
return ret;
@@ -4242,7 +4304,7 @@ SQLRETURN FetchAll_wrap(SqlHandlePtr StatementHandle, py::list& rows,
42424304
// Reset attributes before returning to avoid using stack pointers later
42434305
SQLSetStmtAttr_ptr(hStmt, SQL_ATTR_ROW_ARRAY_SIZE, (SQLPOINTER)1, 0);
42444306
SQLSetStmtAttr_ptr(hStmt, SQL_ATTR_ROWS_FETCHED_PTR, NULL, 0);
4245-
4307+
42464308
// Unbind columns to allow subsequent fetchone() calls to use SQLGetData
42474309
SQLFreeStmt_ptr(hStmt, SQL_UNBIND);
42484310

mssql_python/pybind/ddbc_bindings.h

Lines changed: 38 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -667,6 +667,8 @@ struct ColumnInfoExt {
667667
SQLULEN processedColumnSize;
668668
uint64_t fetchBufferSize;
669669
bool isLob;
670+
bool isUtf8; // Pre-computed from charEncoding (avoids string compare per cell)
671+
std::string charEncoding; // Effective decoding encoding for SQL_C_CHAR data
670672
};
671673

672674
// Forward declare FetchLobColumnData (defined in ddbc_bindings.cpp) - MUST be
@@ -811,21 +813,48 @@ inline void ProcessChar(PyObject* row, ColumnBuffers& buffers, const void* colIn
811813
// fetchBufferSize includes null-terminator, numCharsInData doesn't. Hence
812814
// '<'
813815
if (!colInfo->isLob && numCharsInData < colInfo->fetchBufferSize) {
814-
// Performance: Direct Python C API call - create string from buffer
815-
PyObject* pyStr = PyUnicode_FromStringAndSize(
816-
reinterpret_cast<char*>(
817-
&buffers.charBuffers[col - 1][rowIdx * colInfo->fetchBufferSize]),
818-
numCharsInData);
816+
const char* dataPtr = reinterpret_cast<char*>(
817+
&buffers.charBuffers[col - 1][rowIdx * colInfo->fetchBufferSize]);
818+
PyObject* pyStr = nullptr;
819+
#if defined(__APPLE__) || defined(__linux__)
820+
// On Linux/macOS, ODBC driver returns UTF-8 — PyUnicode_FromStringAndSize
821+
// expects UTF-8, so this is correct and fast.
822+
pyStr = PyUnicode_FromStringAndSize(dataPtr, numCharsInData);
823+
#else
824+
// On Windows, ODBC driver returns bytes in the server's native encoding.
825+
// For UTF-8, use the direct C API (PyUnicode_FromStringAndSize) which
826+
// bypasses the codec registry for maximum reliability. For non-UTF-8
827+
// encodings (e.g., CP1252), use PyUnicode_Decode with the codec registry.
828+
if (colInfo->isUtf8) {
829+
pyStr = PyUnicode_FromStringAndSize(dataPtr, numCharsInData);
830+
} else {
831+
pyStr =
832+
PyUnicode_Decode(dataPtr, numCharsInData, colInfo->charEncoding.c_str(), "strict");
833+
}
834+
#endif
819835
if (!pyStr) {
820-
Py_INCREF(Py_None);
821-
PyList_SET_ITEM(row, col - 1, Py_None);
836+
// Decode failed — fall back to returning raw bytes (consistent with
837+
// FetchLobColumnData and SQLGetData_wrap which also return raw bytes
838+
// on decode failure instead of silently converting to None).
839+
PyErr_Clear();
840+
PyObject* pyBytes = PyBytes_FromStringAndSize(dataPtr, numCharsInData);
841+
if (pyBytes) {
842+
PyList_SET_ITEM(row, col - 1, pyBytes);
843+
} else {
844+
PyErr_Clear();
845+
Py_INCREF(Py_None);
846+
PyList_SET_ITEM(row, col - 1, Py_None);
847+
}
822848
} else {
823849
PyList_SET_ITEM(row, col - 1, pyStr);
824850
}
825851
} else {
826852
// Slow path: LOB data requires separate fetch call
827-
PyList_SET_ITEM(row, col - 1,
828-
FetchLobColumnData(hStmt, col, SQL_C_CHAR, false, false).release().ptr());
853+
PyList_SET_ITEM(
854+
row, col - 1,
855+
FetchLobColumnData(hStmt, col, SQL_C_CHAR, false, false, colInfo->charEncoding)
856+
.release()
857+
.ptr());
829858
}
830859
}
831860

tests/test_013_encoding_decoding.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5697,6 +5697,7 @@ def test_default_encoding_behavior_validation(conn_str):
56975697
def test_encoding_with_bytes_and_bytearray_parameters(db_connection):
56985698
"""Test encoding with bytes and bytearray parameters (SQL_C_CHAR path)."""
56995699
db_connection.setencoding(encoding="utf-8", ctype=mssql_python.SQL_CHAR)
5700+
db_connection.setdecoding(mssql_python.SQL_CHAR, encoding="utf-8")
57005701

57015702
cursor = db_connection.cursor()
57025703
try:

0 commit comments

Comments
 (0)