Skip to content

Commit 590af57

Browse files
authored
FEAT: Support for Complex Data Type- sql_variant (#446)
### Work Item / Issue Reference <!-- IMPORTANT: Please follow the PR template guidelines below. For mssql-python maintainers: Insert your ADO Work Item ID below For external contributors: Insert Github Issue number below Only one reference is required - either GitHub issue OR ADO Work Item. --> <!-- mssql-python maintainers: ADO Work Item --> > [AB#42724](https://sqlclientdrivers.visualstudio.com/c6d89619-62de-46a0-8b46-70b92a84d85e/_workitems/edit/42724) <!-- External contributors: GitHub Issue --> > GitHub Issue: #<ISSUE_NUMBER> ------------------------------------------------------------------- ### Summary <!-- Insert your summary of changes below. Minimum 10 characters required. --> This pull request adds support for the `sql_variant` SQL Server type to the Python MSSQL driver, ensuring that `sql_variant` columns are fetched and mapped correctly to their underlying types. The changes introduce new constants, update type mappings, and enhance the fetch logic to detect and process `sql_variant` columns using their native types, improving compatibility and correctness when handling complex data. **Support for sql_variant type** * Added the `SQL_SS_VARIANT` constant and related attribute constants in both `mssql_python/constants.py` and the C++ binding layer to enable recognition and handling of the `sql_variant` type. [[1]](diffhunk://#diff-e6d80f1000af6fd5afca05f435b11fd82df7f5c3e75ecf5763f85d3aacdbe758R120) [[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R30-R34) * Included `SQL_SS_VARIANT` in the set of valid types for type validation logic. **Type mapping and fetch logic** * Updated the C type mapping in `cursor.py` so that `SQL_SS_VARIANT` is mapped to `SQL_C_BINARY`, allowing binary transfer of the variant data. * Implemented a helper function in the C++ layer to map a `sql_variant`'s underlying C type to the appropriate SQL data type, enabling the fetch logic to reuse existing code for each possible underlying type. * Enhanced the fetch routines (`SQLGetData_wrap`, `FetchMany_wrap`, and `FetchAll_wrap`) to detect `sql_variant` columns, determine their true data types at runtime, and handle them with the correct logic. This includes always using the streaming fetch path for `sql_variant` columns to preserve native type fidelity. [[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2932-R3020) [[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L4044-R4144) [[3]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R4229-R4274) **Other improvements** * Improved error logging and debug output for easier troubleshooting and visibility into how `sql_variant` columns are processed. [[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L1156-R1162) [[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2932-R3020) [[3]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L4044-R4144) [[4]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R4229-R4274) These changes collectively ensure that the driver can now fully support reading `sql_variant` columns, mapping them to their actual types, and handling them efficiently. <!-- ### PR Title Guide > For feature requests FEAT: (short-description) > For non-feature requests like test case updates, config updates , dependency updates etc CHORE: (short-description) > For Fix requests FIX: (short-description) > For doc update requests DOC: (short-description) > For Formatting, indentation, or styling update STYLE: (short-description) > For Refactor, without any feature changes REFACTOR: (short-description) > For release related changes, without any feature changes RELEASE: #<RELEASE_VERSION> (short-description) ### Contribution Guidelines External contributors: - Create a GitHub issue first: https://github.com/microsoft/mssql-python/issues/new - Link the GitHub issue in the "GitHub Issue" section above - Follow the PR title format and provide a meaningful summary mssql-python maintainers: - Create an ADO Work Item following internal processes - Link the ADO Work Item in the "ADO Work Item" section above - Follow the PR title format and provide a meaningful summary -->
1 parent 78ddbbb commit 590af57

4 files changed

Lines changed: 755 additions & 39 deletions

File tree

mssql_python/constants.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,7 @@ class ConstantsDDBC(Enum):
118118
SQL_DATETIMEOFFSET = -155
119119
SQL_SS_TIME2 = -154
120120
SQL_SS_XML = -152
121+
SQL_SS_VARIANT = -150
121122
SQL_C_SS_TIMESTAMPOFFSET = 0x4001
122123
SQL_SCOPE_CURROW = 0
123124
SQL_BEST_ROWID = 1
@@ -376,6 +377,7 @@ def get_valid_types(cls) -> set:
376377
ConstantsDDBC.SQL_SS_XML.value,
377378
ConstantsDDBC.SQL_GUID.value,
378379
ConstantsDDBC.SQL_SS_UDT.value,
380+
ConstantsDDBC.SQL_SS_VARIANT.value,
379381
}
380382

381383
# Could also add category methods for convenience

mssql_python/cursor.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -883,6 +883,7 @@ def _get_c_type_for_sql_type(self, sql_type: int) -> int:
883883
# Other types
884884
ddbc_sql_const.SQL_GUID.value: ddbc_sql_const.SQL_C_GUID.value,
885885
ddbc_sql_const.SQL_SS_XML.value: ddbc_sql_const.SQL_C_WCHAR.value,
886+
ddbc_sql_const.SQL_SS_VARIANT.value: ddbc_sql_const.SQL_C_BINARY.value,
886887
}
887888
return sql_to_c_type.get(sql_type, ddbc_sql_const.SQL_C_DEFAULT.value)
888889

mssql_python/pybind/ddbc_bindings.cpp

Lines changed: 145 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,19 @@
2828
#define SQL_MAX_NUMERIC_LEN 16
2929
#define SQL_SS_XML (-152)
3030
#define SQL_SS_UDT (-151)
31+
#define SQL_SS_VARIANT (-150)
32+
#define SQL_CA_SS_VARIANT_TYPE (1215)
33+
#ifndef SQL_C_DATE
34+
#define SQL_C_DATE (9)
35+
#endif
36+
#ifndef SQL_C_TIME
37+
#define SQL_C_TIME (10)
38+
#endif
39+
#ifndef SQL_C_TIMESTAMP
40+
#define SQL_C_TIMESTAMP (11)
41+
#endif
42+
// SQL Server-specific variant TIME type code
43+
#define SQL_SS_VARIANT_TIME (16384)
3144

3245
#define STRINGIFY_FOR_CASE(x) \
3346
case x: \
@@ -2914,6 +2927,67 @@ py::object FetchLobColumnData(SQLHSTMT hStmt, SQLUSMALLINT colIndex, SQLSMALLINT
29142927
}
29152928
}
29162929

2930+
// Helper function to map sql_variant's underlying C type to SQL data type
2931+
// This allows sql_variant to reuse existing fetch logic for each data type
2932+
SQLSMALLINT MapVariantCTypeToSQLType(SQLLEN variantCType) {
2933+
switch (variantCType) {
2934+
case SQL_C_SLONG:
2935+
case SQL_C_LONG:
2936+
return SQL_INTEGER;
2937+
case SQL_C_SSHORT:
2938+
case SQL_C_SHORT:
2939+
return SQL_SMALLINT;
2940+
case SQL_C_SBIGINT:
2941+
return SQL_BIGINT;
2942+
case SQL_C_FLOAT:
2943+
return SQL_REAL;
2944+
case SQL_C_DOUBLE:
2945+
return SQL_DOUBLE;
2946+
case SQL_C_BIT:
2947+
return SQL_BIT;
2948+
case SQL_C_CHAR:
2949+
return SQL_VARCHAR;
2950+
case SQL_C_WCHAR:
2951+
return SQL_WVARCHAR;
2952+
case SQL_C_DATE:
2953+
case SQL_C_TYPE_DATE:
2954+
return SQL_TYPE_DATE;
2955+
case SQL_C_TIME:
2956+
case SQL_C_TYPE_TIME:
2957+
case SQL_SS_VARIANT_TIME:
2958+
return SQL_TYPE_TIME;
2959+
case SQL_C_TIMESTAMP:
2960+
case SQL_C_TYPE_TIMESTAMP:
2961+
return SQL_TYPE_TIMESTAMP;
2962+
case SQL_C_BINARY:
2963+
return SQL_VARBINARY;
2964+
case SQL_C_GUID:
2965+
return SQL_GUID;
2966+
case SQL_C_NUMERIC:
2967+
return SQL_NUMERIC;
2968+
case SQL_C_TINYINT:
2969+
case SQL_C_UTINYINT:
2970+
case SQL_C_STINYINT:
2971+
return SQL_TINYINT;
2972+
default:
2973+
// Unknown C type code - fallback to WVARCHAR for string conversion
2974+
// Note: SQL Server enforces sql_variant restrictions at INSERT time, preventing
2975+
// invalid types (text, ntext, image, timestamp, xml, MAX types, nested variants,
2976+
// spatial types, hierarchyid, UDTs) from being stored. By the time we fetch data,
2977+
// only valid base types exist. This default handles unmapped/future type codes.
2978+
return SQL_WVARCHAR;
2979+
}
2980+
}
2981+
2982+
// Helper function to check if a column requires SQLGetData streaming (LOB or sql_variant)
2983+
static inline bool IsLobOrVariantColumn(SQLSMALLINT dataType, SQLULEN columnSize) {
2984+
return dataType == SQL_SS_VARIANT ||
2985+
((dataType == SQL_WVARCHAR || dataType == SQL_WLONGVARCHAR || dataType == SQL_VARCHAR ||
2986+
dataType == SQL_LONGVARCHAR || dataType == SQL_VARBINARY ||
2987+
dataType == SQL_LONGVARBINARY || dataType == SQL_SS_XML || dataType == SQL_SS_UDT) &&
2988+
(columnSize == 0 || columnSize == SQL_NO_TOTAL || columnSize > SQL_MAX_LOB_SIZE));
2989+
}
2990+
29172991
// Helper function to retrieve column data
29182992
SQLRETURN SQLGetData_wrap(SqlHandlePtr StatementHandle, SQLUSMALLINT colCount, py::list& row,
29192993
const std::string& charEncoding = "utf-8",
@@ -2952,7 +3026,42 @@ SQLRETURN SQLGetData_wrap(SqlHandlePtr StatementHandle, SQLUSMALLINT colCount, p
29523026
continue;
29533027
}
29543028

2955-
switch (dataType) {
3029+
// Preprocess sql_variant: detect underlying type to route to correct conversion logic
3030+
SQLSMALLINT effectiveDataType = dataType;
3031+
if (dataType == SQL_SS_VARIANT) {
3032+
// For sql_variant, we MUST call SQLGetData with SQL_C_BINARY (NULL buffer, len=0)
3033+
// first. This serves two purposes:
3034+
// 1. Detects NULL values via the indicator parameter
3035+
// 2. Initializes the variant metadata in the ODBC driver, which is required for
3036+
// SQLColAttribute(SQL_CA_SS_VARIANT_TYPE) to return the correct underlying C type.
3037+
// Without this probe call, SQLColAttribute returns incorrect type codes.
3038+
SQLLEN indicator;
3039+
ret = SQLGetData_ptr(hStmt, i, SQL_C_BINARY, NULL, 0, &indicator);
3040+
if (!SQL_SUCCEEDED(ret)) {
3041+
LOG_ERROR("SQLGetData: Failed to probe sql_variant column %d - SQLRETURN=%d", i,
3042+
ret);
3043+
row.append(py::none());
3044+
continue;
3045+
}
3046+
if (indicator == SQL_NULL_DATA) {
3047+
row.append(py::none());
3048+
continue;
3049+
}
3050+
// Now retrieve the underlying C type
3051+
SQLLEN variantCType = 0;
3052+
ret =
3053+
SQLColAttribute_ptr(hStmt, i, SQL_CA_SS_VARIANT_TYPE, NULL, 0, NULL, &variantCType);
3054+
if (!SQL_SUCCEEDED(ret)) {
3055+
LOG_ERROR("SQLGetData: Failed to get sql_variant underlying type for column %d", i);
3056+
row.append(py::none());
3057+
continue;
3058+
}
3059+
effectiveDataType = MapVariantCTypeToSQLType(variantCType);
3060+
LOG("SQLGetData: sql_variant column %d has variantCType=%ld, mapped to SQL type %d", i,
3061+
(long)variantCType, effectiveDataType);
3062+
}
3063+
3064+
switch (effectiveDataType) {
29563065
case SQL_CHAR:
29573066
case SQL_VARCHAR:
29583067
case SQL_LONGVARCHAR: {
@@ -4118,10 +4227,7 @@ SQLRETURN FetchMany_wrap(SqlHandlePtr StatementHandle, py::list& rows, int fetch
41184227
SQLSMALLINT dataType = colMeta["DataType"].cast<SQLSMALLINT>();
41194228
SQLULEN columnSize = colMeta["ColumnSize"].cast<SQLULEN>();
41204229

4121-
if ((dataType == SQL_WVARCHAR || dataType == SQL_WLONGVARCHAR || dataType == SQL_VARCHAR ||
4122-
dataType == SQL_LONGVARCHAR || dataType == SQL_VARBINARY ||
4123-
dataType == SQL_LONGVARBINARY || dataType == SQL_SS_XML || dataType == SQL_SS_UDT) &&
4124-
(columnSize == 0 || columnSize == SQL_NO_TOTAL || columnSize > SQL_MAX_LOB_SIZE)) {
4230+
if (IsLobOrVariantColumn(dataType, columnSize)) {
41254231
lobColumns.push_back(i + 1); // 1-based
41264232
}
41274233
}
@@ -4211,6 +4317,40 @@ SQLRETURN FetchAll_wrap(SqlHandlePtr StatementHandle, py::list& rows,
42114317
return ret;
42124318
}
42134319

4320+
std::vector<SQLUSMALLINT> lobColumns;
4321+
for (SQLSMALLINT i = 0; i < numCols; i++) {
4322+
auto colMeta = columnNames[i].cast<py::dict>();
4323+
SQLSMALLINT dataType = colMeta["DataType"].cast<SQLSMALLINT>();
4324+
SQLULEN columnSize = colMeta["ColumnSize"].cast<SQLULEN>();
4325+
4326+
// Detect LOB columns that need SQLGetData streaming
4327+
// sql_variant always uses SQLGetData for native type preservation
4328+
if (IsLobOrVariantColumn(dataType, columnSize)) {
4329+
lobColumns.push_back(i + 1); // 1-based
4330+
}
4331+
}
4332+
4333+
// If we have LOBs → fall back to row-by-row fetch + SQLGetData_wrap
4334+
if (!lobColumns.empty()) {
4335+
LOG("FetchAll_wrap: LOB columns detected (%zu columns), using per-row "
4336+
"SQLGetData path",
4337+
lobColumns.size());
4338+
while (true) {
4339+
ret = SQLFetch_ptr(hStmt);
4340+
if (ret == SQL_NO_DATA)
4341+
break;
4342+
if (!SQL_SUCCEEDED(ret))
4343+
return ret;
4344+
4345+
py::list row;
4346+
SQLGetData_wrap(StatementHandle, numCols, row, charEncoding,
4347+
wcharEncoding); // <-- streams LOBs correctly
4348+
rows.append(row);
4349+
}
4350+
return SQL_SUCCESS;
4351+
}
4352+
4353+
// No LOBs detected - use binding path with batch fetching
42144354
// Define a memory limit (1 GB)
42154355
const size_t memoryLimit = 1ULL * 1024 * 1024 * 1024;
42164356
size_t totalRowSize = calculateRowSize(columnNames, numCols);
@@ -4251,40 +4391,6 @@ SQLRETURN FetchAll_wrap(SqlHandlePtr StatementHandle, py::list& rows,
42514391
}
42524392
LOG("FetchAll_wrap: Fetching data in batch sizes of %d", fetchSize);
42534393

4254-
std::vector<SQLUSMALLINT> lobColumns;
4255-
for (SQLSMALLINT i = 0; i < numCols; i++) {
4256-
auto colMeta = columnNames[i].cast<py::dict>();
4257-
SQLSMALLINT dataType = colMeta["DataType"].cast<SQLSMALLINT>();
4258-
SQLULEN columnSize = colMeta["ColumnSize"].cast<SQLULEN>();
4259-
4260-
if ((dataType == SQL_WVARCHAR || dataType == SQL_WLONGVARCHAR || dataType == SQL_VARCHAR ||
4261-
dataType == SQL_LONGVARCHAR || dataType == SQL_VARBINARY ||
4262-
dataType == SQL_LONGVARBINARY || dataType == SQL_SS_XML || dataType == SQL_SS_UDT) &&
4263-
(columnSize == 0 || columnSize == SQL_NO_TOTAL || columnSize > SQL_MAX_LOB_SIZE)) {
4264-
lobColumns.push_back(i + 1); // 1-based
4265-
}
4266-
}
4267-
4268-
// If we have LOBs → fall back to row-by-row fetch + SQLGetData_wrap
4269-
if (!lobColumns.empty()) {
4270-
LOG("FetchAll_wrap: LOB columns detected (%zu columns), using per-row "
4271-
"SQLGetData path",
4272-
lobColumns.size());
4273-
while (true) {
4274-
ret = SQLFetch_ptr(hStmt);
4275-
if (ret == SQL_NO_DATA)
4276-
break;
4277-
if (!SQL_SUCCEEDED(ret))
4278-
return ret;
4279-
4280-
py::list row;
4281-
SQLGetData_wrap(StatementHandle, numCols, row, charEncoding,
4282-
wcharEncoding); // <-- streams LOBs correctly
4283-
rows.append(row);
4284-
}
4285-
return SQL_SUCCESS;
4286-
}
4287-
42884394
ColumnBuffers buffers(numCols, fetchSize);
42894395

42904396
// Bind columns

0 commit comments

Comments
 (0)