-
Notifications
You must be signed in to change notification settings - Fork 27
FEAT: streaming support in executemany #251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces robust streaming support for large values (NVARCHAR(MAX), VARCHAR(MAX), VARBINARY(MAX)) with the executemany method. The implementation automatically detects Data-At-Execution (DAE) parameters and falls back to row-by-row execution when necessary to handle streaming operations correctly.
Key changes:
- Enhanced
executemanymethod to detect DAE parameters and fall back to row-by-row execution with streaming support - Updated C++ binding to handle DAE parameters with
SQLPutDatafor both text and binary data streaming - Added comprehensive tests for streaming inserts and fetches across all MAX column types
Reviewed Changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| tests/test_004_cursor.py | Adds comprehensive test coverage for streaming operations with NVARCHAR(MAX), VARCHAR(MAX), and VARBINARY(MAX) using executemany |
| mssql_python/pybind/ddbc_bindings.cpp | Implements DAE parameter detection and streaming logic in SQLExecuteMany_wrap, plus minor log message improvements |
| mssql_python/cursor.py | Updates executemany to detect DAE parameters and fall back to row-by-row execution for streaming support |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
merging main here to check validity of code coverage |
📊 Code Coverage Report
Diff CoverageDiff: main...HEAD, staged and unstaged changes
Summary
mssql_python/pybind/ddbc_bindings.cppLines 2012-2022 2012
2013 bool hasDAE = false;
2014 for (const auto& p : paramInfos) {
2015 if (p.isDAE) {
! 2016 hasDAE = true;
! 2017 break;
! 2018 }
2019 }
2020 if (!hasDAE) {
2021 std::vector<std::shared_ptr<void>> paramBuffers;
2022 rc = BindParameterArray(hStmt, columnwise_params, paramInfos, paramSetSize, paramBuffers);Lines 2027-2069 2027
2028 rc = SQLExecute_ptr(hStmt);
2029 return rc;
2030 } else {
! 2031 size_t rowCount = columnwise_params.size();
! 2032 for (size_t rowIndex = 0; rowIndex < rowCount; ++rowIndex) {
! 2033 py::list rowParams = columnwise_params[rowIndex];
2034
! 2035 std::vector<std::shared_ptr<void>> paramBuffers;
! 2036 rc = BindParameters(hStmt, rowParams, const_cast<std::vector<ParamInfo>&>(paramInfos), paramBuffers);
! 2037 if (!SQL_SUCCEEDED(rc)) return rc;
2038
! 2039 rc = SQLExecute_ptr(hStmt);
! 2040 while (rc == SQL_NEED_DATA) {
! 2041 SQLPOINTER token;
! 2042 rc = SQLParamData_ptr(hStmt, &token);
! 2043 if (!SQL_SUCCEEDED(rc) && rc != SQL_NEED_DATA) return rc;
2044
! 2045 py::object* py_obj_ptr = reinterpret_cast<py::object*>(token);
! 2046 if (!py_obj_ptr) return SQL_ERROR;
2047
! 2048 if (py::isinstance<py::str>(*py_obj_ptr)) {
! 2049 std::string data = py_obj_ptr->cast<std::string>();
! 2050 SQLLEN data_len = static_cast<SQLLEN>(data.size());
! 2051 rc = SQLPutData_ptr(hStmt, (SQLPOINTER)data.c_str(), data_len);
! 2052 } else if (py::isinstance<py::bytes>(*py_obj_ptr) || py::isinstance<py::bytearray>(*py_obj_ptr)) {
! 2053 std::string data = py_obj_ptr->cast<std::string>();
! 2054 SQLLEN data_len = static_cast<SQLLEN>(data.size());
! 2055 rc = SQLPutData_ptr(hStmt, (SQLPOINTER)data.c_str(), data_len);
! 2056 } else {
! 2057 LOG("Unsupported DAE parameter type in row {}", rowIndex);
! 2058 return SQL_ERROR;
! 2059 }
! 2060 }
2061
! 2062 if (!SQL_SUCCEEDED(rc)) return rc;
! 2063 }
! 2064 return SQL_SUCCESS;
! 2065 }
2066 }
2067
2068
2069 // Wrap SQLNumResultCols📋 Files Needing Attention📉 Files with overall lowest coverage (click to expand)mssql_python.pybind.ddbc_bindings.cpp: 66.9%
mssql_python.pybind.connection.connection.cpp: 68.3%
mssql_python.ddbc_bindings.py: 68.5%
mssql_python.pybind.connection.connection_pool.cpp: 78.9%
mssql_python.connection.py: 81.7%
mssql_python.helpers.py: 84.7%
mssql_python.auth.py: 85.3%
mssql_python.type.py: 86.8%
mssql_python.pooling.py: 88.8%
mssql_python.exceptions.py: 90.4%🔗 Quick Links
|
Work Item / Issue Reference
Summary
This pull request introduces robust support for streaming large values (such as NVARCHAR(MAX), VARCHAR(MAX), VARBINARY(MAX)) with the
executemanymethod, ensuring correct handling of parameters that require Data-At-Execution (DAE) or streaming. It also adds comprehensive tests for these scenarios and improves logging and code clarity throughout the codebase.Streaming and DAE parameter handling:
executemanymethod inmssql_python/cursor.pynow automatically detects DAE/streaming parameters and falls back to row-by-row execution when necessary, ensuring correct behavior for large values and streaming inserts. [1] [2] [3]mssql_python/pybind/ddbc_bindings.cpp), theSQLExecuteMany_wrapfunction checks for DAE parameters and either performs fast array execution or falls back to row-by-row execution with streaming, supporting both text and binary data.Testing improvements:
tests/test_004_cursor.pyto verify streaming inserts and fetches for NVARCHAR(MAX), VARCHAR(MAX), and VARBINARY(MAX) columns usingexecutemanyand all fetch modes.Sample usage and demonstration:
main.pyto include a sample script that demonstrates streaming inserts and fetches of large NVARCHAR(MAX) values usingexecutemanyand different fetch modes.Logging and code clarity: