Skip to content

Conversation

@Alex-PLACET
Copy link
Member

No description provided.

@Alex-PLACET Alex-PLACET self-assigned this Nov 7, 2025
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From sparrow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From sparrow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From sparrow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From sparrow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From sparrow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From sparrow

@Alex-PLACET Alex-PLACET requested a review from Copilot November 7, 2025 13:02
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces the sparrow-pycapsule library, which provides C++ utilities for interacting with Python's C API capsule system to export and import Apache Arrow arrays. The library enables seamless data exchange between C++ (using the Sparrow library) and Python through the Arrow C Data Interface.

  • Implements bidirectional conversion functions between Sparrow arrays and Python capsules
  • Adds comprehensive test suite covering export, import, and round-trip operations
  • Sets up complete build infrastructure with CMake, CI workflows, and development tooling

Reviewed Changes

Copilot reviewed 19 out of 20 changed files in this pull request and generated 19 comments.

Show a summary per file
File Description
include/sparrow-pycapsule/pycapsule.hpp Defines public API for capsule export/import operations
src/pycapsule.cpp Implements capsule conversion functions and destructors
test/test_pycapsule.cpp Comprehensive test suite for all pycapsule functionality
CMakeLists.txt Main build configuration defining the sparrow-pycapsule library target
test/CMakeLists.txt Test build configuration and targets
cmake/external_dependencies.cmake Dependency management for Sparrow, doctest, and Python
cmake/compile_options.cmake Compiler flags and warnings configuration
.github/workflows/*.yml CI workflows for Linux, macOS, and Windows builds
environment-dev.yml Conda environment specification for development
sparrow-pycapsuleConfig.cmake.in CMake package configuration template

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


- name: Run tests
working-directory: build
run: cmake --build . --target run_tests_with_junit_report
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The target run_tests_with_junit_report is referenced in the workflow but is never defined in the test CMakeLists.txt. The actual test target defined is run_pycapsule_tests (line 47 in test/CMakeLists.txt). This will cause the workflow to fail when trying to run tests.

Copilot uses AI. Check for mistakes.
if(LIBCPP)
message(STATUS "Using libc++")

# Allow the use of not visible yet availabile features, such
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: "availabile" should be "available".

Suggested change
# Allow the use of not visible yet availabile features, such
# Allow the use of not visible yet available features, such

Copilot uses AI. Check for mistakes.
endif()

if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU" AND CMAKE_CXX_COMPILER_VERSION VERSION_LESS 11.3)
set(compile_options ${compile_optoins} PRIVATE "-Wno-error=shift-negative-value")
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: compile_optoins should be compile_options. This will cause the variable to not be set correctly.

Suggested change
set(compile_options ${compile_optoins} PRIVATE "-Wno-error=shift-negative-value")
set(compile_options ${compile_options} PRIVATE "-Wno-error=shift-negative-value")

Copilot uses AI. Check for mistakes.

- name: Run tests
working-directory: build
run: cmake --build . --target run_tests_with_junit_report
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The target run_tests_with_junit_report is referenced in the workflow but is never defined in the test CMakeLists.txt. The actual test target defined is run_pycapsule_tests (line 47 in test/CMakeLists.txt). This will cause the workflow to fail when trying to run tests.

Copilot uses AI. Check for mistakes.

- name: Run tests
working-directory: build
run: cmake --build . --target run_tests_with_junit_report
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The target run_tests_with_junit_report is referenced in the workflow but is never defined in the test CMakeLists.txt. The actual test target defined is run_pycapsule_tests (line 47 in test/CMakeLists.txt). This will cause the workflow to fail when trying to run tests.

Copilot uses AI. Check for mistakes.
@Alex-PLACET Alex-PLACET requested a review from Copilot November 7, 2025 13:50
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 19 out of 20 changed files in this pull request and generated 10 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

message(STATUS "Using C++ compiler launcher: ${CMAKE_CXX_COMPILER_LAUNCHER}")
endif()

# Versionning
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: Versionning should be Versioning (one 'n').

Copilot uses AI. Check for mistakes.
message(STATUS "Building sparrow v${CMAKE_PROJECT_VERSION}")

# Binary version
# See the following URL for explanations about the binary versionning
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: versionning should be versioning (one 'n').

Suggested change
# See the following URL for explanations about the binary versionning
# See the following URL for explanations about the binary versioning

Copilot uses AI. Check for mistakes.
Comment on lines +104 to +105
PyObject* schema_capsule = PyCapsule_New(schema_ptr, arrow_schema_str.data(), ReleaseArrowSchemaPyCapsule);
PyObject* array_capsule = PyCapsule_New(array_ptr, arrow_array_str.data(), ReleaseArrowArrayPyCapsule);
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Potential resource leak if PyCapsule_New fails. If the first PyCapsule_New call (line 104) fails and returns nullptr, the allocated schema_ptr would leak. Similarly, if the second PyCapsule_New call (line 105) fails, both schema_ptr and array_ptr could leak.

Consider checking for nullptr after each PyCapsule_New call and cleaning up allocated resources before returning on failure, or using RAII wrappers to ensure cleanup.

Suggested change
PyObject* schema_capsule = PyCapsule_New(schema_ptr, arrow_schema_str.data(), ReleaseArrowSchemaPyCapsule);
PyObject* array_capsule = PyCapsule_New(array_ptr, arrow_array_str.data(), ReleaseArrowArrayPyCapsule);
PyObject* schema_capsule = PyCapsule_New(schema_ptr, arrow_schema_str.data(), ReleaseArrowSchemaPyCapsule);
if (!schema_capsule) {
delete schema_ptr;
delete array_ptr;
return {nullptr, nullptr};
}
PyObject* array_capsule = PyCapsule_New(array_ptr, arrow_array_str.data(), ReleaseArrowArrayPyCapsule);
if (!array_capsule) {
delete schema_ptr;
delete array_ptr;
Py_DECREF(schema_capsule);
return {nullptr, nullptr};
}

Copilot uses AI. Check for mistakes.
ArrowArray* arrow_array_ptr = new ArrowArray();
*arrow_array_ptr = extract_arrow_array(std::move(arr));

return PyCapsule_New(arrow_array_ptr, arrow_array_str.data(), ReleaseArrowArrayPyCapsule);
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Potential resource leak if PyCapsule_New fails. If PyCapsule_New returns nullptr (e.g., due to memory allocation failure), the allocated arrow_array_ptr would leak. Consider checking for nullptr and cleaning up the allocated array before returning, or document that callers must ensure Python is properly initialized.

Copilot uses AI. Check for mistakes.
Comment on lines +443 to +444
PyObject* schema_capsule = ExportArrowSchemaPyCapsule(arr);
PyObject* array_capsule = ExportArrowArrayPyCapsule(arr);
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Using a moved-from object. ExportArrowSchemaPyCapsule(arr) on line 443 moves from arr, making it invalid. The subsequent call to ExportArrowArrayPyCapsule(arr) on line 444 is then operating on a moved-from object, which leads to undefined behavior.

The correct approach is to use export_array_to_capsules(arr) which exports both capsules from a single array object, as shown in other test cases.

Suggested change
PyObject* schema_capsule = ExportArrowSchemaPyCapsule(arr);
PyObject* array_capsule = ExportArrowArrayPyCapsule(arr);
auto [schema_capsule, array_capsule] = export_array_to_capsules(arr);

Copilot uses AI. Check for mistakes.
SUBCASE("destructor_handles_already_released_schema")
{
// Create a schema with null release callback
ArrowSchema* schema = new ArrowSchema();
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Uninitialized memory. The ArrowSchema is allocated but not fully initialized. While release is set to nullptr, other fields of the ArrowSchema structure remain uninitialized. The destructor ReleaseArrowSchemaPyCapsule may attempt to read these fields. Use new ArrowSchema{} to zero-initialize all fields or explicitly initialize the structure.

Suggested change
ArrowSchema* schema = new ArrowSchema();
ArrowSchema* schema = new ArrowSchema{};

Copilot uses AI. Check for mistakes.
SUBCASE("destructor_handles_already_released_array")
{
// Create an array with null release callback
ArrowArray* array = new ArrowArray();
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Uninitialized memory. The ArrowArray is allocated but not fully initialized. While release is set to nullptr, other fields of the ArrowArray structure remain uninitialized. The destructor ReleaseArrowArrayPyCapsule may attempt to read these fields. Use new ArrowArray{} to zero-initialize all fields or explicitly initialize the structure.

Suggested change
ArrowArray* array = new ArrowArray();
ArrowArray* array = new ArrowArray{};

Copilot uses AI. Check for mistakes.
ArrowSchema* arrow_schema_ptr = new ArrowSchema();
*arrow_schema_ptr = extract_arrow_schema(std::move(arr));

return PyCapsule_New(arrow_schema_ptr, arrow_schema_str.data(), ReleaseArrowSchemaPyCapsule);
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Potential resource leak if PyCapsule_New fails. If PyCapsule_New returns nullptr (e.g., due to memory allocation failure), the allocated arrow_schema_ptr would leak. Consider checking for nullptr and cleaning up the allocated schema before returning, or document that callers must ensure Python is properly initialized.

Copilot uses AI. Check for mistakes.
Comment on lines +20 to +25
auto schema = static_cast<ArrowSchema*>(PyCapsule_GetPointer(capsule, arrow_schema_str.data()));
if (schema->release != nullptr)
{
schema->release(schema);
}
delete schema;
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Missing error handling. If PyCapsule_GetPointer fails and returns nullptr (which can happen if the capsule is invalid), dereferencing schema on line 21 would cause undefined behavior. The function should check if schema is nullptr before accessing its members.

Copilot uses AI. Check for mistakes.
Comment on lines +44 to +49
auto array = static_cast<ArrowArray*>(PyCapsule_GetPointer(capsule, arrow_array_str.data()));
if (array->release != nullptr)
{
array->release(array);
}
delete array;
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Missing error handling. If PyCapsule_GetPointer fails and returns nullptr (which can happen if the capsule is invalid), dereferencing array on line 45 would cause undefined behavior. The function should check if array is nullptr before accessing its members.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant