Skip to content

Conversation

@VGSML
Copy link

@VGSML VGSML commented Nov 4, 2025

Rewrite the Arrow.QueryContext method to use new DuckDB Arrow C-API
Closes #5

@VGSML
Copy link
Author

VGSML commented Nov 4, 2025

Merge after: duckdb/duckdb-go-bindings#39
And bindings versions

@VGSML
Copy link
Author

VGSML commented Nov 4, 2025

@taniabogatsch FYI

@taniabogatsch
Copy link
Collaborator

Thanks! Will have a look once we've released the bindings with the latest changes. 👍

@VGSML VGSML changed the base branch from main to v2.6.0-preview November 16, 2025 13:22
@VGSML VGSML changed the base branch from v2.6.0-preview to main November 17, 2025 10:01
@VGSML
Copy link
Author

VGSML commented Nov 17, 2025

@taniabogatsch Could you take a look this PR? What the sequence to merge it, should I the arrowmapping version increase in the duckdb package before merge or split pr to 2 - arrowmapping changes and the changes in the duckdb package?

@taniabogatsch
Copy link
Collaborator

I think your duckdb-go-bindings changes should already be on main! :)

@taniabogatsch
Copy link
Collaborator

We pushed a release today.

@taniabogatsch taniabogatsch self-requested a review November 17, 2025 18:59
@VGSML
Copy link
Author

VGSML commented Nov 17, 2025

Yes, I sow. I mean arrowmapping package in this repo.

@taniabogatsch
Copy link
Collaborator

ah, I see now! We likely need to bump it in a separate PR - i.e., first one with the new functions in the arrow mapping package, then I can bump that version. 👍

@VGSML
Copy link
Author

VGSML commented Nov 21, 2025

@taniabogatsch Very strange with tests when the go version 1.24 is specified in the go.mod, the tests are failed. I have created an issue to update the go version to 1.25 in the CI and packages. #63

@VGSML
Copy link
Author

VGSML commented Nov 21, 2025

@taniabogatsch I revert transitive dependency versions. The failed tests was fixed
You can merge this PR.

@VGSML
Copy link
Author

VGSML commented Nov 21, 2025

@taniabogatsch Will you merge it?

@taniabogatsch
Copy link
Collaborator

Yes, but I'll need to review it first. 👍 Didn't find the time to take a look yet, will so next week.

Copy link
Collaborator

@taniabogatsch taniabogatsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR / implementation - I saw that its not a huge PR, so I squeezed in a review already ;)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a breaking change? I.e., can people still use rdr.Record() and rdr.(*arrowStreamReader) in their implementations? Or would we break compatibility with this change as it currently is?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, no, there are no breaking changes, the public methods are still the same.

Comment on lines +104 to +106
if err != nil {
return nil, err
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call rows.Close() here before returning the error?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still using in the record reader and will be closed by last recordReader.Release. So it should be in non-closed state here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're removing a bunch of exported functionality here - let's instead tag them as deprecated? And add the new functionality on top?

Copy link
Author

@VGSML VGSML Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I just moved the public method to the top of the file. There are no braking changes

arrow.go Outdated
}
if arrowmapping.ArrowScan(a.conn.conn, name, arrowStream) == mapping.StateError {
release()
return nil, errors.New("duckdb_arrow_scan")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we turn this into a bit more elaborate error? I.e., that something went wrong in the DuckDB-side? Also, can we trigger this in a test? If not, should we tell people to file a bug report if they run into it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but this part of the code is not changed. There is no method to get the real error in the ArrowScan (I think in the future it should be deprecated).

I want to add the new type table function - ArrowTableUDF, that will use the new C-API, and deprecated this method. With this new UDF and replacement scan we can provide the same functionality to have arrow view.

The main difference will be in the options to delete the view definition after using. Do you know, the C-API will contain the method to deregister replacement scans and UDFs?

Comment on lines 193 to 197
defer mapping.DestroyErrorData(&ed)
if mapping.ErrorDataHasError(ed) {
return nil, fmt.Errorf("failed to create arrow schema: %w",
getDuckDBError(mapping.ErrorDataMessage(ed)))
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use func errorDataError(errorData mapping.ErrorData) error here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


schema, ed := arrowmapping.NewArrowSchema(arrowOptions, types, names)
defer mapping.DestroyErrorData(&ed)
if mapping.ErrorDataHasError(ed) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we return with an error here, then we also need to destroy the arrow options, no?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

arrow.go Outdated
type duckdbArrowReader struct {
ctx context.Context
res mapping.Result
opts *arrowmapping.ArrowOptions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need a pointer (heap allocation) here - the arrow mapping.ArrowOptions are already wrapping a pointer, which we can just copy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

r.currentRec = nil
defer mapping.DestroyDataChunk(&chunk)
rec, ed := arrowmapping.DataChunkToArrowArray(*r.opts, r.schema, chunk)
defer mapping.DestroyErrorData(&ed)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same: func errorDataError(errorData mapping.ErrorData) error.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@taniabogatsch taniabogatsch added feature / enhancement Code improvements or a new feature dependencies Updates a dependency file labels Nov 22, 2025
@taniabogatsch taniabogatsch added the changes requested Changes have been requested to a PR or issue label Nov 22, 2025
@VGSML
Copy link
Author

VGSML commented Nov 23, 2025

The PR just replaces the internal implementation of the Arrow.QueryContext method and record reader implementation.

I also rename the reader to make it simple.

@VGSML VGSML requested a review from taniabogatsch November 23, 2025 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changes requested Changes have been requested to a PR or issue dependencies Updates a dependency file feature / enhancement Code improvements or a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Move to new Arrow C-API

2 participants