Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Query RPC for fast first page results #49

Open
scotthart opened this issue Jan 27, 2025 · 2 comments
Open

Support Query RPC for fast first page results #49

scotthart opened this issue Jan 27, 2025 · 2 comments
Labels
priority: p3 Desirable enhancement or fix. May not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@scotthart
Copy link
Member

BigQuery supports both the InsertJob RPC (with type QueryJob) as well as a Query RPC (https://github.com/googleapis/googleapis/blob/8798ceff3f6fbcdce3186b67ce9339df337569d5/google/cloud/bigquery/v2/job.proto#L117). InsertJob is an asynchronous RPC stores the query results in a BigQuery table, while Query is a synchronous RPC that returns the first page of results immediately while executing a QueryJob asynchronous for the rest of the results.

Add a Query method that returns the first page of results as a collection of std::tuples and a JobReference that can be used to poll for the remainder of the results.

@scotthart scotthart added priority: p3 Desirable enhancement or fix. May not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. labels Jan 27, 2025
@cuiy0006
Copy link
Contributor

cuiy0006 commented Jan 27, 2025

while executing a QueryJob asynchronous for the rest of the results.

Is this done by GetQueryResults?

Can GetQueryResults be executed in parallel or it must be executed in sequence? Because I see there is page_token needed to be passed in the request of the next page, it is returned by the response of requesting the last page.

@scotthart
Copy link
Member Author

GetQueryResults is available in https://github.com/googleapis/google-cloud-cpp/blob/6055c7025d1c98625fbcbbe7a7eb4a74577a1c9b/google/cloud/bigquerycontrol/v2/job_client.h#L291 but there are no current plans to support it in this library.

GetQueryResults transfers data as JSON via a paginated mechanism over REST. The ReadArrowmethod is a streaming read that can be parallelized offering superior throughput.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p3 Desirable enhancement or fix. May not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

2 participants