feat: Improve `to_pyarrow_batches` for PostgreSQL backend #10938

ronif · 2025-03-05T13:22:17Z

Is your feature request related to a problem?

Hi,

It seems that to_pyarrow_batches is implemented somewhat naively in many backends. In many cases (including the SQL backends) all the data is first instantiated in the client-side cursor (or as a pandas DF) and then partitioned to batches. This means that something like

remote_con.table('huge_table').to_pyarrow_batches(...) tries to allocate the whole table in memory.

What is the motivation behind your request?

No response

Describe the solution you'd like

PostgreSQL (and maybe other backends) has a mechanism to batch the results server-side.

I can make a PR for this.

What version of ibis are you running?

10.2.0

What backend(s) are you using, if any?

PostgreSQL

Code of Conduct

I agree to follow this project's Code of Conduct

The text was updated successfully, but these errors were encountered:

cpcloud · 2025-03-06T15:46:45Z

@ronif Thanks for the issue!

Would definitely review a PR to improve to_pyarrow_batches() for Postgres.

…sors (#10954) This adds a specific `to_pyarrow_batches` implementation to the PostgreSQL backend, which uses server side cursors. This allows ibis to allocate memory needed only for `chunk_size` results of the query instead of the whole set. Resolves #10938

ronif added the feature Features or general enhancements label Mar 5, 2025

github-project-automation bot added this to Ibis planning and roadmap Mar 5, 2025

github-project-automation bot moved this to backlog in Ibis planning and roadmap Mar 5, 2025

ronif mentioned this issue Mar 7, 2025

perf(postgres): improve to_pyarrow_batches by using server-side cursors #10954

Merged

cpcloud closed this as completed in #10954 Mar 9, 2025

github-project-automation bot moved this from backlog to done in Ibis planning and roadmap Mar 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Improve `to_pyarrow_batches` for PostgreSQL backend #10938

feat: Improve `to_pyarrow_batches` for PostgreSQL backend #10938

ronif commented Mar 5, 2025

cpcloud commented Mar 6, 2025

feat: Improve to_pyarrow_batches for PostgreSQL backend #10938

feat: Improve to_pyarrow_batches for PostgreSQL backend #10938

Comments

ronif commented Mar 5, 2025

Is your feature request related to a problem?

What is the motivation behind your request?

Describe the solution you'd like

What version of ibis are you running?

What backend(s) are you using, if any?

Code of Conduct

cpcloud commented Mar 6, 2025

feat: Improve `to_pyarrow_batches` for PostgreSQL backend #10938

feat: Improve `to_pyarrow_batches` for PostgreSQL backend #10938