Skip to content

Conversation

@ahoppen
Copy link
Contributor

@ahoppen ahoppen commented Jun 24, 2025

This implements support for COPY operations using COPY … FROM STDIN queries for fast data transfer from the client to the backend.

Performance

A quick note on local performance measurements: Inserting the numbers from 0 to 1,000,000 into a table that has two columns (INT and VARCHAR) takes ~150ms. Depending on the exact implementation, the majority of the active CPU cycles are spent converting the numbers to strings, inside string interpolation or inside ByteBuffer._setBytes. If I remove the code that sends the CopyData messages to the backend (but keep all other logic that might incur thread hopes), the test described above takes ~50ms and utilizes the CPU at ~200%, so the real bottleneck here is the Postges backend handling the data. For comparison psycopg2 takes 210ms with the data to be written already prepared in a StringIO object. So, performance-wise this PR should be good to go.

Ideas for follow-up PRs

  • Check if we should buffer data sent through the PostgresCopyFromWriter to reduce the number of CopyData messages we need to send (and thus the protocol overhead). Alternatively, we can leave that kind of optimization to the client.
  • Add an API that allows binary transfer of data
  • Implement remaining options that can be passed to COPY FROM.
  • Allow concurrently generating the data to be written and flushing a buffer to the backend.

Fixes #290

Copy link
Collaborator

@fabianfett fabianfett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another review round. Thanks so much for pushing through this!

@ahoppen ahoppen changed the title WIP: Implement COPY … FROM STDIN Implement COPY … FROM STDIN queries Jul 7, 2025
@ahoppen ahoppen marked this pull request as ready for review July 7, 2025 22:46
@ahoppen ahoppen requested a review from gwynne as a code owner July 7, 2025 22:46
@ahoppen ahoppen force-pushed the copy-from branch 2 times, most recently from e9aa0bd to c5f2928 Compare July 8, 2025 08:36
@fabianfett fabianfett added the semver-minor Adds new public API. label Jul 22, 2025
@ahoppen ahoppen force-pushed the copy-from branch 2 times, most recently from 0e79469 to 1602d85 Compare July 24, 2025 17:39
Copy link
Collaborator

@fabianfett fabianfett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another round. Please also make sure that merge conflicts are resolved and tests pass

///
/// - Throws: If an error occurs during the write of if the backend sent an `ErrorResponse` during the copy
/// operation, eg. to indicate that a **previous** `write` call had an invalid format.
public func write(_ byteBuffer: ByteBuffer, isolation: isolated (any Actor)? = #isolation) async throws {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets remove the isolation calls here and instead adopt nonisolated nonsending for the whole package.

Suggested change
public func write(_ byteBuffer: ByteBuffer, isolation: isolated (any Actor)? = #isolation) async throws {
public func write(_ byteBuffer: ByteBuffer) async throws {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nonisolated(nonsending) is only available in Swift 6.2, so I’m not quite sure what you’re proposing.

  • Make these function only available in Swift 6.2 behind #if compiler(>=6.2) and explicitly mark them as nonisolated(nonsending)?
  • Make these function nonisolated(nonsending) when using a 6.2 compiler and let them run on the nonisolated executor context in older compilers? I’d be very worried about the subtle difference in behavior here.
  • Adopt the NonisolatedNonsendingByDefault upcoming language feature when the host compiler is 6.2+? But that just expands the problem mentioned above to all functions.

So, I’d just stick with the isolation for now.

columns: [String] = [],
format: PostgresCopyFromFormat = .text(.init()),
logger: Logger,
isolation: isolated (any Actor)? = #isolation,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
isolation: isolated (any Actor)? = #isolation,

This implements support for COPY operations using `COPY … FROM STDIN` queries for fast data transfer from the client to the backend.
@ahoppen
Copy link
Contributor Author

ahoppen commented Nov 28, 2025

Thanks for the review, Fabian 🙂

Rebased, updated tests to use Swift Testing and addressed/replied to the review comments.

@codecov
Copy link

codecov bot commented Nov 28, 2025

Codecov Report

❌ Patch coverage is 84.61538% with 58 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.70%. Comparing base (db1eae1) to head (125c70a).

Files with missing lines Patch % Lines
...tion State Machine/ExtendedQueryStateMachine.swift 78.99% 25 Missing ⚠️
...nection State Machine/ConnectionStateMachine.swift 76.56% 15 Missing ⚠️
...urces/PostgresNIO/New/PostgresChannelHandler.swift 86.95% 9 Missing ⚠️
...esNIO/Connection/PostgresConnection+CopyFrom.swift 93.10% 8 Missing ⚠️
Sources/PostgresNIO/New/PSQLTask.swift 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #566      +/-   ##
==========================================
+ Coverage   75.11%   75.70%   +0.58%     
==========================================
  Files         132      134       +2     
  Lines        9532     9885     +353     
==========================================
+ Hits         7160     7483     +323     
- Misses       2372     2402      +30     
Files with missing lines Coverage Δ
...tgresNIO/New/Extensions/AnyErrorContinuation.swift 100.00% <100.00%> (ø)
Sources/PostgresNIO/New/PSQLTask.swift 87.50% <80.00%> (-1.08%) ⬇️
...esNIO/Connection/PostgresConnection+CopyFrom.swift 93.10% <93.10%> (ø)
...urces/PostgresNIO/New/PostgresChannelHandler.swift 90.72% <86.95%> (-0.33%) ⬇️
...nection State Machine/ConnectionStateMachine.swift 68.75% <76.56%> (+0.86%) ⬆️
...tion State Machine/ExtendedQueryStateMachine.swift 79.11% <78.99%> (+0.79%) ⬆️

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

semver-minor Adds new public API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Copy In Mode

2 participants