Skip to content

Conversation

@tjgreen42
Copy link
Collaborator

Summary

  • Fix data corruption in bm25query type when Postgres uses short (1-byte) varlena headers
  • Functions tpquery_out, tpquery_send, bm25_text_bm25query_score, and tpquery_eq now properly detoast varlena arguments
  • Fix tpquery_send to use pq_begintypsend() for proper binary protocol formatting
  • Fix tpquery_recv to use pq_getmsgbyte() for version field

Background

When Postgres stores small varlena values, it may use a 1-byte header instead of the standard 4-byte header. Functions using PG_GETARG_POINTER instead of PG_DETOAST_DATUM would access struct fields at incorrect offsets, causing data corruption.

This manifested as:

  • Truncated query text in tpquery_out
  • Wrong index OID values
  • Binary I/O failures with COPY BINARY

Testing

  • Added binary_io.sql test that verifies COPY BINARY round-trip for bm25query type

@tjgreen42 tjgreen42 force-pushed the fix-bm25query-binary-io branch from a6eb348 to 8ac399b Compare January 9, 2026 02:25
@tjgreen42
Copy link
Collaborator Author

Addressed - reverted the pq_getmsgint(buf, 4) back to pq_getmsgint(buf, sizeof(Oid)) and sizeof(int32). You're right that the original was more self-documenting.

Several functions accessed TpQuery struct fields without properly
detoasting the varlena, which caused data corruption when Postgres
used short (1-byte) varlena headers instead of standard 4-byte headers.

Fixed functions:
- tpquery_out: use PG_DETOAST_DATUM for text output
- tpquery_send: use PG_DETOAST_DATUM and proper pq_begintypsend
- tpquery_recv: use pq_getmsgbyte for version (1 byte)
- bm25_text_bm25query_score: use PG_DETOAST_DATUM for query
- tpquery_eq: use PG_DETOAST_DATUM for both arguments

Also fixed tpquery_send to use pq_begintypsend() instead of
makeStringInfo() for proper binary protocol formatting.

Testing: Added binary_io.sql test for COPY BINARY round-trip.
@tjgreen42 tjgreen42 force-pushed the fix-bm25query-binary-io branch from 8ac399b to 7213a78 Compare January 9, 2026 02:31
@tjgreen42 tjgreen42 merged commit 71ceb7e into main Jan 9, 2026
12 checks passed
@tjgreen42 tjgreen42 deleted the fix-bm25query-binary-io branch January 9, 2026 02:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants