Skip to content

Conversation

@pyramation
Copy link
Collaborator

@pyramation pyramation commented Dec 22, 2025

Summary

This PR centralizes and improves identifier quoting logic by:

  1. Generated keyword list: Added scripts/keywords.ts to generate kwlist.ts from PostgreSQL's kwlist.h, providing accurate keyword categorization (RESERVED_KEYWORD, UNRESERVED_KEYWORD, TYPE_FUNC_NAME_KEYWORD, COL_NAME_KEYWORD).

  2. New QuoteUtils methods:

    • quoteIdentifier() - TypeScript port of PostgreSQL's quote_identifier() from ruleutils.c, with proper escaping of embedded double quotes as ""
    • quoteQualifiedIdentifier() - TypeScript port of PostgreSQL's quote_qualified_identifier()
  3. Deparser refactor: Replaced inline quoting logic, hardcoded RESERVED_WORDS, and all legacy quoting methods with centralized quoteIdentifier().

Updates since last revision

  • Completed Phase 4 migration: Replaced all 130+ QuoteUtils.quote() usages with QuoteUtils.quoteIdentifier() throughout deparser.ts
  • Removed all deprecated methods: quote(), needsQuotes(), quoteString(), and needsQuotesForString() have been removed from quote-utils.ts
  • Fixed type name quoting: Type names (e.g., int, timestamp) are no longer quoted even though they are TYPE_FUNC_NAME_KEYWORDS. Type names have their own grammar rules in PostgreSQL and should not be treated as identifiers.
  • Refactored schema.name patterns: Replaced 11 instances of manual quoteIdentifier(schema) + '.' + quoteIdentifier(name) concatenation with quoteQualifiedIdentifier() for cleaner, more maintainable code (RangeVar, DropStmt, AlterObjectSchemaStmt contexts).

Review & Testing Checklist for Human

  • Verify quoting behavior change is acceptable: quoteIdentifier() quotes all keywords except UNRESERVED_KEYWORD (e.g., join will now be quoted as "join"), whereas the old logic only quoted RESERVED_KEYWORDS. This is more correct per PostgreSQL semantics but may change output.
  • Verify type name handling: Type names like int, timestamp, interval should NOT be quoted in CREATE TABLE statements. The test dynamic creation of tables validates this.
  • Test identifiers with embedded quotes: quoteIdentifier() properly escapes " as "". Verify this works correctly in roundtrip tests (e.g., identifier a"b should become "a""b").
  • Test roundtrips with edge cases: Parse→deparse→parse with reserved words as identifiers, type_func_name keywords (like join, left, right), mixed-case names, and special characters.

Test Plan

  1. Run existing test suite to catch regressions
  2. Test specific SQL statements that use quoted identifiers (CREATE DATABASE, CREATE PUBLICATION, ALTER SUBSCRIPTION, SET/RESET, etc.)
  3. Test CREATE TABLE with type keywords as column types (int, timestamp, interval, etc.)
  4. Test identifiers containing ", $, ., -, or uppercase letters

Notes

pyramation and others added 7 commits December 21, 2025 20:43
- Remove hardcoded RESERVED_WORDS from quote-utils.ts
- Import RESERVED_KEYWORDS and TYPE_FUNC_NAME_KEYWORDS from kwlist.ts
- Add explicit Set<string> type annotations to kwlist.ts exports for proper TypeScript compatibility
- QuoteUtils.needsQuotes() now uses the centralized keyword sets

Co-Authored-By: Dan Lynch <[email protected]>
…ions

The generated kwlist.ts needs explicit Set<string> type annotations for
TypeScript compatibility when calling .has() with string arguments.

Co-Authored-By: Dan Lynch <[email protected]>
…Utils

- Remove Deparser.RESERVED_WORDS and Deparser.needsQuotes from deparser.ts
- Add QuoteUtils.needsQuotesForString() and QuoteUtils.quoteString() methods
- Update all call sites to use QuoteUtils instead of Deparser methods
- Both QuoteUtils.needsQuotes and QuoteUtils.needsQuotesForString now use
  RESERVED_KEYWORDS from kwlist.ts as the single source of truth
- Add Set<string> type annotations to kwlist.ts exports for TypeScript compatibility

Co-Authored-By: Dan Lynch <[email protected]>
…L ruleutils.c

Port PostgreSQL's quote_identifier() and quote_qualified_identifier() functions
to TypeScript. These functions properly:
- Quote identifiers only when needed (lowercase letters, digits, underscores)
- Escape embedded double quotes as ""
- Check against keyword categories (quote all except UNRESERVED_KEYWORD)

References:
- https://github.com/postgres/postgres/blob/fab5cd3dd1323f9e66efeb676c4bb212ff340204/src/backend/utils/adt/ruleutils.c#L13055-L13137
- https://github.com/postgres/postgres/blob/fab5cd3dd1323f9e66efeb676c4bb212ff340204/src/backend/utils/adt/ruleutils.c#L13139-L13156

Co-Authored-By: Dan Lynch <[email protected]>
@devin-ai-integration devin-ai-integration bot changed the title Feat/quotes keywords feat: centralize quoting logic with PostgreSQL-compatible utilities Dec 22, 2025
devin-ai-integration bot and others added 4 commits December 22, 2025 05:13
…precated methods

- Replace all quoteString usages in deparser.ts with quoteIdentifier
- Replace all needsQuotesForString usages with quoteIdentifier checks
- Remove deprecated quoteString and needsQuotesForString methods from quote-utils.ts
- Update quoteIfNeeded() and String() methods to use quoteIdentifier
- Migrate DefElem contexts and VariableSetStmt to use quoteIdentifier

The quoteIdentifier function is more correct because it:
- Properly escapes embedded double quotes as ""
- Checks all keyword categories (not just RESERVED_KEYWORDS)
- Follows PostgreSQL's exact quote_identifier() logic from ruleutils.c

Co-Authored-By: Dan Lynch <[email protected]>
…precated methods

- Replace all QuoteUtils.quote() usages with QuoteUtils.quoteIdentifier()
- Remove deprecated needsQuotes() and quote() methods from quote-utils.ts
- QuoteUtils now only exports: escape, escapeEString, formatEString, needsEscapePrefix, quoteIdentifier, quoteQualifiedIdentifier

This completes the migration to PostgreSQL-accurate identifier quoting.

Co-Authored-By: Dan Lynch <[email protected]>
Type names like 'int' and 'timestamp' should not be quoted even though
they are TYPE_FUNC_NAME_KEYWORDS. The quoteIdentifier function is meant
for SQL identifiers (table names, column names, etc.), not for type names
which have their own grammar rules in PostgreSQL.

Co-Authored-By: Dan Lynch <[email protected]>
Replaced 11 instances of manual schema.name concatenation with
quoteQualifiedIdentifier() for cleaner, more maintainable code:
- RangeVar: schema.table patterns
- DropStmt: operator class/family with schema
- AlterObjectSchemaStmt: domain, type, conversion, parser, config,
  template, dictionary, operator class/family patterns

Co-Authored-By: Dan Lynch <[email protected]>
@pyramation pyramation closed this Dec 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants