Skip to content

Conversation

@devdavidkarlsson
Copy link
Contributor

Description

Fixes #2650

This PR adds support for multi-dot notation in database names without requiring quotes, enabling table references like a.b.c where a.b is the database name and c is the table name.

Changes

  • Grammar: Updated ANTLR databaseIdentifier rule to support multiple dot-separated identifiers: identifier (DOT identifier)*
  • Parser Logic: Added extractTableName() helper method in SqlParserFacade to properly unquote each identifier part separately and join them correctly
  • Tests: Added testMultiDotNotation() test method with 3 test cases:
    • Unquoted multi-dot: SELECT * FROM a.b.c WHERE id = ?
    • Quoted identifiers: SELECT * FROM `db.part1`.`table` WHERE id = ?
    • INSERT statement: INSERT INTO a.b.c (col1, col2) VALUES (?, ?)

Testing

  • ✅ All 367 existing tests continue to pass
  • ✅ New test validates parsing of multi-dot database names
  • ✅ Verified against ClickHouse server with databases containing dots in their names

Example Usage

-- Database with dot in name (must be quoted when creating)
CREATE DATABASE `test.db`;
CREATE TABLE `test.db`.users (id UInt32, name String) ENGINE = MergeTree ORDER BY id;

-- Can now be referenced in JDBC with proper parsing
SELECT * FROM `test.db`.users WHERE id = ?;
INSERT INTO `test.db`.users VALUES (?, ?);

The JDBC driver now correctly parses these statements and extracts the full table name test.db.users for metadata operations.

Fixes ClickHouse#2650

Changes:
- Updated ANTLR grammar databaseIdentifier rule to support multiple
  dot-separated identifiers: identifier (DOT identifier)*
- Added extractTableName() helper method in SqlParserFacade to properly
  handle multi-part table identifiers by unquoting each part separately
- Added testMultiDotNotation() test with 3 test cases covering SELECT,
  INSERT, and quoted identifiers
- All 367 existing tests continue to pass

This allows database names like 'a.b' in table references such as
'a.b.c' or '`db.part1`.`table`' to be parsed correctly.
Copy link
Contributor

@windsurf-bot windsurf-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 To request another review, post a new comment with "/windsurf-review".

Addresses PR feedback: Only append dot after database identifier
if table identifier is not null to avoid trailing dots.
@devdavidkarlsson
Copy link
Contributor Author

/windsurf-review

Copy link
Contributor

@windsurf-bot windsurf-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 🤙

💡 To request another review, post a new comment with "/windsurf-review".

@chernser
Copy link
Contributor

Good day, @devdavidkarlsson !
Thank you for the contribution!

I've left comments.
Build fails with:

Error:  Tests run: 1182, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.501 s <<< FAILURE! -- in TestSuite
Error:  com.clickhouse.jdbc.internal.JavaCCParserTest.testMultiDotNotation -- Time elapsed: 0.010 s <<< FAILURE!
java.lang.AssertionError: expected [false] but found [true]
	at org.testng.Assert.fail(Assert.java:110)
	at org.testng.Assert.failNotEquals(Assert.java:1413)
	at org.testng.Assert.assertFalse(Assert.java:78)
	at org.testng.Assert.assertFalse(Assert.java:88)
	at com.clickhouse.jdbc.internal.BaseSqlParserFacadeTest.testMultiDotNotation(BaseSqlParserFacadeTest.java:169)

Address PR feedback: Handle multi-dot notation directly in the parser
grammar rather than in post-processing Java code.

Changes:
- ANTLR: Simplified to use getText() directly from parser, removed
  extractTableName() method that was walking the parse tree
- JavaCC: Modified tableIdentifier rule to parse all dot-separated
  identifiers and split database/table within the grammar action
- Both parsers now handle the logic in the grammar/parser itself
@devdavidkarlsson
Copy link
Contributor Author

@chernser new attempt: I added the JavaCC (.jj) and got rid of extractTableName, tests should be green now.
Thank you 🙏

@devdavidkarlsson
Copy link
Contributor Author

/windsurf-review

1 similar comment
@devdavidkarlsson
Copy link
Contributor Author

/windsurf-review

Copy link
Contributor

@windsurf-bot windsurf-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 To request another review, post a new comment with "/windsurf-review".

Address bot feedback about quoted identifiers containing dots.

Changes:
- ANTLR: Use ClickHouseSqlUtils.unescape() instead of
  SQLUtils.unquoteIdentifier() to properly handle escaped backticks
- JavaCC: Already working correctly (no changes needed)
- Added comprehensive test suite with 15 test cases covering all
  ClickHouse-relevant scenarios for quoted identifiers with dots

Test coverage:

Case  SQL Pattern                           Database          Table
----  ------------------------------------  ----------------  ---------------
1     db.table                              db                table
2     `db`.`table`                          db                table
3     db.`table.name`                       db                table.name
4     `db.part1`.`table`                    db.part1          table
5     db.`table.name`                       db                table.name
6     `db.part1`.table                      db.part1          table
7     db.`tab``le`                          db                tab`le
8     `my db`.`table name!@#`               my db             table name!@#
9     `db.part1`.`table.name` AS t          db.part1          table.name
10    db.`a.b.c.d`                          db                a.b.c.d
11    `db.part1.part2`.`table`              db.part1.part2    table
12    db.part1.table2                       db.part1          table2
13    `db.part1`.`part2`.`table`            db.part1.part2    table
14    db.part1.`table.name`                 db.part1          table.name
15    `db.part1`.part2.table3               db.part1.part2    table3

All 1104 tests passing (367 base tests × 3 parsers + 1 new test × 3).
@devdavidkarlsson devdavidkarlsson force-pushed the fix/jdbc-v2-multi-dot-notation branch from ba2f40a to 823b35d Compare November 25, 2025 09:37
@devdavidkarlsson
Copy link
Contributor Author

/windsurf-review

Copy link
Contributor

@windsurf-bot windsurf-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 To request another review, post a new comment with "/windsurf-review".

@chernser
Copy link
Contributor

Good day, @devdavidkarlsson !
Will review today.

super.enterColumnExprPrecedence3(ctx);
}

private String unquoteTableIdentifier(String rawTableId) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strange - I thought I've left the comment that it should be actually handle by parser.
Because one of the points to have parser is to avoid writing this kind of methods.

{
if (record && t != null && token_source.table == null) {
token_source.table = ClickHouseSqlUtils.unescape(t.image);
if (record && token_source.table == null && parts.size() > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why parser cannot do it for us?

Copy link
Contributor

@chernser chernser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try to get clean table and database names using parser only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[jdbc-v2] Support dot notation without quotes

2 participants