-
Notifications
You must be signed in to change notification settings - Fork 82
feat(ibis): postgreSQL, PostGIS support for IBIS Server. #1188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
""" WalkthroughThe changes introduce support for PostgreSQL extensions, specifically PostGIS, in the metadata extraction process. This is achieved by extending the column type enum to include geometry types and adding an Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant PostgresMetadata
participant ExtensionHandler
participant Database
Client->>PostgresMetadata: get_table_list()
PostgresMetadata->>Database: Query standard table metadata
Database-->>PostgresMetadata: Return tables and columns
PostgresMetadata->>ExtensionHandler: augment(tables)
ExtensionHandler->>Database: get_extensions()
Database-->>ExtensionHandler: Return installed extensions
ExtensionHandler->>Database: Query PostGIS columns (if present)
Database-->>ExtensionHandler: Return geometry/geography columns
ExtensionHandler->>PostgresMetadata: Return augmented tables
PostgresMetadata-->>Client: Return final table list with extension types
Poem
Note ⚡️ AI Code Reviews for VS Code, Cursor, WindsurfCodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback. Note ⚡️ Faster reviews with cachingCodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
ibis-server/app/model/metadata/postgres.py (2)
53-86
: Consider parameterizing SQL query and improving table name handlingWhile the implementation is functional, there are a few improvements that could be made:
The SQL query uses string formatting with
f-string
. Although theschema_name
comes from the database itself and is less likely to be a security risk, it's generally better to use parameterized queries.As noted in your comment on line 76, the table name formatting appears in multiple places. Consider refactoring to use the existing
_format_postgres_compact_table_name
method from thePostgresMetadata
class.- table_name = f"{row['f_table_schema']}.{row['f_table_name']}" # ? Might want to use a global `_format_postgres_compact_table_name` function. + table_name = self._format_postgres_compact_table_name(row['f_table_schema'], row['f_table_name']) # Add the method to the ExtensionHandler class: + def _format_postgres_compact_table_name(self, schema: str, table: str): + return f"{schema}.{table}"
87-98
: Consider consolidating type transformation methodsThere are now two methods with similar functionality:
ExtensionHandler._transform_postgres_column_type
(new)PostgresMetadata._transform_postgres_column_type
(existing)Consider refactoring to avoid duplication, perhaps by:
- Moving the extension type mapping to the main transformation method, or
- Creating a shared utility function that both classes can use.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
ibis-server/app/model/metadata/dto.py
(1 hunks)ibis-server/app/model/metadata/postgres.py
(2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
ibis-server/app/model/metadata/postgres.py (1)
ibis-server/app/model/metadata/dto.py (2)
Table
(79-84)RustWrenEngineColumnType
(13-57)
🔇 Additional comments (4)
ibis-server/app/model/metadata/dto.py (1)
54-57
: Well-organized extension of RustWrenEngineColumnType enumThe addition of
GEOMETRY
andGEOGRAPHY
types under a clear "Extension types > PostGIS" comment block is clean and follows the established pattern. This structure allows for future extension types to be added in an organized manner.ibis-server/app/model/metadata/postgres.py (3)
14-21
: Well-designed ExtensionHandler class structureThe class is designed with extensibility in mind using a handler dictionary that maps extension names to methods. This approach makes it easy to add support for additional PostgreSQL extensions in the future.
22-36
: Clear extension handling flowThe
augment
method has a clean implementation that retrieves installed extensions and applies the appropriate handlers. The method correctly returns the modified tables list.
37-52
: Effective extension detection queryThe SQL query to retrieve extensions and their schemas is well-structured. The error handling for empty dataframes is a good practice.
extension_handler = ExtensionHandler(self.connection) | ||
unique_tables = extension_handler.augment(unique_tables) | ||
return list(unique_tables.values()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify extension handler augmentation behavior
The augment
method signature returns a list of tables, but it's being used here to modify the unique_tables
dictionary. Since Python dictionaries are passed by reference, this likely works as intended, but it would be clearer if the method returned the modified dictionary or the variable assignment was removed.
🏁 Script executed:
#!/bin/bash
# Check if the augment method modifies the tables in place
echo "Examining ExtensionHandler.augment method:"
rg -A 5 "def augment" --type py
echo "------------------------------------------------"
echo "Checking reference to unique_tables before and after augment call:"
rg -A 5 -B 2 "extension_handler.augment" --type py
Length of output: 1566
🏁 Script executed:
#!/bin/bash
# Show full implementation of the augment method including its return behavior
rg -n "def augment" -A 50 ibis-server/app/model/metadata/postgres.py
# Show context around unique_tables before and after its definition and usage
rg -n "unique_tables" -C 5 ibis-server/app/model/metadata/postgres.py
Length of output: 3870
🏁 Script executed:
#!/bin/bash
# Show complete ExtensionHandler class, including handler registrations
rg -n "class ExtensionHandler" -A 200 ibis-server/app/model/metadata/postgres.py
echo "---- handlers mapping ----"
rg -n "self.handlers" -A 20 ibis-server/app/model/metadata/postgres.py
echo "---- postgis_handler implementation ----"
rg -n "def postgis_handler" -A 200 ibis-server/app/model/metadata/postgres.py
Length of output: 19056
Fix type annotations for ExtensionHandler.augment and handlers
The augment
and related handler methods actually accept and return a mapping of table names to Table
objects (a dict[str, Table]
), not a list[Table]
. Update the type signatures to reflect that, or else change the implementation to work with lists:
• In ibis-server/app/model/metadata/postgres.py, adjust:
- class ExtensionHandler:
- def augment(self, tables: list[Table]) -> list[Table]:
+ class ExtensionHandler:
+ def augment(self, tables: dict[str, Table]) -> dict[str, Table]:
# …
- def postgis_handler(self, tables: list[Table], schema_name: str) -> list[Table]:
+ def postgis_handler(self, tables: dict[str, Table], schema_name: str) -> dict[str, Table]:
# …
• Optionally remove the assignment
Since the handlers mutate the dict in place and then return it, you could also call:
extension_handler.augment(unique_tables)
return list(unique_tables.values())
without re-assigning unique_tables
.
These changes will make the code’s behavior and its annotations consistent.
This is very interesting use case! @wwwy3y3 @andreashimin take a closer look! |
Thanks @eren-ture for the contribution. It's awesome. 👍
Maybe add a test case called
I expected you might implement another method called |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @eren-ture, I think this PR is pretty close. I left some comments for it. As what I mentioned previously, it's good to add the test case for it. We can use the Postgis image to build the test cases. I think I can help you do this if you don't mind I push some commits to your branch.
column.type = self._transform_postgres_column_type( | ||
row["column_type"] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
column.type = self._transform_postgres_column_type( | |
row["column_type"] | |
) | |
column.type = str(self._transform_postgres_column_type( | |
row["column_type"] | |
)) |
I got the following message when testing:
Expected `str` but got `RustWrenEngineColumnType` with value `<RustWrenEngineColumnType.GEOMETRY: 'GEOMETRY'>` - serialized value may not be as expected
I think it's better to cast to str
explicitly to avoid the warning.
|
||
# Update tables | ||
for row in response: | ||
table_name = f"{row['f_table_schema']}.{row['f_table_name']}" # ? Might want to use a global `_format_postgres_compact_table_name` function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
table_name = f"{row['f_table_schema']}.{row['f_table_name']}" # ? Might want to use a global `_format_postgres_compact_table_name` function. | |
# TODO: Might want to use a global `_format_postgres_compact_table_name` function. | |
table_name = f"{row['f_table_schema']}.{row['f_table_name']}" |
The comment is like a to-do enhancement. It's better to add the TODO word for it.
Hi @goldmedal, |
Thanks. I'm looking forward to them. 👍 |
… the distance functionality of the database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
ibis-server/tests/routers/v2/connector/test_postgres.py (1)
1074-1078
: Consider adding more comprehensive PostGIS tests.While this test validates basic spatial query functionality, consider adding more tests that cover other PostGIS operations like:
- Spatial joins
- Area/perimeter calculations
- Spatial containment queries
- Testing both geometry and geography column types
Adding more comprehensive tests would ensure that all aspects of the PostGIS integration work correctly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
⛔ Files ignored due to path filters (1)
ibis-server/tests/resource/tpch/data/cities_geometry.parquet
is excluded by!**/*.parquet
📒 Files selected for processing (1)
ibis-server/tests/routers/v2/connector/test_postgres.py
(3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
ibis-server/tests/routers/v2/connector/test_postgres.py (2)
ibis-server/tests/conftest.py (2)
file_path
(10-11)client
(18-23)ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py (1)
manifest_str
(29-30)
🔇 Additional comments (3)
ibis-server/tests/routers/v2/connector/test_postgres.py (3)
4-4
: Appropriate geopandas import for PostGIS tests.The addition of the
geopandas
library as a dependency is necessary for handling geospatial data and loading it into PostGIS tables.
153-163
: Well-structured PostGIS fixture.The
postgis
fixture correctly follows the same pattern as other fixtures in the file, with appropriate scope and cleanup handling. The fixture properly:
- Creates a PostGIS-enabled container using the correct image
- Initializes the PostGIS extension
- Loads geospatial data from a Parquet file into a table for testing
This provides a good foundation for testing spatial functionality.
159-161
:✅ Verification successful
Verify test data file exists.
The test loads data from a Parquet file at
resource/tpch/data/cities_geometry.parquet
. Let's verify that this file exists in the repository.
🏁 Script executed:
#!/bin/bash # Check if the cities_geometry.parquet file exists if [ -f "tests/resource/tpch/data/cities_geometry.parquet" ]; then echo "File exists" # Preview first few rows of the file structure if possible python -c "import pyarrow.parquet as pq; print(pq.read_metadata('tests/resource/tpch/data/cities_geometry.parquet'))" else echo "File not found: tests/resource/tpch/data/cities_geometry.parquet" fiLength of output: 200
🏁 Script executed:
#!/bin/bash # Search for the cities_geometry.parquet file anywhere in the repository echo "Searching for cities_geometry.parquet via ripgrep..." rg -l "cities_geometry\.parquet" || echo "No matches found with rg" echo "" echo "Searching for cities_geometry.parquet via find..." find . -type f -name "cities_geometry.parquet" || echo "No file found with find"Length of output: 435
File Verified: cities_geometry.parquet Present
The test data file
ibis-server/tests/resource/tpch/data/cities_geometry.parquet
exists at the expected location. No changes are needed here.
Inclusions
Added an
ExtensionHandler
class to handle database extensions. Created thepostgis_handler
method to define columns asgeometry
orgeography
.Added new column types
GEOMETRY
andGEOGRAPHY
to theRustWrenEngineColumnType
enum to represent PostGIS-specific data types.Proof of Work
Issue
When the frontend is asked to display the values, it obviously fails. I unfortunately don't know much about frontend development, ad cannot fix this issue.

Summary by CodeRabbit