-
Couldn't load subscription status.
- Fork 41
feat(catalog): add catalog API #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
crates/paimon/src/catalog/mod.rs
Outdated
| /// Impl References: <https://github.com/apache/paimon/blob/release-0.8.2/paimon-core/src/main/java/org/apache/paimon/catalog/Catalog.java#L42> | ||
| #[async_trait] | ||
| pub trait Catalog: Send + Sync { | ||
| const DEFAULT_DATABASE: &'static str = "default"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using const will make this trait non-object safe, preventing users from invoking Catalog through Arc<dyn Catalog>. How about introducing a new API, such as info, for this purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will do it later.
| fn options(&self) -> &HashMap<String, String>; | ||
|
|
||
| /// Returns the FileIO instance. | ||
| fn file_io(&self) -> &FileIO; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The catalog is integrated with file I/O, which is somewhat surprising to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The catalog is integrated with file I/O, which is somewhat surprising to me.
The FileIO fileIO() function can be found at here. Do you have some good suggestions to deal this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need some input from @JingsongLi, @SteNicholas, and @Aitozi: Should a catalog be coupled with a specific file I/O?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, the catalog is bind with a warehouse path which determines the fileIO. So these two things are coupled in current shape, and the table's schema are also retrieved from the filesystem. Also like to see some inputs from @JingsongLi @SteNicholas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@QuakeWang, @Xuanwo, @Aitozi, the warehoused path is required to determine which FileIO to use. This interface makes sense to me.
crates/paimon/src/catalog/mod.rs
Outdated
| ) -> Result<()>; | ||
|
|
||
| /// Returns a Table instance for the specified identifier. | ||
| async fn get_table(&self, identifier: &Identifier) -> Result<impl Table>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementing Table makes this trait non-object safe. Perhaps we could return a Table struct instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementing
Tablemakes this trait non-object safe. Perhaps we could return aTablestruct instead.
Ok, I will create a Table struct instead of trait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the java version,Table interface represents various specialized table types. Is it possible to define a simple Table struct in rust ?
What about returning Box<dyn Table> instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the java version,
Tableinterface represents various specialized table types. Is it possible to define a simpleTablestruct in rust ? What about returningBox<dyn Table>instead?
Maybe we will not use dyn, when returning Box<dyn Table> has some error like error[E0404]: expected trait, found struct Table. In Rust, dyn is used for dynamic dispatch with trait objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the java version,
Tableinterface represents various specialized table types. Is it possible to define a simpleTablestruct in rust ? What about returningBox<dyn Table>instead?Maybe we will not use
dyn, when returningBox<dyn Table>has some error likeerror[E0404]: expected trait, found struct Table. In Rust,dynis used for dynamic dispatch with trait objects.
Why Box<dyn Table> would give error[E0404] ? It should not when using properly.
I think in many cases, the return value of get_table needs to be downcast to a concrete type, so it makes sense to use dyn Table.
Alternatively, maybe using enum Table is feasible
| async fn list_tables(&self, database_name: &str) -> Result<Vec<String>>; | ||
|
|
||
| /// Checks if a table exists. | ||
| async fn table_exists(&self, identifier: &Identifier) -> Result<bool> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit surprising that we use Identifier in some places but not in others. Shouldn't we be consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit surprising that we use
Identifierin some places but not in others. Shouldn't we be consistent?
In the Java version, Identifier is usually used for table-related methods.
| ) -> Result<()>; | ||
|
|
||
| /// Returns whether this catalog is case-sensitive. | ||
| fn case_sensitive(&self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we can consolidate those metadata-related items into a single API.
crates/paimon/src/catalog/mod.rs
Outdated
| /// A table provides basic abstraction for a table type and table scan, and table read. | ||
| /// | ||
| /// Impl Reference: <https://github.com/apache/paimon/blob/release-0.8.2/paimon-core/src/main/java/org/apache/paimon/table/Table.java#L41> | ||
| pub trait Table { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like Table should be a struct instead of a trait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will create a simple Table struct as the return value of the get_table method. With more design details about Table we can create a new issue for discussion. What do you think of it?
|
Also invite @Aitozi for a review. |
|
|
||
| /// Returns the warehouse root path containing all database directories in this catalog. | ||
| fn warehouse(&self) -> &str; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do APIs like lockFactory need to be added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do APIs like
lockFactoryneed to be added?
I think this PR will add the struct and definition first, maybe we can add some APIs like lockFactory in the further PRs.
|
Hi teams, any update about the PR? Can we merge it and do something based on it? |
I believe there some unresolved comments left: #62 (comment) |
|
@QuakeWang Hi, do you mind if I continue your job? |
Sorry I've been busy recently and didn't check back sooner. Please feel free to do this with this as planned. |
fix: #52