-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Kernel]Generalize the actions after commit(like checkpoint) by introducing post commit action to kernel #4115
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Left some comments!
kernel/kernel-api/src/main/java/io/delta/kernel/PostCommitAction.java
Outdated
Show resolved
Hide resolved
kernel/kernel-api/src/main/java/io/delta/kernel/PostCommitAction.java
Outdated
Show resolved
Hide resolved
kernel/kernel-api/src/main/java/io/delta/kernel/PostCommitAction.java
Outdated
Show resolved
Hide resolved
kernel/kernel-api/src/main/java/io/delta/kernel/internal/TransactionImpl.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
import io.delta.kernel.engine.Engine; | ||
import java.io.IOException; | ||
|
||
public interface PostCommitHook { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add more docs here declaring what type of work is considered a PostCommitHook? And how an engine should treat them? i.e. are they required, how do they relate to the commit, etc
Alternatively, maybe more thorough docs for each type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some doc, PTAL. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add what sort of operations and latency wise. So that the connector can choose to run it async.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the documentation to indicate supported operations and latency indication for checkpoint in below section.
kernel/kernel-api/src/main/java/io/delta/kernel/hook/PostCommitHookType.java
Outdated
Show resolved
Hide resolved
kernel/kernel-api/src/main/java/io/delta/kernel/hook/CheckpointHook.java
Outdated
Show resolved
Hide resolved
kernel/kernel-defaults/src/test/scala/io/delta/kernel/defaults/DeltaTableWritesSuite.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few minor comments then lgtm
public interface PostCommitHook { | ||
|
||
enum PostCommitHookType { | ||
// Write a new checkpoint at the version committed by the txn if required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is maybe a little ambiguous as to when it is present (i.e. is it always present and only sometime checkpoints?)
Maybe something like "Write a new checkpoint at the version committed by the txn. This hook is present when the table is ready for checkpoint according to its configured checkpoint interval"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do /* */
comment for the enum
import io.delta.kernel.internal.fs.Path; | ||
import java.io.IOException; | ||
|
||
/** Write a new checkpoint at the version committed by the txn if required. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/** Write a new checkpoint at the version committed by the txn if required. */ | |
/** Write a new checkpoint at the version committed by the txn. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this hook is created, it is required
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall.
*/ | ||
public boolean isReadyForCheckpoint() { | ||
return isReadyForCheckpoint; | ||
/** @return list of operations to trigger after commit. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add more details such as how the connector can choose to run and how it may affect the query performance etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a usage section to cover these. "query performance" means the commit latency in this context?
import io.delta.kernel.engine.Engine; | ||
import java.io.IOException; | ||
|
||
public interface PostCommitHook { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add what sort of operations and latency wise. So that the connector can choose to run it async.
Which Delta project/connector is this regarding?
Description
This PR doesn't make any functional changes, but abstract checkpoint into post commit action. This is prepared adding more post commit actions such as CRC write (#4116)
How was this patch tested?
Existing unit test, manual test using delta/kernel/examples/run-kernel-examples.py --use-local
Does this PR introduce any user-facing changes?
No