-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1882075,SNOW-1990301,SNOW-1990302,SNOW-1821290: AST finalizations for stateless batches, and DataframeWriter,DataframeReader new APIs #3180
SNOW-1882075,SNOW-1990301,SNOW-1990302,SNOW-1821290: AST finalizations for stateless batches, and DataframeWriter,DataframeReader new APIs #3180
Conversation
🎉 Snyk checks have passed. No issues have been found so far.✅ security/snyk check is complete. No issues have been found. (View Details) ✅ license/snyk check is complete. No issues have been found. (View Details) |
c4dfaea
to
dfc31e6
Compare
dfc31e6
to
64f26b6
Compare
64f26b6
to
38c3df8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Hemit!
8fa300c
to
5a59f68
Compare
@@ -250,6 +250,7 @@ class Column: | |||
# For example, running: df.filter(col("A").isin(1, 2, 3) & col("B")) would fail since the boolean operator | |||
# '&' would try to construct an AST using that of the new col("A").isin(1, 2, 3) column (which we currently | |||
# don't fill if the only argument provided in the Column constructor is 'expr1' of type Expression) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recall there used to be some issue adding this to the constructor, do we have tests covering on/off for using Column
objects (i.e. calling this API, and also col
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you clarify, do you mean on/off in terms of our global AST enabled flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh like with AST flag enabled/disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yes we do have tests in both cases then (public calls made by tests, and internal calls to the constructor) if I'm understanding correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No questions re the minor changes, but confused about _flush_ast
. What's the role of it? Why do we need it? Could we get away with not adding yet another parameter?
Only using Using the combination means that internal callers can set It's a unique case: public APIs, eager (in terms of actually performing the execution within the function itself rather than in a helper), and used internally by other public APIs. I don't think combination exists anywhere else in the Snowpark codebase unless I'm mistaken. Elsewhere we seem to pass an |
Hm, internal use of |
52ff093
to
6ff3690
Compare
…s for stateless batches, and DataframeWriter,DataframeReader new APIs
6ff3690
to
82a5026
Compare
Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
This PR includes all changes required to complete the following tickets.
It also includes changes in support of SNOW-1997136: Ensure AST batch is stateless.
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
AST generation for
DataframeReader
APIs which return the sameDataframeReader
instance have been updated to follow the same pattern asDataframeWriter
. Since all such methods are setters, it is only necessary to have the last set attributes in the AST sent to the serveer. This is in support of the newformat
API in both Snowpark classes.AST generation was added for the new APIs
DataframeReader.{format,load}
, andDataframeWriter.{format,read,insert_into}
along with expectation test updates.Changes were also made in support of AST batch statelessness in the IR, by changing fields typed
VarId
to beExpr
typed. This allows more consistent usage of_set_ast_ref
as a helper method across all Snowpark APIs.This PR is accompanied by a server-side PR.