-
Notifications
You must be signed in to change notification settings - Fork 0
refactor: logging original query in metaflow card #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ql_with_comment` and logging original query in metaflow card
Wiz Scan Summary
To detect these findings earlier in the dev lifecycle, try using Wiz Code VS Code Extension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors how SQL queries are logged in Metaflow cards to display the original, unannotated query instead of the query with select.dev cost-tracking tags. The main change renames add_comment_to_each_sql_statement to annotate_sql_with_comment and updates it to return a tuple of (original_query, annotated_query), allowing the original query to be logged while the annotated version is executed.
- Renamed function to better reflect its purpose and return both original and annotated queries
- Modified query logging to show clean queries in Metaflow cards before annotation with tracking tags
- Added defensive
hasattrcheck forcurrent.cardinpublish()function
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
tests/functional_tests/metaflow/test__add_comment_to_each_sql_statement.py |
Updated test to handle new tuple return signature from annotate_sql_with_comment |
src/ds_platform_utils/metaflow/write_audit_publish.py |
Renamed and refactored main function to return both original and annotated queries; moved function definitions; added card existence check |
src/ds_platform_utils/metaflow/pandas.py |
Updated to log original query before annotation and execute annotated query with tracking tags |
src/ds_platform_utils/_snowflake/write_audit_publish.py |
Updated function call to use explicit named parameters for clarity |
pyproject.toml |
Bumped version from 0.3.0 to 0.3.1 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Separate statements with a blank line for readability | ||
| annotated_query = "\n\n".join(annotated) + "\n" | ||
|
|
||
| # NOTE: We are returing both original and annotated query because in the publish function, |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in comment: "returing" should be "returning".
| # NOTE: We are returing both original and annotated query because in the publish function, | |
| # NOTE: We are returning both original and annotated query because in the publish function, |
| from src.ds_platform_utils.metaflow.write_audit_publish import annotate_sql_with_comment | ||
|
|
||
|
|
||
| def test_add_comment_to_each_sql_statement(): |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test function name test_add_comment_to_each_sql_statement should be updated to test_annotate_sql_with_comment to match the renamed function it's testing.
| def test_add_comment_to_each_sql_statement(): | |
| def test_annotate_sql_with_comment(): |
|
|
||
| statements = [s.strip() for s in sqlparse.split(query) if s.strip()] | ||
| if not statements: | ||
| return "", query |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the SQL query has no statements (empty or whitespace-only), the function returns ("", query). This is inconsistent - it should return (query, query) since both the original and annotated versions should be the same when there are no statements to annotate.
| return "", query | |
| return query, query |
| original_query, annotated_query = annotate_sql_with_comment(input_sql, comment) | ||
|
|
||
| assert output.strip() == expected_output.strip() | ||
| assert annotated_query.strip() == expected_output.strip() |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test does not verify the original_query return value from annotate_sql_with_comment. Since the function now returns a tuple of (original_query, annotated_query), the test should assert that original_query equals the input SQL to ensure both return values are correct.
| if current.card: | ||
| if hasattr(current, "card"): | ||
| # Replace the annotated query with the original query for logging in the Metaflow card | ||
| operation.query = original_query |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original query is being set for all operation types (write, clone, audit, publish), which is incorrect. The write_audit_publish generator yields multiple operation types, each with their own specific queries (clone queries, audit queries, publish/swap queries). Only the "write" operation should have its query replaced with the original query for logging purposes. Consider checking operation.operation_type == "write" before replacing the query.
| operation.query = original_query | |
| if operation.operation_type == "write": | |
| operation.query = original_query |
| current.card.append(Markdown(f"## Using Snowflake Warehouse: `{warehouse}`")) | ||
| _execute_sql(conn, f"USE WAREHOUSE {warehouse};") | ||
|
|
||
| # Log the original query in the Metaflow card before tagging them for select.dev | ||
| current.card.append(Markdown("## Querying Snowflake Table")) | ||
| current.card.append(Markdown(f"```sql\n{query}\n```")) |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent card existence checking: In write_audit_publish.py at line 105, the code uses hasattr(current, "card") to check if the card exists before accessing it. However, in this file, current.card is accessed directly without such checks (lines 193, 197, 198, 215, 216). For consistency and to avoid potential AttributeErrors, consider adding similar hasattr(current, "card") checks here as well.
abhishek-pattern
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.

No description provided.