-
Notifications
You must be signed in to change notification settings - Fork 14.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Created new exception, that provides details of each step of a job ru… #48466
base: main
Are you sure you want to change the base?
Conversation
…n in the logs when a job fails (apache#46923)
…ncelled job status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. I left some minor suggestions for modularization.
By the way, is the traceback linked in the PR copied from the local log files? Just want to confirm that the logging display will render correctly.
raise DbtCloudJobRunException(f"Job run {self.run_id} has failed or has been cancelled.") | ||
# Gather job run details (including run steps), so we can output the logs from each step | ||
run_details = self.hook.get_job_run( | ||
run_id=self.run_id, account_id=self.account_id, include_related=["run_steps"] | ||
).json()["data"] | ||
|
||
raise DbtCloudJobRunDetailsException(self.run_id, run_details) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can modularize this part as _raise_dbt_job_run_detail_exception method.
raise DbtCloudJobRunException(f"Job run {self.run_id} has failed or has been cancelled.") | ||
# Gather job run details (including run steps), so we can output the logs from each step | ||
run_details = self.hook.get_job_run( | ||
run_id=self.run_id, account_id=self.account_id, include_related=["run_steps"] | ||
).json()["data"] | ||
|
||
raise DbtCloudJobRunDetailsException(self.run_id, run_details) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then the method can be reused here and several parts below.
# For extracting only a subset of the higher level job details | ||
# id is the run_id | ||
run_data_keys = [ | ||
"id", | ||
"job_id", | ||
"is_error", | ||
"dbt_version", | ||
"finished_at", | ||
"finished_at_humanized", | ||
"run_duration", | ||
"run_duration_humanized", | ||
] | ||
# For extracting only a subset of the run step details | ||
# id is the step id | ||
run_steps_keys = [ | ||
"run_id", | ||
"id", | ||
"index", | ||
"name", | ||
"status", | ||
"status_humanized", | ||
"duration", | ||
"duration_humanized", | ||
"logs", | ||
"debug_logs", | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: How about make these list for filtering keys as properties of the exception class?
@jason810496 Thank you for the review. I'm AFK until next Tuesday. I will address your comments then :) |
closes: #46923
Created new exception
DbtCloudJobRunDetailsException
, which surfaces the job run logs from dbt Cloud, whenever a job run fails or is cancelled.Side-Note: I did notice that when the operator is
deferred
, it finishes pre-maturely (i.e. the trigger fires an event before the job is in a terminal status). This wasn't an issue with the operator code in branchv2-10-stable
, but is a current issue inmain
. I can try to sneak the change in here, or open a new issue to resolve.Current Task Log Output
... File "/opt/airflow/airflow/providers/dbt/cloud/operators/dbt.py", line 182, in execute raise DbtCloudJobRunException(f"Job run {self.run_id} has failed or has been cancelled.") airflow.providers.dbt.cloud.hooks.dbt.DbtCloudJobRunException: Job run 70471831332476 has failed or has been cancelled.
New Task Log Output
EDIT: Used chatgpt to format the above traceback: