Explore and add static checks for DAGs for early detection of common issues #43176

omkar-foss · 2024-10-18T20:03:00Z

Description

As per users' feedback in the Airflow Debugging Survey 2024, 48.3% of respondents chose early issue detection during execution as one of their top 2 choices.

Use case/motivation

Goal of this issue:

Enhance early detection of DAG issues to minimize dev time
Some sort of static analysis similar to Ruff checks for DAGs
Runtime analysis of DAGs if feasible can also be done to save dev time

Related issues

Parent Issue: #40975

Are you willing to submit a PR?

Yes I am willing to submit a PR!

Code of Conduct

I agree to follow this project's Code of Conduct

omkar-foss · 2024-10-29T12:29:23Z

In the context of this issue, DAG linting as mentioned by this article (shared by @potiuk on slack) can also be explored:

https://medium.com/@snir.isl/mastering-airflow-dag-standardization-with-pythons-ast-a-deep-dive-into-linting-at-scale-1396771a9b90

Inspired by above article, may be we can have commands like airflow dag lint <dagname>.py for checking common DAGs issues and airflow dag grade <dagname>.py for grading DAGs based on their code quality.

potiuk · 2024-10-29T19:54:07Z

Love that idea. We could even had some way of checking for "best practices" - like not using DB while parsing etc. This might also be then used as part of the upgrade-check mechanism that we are planning for Airflow 2-> 3 migration - see #41641 cc: @Lee-W

Lee-W · 2024-10-30T01:11:02Z

I feel it's a bit different. 🤔 But for not using DB while parsing, that's something we should check in the upgrade check. But the "best practices" thing would probably be something else. Probably integrating with ruff or building our own linter would be better.

Dev-iL · 2025-01-05T09:26:49Z

Good to see new airflow-specific Ruff rules being added (AIR3##). Nice work @Lee-W!

zach-overflow · 2025-02-11T19:48:10Z

This is definitely something I'd find immensely useful. In addition to the article linked in the earlier comments, there are some decent prior attempts at doing something similar with static DAG code checks (see links below); however, I think the major benefit for making this feature native to Airflow is that it can rely directly on the built-in DAG parsing logic.

Some of the examples I was referring to:

pylint-airflow: which is not maintained, but defines some reasonable lint checks as custom pylint checkers.
airflint: this might be useful as a reference, but it is worth noting that this is an auto-formatter rather than a simple validation tool.

Lee-W · 2025-02-12T11:29:24Z

We could probably start discussing what the best practice is after Airflow 3.0 is released (best practice now might not reflect the latest best practice we want then)

omkar-foss · 2025-02-14T06:18:24Z

We could probably start discussing what the best practice is after Airflow 3.0 is released (best practice now might not reflect the latest best practice we want then)

Yes that makes sense, let's add support for best practices linting for Airflow 3.x onwards. I just took a quick opinion and got a good concensus for 3.x onwards on slack.

omkar-foss · 2025-02-14T06:25:03Z

@Lee-W would you be interested in picking up this issue? Because I suppose you have good context on Airflow static checks by adding Ruff rules for it, among other things.

Lee-W · 2025-02-14T08:49:46Z

@Lee-W would you be interested in picking up this issue? Because I suppose you have good context on Airflow static checks by adding Ruff rules for it, among other things.

Yep, implementing them is interesting, and I would definitely be interested in adding more rules haha

Lee-W · 2025-02-14T10:28:36Z

I'll probably focus most on implementing rules, but will need everyone's suggestion on what the best practices we want 🙂

Dev-iL · 2025-02-14T13:06:42Z

Perhaps we should start a shared doc to raise ideas?

Personally, I'd love some checks related to XCOMs, and in particular in the context of task groups, to ensure they're unpacked/deserialized properly. Also, if possible, a check that custom classes returned by dags are included in the allowed_deserialization_classes.

Lee-W · 2025-02-15T02:09:18Z

we could probably just discuss here and after we have some rough idea, we can organize them to something like #41641

Dev-iL · 2025-02-19T06:18:03Z

Idea:
If a task.branch has at least two returns, and exactly one of them is not [], suggest that the user replaces it with task.short_circuit, which is simpler.

Dev-iL · 2025-02-19T07:53:58Z

Idea:
When a task/operator has a templated input that the user specifies as a string, and it contains a template to pull the xcom of exactly one task, suggest replacing it with an object representation of the upstream output. An example with operators:

# Before:
some_task = SomeOperator(task_id="some_task", ...)
some_other_task = SomeOtherOperator(
    templated_field="{{ ti.xcom_pull(task_ids='some_task') }}",
)

# After:
some_task = SomeOperator(...)  # Note the task id no longer has to be specified explicitly.
some_other_task = SomeOtherOperator(
    templated_field=some_task.output,
)

...and equivalently for tasks.

Dev-iL · 2025-02-19T08:19:48Z

Some ideas from this blog post:

The DAG must not have top-level calls to expensive classes/services.
The DAG object should be instantiated using a context manager.
The DAG file name should match its dag_id.

zach-overflow · 2025-02-19T22:37:44Z

Some additional best practices rule suggestions:

All dag_id's should be unique (AKA prevent name collisions and detect them at validation time).
+1 to @Dev-iL's suggestion on top-level code, specifically I can think of two preliminary checks for that category:
1. Ensure function-level imports are used in DAG files when possible (again, an example of this kind of rule may be found in the airflint repo here)
2. Require Variables passed to operator constructors are only passed as Jinja templates (per the official best practice section here).
Lastly -- and this might be a larger scope suggestion -- it would be really useful if the Airflow linter could validate the user's DAG code against any user-defined task policy or dag policy (as defined in their cluster policy).

zach-overflow · 2025-02-20T17:55:05Z

One other lint check idea came to mind:

Optionally check whether DAG definitions are serializable or not

Dev-iL · 2025-02-23T08:00:48Z

@zach-overflow

* All `dag_id`'s should be unique (AKA prevent name collisions and detect them at validation time).

I'm not sure Ruff supports multi-file checks, so I suspect this might only be applicable to situations where multiple DAGs are defined in a single script. IIRC this can be detected by a DagBag unit test.

* Lastly -- and this might be a larger scope suggestion -- it would be _really_ useful if the Airflow linter could validate the user's DAG code against any user-defined task policy or dag policy (as defined in their [cluster policy](https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/cluster-policies.html#cluster-policies)).

I think this too belongs in the realm of checks to be added to DagBag and/or unit tests.

* Optionally check whether DAG definitions are serializable or not

Could you please elaborate (mainly out of curiosity):

What common pattern might make a DAG definition non-serializable?
In what contexts does this matter?
Is this supposed to be a static or a DagBag-related test? If static - how can this be done without trying to serialize the DAG and seeing if it works?

All in all I these are some good ideas to be included in DagBag, some variation of DagBag that's used for DAG validation, or at the very least - unit test examples to be added to the docs.

snirisl · 2025-02-26T08:17:32Z

Some ideas from this blog post:

The DAG must not have top-level calls to expensive classes/services.

The DAG object should be instantiated using a context manager.

The DAG file name should match its dag_id.

Thanks @Dev-iL for referencing my blog post 🙏 Another thing we can add is consider implementing a rule, adding to the third rule above, that dag id should be all lower case characters. Since DAG IDs should match the file names, enforcing this one is making sure DAG files to be compatible with the Python convention for file names.

Dev-iL · 2025-03-03T17:21:01Z

Check that paths to temporary files/folders created in a task aren't returned (in XCOM) - since it cannot be guaranteed that downstream tasks will run on the same worker, and therefore will have access to the temporary path.

zach-overflow · 2025-03-07T11:25:05Z

@Dev-iL Thank you for pointing me towards the DagBag unit testing page, I wasn't aware that was an available approach. You're correct that the DagBag unit testing would cover unique DAG ID enforcement.

As for checking against user-defined cluster policies, I think the DagBag unit testing could cover that, but the cluster policy docs are not clear if that is evaluated during the DagBag loading, or during the actual DAG/task scheduling.

TL;DR: I think the most broadly useful static check would be the checks for problematic top-level code (e.g. network calls, variable.get() calls, expensive top-level imports, etc).

What common pattern might make a DAG definition non-serializable?

After my previous post, I leared that Airflow enforces DAG serialization, so I suppose this is less of a concern unless users are relying on some extended/custom serialization functionality.

In what contexts does this matter?

Likely not many except for some non-standard serialization, so I think this is less of a concern for me now.

Is this supposed to be a static or a DagBag-related test? If static - how can this be done without trying to serialize the DAG and seeing if it works?

I guess the best way would to simply run static unit tests against any custom serialization methods

omkar-foss added kind:feature Feature Requests needs-triage label for new issues that we didn't triage yet labels Oct 18, 2024

omkar-foss added this to Debugging Improvements - Airflow 3 Oct 18, 2024

omkar-foss moved this to Todo in Debugging Improvements - Airflow 3 Oct 18, 2024

dosubot bot added the area:DAG-processing label Oct 18, 2024

omkar-foss changed the title ~~Explore and add static checks for DAGs for early detect of common issues~~ Explore and add static checks for DAGs for early detection of common issues Oct 18, 2024

nathadfield removed the needs-triage label for new issues that we didn't triage yet label Oct 23, 2024

omkar-foss moved this from Planning to Todo in Debugging Improvements - Airflow 3 Feb 14, 2025

omkar-foss assigned Lee-W Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explore and add static checks for DAGs for early detection of common issues #43176

Explore and add static checks for DAGs for early detection of common issues #43176

omkar-foss commented Oct 18, 2024

omkar-foss commented Oct 29, 2024

potiuk commented Oct 29, 2024

Lee-W commented Oct 30, 2024

Dev-iL commented Jan 5, 2025

zach-overflow commented Feb 11, 2025

Lee-W commented Feb 12, 2025

omkar-foss commented Feb 14, 2025 •

edited

Loading

omkar-foss commented Feb 14, 2025

Lee-W commented Feb 14, 2025

Lee-W commented Feb 14, 2025

Dev-iL commented Feb 14, 2025

Lee-W commented Feb 15, 2025

Dev-iL commented Feb 19, 2025

Dev-iL commented Feb 19, 2025 •

edited

Loading

Dev-iL commented Feb 19, 2025

zach-overflow commented Feb 19, 2025

zach-overflow commented Feb 20, 2025

Dev-iL commented Feb 23, 2025

snirisl commented Feb 26, 2025

Dev-iL commented Mar 3, 2025

zach-overflow commented Mar 7, 2025

Explore and add static checks for DAGs for early detection of common issues #43176

Explore and add static checks for DAGs for early detection of common issues #43176

Comments

omkar-foss commented Oct 18, 2024

Description

Use case/motivation

Related issues

Are you willing to submit a PR?

Code of Conduct

omkar-foss commented Oct 29, 2024

potiuk commented Oct 29, 2024

Lee-W commented Oct 30, 2024

Dev-iL commented Jan 5, 2025

zach-overflow commented Feb 11, 2025

Lee-W commented Feb 12, 2025

omkar-foss commented Feb 14, 2025 • edited Loading

omkar-foss commented Feb 14, 2025

Lee-W commented Feb 14, 2025

Lee-W commented Feb 14, 2025

Dev-iL commented Feb 14, 2025

Lee-W commented Feb 15, 2025

Dev-iL commented Feb 19, 2025

Dev-iL commented Feb 19, 2025 • edited Loading

Dev-iL commented Feb 19, 2025

zach-overflow commented Feb 19, 2025

zach-overflow commented Feb 20, 2025

Dev-iL commented Feb 23, 2025

snirisl commented Feb 26, 2025

Dev-iL commented Mar 3, 2025

zach-overflow commented Mar 7, 2025

omkar-foss commented Feb 14, 2025 •

edited

Loading

Dev-iL commented Feb 19, 2025 •

edited

Loading