Best practices around versioning (or not) DAGs?

Does anyone have thoughts on how to best maintain DAG release and versioning?

This Stackoverflow post ( looks like a reasonable approach, but involves having to adjust the version of the DAG manually but updating the dag_id in the DAG file. If you version control the code using git or something similar, this approach may lead to inconsistencies in git versions and DAG versions.

Keeping DAGs version-less is a good option, but we’ve noticed that deploying a DAG file to our system’s airflow/dags/ folder will impact both the contents and the shape of currently running DAGs. In an ideal world, currently running DAGs would not be modified once they started.

As this likely affects most of us in the community, would love to hear any thoughts, opinions or best practices you’ve learned.

Thank you!

1 Like

We follow the approach suggested in stackoverflow by using Maven and the resource plugin to replace properties (${project.version}) value in the dag_id parameter in the DAG code before packaging them into zip-files which is working really good.

However, we face other problems when using python packages. Since the packages will have the same names in different versions of the DAGs they are in conflict and only the first loaded package will be used for any version of the DAG. If you have a solution for that I would be grateful.

i think it depends entirely on your situation. I have worked in the scenario where we didn’t version, but we were able to deploy changes when dags weren’t running. We’d push updates to the same dag daily with no issues.

Versioning can make sense in some scenarios, but things to keep in mind. If you deploy a new version, you will want to change the start_date of the new version and the end date of the prior version so both versions don’t run at the same time.

There has been some discussion by Airflow committers to improve the way Airflow handles all of this, but if it’s changed it won’t be til 2.0.0 which is probably slated for the end of 2019