Running DBT dags with Airflow

Has anyone set up their Airflow ETL dags to run DBT transforms? The challenge I see is that there are now two levels of DAGs, the airflow dag and the dbt dag.

I think the naive approach would be to have the airflow dag consist of just one task: BashOperator(dbt run). But this seems like it would have several drawbacks:

  1. There would be no visual DAG representation of the DBT tasks that are running/completed
  2. The only way to see the progress of the DBT run would be tail the airflow logs for that task
  3. Parallelism of the dbt dag could potentially be impacted by running on a single task instance. I would think --threads would alleviate this, but I haven’t tested yet how well that works with airflow task instances.

There’s also the option of airflow calling dbt-cloud-plugin. Has anyone found success with that method and prefers it?

2 Likes

I have implemented a DAG that contains a task each for dbt seed, dbt run, and dbt test. Each are bash operators. Parallelism is not really much of an issue unless you want to have multiple dbt run commands running simultaneously, I believe? However, I’ve never actually implemented this in astronomer, as I usually set up the server to have two venvs, one for airflow and one for dbt, where the BashOperator activates the dbt venv and then runs dbt in that environment.

The dbt cloud plugin seems to trigger runs on the dbt cloud, which would be unnecessary in my mind if you can just run locally.

I’m new to astronomer and I’d be very interested in a good solution on how to run dbt on astronomer!

1 Like