Has anyone set up their Airflow ETL dags to run DBT transforms? The challenge I see is that there are now two levels of DAGs, the airflow dag and the dbt dag.
I think the naive approach would be to have the airflow dag consist of just one task: BashOperator(dbt run). But this seems like it would have several drawbacks:
- There would be no visual DAG representation of the DBT tasks that are running/completed
- The only way to see the progress of the DBT run would be tail the airflow logs for that task
- Parallelism of the dbt dag could potentially be impacted by running on a single task instance. I would think --threads would alleviate this, but I haven’t tested yet how well that works with airflow task instances.
There’s also the option of airflow calling dbt-cloud-plugin. Has anyone found success with that method and prefers it?