We generally recommend using an ExternalTaskSensor in this scenario. That’d allow for a first task to wait for the Redshift load in a first DAG to finish and trigger downstream tasks in another DAG from there.
A few notes:
- Both DAGs would have to be set on the same schedule
- The DAG execution date of that sensor MUST match the DAG execution date of the task it’s sensing. That’s why they must have the same schedule.
- Consider upgrading to 1.10.2 (details below)
Sensors in Airflow 1.10.2
Sensors take up a persistent worker slot that historically has sometimes created a “deadlock” where, if you have more sensors than worker slots, nothing else gets pulled in to get executed. Airflow 1.10.2 thankfully has a “reschedule mode” (https://github.com/apache/airflow/blob/1.10.2/airflow/sensors/base_sensor_operator.py#L46-L56) that addresses that issue, so might be worth upgrading.
Now, if you have more sensors than worker slots, the sensor will now get thrown into a new
up_for_reschedule state and unblock that slot. If you can upgrade to 1.10.2, the better.