Celery Executor & tasks sequencing

Hi,

I am using airflow with celery executor. As an example, I have a dag with 3 tasks.
task1 >> task2 >> task3

When my worker nodes = 1, I see that the tasks execute fine and in the right sequence

However, when I increase the worker nodes to 2, I see that task1 is picked by worker node 1 and task2 by worker node 2. In my use case, task2 must execute only after task1. I believed that the celery executor will understand this. My dag fails because of this

Can you please me understand what’s wrong and how do I fix this issue?

Thanks
Rathi

If you want your tasks to be executed by the same Celery worker, you will need to set queues them.

Take a look at the Airflow documentation for Celery Queues.


Though I would also consider writing your task to be not reliant on executing on a specific host due to the result of the previous tasks. If the previous task saves a file on the worker and the next task needs to inspect it, my suggestion is either:

  • combine those tasks
  • upload at the end of task 1 and download at the beginning of task 2