When should I use the Kubernetes over the Celery and Local Executor?

For which cases should I use the Kubernetes Executor over the Celery and Local Executors?

Out of the box, Astronomer supports the Local, Celery, and Kubernetes executors.

Local:

  • Everything runs on the same pod as the scheduler for the local executor (“single box” approach).
  • Because of that, the local executor is not very resource intensive and is great for anything involving dev environments or other types of “lightly used” environments

Celery:

  • Using the Celery executor, you can run dedicated worker pods for your tasks
  • You can add/remove the number of pods as well as modify the resources on each one
  • Each worker on Astronomer is the same; for that deployment.
  • Deploys are also handled gracefully. In the event of a code push when on the celery executor, jobs will run until the worker termination grace period (when they’ll be marked as zombies).
  • Celery executor also gives you access to ephemeral storage for your pods

Kubernetes:

  • Each task on the Kubernetes executor gets its own pod, which allows you to pass an executor_config in your task params. This lets you assign resources at the task level by passing an executor_config
# Sample config
test_config = {"KubernetesExecutor": {"request_memory": "8Gi", "limit_memory": "8Gi", "request_cpu": "10Gi", "limit_cpu": "10Gi"}}
...

# Pass config into task
run_compute = PythonOperator(
        task_id='run_model',
        provide_context=True,
        executor_config=test_config,
        python_callable=h.jira_functions.jira_completed_tickets
    )


  • Since each task is a pod, it is managed independently of the code deploys. This is great for longer running tasks as users can push new code without fear of interrupting that task
  • However, because each task is its own pod, they make take a little time to start.

In summary, the Celery executor is a great fit for any environment where the tasks are “similar” and you can find a configuration for the worker that fits all sizes, or for any tasks that need to run quickly (since the workers are “always on”).

The Kubernetes executor is great for dags that have really different requirements between tasks (e.g, the first task may be a sensor that only requires a few resources, but the downstream tasks have to run on your GPU node pool with a higher CPU request). It’s also great for environments with long running tasks and users pushing code when jobs are running (since there is no grace period concept).

In the near future, Astronomer will have an option for KEDA autoscaling on Celery, combining a lot of the great features between Kubernetes executor and Celery executor.

2 Likes

Hi virajparekh,
do you already know when KEDA will be available for Astronomer users? I would really like to use the function, because I need several workers to process my Airflow Tasks at night. During the day, however, the workers have little to do.

Hi @Jonnyblacklabel - we are doing some final testing around it. Which customer are you with? We can make sure to keep you in the loop around beta testing and timelines.

Hi virajparekh, thanks for your answer. i’m with get:traction.
I have registered here in the forum with my private github account

1 Like

Hey @virajparekh,
I just wanted to ask if there’s been any news on the KEDA Operator?
Thank you :slight_smile: