Having issues scaling airflow

teeforb · September 30, 2020, 4:07pm

Hi,

I’m having issues with scaling airflow over 700 tasks instances, using local executor and MySQL. The PIDs are getting killed with no other message. I am now trying with dask, and all seems to run with no errors. But now the dag runs aren’t being scheduled. It just sits there after being triggered.

I followed this:

I also had to set queues to None on the dag and run scheduler with the -do_pickle option.

Id like to try with dask and LSFCluster.

Help, thoughts?

ryw · September 30, 2020, 4:33pm

Our team at Astronomer doesn’t have much if any experience w/ DaskExecutor or LSFCluster.

Have you considered using CeleryExecutor with Kubernetes and KEDA? We have a lot of experience scaling that, and the Celery workers scale to zero.
Also Postgres w/ PgBouncer is a much better DB setup vs. MySQL. With Airflow 2.0, the difference may even become more pronounced, as Postgres has some features that the scheduler upgrades will take advantage of.

Happy to chat about it if you’re interested, feel free to grab a slot on my cal https://calendly.com/ryw/30min

-Ry

teeforb · September 30, 2020, 4:41pm

Thanks for the response! Yes, i’d like to give CeleryExecutor a try. Do you have any straight forward documents for setting Celery up?

ryw · September 30, 2020, 4:59pm

You can test the Airflow Helm chart locally w CeleryWorkers by following this walkthrough https://github.com/apache/airflow/tree/master/chart#walkthrough-using-kind

We’re very Kubernetes-centric at Astronomer, because we love the autoscaling + stability it provides.

For production, we recommend Astronomer Certified (https://www.astronomer.io/docs/ac/v1.10.10/get-started/production/) which you can run yourself for free – or use one of our commercially-supported products Astronomer Enterprise (https://www.astronomer.io/docs/enterprise/) or Astronomer Cloud (https://www.astronomer.io/cloud/).

teeforb · September 30, 2020, 5:04pm

With Celery, how any task instances do you think I can scale to max?

Also, I am not clear on how a messaging queue helps as an executor? Can you elaborate?

teeforb · September 30, 2020, 5:12pm

Also, does Celery work with Kubernetes (hand and hand)?

ryw · September 30, 2020, 5:56pm

Yes we run all our Celery workers on Kubernetes.

ryw · September 30, 2020, 6:00pm

If you use KEDA option, the # of Celery workers will autoscale depending on how many tasks are waiting for work. If you’d like, we could jump on a call to demo this to you.

You can also see Daniel talk about Keda here https://youtu.be/YLsGVFB8Pws?t=1688

teeforb · September 30, 2020, 7:14pm

Maybe we can do a call later today or tomorrow with a cohort of mine.

teeforb · October 5, 2020, 5:54pm

Hi ryw, I had an urgent meeting come up and i wasn’t able to attend today’s session. Do you have any other time slot today? I apologize.

Topic		Replies	Views
Celery or LocalExecutor? Astronomer	4	5594	March 13, 2019
Transitioning to the Celery executer Airflow	1	1699	July 25, 2019
Question about updating DAGs on Astronomer Enterprise	2	2062	October 28, 2020
Tasks stop working after 5 minutes Astronomer Nebula	5	3634	March 24, 2023
CeleryKubernetesExecutor in Astronomer? Astronomer	1	1589	January 18, 2023

Having issues scaling airflow

Related topics