Tasks stop working after 5 minutes

Hey guys,

Somehow it looks like that no matter what I do, after a few minutes nothing works anymore. At first everything starts as usual, but then no more tasks start.
I have already re-deployed via the cli and changed the value AIRFLOW__SCHEDULER__RUN_DURATION. Unfortunately nothing helps.
All tasks are stuck.

Can you help me with this?

image

Cheers, Johannes

If this is on Astronomer Cloud and still an issue, send a support ticket in support@astronomer.io and we can investigate.

Hey, unfortunately, it looks like the problem is back.

For a while, things were good in my astronomer deployment. Now that the number of dags (dynamically created) is growing, I seem to be having the same problem again. The scheduler is stuck and in the logs I permanently get the message “{scheduler_job.py:214} WARNING - Killing PID 5866”. I have already increased the scheduler to 20 AU and also the “AIRFLOW__CORE__DAGBAG_IMPORT_TIMEOUT” to 60. But it seems to be of no use.

I really hope you can help me!


Sorry to hear this, have you filed a support issue to support@astronomer.io? We can dig in and see what’s going on.

Thanks for your answer. I haven’t filed an issue yet. But I hope I have already solved the problem.
My deployment was still running on 1.10.5 and I use some Google Bigquery operators. In this version there was logging inside the init function. This seems to have been the problem.
I have now updated to version 1.10.7 and everything works again.

Cheers :slight_smile:

1 Like

I had a similar issue on airflow 1.10 on top of kubernetes.

Restarting all the management node and worker nodes solve the issue. They were running for one year, without reboot. It seems we need frequent maintenance reboot for all kubernetes nodes to prevent such issues.