Airflow takes longer to execute

vinod-morgtagekart · September 20, 2021, 10:04am

I am using airflow version 2.1.3. My use case is to schedule tasks one after another and I require an orchestrator to manage those tasks. I am giving the input to airflow’s DAG using a REST API feature of airflow. My use-case requires the answer in real-time (external API call), as well the batch-processing should happen on a daily basis (REST API call through one DAG to trigger another DAG)
For batch-processing, the time is not a constrain as it doesn’t matter even though more time is taken. But for real-time, the execution time matters which slow (About 6 secs for simple Python code).

I have divided the Airflow DAG into 4 tasks. The scheduling between the tasks is very slow. The individual execution time for tasks is fast. Therefore, I wrote all the code in one monolithic block, which significantly reduces the time to 1 sec. But, this doesn’t let me take the full advantage of airflow.

I want to know - ‘How can I reduce the whole execution time of the DAG by dividing it into tasks’. The configuration parameters that I am using are -

AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
AIRFLOW__CORE__FERNET_KEY: ''
AIRFLOW__SCHEDULER__MAX_THREADS: 4
AIRFLOW__SCHEDULER__SCHEDULER_HEARTBEAT_SEC: 1
AIRFLOW__LOGGING__LOGGING_LEVEL: DEBUG
AIRFLOW__SCHEDULER__MIN_FILE_PROCESS_INTERVAL: 60
AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL: 60
AIRFLOW__OPERATORS__DEFAULT_CPUS: 2
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
AIRFLOW__CORE__LOAD_EXAMPLES: 'false'

david · September 20, 2021, 2:09pm

Hello, the LocalExecutor is not recommended for production for this exact reason. It is slow, only has one worker, and cannot run tasks in parallel. For time sensitive tasks the Celery executor is recommended. It runs tasks much faster and can run tasks in parallel. This guide gives a good overview of the different executors available. This airflow doc gives an overview on how to use the Celery Executor.

Topic		Replies	Views
Scheduler takes long time to run a DAG Airflow	3	2714	July 13, 2021
Airflow DAG Run Delay: running State Persists After Tasks Complete	0	30	November 8, 2024
Task take very long and log printing out same line over and over again Airflow	3	2686	August 9, 2019
Which tasks are constantly running on Airflow? Airflow airflow	1	1299	April 7, 2023
scheduling an dag to run 3 times a day Airflow	4	6642	September 7, 2020

Airflow takes longer to execute

Related topics