Airflow takes longer to execute

I am using airflow version 2.1.3. My use case is to schedule tasks one after another and I require an orchestrator to manage those tasks. I am giving the input to airflow’s DAG using a REST API feature of airflow. My use-case requires the answer in real-time (external API call), as well the batch-processing should happen on a daily basis (REST API call through one DAG to trigger another DAG)
For batch-processing, the time is not a constrain as it doesn’t matter even though more time is taken. But for real-time, the execution time matters which slow (About 6 secs for simple Python code).

I have divided the Airflow DAG into 4 tasks. The scheduling between the tasks is very slow. The individual execution time for tasks is fast. Therefore, I wrote all the code in one monolithic block, which significantly reduces the time to 1 sec. But, this doesn’t let me take the full advantage of airflow.

I want to know - ‘How can I reduce the whole execution time of the DAG by dividing it into tasks’. The configuration parameters that I am using are -

AIRFLOW__CORE__EXECUTOR: LocalExecutor
AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow
AIRFLOW__CORE__FERNET_KEY: ''
AIRFLOW__SCHEDULER__MAX_THREADS: 4
AIRFLOW__SCHEDULER__SCHEDULER_HEARTBEAT_SEC: 1
AIRFLOW__LOGGING__LOGGING_LEVEL: DEBUG
AIRFLOW__SCHEDULER__MIN_FILE_PROCESS_INTERVAL: 60
AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL: 60
AIRFLOW__OPERATORS__DEFAULT_CPUS: 2
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION: 'true'
AIRFLOW__CORE__LOAD_EXAMPLES: 'false'

Hello, the LocalExecutor is not recommended for production for this exact reason. It is slow, only has one worker, and cannot run tasks in parallel. For time sensitive tasks the Celery executor is recommended. It runs tasks much faster and can run tasks in parallel. This guide gives a good overview of the different executors available. This airflow doc gives an overview on how to use the Celery Executor.