Airflow openning a lot of connections

I’m having problems with Apache Airflow.

When it initiates, it logs a lot of connections such as:

[2019-12-04 17:26:20,578] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,584] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,600] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,612] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,613] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,612] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,637] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,638] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,641] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,652] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,661] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,667] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,674] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,683] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,691] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,702] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,712] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,713] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,720] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,749] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,758] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,771] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,787] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,812] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,838] {{base_hook.py:83}} INFO - Using connection to: my_host
[2019-12-04 17:26:20,839] {{base_hook.py:83}} INFO - Using connection to: my_host

And my database connections are over 50 ~ 60, I have only 12 dags. I already checked on my code to find if I wasn’t closing, but I am.

Here is a picture from de Dags,

And here the lot of open Connections

I’m using PostgresHook to make the connections, here is a example code for how I make the connections:

# Recebendo variáveis de ambiente
src_con_id = os.getenv('A_ID')
database_src = os.getenv('A_DATABASE')

dest_con_id = os.getenv('B_ID')
database_redshift = os.getenv('B_DATABASE')

# SEMPRE COLOCAR CERTO O SCHEMA, O SRC É DA ONDE VEM A INFORMAÇÃO E O DEST É PRA ONDE VAI
table_schema_src = os.getenv('A_SCHEMA_DOMINIO')
table_schema_dest = os.getenv('B_SCHEMA_A_DOMINIO')

# Conexão src para a criação das tabelas
src_conn_teste = PostgresHook(postgres_conn_id=src_con_id, schema=database_src).get_conn()
dest_conn_teste = PostgresHook(postgres_conn_id=dest_con_id, schema=database_redshift).get_conn()

# Cursor para criação das tabelas
src_cursor_teste = src_conn_teste.cursor()
dest_cursor_teste = dest_conn_teste.cursor()

is this code in your DAG file or in an operator. The scheduler parses the dag file every 5 secs. So if this code is in there, it could be executed every 5 secs. If this code is inside an operator or a function called by the python operator, then it should only be executed when the task is run

1 Like

Ohhh, do you have an exemple of how can i open connections using an operator?