Best practice : best place to define variables and functions

gto · April 26, 2021, 8:22pm

Hi
I use to set/define my variables, connections and functions before i instantiate the tasks (outside “with dag”) . But in some dags i notice that sometimes it’s the inverse. So what is the best practice between:

myvar = Variable.get(“var1”)
with DAG … as dag:
task1

vs

with DAG … as dag:
myvar = Variable.get(“var1”)
task1

josh-fell · April 27, 2021, 1:01am

Hi @gto,

For using Variables in your DAGs it is best practice to either use Jinja templating or confining the Variable.get() call to inside an Operator’s execute() method. This is related to reducing, and ideally avoiding, top-level code in your DAG file (i.e. your DAG file should act like a config file and solely define the DAG, tasks, and task dependencies and not contain function calls). More on best practices when using Variables here.

Accessing Variables outside of an Operator’s execute() method without Jinja templating creates a session to query the metadata database in Airflow. This can slow down DAG parsing and put unnecessary load on the metadata database causing a number of performance issues.

For the sake of example, let’s assume task1 is the Task that uses the var1 Variable and task1 is a PythonOperator that executes a function called foo(). Using the two options outlined above:

Access var1 in task1 Operator

from airflow.models import Variable


def foo():
    print(Variable.get("var1"))

Use Jinja templating where op_kwargs is a templated field (see source)

# Function defined outside of the DAG file in foo.py
def foo(bar):
    print(bar)

from foo import foo
...

with DAG(...) as dag:
   task1 = PythonOperator(
       task_id="task1",
       python_callable=foo,
       op_kwargs=dict(bar={{ var.value.var1 }}),
   )

I hope that helps!

gto · April 27, 2021, 9:27pm

Thanks @josh-fell for taking the time to explain.
Its much more practical to use the templating and leverage the execute method of the task.

Topic		Replies	Views
How to manage utility code Airflow	1	4014	May 13, 2021
Question About Xcoms Airflow	1	1531	May 13, 2021
Setting python variables based on execution_date Airflow	4	4381	December 30, 2022
How to pass variable in Python operator function to another one Airflow	1	2438	January 13, 2023
Problem with dynamic DAG creation	1	2030	April 29, 2020

Best practice : best place to define variables and functions

Related topics