Can I use the Airflow REST API to externally trigger a DAG?

We’re trying to trigger a DAG from an external source. I know that Airflow has an API that we can use to do so, but is there a best practice around this that you know of other than using a sensor?

August 2022: Our most up-to-date documentation, including examples for triggering a DAG run, is here: Make requests to the Airflow REST API.

Airflow does have a REST API being developed for external triggering but it’s still considered to be in the “experimental” stage (defined by the core Airflow contributors). For that reason, we wouldn’t recommend it as a production solution at the moment.

We’d suggest either creating a DAG that runs at a more frequent interval (possibly what the poke is set at) and skips downstream tasks if no file is found. If you need absolute real-time, it might be a job better suited for a lambda function.

UPDATE: As of its momentous 2.0 release in December 2020, the Apache Airflow project now supports an official and more robust Stable REST API. Instructions on how to make requests have been added to Astronomer’s “Airflow API” doc :slightly_smiling_face:

Here’s what I wrote in our own Confluence page around using the REST API. It works pretty well. The links are broken because new users can only put 2 links in a post apparently.

Airflow exposes what they call an “experimental” REST API, which allows you to make HTTP requests to an Airflow endpoint to do things like trigger a DAG run. With Astronomer, you just need to create a service account to use as the token passed in for authentication. Here are the steps:

Create service account in Astronomer deployment. Navigate to https://app.astronomer.cloud/deployments and choose your deployment. Click “Service Accounts” and click “New Service Account”:

  • Note down the token, because you won’t be able to view it again.
  • Decide which endpoint to use. In this example, we’ll be using the dag_runs endpoint, which lets you trigger a DAG to run. For more information, the docs are available here: https ://airflow.apache.org/api.html
  • Using an entry in the Astronomer Forums (https ://forum.astronomer.io/t/hitting-the-airflow-api-through-astronomer/44), we start out with a generic cURL request:
  • curl -v -X POST https ://AIRFLOW_DOMAIN/api/experimental/dags/airflow_test_basics_hello_world/dag_runs -H -H 'Authorization: ’ -H ‘Cache-Control: no-cache’ -H ‘content-type: application/json’ -d ‘{}’
  • Replace the “AIRFLOW_DOMAIN” value with the value matching your deployment.
  • Replace the “” with the Service Account token obtained in step 1.
  • Replace “airflow_test_basiscs_hello_world” with the name of your DAG you want to trigger a run for. This is case sensitive. In our case, we’ll be using “customer_health_score”.
  • Our new cURL command is now: curl -v -X POST https ://AIRFLOW_DOMAIN/api/experimental/dags/customer_health_score/dag_runs -H ‘Authorization: XXXX’ -H ‘Cache-Control: no-cache’ -H ‘content-type: application/json’ -d ‘{}’
  • Please note the token is hidden for security.
  • This will successfully kick off a DAG run for the customer_health_score DAG with an execution_date value of NOW(), which is equivalent to clicking the “Play” symbol from the DAG main screen:
  • If you would like to choose a specific execution_date value to kick off the DAG for, we can pass that in with the Data parameter’s JSON value ("-d ‘{}’").
  • The string needs to be in the “YYYY-mm-DDTHH:MM:SS” format, for example: “2016-11-16T11:34:15”.
  • *More information on this StackOverflow article: https ://stackoverflow.com/questions/50121593/pass-parameters-to-airflow-experimental-rest-api-when-creating-dag-run
  • *More information on the source code here: https ://github.com/apache/airflow/blob/v1-9-stable/airflow/www/api/experimental/endpoints.py
  • Our new command becomes: curl -v -X POST https ://AIRFLOW_DOMAIN/api/experimental/dags/customer_health_score/dag_runs -H ‘Authorization: XXXX’ -H ‘Cache-Control: no-cache’ -H ‘content-type: application/json’ -d ‘{“execution_date”:“2019-03-05T08:30:00”}’

Because this is just a cURL command executed through the command line, this functionality can also be replicated in a scripting language like Python using the Requests library.

7 Likes

This is great. Thanks @joshuamoore-procore!

note, this no longer works with more recent versions of Astronomer, but it is a feature we are looking to add again soon.

Hi Andrew, I’m pretty sure this is still supported in Astronomer, since we have a DAG in an R&D Astronomer deployment kick off a DAG in an IT Astronomer deployment using this exact method. If this breaks, it will cause downtime for a data set that has a lot of visibility. I haven’t seen any information about ceasing to support this. Can you clarify which versions don’t support this? We’re running Astronomer Cloud with Airflow 1.10.1 at the moment.

Hi @joshuamoore-procore! As you said, making a request to the Airflow API using an Astronomer Service Account works on Astronomer Cloud v0.7.5 (which is what Procore is on).

On Astronomer v0.10, that feature got through the cracks and was not included, but it’s already built out to be re-incorporated in Astronomer v0.11, which will be released to “New” Astronomer Cloud in early January.

Josh, by the time you’re ready to migrate to “New” Cloud, v0.11 should more than likely be in place but let’s make sure that’s the case before you get over. You’ll get an email from us once v0.11 is out, too. The Airflow version you’re on won’t affect this functionality (whether on your current or “New” Cloud).

Ok, good to know! This is exactly what I was looking for. Looks like we’ll just need to upgrade straight to v0.11 sometime in January and we’ll be all set.

2 Likes

Does the external dag trigger work on the new cloud?
I’m running Version : [1.10.7+astro.8]

What would the AIRFLOW_DOMAIN look like in this case?

When i tried, it redirects me back to the login page just like in this post. Hitting the Airflow API through Astronomer

Hi @papanash - sorry for the delay.

You can use this doc:

Thank you. that worked!

Thank you @ virajparekh
Now that i know how to trigger a dag externally -
My goal is to trigger certain dags immediately after a deployment to astronomer. Is there a way to enable this setting through astronomer.

If not, is there is way to get notified when the deployment is done?

@paola: Related to the my previous question.

I’m able to trigger the dag through REST API from my machine. However, when i do this from github, the request gets redirected.
What could be the reason for this?

https://app.gcp0001.us-east4.astronomer.io:443 GET /login?rd=https://deployments.gcp0001.us-east4.astronomer.io

PS:I’m trying the same request with the authorization header.

Hi all, quick update here that we now have an official doc for how to call Airflow’s experimental REST API on Astronomer:

An official REST API for Airflow is coming in the Airflow 2.0 release scheduled for Winter 2020 (!). For more information, check out AIP-32 or reach out to us to chat about it.

Hey Paola, just wondering where I can find instructions to trigger a dag run via the airflow REST API in airflow 2.0 on astronomer. Thanks!

1 Like

Actually I figured it out. I just needed to modify the url to reflect the new end point (https://AIRFLOW_DOMAIN/airflow/api/v1/dags/{dag_id}/dagRuns) when upgrading to 2.0.

So far we do not have any documentation on how to use the new API (coming soon)

The process itself is straight forward! Update the url to use the endpoint specified in the Airflow documentation on the Stable REST API.

Here are two examples for anyone wondering.

curl -X GET \
https://<host>/<release>/airflow/api/v1/config \
-H 'Authorization: <service_account_token>' \
-H 'Cache-Control: no-cache'
curl -X POST \\
https://<host>/<release>/airflow/api/v1/dags/test/dagRuns \
-H 'Authorization: <service_account_token>' \
-H 'Cache-Control: no-cache' \
-H 'content-type: application/json' \
-d '{"execution_date": "2020-01-01T00:00:00"}'
1 Like

Quick update here that full instructions on how to make requests to the Airflow 2.0 stable REST API have been added to our “Airflow API” doc (linked above too) :slightly_smiling_face:

Thanks for bringing this up, @yathor and glad you got it working on your end!

August 2022: Our most up-to-date documentation, including examples for triggering a DAG run, is here: Make requests to the Airflow REST API.

2 Likes

Hi @paola
Can you advice of the best practice: the profred approach - in what circumstances you would use an ExternalTaskSensor rather an an API Trigger? Can you point me to some documentation that compares ExternalTaskSensor vs REST API Trigger?

Hey @amaheshwari

Welcome the Astronomer forum for Airflow!

ExternalTaskSensor is used mainly for cross-DAG dependencies in the same Airflow environment based on task states. One important factor when using this sensor is that the schedule_interval of the parent DAG and child DAG should be same. If they aren’t, you can use either execution_delta or execution_date_fn, but not both.

When using execution_delta, the value you give using timedelta(hours=1) is subtracted from your child DAG’s schedule_interval. The resulting schedule_interval should be equal to the parent DAG’s schedule_interval for sensor condition to be satisfied. For example, if you want to trigger DAG B (schedule_interval= 0 1 * * * ) only on the success of Task 1 of DAG A (schedule_interval= 0 0 * * * ). Then, your ExternalTaskSensor should look like:

check_dag_a_task_1 = ExternalTaskSensor(
    task_id = "check_dag_a_task_1",
    external_dag_id="dag_a",
    external_task_id = "end_task",
    execution_delta= timedelta(hours=1),
)

You can also verify the execution_date that the sensor is poking in the logs. Another useful parameter of this sensor task is check_for_existence, which when set to True will not unnecessarily wait if the DAG or task don’t exist. See Airflow docs for other parameters and Astronomer docs on how to use this.

Airflow REST API Trigger is particularly useful to setup dependencies across different Airflow environments. ExternalTaskSensor is not applicable across Airflow environments, hence REST API is a good choice if your use-case requires that. However, REST API’s create a DAGrun is similar to TriggerDagRunOperator.

Based on the above info, you can see that ExternalTaskSensor is a conditional sensor task whereas using REST API’s Trigger will create a new DAG run without any conditional check. TriggerDagRunOperator even though functions like the REST API’s create a DAG run, but can implement a conditional check to wait if a DAGRun is already active for the child DAG.

To see other methods, see Managing cross-DAG dependencies.

Hope this helps!

Thanks
Manmeet