We’re trying to trigger a DAG from an external source. I know that Airflow has an API that we can use to do so, but is there a best practice around this that you know of other than using a sensor?
Airflow does have a REST API being developed for external triggering but it’s still considered to be in the “experimental” stage (defined by the core Airflow contributors). For that reason, we wouldn’t recommend it as a production solution at the moment.
We’d suggest either creating a DAG that runs at a more frequent interval (possibly what the poke is set at) and skips downstream tasks if no file is found. If you need absolute real-time, it might be a job better suited for a lambda function.
UPDATE: As of its momentous 2.0 release in December 2020, the Apache Airflow project now supports an official and more robust Stable REST API. Instructions on how to make requests have been added to Astronomer’s “Airflow API” doc
Here’s what I wrote in our own Confluence page around using the REST API. It works pretty well. The links are broken because new users can only put 2 links in a post apparently.
Airflow exposes what they call an “experimental” REST API, which allows you to make HTTP requests to an Airflow endpoint to do things like trigger a DAG run. With Astronomer, you just need to create a service account to use as the token passed in for authentication. Here are the steps:
Create service account in Astronomer deployment. Navigate to https://app.astronomer.cloud/deployments and choose your deployment. Click “Service Accounts” and click “New Service Account”:
Note down the token, because you won’t be able to view it again.
Decide which endpoint to use. In this example, we’ll be using the dag_runs endpoint, which lets you trigger a DAG to run. For more information, the docs are available here: https ://airflow.apache.org/api.html
Using an entry in the Astronomer Forums (https ://forum.astronomer.io/t/hitting-the-airflow-api-through-astronomer/44), we start out with a generic cURL request:
Replace the “AIRFLOW_DOMAIN” value with the value matching your deployment.
Replace the “” with the Service Account token obtained in step 1.
Replace “airflow_test_basiscs_hello_world” with the name of your DAG you want to trigger a run for. This is case sensitive. In our case, we’ll be using “customer_health_score”.
Our new cURL command is now: curl -v -X POST https ://AIRFLOW_DOMAIN/api/experimental/dags/customer_health_score/dag_runs -H ‘Authorization: XXXX’ -H ‘Cache-Control: no-cache’ -H ‘content-type: application/json’ -d ‘{}’
Please note the token is hidden for security.
This will successfully kick off a DAG run for the customer_health_score DAG with an execution_date value of NOW(), which is equivalent to clicking the “Play” symbol from the DAG main screen:
If you would like to choose a specific execution_date value to kick off the DAG for, we can pass that in with the Data parameter’s JSON value ("-d ‘{}’").
The string needs to be in the “YYYY-mm-DDTHH:MM:SS” format, for example: “2016-11-16T11:34:15”.
*More information on this StackOverflow article: https ://stackoverflow.com/questions/50121593/pass-parameters-to-airflow-experimental-rest-api-when-creating-dag-run
*More information on the source code here: https ://github.com/apache/airflow/blob/v1-9-stable/airflow/www/api/experimental/endpoints.py
Because this is just a cURL command executed through the command line, this functionality can also be replicated in a scripting language like Python using the Requests library.
Hi Andrew, I’m pretty sure this is still supported in Astronomer, since we have a DAG in an R&D Astronomer deployment kick off a DAG in an IT Astronomer deployment using this exact method. If this breaks, it will cause downtime for a data set that has a lot of visibility. I haven’t seen any information about ceasing to support this. Can you clarify which versions don’t support this? We’re running Astronomer Cloud with Airflow 1.10.1 at the moment.
Hi @joshuamoore-procore! As you said, making a request to the Airflow API using an Astronomer Service Account works on Astronomer Cloud v0.7.5 (which is what Procore is on).
On Astronomer v0.10, that feature got through the cracks and was not included, but it’s already built out to be re-incorporated in Astronomer v0.11, which will be released to “New” Astronomer Cloud in early January.
Josh, by the time you’re ready to migrate to “New” Cloud, v0.11 should more than likely be in place but let’s make sure that’s the case before you get over. You’ll get an email from us once v0.11 is out, too. The Airflow version you’re on won’t affect this functionality (whether on your current or “New” Cloud).
Ok, good to know! This is exactly what I was looking for. Looks like we’ll just need to upgrade straight to v0.11 sometime in January and we’ll be all set.
Thank you @ virajparekh
Now that i know how to trigger a dag externally -
My goal is to trigger certain dags immediately after a deployment to astronomer. Is there a way to enable this setting through astronomer.
If not, is there is way to get notified when the deployment is done?
I’m able to trigger the dag through REST API from my machine. However, when i do this from github, the request gets redirected.
What could be the reason for this?
An official REST API for Airflow is coming in the Airflow 2.0 release scheduled for Winter 2020 (!). For more information, check out AIP-32 or reach out to us to chat about it.
Actually I figured it out. I just needed to modify the url to reflect the new end point (https://AIRFLOW_DOMAIN/airflow/api/v1/dags/{dag_id}/dagRuns) when upgrading to 2.0.
Quick update here that full instructions on how to make requests to the Airflow 2.0 stable REST API have been added to our “Airflow API” doc (linked above too)
Thanks for bringing this up, @yathor and glad you got it working on your end!
Hi @paola
Can you advice of the best practice: the profred approach - in what circumstances you would use an ExternalTaskSensor rather an an API Trigger? Can you point me to some documentation that compares ExternalTaskSensor vs REST API Trigger?
ExternalTaskSensor is used mainly for cross-DAG dependencies in the same Airflow environment based on task states. One important factor when using this sensor is that the schedule_interval of the parent DAG and child DAG should be same. If they aren’t, you can use either execution_delta or execution_date_fn, but not both.
When using execution_delta, the value you give using timedelta(hours=1) is subtracted from your child DAG’s schedule_interval. The resulting schedule_interval should be equal to the parent DAG’s schedule_interval for sensor condition to be satisfied. For example, if you want to trigger DAG B (schedule_interval= 0 1 * * * ) only on the success of Task 1 of DAG A (schedule_interval= 0 0 * * * ). Then, your ExternalTaskSensor should look like:
You can also verify the execution_date that the sensor is poking in the logs. Another useful parameter of this sensor task is check_for_existence, which when set to True will not unnecessarily wait if the DAG or task don’t exist. See Airflow docs for other parameters and Astronomer docs on how to use this.
Airflow REST API Trigger is particularly useful to setup dependencies across different Airflow environments. ExternalTaskSensor is not applicable across Airflow environments, hence REST API is a good choice if your use-case requires that. However, REST API’s create a DAGrun is similar to TriggerDagRunOperator.
Based on the above info, you can see that ExternalTaskSensor is a conditional sensor task whereas using REST API’s Trigger will create a new DAG run without any conditional check. TriggerDagRunOperator even though functions like the REST API’s create a DAG run, but can implement a conditional check to wait if a DAGRun is already active for the child DAG.