Is there a way to trigger a DAG run in the Airflow UI with a custom execution date specified also through the UI?
No there isn’t. If you trigger a DAG run manually via the UI, its execution date will always be the moment you click the button.
Hi @kevinc! Correction here that you CAN trigger a DAG run in the past and specify an execution date via the Airflow Web UI on Airflow versions 1.10.3+ (on Astronomer Cloud).
You should see something like this:
If you’re running vanilla Airflow, keep the following bugs native to Airflow and specific to this feature in mind:
- https://issues.apache.org/jira/browse/AIRFLOW-3191
- https://issues.apache.org/jira/browse/AIRFLOW-4871
Upgrading Versions
To upgrade to 1.10.3 on Astronomer, throw the following in your Dockerfile:
FROM astronomerinc/ap-airflow:0.9.2-1.10.3-onbuild
Backfilling Guidelines
If you’re looking for wider backfilling guidelines, go here.
Hi @paola,
I tried backfilling using the Dag runs -> create option with a previous execution date, but the dag run showed finished without executing any tasks in the UI?
Airflow ver: 1.10.5
Astronomer: EE
hi @paola I am also battling to do this with same experience as @jatinderGG. I note also that the two issues you linked are now resolved so perhaps the post can be updated? Thanks
Hi @jatinderGG. I apologize for the late response.
Can you provide some information about the DAG? I need the following data to understand your situation.
- Start date for the DAG
- The latest dagrun of the DAG
- The execution date did you created
- The existing executions dates
As for @robmarkcole, I’m sorry that you are experiencing the same issue. Can you also provide the same information so I can determine the root cause.
Hi @Alan
in my case the date I wanted to trigger was earlier than the start date, as it was a historical backfill. Its only been 48 hours but already I cannot find the specifics of what I wanted to run, my apologies! I ended up forking my dag and editing it to process all historical data.
Cheers
@jatinderGG, since you are an enterprise customer, I would highly recommend submitting a ticket through our Zendesk portal so we can efficiently communicate and resolve your issue.
@robmarkcole, it is not possible to trigger dagruns before the start_date of the DAG. If you do not care about the backfill being in the same DAG then the approach you took works great. On the other hand, if you do want the dagruns to be in the same DAG, you need to do a little workaround since the backfill begins from the latest dagrun.
If you are interested, I can outline that for you.
@Alan I am interested to hear the workaround, thanks!
This workaround is only needed if
- you do not have a way to run the backfill command
- you want to be able to rerun backfilled tasks
My experience with tasks of dagruns created by the backfill command is that they will not be scheduled again if cleared regardless of the state of the backfill process. On the other hand, if the tasks belongs to a dagrun that is naturally scheduled or triggered, they will rerun after being cleared, which you probably already have done plenty of times.
So without the backfill command, we can easily imitate its behaviour by leveraging the catchup feature. Catchup functions by catching up from the latest run until there are no more missed dagruns, which is the present.
The question you might be asking now is that “My dagrun is already at the present. How can I catchup?”. This is workaround that I mentioned previously. We can trick Airflow that your latest dagrun is right when you want your backfill to start. We will delete all dagruns before your backfill start date from the “DAG Runs” view under Browse. This does not, i repeat, DOES NOT, delete the task instances. They are safely kept in another table in the metadata database. We are only deleting the dagrun data. When the dagruns are recreated the task instances of those dagruns are loaded back like it was never gone!
From Airflow’s perspective, the DAG has only made dagruns till the start of the desired backfill period. With catchup turned on for the DAG, Airflow will start scheduling from that date and as it normally does but only running the tasks that did not already have a state. In the end, you have all the “missing” dagruns created with its tasks ran without affecting existing dagruns and tasks.
Hopefully that makes sense! Please give that a go and let me know if there’s anything I can clarify!
Now, if you just want to have the backfill command but don’t want users to have access to the host terminal, I found this nifty backfill plugin that will open up that command through a view. Looks like it is triggering the backfill command in the background on the webserver host (though that’s probably not ideal).
Regardless, this just shows you that there are many ways to perform backfills because of how customizable Airflow is.
@Alan thanks for your advice, next time I have a backfill to perform I will try it out. As we are on Astronomer we cannot access backfill command, correct? The backfill plugin looks nifty and I did stumble accross it before, would be nice to see it implemented on Astronomer also. Cheers!
That is correct. It is not possible right now for Cloud customer to access airflow CLI. There has been talks about opening the Airflow CLI commands through the Astro CLI but traction has been slow on that front.
However, with the plugin feature Airflow provides, you can expose the Airflow CLI through either an operator (Python subprocesses) or a view, which is how the linked plugin did it.