Backfilling guidelines

If you’re looking to trigger a DAG run in the past, you might be considering a backfill. You have 3 primary options to do so:

1. Airflow CLI backfill command

If you’re developing locally on Astronomer Cloud (or at all on Astronomer Enterprise/native Airflow), you might leverage the Airflow CLI’s backfill command, which allows you to run subsections of a DAG for a specified date range.

Note: Access to the Airflow CLI on Astronomer Cloud is limited to local development, but it’s on our roadmap to support using it for a remote deployment as well. More info here.

2. Set Past Execution Date in the Airflow UI

As of Airflow 1.10.3, you actually are able to set a specified execution date via the Airflow UI by going to Browse -> DAG Runs.

Guidelines here.

3. Manually Trigger Runs

If you’re running Airflow 1.10.2 or earlier and don’t have access to the Airflow CLI, you’ll have to resort to changing your DAG’s start_date and then manually triggering the runs.

The start_date is only inspected by airflow for creating the very first DAG run - after it’s created one it just looks at the latest dag_run and then adds schedule_interval to it.

. Generally, we recommend the following:

  • Create dag with desired (real) start date
  • Enable it (turn it on), so that Airflow starts scheduling DAG runs
  • Wait until at least one scheduled run has been created
  • Re-deploy with the start_date being the earliest date you want to backfill for
  • Use the Airflow API to trigger dag runs with specific execution dates (guidelines here)
2 Likes