The Airflow UI isn't updating after I deploy, "503 Service Temporarily Unavailable"

If you’ve recently pushed up a deploy but aren’t seeing the Airflow UI properly render your changes, it’s very likely an indicator that your Airflow Webserver (responsible for rendering task state and task execution logs in the Airflow UI) is having some trouble.

If you’ve recently upgraded, Airflow 1.10 is also a bit greedier on the Webserver than 1.9.

Symptoms

  • Recurring 503 errors when accessing the Airflow UI
  • You pushed up a new DAG and it’s not showing up
  • Your DAG files are correctly in your dags folder, but you’re getting an i symbol in the Airflow UI that reads, “This DAG isnt available in the webserver DagBag object”
  • Your about > version page is wrong (reads 1.9 but should be 1.10)
  • You’re getting a spinning wheel in the UI to the right of Recent Tasks, DAG Runs, or the Links menu isn’t properly rendered
  • You’re seeing a Mushroom Cloud and AttributeError: 'NoneType' object has no attribute 'create_dagrun'as seen here

Fixes

1. Bump up the # of AU’s allocated to your Webserver.

If they’re at the default 2AU, try bumping them up to ~4-5 AU. That’ll trigger a restart, so refresh your Airflow UI to check for changes. If your deployment has a lot of DAGs to parse through, this is especially important.

To do this, go to: app.astronomer.cloud > deployment > Configure

2. Make sure you’re not running a ton of toplevel code.

If you’re making API calls, JSON requests or database requests outside of an operator at a high frequency, your Webserver is much more likely to timeout.

When Airflow interprets a file to look for any valid DAGs, it first runs all code at the top level (i.e. outside of operators) immediately. Even if the operator itself only gets executed at execution time, everything called _outside of an operator is called every heartbeat, which can be quite taxing.

We’d recommend taking the logic you have currently running outside of an operator and moving it inside of a python_operator.

HI Paola,
We were running 10 AU with 5 dags;
After deploying code it is not updated;

After every deployment does the pods are recreated?

I bumped up to 20 AU and when the pods came back ; it was updated.
I have raised a support request for this.

Are you indicating that everytime we do a deploy we need to change the extra capacity values ; so that dag changes are picked up?