How do I connect Astronomer Cloud to Databricks?

Many folks using Airflow decide to integrate with Databricks (a managed Apache Spark service) to offload execution of jobs Airflow is responsible for orchestrating. A few notes on doing this with Astronomer Cloud:

1. Create a native Airflow Databricks Connection

You should be able to rely on native airflow connections and operators to connect to Databricks from Astronomer Cloud.

On Airflow, operators off-load auth/connections into the hook object. So, the databricks_operator jumps over to this databricks_hook.

2. Whitelist Astronomer Cloud’s Static IP

We route all Astronomer Cloud traffic through a single NAT gateway, so you’ll have to Whitelist Astronomer’s Static IP Address (35.188.248.243)

Resources

Note: If you’re using Databricks, Astronomer Enterprise might be a compelling solution to consider down the line. With a self-hosted setup, you’d be able to link the right AWS IAM roles to your nodes or peer corresponding VPCs as explained here, instead of whitelisting a Public IP.