Many folks using Airflow decide to integrate with Databricks (a managed Apache Spark service) to offload execution of jobs Airflow is responsible for orchestrating. A few notes on doing this with Astronomer Cloud:
1. Create a native Airflow Databricks Connection
You should be able to rely on native airflow connections and operators to connect to Databricks from Astronomer Cloud.
2. Whitelist Astronomer Cloud’s Static IP
We route all Astronomer Cloud traffic through a single NAT gateway, so you’ll have to Whitelist Astronomer’s Static IP Address (
- “Integrating Apache Airflow with Databricks” by Databricks
- “Whitelist IP Addresses” by Databricks
- VPC Access Doc on Astronomer
Note: If you’re using Databricks, Astronomer Enterprise might be a compelling solution to consider down the line. With a self-hosted setup, you’d be able to link the right AWS IAM roles to your nodes or peer corresponding VPCs as explained here, instead of whitelisting a Public IP.