How do I whitelist Astronomer Cloud on AWS Redshift?

There are two requirements to allow Cloud to execute queries on your Redshift cluster.

Step 1: Make sure your Redshift Cluster is publicly accessible

If you didn’t do this on setup, it’s easy to modify. You can configure this by going into the Redshift section of your AWS Console, choosing the relevant Cluster and clicking “Modify Cluster”.

From there, toggle the “Publicly Accessible” option to “Yes” and click Modify.

Step 2: Whitelist the Cloud IP address
Even though you’ve setup your Redshift to be publicly accessible, you’ll still want to limit where statements can be executed from. With Astronomer, all queries will come from the same IP address:

You can do this for your Redshift Cluster by going to “Security” and, depending on the specifics of your AWS account, click on “Go to the EC2 Console”.

From there, click into the “Inbound” section of the relevant Security Group (which can be confirmed in the Cluster Profile page you were previously on in the “VPC security groups” section.

Open up the Inbound rules by clicking “Edit”, add the Cloud IP address and click Save.

Give your cluster a minute to update and then test access from within any Airflow deployment.

Step 3: Test a query
Because Redshift uses the same drivers as Postgres, you can add a connection to Airflow using the same methods as any other Postgres db. Simple go to your Admin/Connections in the top menu bar and click “Create”.

Pick a recognizable Conn Id (anything that will help you remember) and choose “Postgres” as the Conn Type. Add in the endpoint that was generated for you when you created the cluster as the Host.

The schema will be the value of “Database Name” in “Cluster Database Properties” section of your Redshift cluster configuration.

Add in the username and password for whatever user you want to execute the queries and set the port to 5439 (rather than 5432 as in a normal Postgres db).

After saving your connection, go to Data Profiling/Ad Hoc Query from the top menu bar and choose the Redshift connection you just created. Run a simple query to make sure it’s all running properly. If it is, you’re done!