Can I use Airflow to open an SSH tunnel into one of our databases?


#1

How can we setup Mongo DB and MySQL connections with an SSH tunnel? When I’m executing my python scripts on my windows server the tunnel into our databases already exists, but it’s not currently a part of our airflow environment.

Do you have an SSH tunnel available by default or would we have to code that into the DAGs/operators?


#2

We’ve had a few customers with certain security restrictions need to open an SSH tunnel as a part of their workflow. Typically, we tell folks to use the standard SSH hook already built into Airflow.

Here’s an example of accessing a Postgres database using SSH: https://github.com/vparekh94/test_ssh_tunnel/tree/master/

If you’re using the example above locally, make sure you have all necessary dependencies. If you see an error like this, for example -

FileNotFoundError: [Errno 2] No such file or directory: 'ssh': 'ssh'

You’ll need to make sure that you’ve added openssl-dev to your packages.txt file and rebuilt the image.

Feel free to also check out a case study on how one of our customers syncs their application database to their data warehouse whilst incorporating SSH tunneling to do so. Descriptions of our product + features are a bit outdated, but it might give you an idea of what your workflows could look like.

If SSH-ing via the hook isn’t an option for you, our Enterprise offering might be worth a look, as it’s a version of our platform deployed directly in your VPC.