Inject configuration files in deployments

dcereijodo · November 6, 2019, 1:45pm

In my use case I need to setup configuration files on the worker nodes for my DAGs to function properly. I am using Astronomer Cloud, and as far as I understand the platform, I have currently two options to setup those

Generate the files before or during the image build process in my CI/CD pipeline and COPY them into the image
Create a specific airflow DAG that I can run on demand and that fetches the config and sets up the files in the container.

Putting the config files, with potential secrets on them, in the image does not feel very safe. On the other hand having a full DAG to do the config does not work well neither, as I will have to remember to run it every time I re-deploy. Some other, more convenient approaches that I do not think are possible are

Bootstraping the entrypoint to fetch and setup the configuration on start-up
SSH to the container and put the files there after deployment

What are the alternatives here?

AndrewHarmon · November 8, 2019, 3:23pm

Yes, you are correct that the last two options wouldn’t really work for cloud.you do not have access to ssh into the containers. Typically secrets are stored as Airlfow connections through the airflow UI. Is that a viable solution for you? Here at Astronomer, we’ve been brainstorming ways to pull secrets from other systems such as vaut of aws secrets manager. We don’t have anything on the roadmap yet, but it’s definitely something we are thinking about. So i would say for now i’d recommend baking any non sensitive config settings into your image, and then manually creating Airflow secrets for you sensitive details. We have seen people create a DAG to sync airflow connections from a remote source like vault, but that is not an out of the box solution and you would have to trigger that DAG everytime the secrets chagned. that could be accomplished with the Airflwo REST api i think though.

dcereijodo · November 14, 2019, 10:54am

Storing the secrets in Airflow connections does not work in my case. I need the JSON configuration files with secrets to operate.

In the end we solved the situation using the KubernetesPodOperator for running the tasks that required physical files with credentials in the filesystem on containers whose entrypoint we could control, though if feels a quite complicated workaround…

Have you considered the possibility of supporting post-start scripts? Similarly to what you already do for requirements.txt and packages.txt but with actual bash code that gets executed right after startup. That would be useful I think.

Topic		Replies	Views
What's the best way to get connections into astronomer cloud? Astronomer	4	2457	July 30, 2020
How do I terraform airflow connections? Astronomer Nebula	2	2951	May 14, 2019
Updating variables on remote deployments Astronomer Nebula	6	5952	May 11, 2020
How can I share my connection credentials locally and once deployed on Astronomer? Astronomer Nebula	1	1716	December 4, 2018
Can I programmatically add/manage Airflow Connections in code? Airflow	0	3693	August 21, 2020

Inject configuration files in deployments

Related topics