How to set up azure keyvault as an external secrets backend on astronomer?

Hello,

I’m trying to find a way to retrieve azure keyvault secrets on my Astronomer airflow instance.

I tried to follow a similar approach from your documentation on Hashicorp Vault & AWS SSM Parameter Store: https://www.astronomer.io/docs/cloud/stable/customize-airflow/secrets-backend
and this documentation from Airflow: Azure Key Vault Backend — apache-airflow-providers-microsoft-azure Documentation.

I’m using the “Service Principal with Secret” credentials to authenticate with DefaultAzureCredential() method. Documentation found here: Azure Identity client library for Python — Azure SDK for Python 2.0.0 documentation

So far it’s not working yet (for context, I’m testing this locally).

Here’s what I have in my docker file (please note that I have the AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_CLIENT_SECRET & AZURE_VAULT_URL saved in my .env file):

FROM quay.io/astronomer/ap-airflow:2.0.0-buster-onbuild
ENV AZURE_CLIENT_ID=$AZURE_CLIENT_ID
ENV AZURE_TENANT_ID=$AZURE_TENANT_ID
ENV AZURE_CLIENT_SECRET=$AZURE_CLIENT_SECRET
ENV AZURE_VAULT_URL=$AZURE_VAULT_URL

ENV AIRFLOW__SECRETS__BACKEND="airflow.providers.microsoft.azure.secrets.azure_key_vault.AzureKeyVaultBackend"

ENV AIRFLOW__SECRETS__BACKEND_KWARGS='{"connections_prefix": "AIRFLOW-CONNECTIONS", "variables_prefix": "AIRFLOW-VARIABLES", "vault_url": $AZURE_VAULT_URL, "AZURE_CLIENT_ID": $AZURE_CLIENT_ID, "AZURE_TENANT_ID":$AZURE_TENANT_ID, "AZURE_CLIENT_SECRET":$AZURE_CLIENT_SECRET}'

Here’s what I have in my dag file:

from airflow.operators.python_operator import PythonOperator
from datetime import datetime
from airflow.hooks.base_hook import BaseHook
from airflow.contrib.secrets.azure_key_vault import AzureKeyVaultBackend


def get_secrets(**kwargs):
    variable = AzureKeyVaultBackend.get_variable(kwargs["var_name"])
    print(variable)

with DAG('test_keyvault_dag', start_date=datetime(2021, 3, 3), schedule_interval=None) as dag:

    test_task = PythonOperator(
        task_id='test-task',
        python_callable=get_secrets,
        op_kwargs={'var_name': 'TEST'}
 )

I’d love to get some feedback on how I can make this work. Thanks in advance :slight_smile:

Hello, just following up on this. Any help would be really appreciated. Thank you :slight_smile:

Hi,
I just have tested it with Azure Vault. The way I did it was almost the same like you. In my case I don’t have the credentials in AIRFLOW__SECRETS_BACKEND_KWARGS because they are already present as env vars (only the vault_url is there for me). Then I have added apache-airflow-providers-microsoft-azure to the requirements.txt. Afterwards I was able to access the var “hello” which I have stored in the Key Vault as <variables_prefix>-hello. I think you don’t need to use the AzureKeyVaultBackend but can just use the general Variable.get()

1 Like

Thank you! Your suggestion worked.

In case anybody else is looking to see how this works:

  • Add this to the docker file:
    ENV AIRFLOW__SECRETS__BACKEND="airflow.providers.microsoft.azure.secrets.azure_key_vault.AzureKeyVaultBackend"
    ENV AIRFLOW__SECRETS__BACKEND_KWARGS='{"connections_prefix": "AIRFLOW-CONNECTIONS", "variables_prefix": "AIRFLOW-VARIABLES", "vault_url": $AZURE_KEYVAULT_URL}'

  • Add this to the requirements.txt file:
    apache-airflow-providers-microsoft-azure

  • Add this to the .env file (for local development)
    AZURE_CLIENT_ID=your_client_id
    AZURE_TENANT_ID=your_tenant_id
    AZURE_CLIENT_SECRET=your_client_secret
    AZURE_KEYVAULT_URL="https://your_keyvault_name.vault.azure.net"

  • I made the secret names in the keyvault to begin with AIRFLOW-VARIABLES or AIRFLOW-CONNECTIONS (e.g., AIRFLOW-VARIABLES-API-KEY) and retrieve it using Variable.get("API-KEY")

2 Likes