DatabricksSQLOperator ValueError "too many values to unpack"

I’m attempting to read data from Databricks using the DatabricksSQLOperator using the example code here:
https://airflow.apache.org/docs/apache-airflow-providers-databricks/stable/operators/sql.html#selecting-data-into-a-file

My Code: deployed in Docker on localhost using Astro CLI

 select_into_file = DatabricksSqlOperator(
        databricks_conn_id=tokenization_databricks,
        sql_endpoint_name=sql_endpoint_name,
        task_id='extract_tokens_to_file',
        sql="select * from schema.tokenization_input_1000",
        output_path="/tmp/extracted_tokens.csv",
        output_format="csv"
    )

When the results get to DatabricksSQLOperator.execute() method I’m getting a consistent error on line 164

[2022-11-22, 17:07:32 UTC] {databricks_sql.py:161} INFO - Executing: select * from schema.tokenization_input_1000
[2022-11-22, 17:07:32 UTC] {base.py:71} INFO - Using connection ID 'tokenization_databricks' for task execution.
[2022-11-22, 17:07:32 UTC] {databricks_base.py:430} INFO - Using token auth.
[2022-11-22, 17:07:33 UTC] {databricks_base.py:430} INFO - Using token auth.
[2022-11-22, 17:07:34 UTC] {client.py:115} INFO - Successfully opened session b'\x01\xedj\x88(\xb7\x10\xd7\xa2\x16\xf2\t\x0e\xb4\xd9\xe3'
[2022-11-22, 17:07:34 UTC] {sql.py:315} INFO - Running statement: select contributorID, datasetID, PatientID1 from schema.tokenization_input_1000, parameters: None
[2022-11-22, 17:07:37 UTC] {taskinstance.py:1851} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/airflow/providers/databricks/operators/databricks_sql.py", line 164, in execute
    schema, results = cast(List[Tuple[Any, Any]], response)[0]
ValueError: too many values to unpack (expected 2)
  • I’ve verified that my Databricks connection is defined correctly
  • I can see the queries being received in Databricks from “PyDatabricksSqlConnector 2.0.2”
  • I can see Databricks responding with the expected query results in Databricks.
  • I added “do_xcom_push=True” in the Operator call and I do not see data in xcom
  • I created a separate Python script using the databricks-sql-connector to double check the ability to read data to my local machine and it works correctly.
  • Databricks SQL Warehouse version is (v 2022.35)

Thanks,
Bill

FYI: This issue was fixed by upgrading the Databricks Airflow package version to v4.0.0, which released this week:

apache-airflow-providers-databricks==4.0.0
1 Like