Error installing sagemaker in airflow 1.10 image

I switched my Dockerfile over to the new 1.10.1 image and added sagemaker to requirements.txt, and am seeing the following error:

Step 1/1 : FROM astronomerinc/ap-airflow:0.7.5-1.10.1-onbuild
# Executing 5 build triggers
 ---> Using cache
 ---> Using cache
 ---> Running in 3efe59795a4a
azure-mgmt-nspkg 3.0.2 has requirement azure-nspkg>=3.0.0, but you'll have azure-nspkg 2.0.0 which is incompatible.
docker-compose 1.24.1 has requirement requests!=2.11.0,!=2.12.2,!=2.18.0,<2.21,>=2.6.1, but you'll have requests 2.21.0 which is incompatible.
sagemaker 1.42.6 has requirement boto3>=1.9.213, but you'll have boto3 1.7.84 which is incompatible.
sagemaker 1.42.6 has requirement requests<2.21,>=2.20.0, but you'll have requests 2.21.0 which is incompatible.
Command "/usr/bin/python3.6 -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-xktbq8me/scipy/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-kk58febo/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-xktbq8me/scipy/
You are using pip version 18.1, however version 19.2.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
The command '/bin/sh -c pip install --no-cache-dir -q -r requirements.txt' returned a non-zero code: 1
Error: command 'docker build -t airflow/airflow:latest failed: failed to execute cmd: exit status 1

It’s possible this might be related to a boto3 issue? https://forums.aws.amazon.com/message.jspa?messageID=867403

Looks like it uses scipy that takes FOREVER to install. but please try adding these to your packages.txt file. took over 20 mins on my machine.

gfortran
gcc
python3-dev
lapack-dev

Thanks Andrew.

Yay for over-bloated packages. Ultimately I’m just trying to use the SagemakerOperator in Airflow, so I’m not sure if it actually depends on that sagemaker package. I saw them being used in conjunction in a tutorial, but it’s possible that’s not actually necessary.

Looks like the operator does an import sagemaker, so i think you’ll need that library installed. :frowning:

Yeah, I was hoping that the sagemaker sdk was just being used to generate the config dicts that are passed to the SagemakerOperators, but alas.

Did you happen to get any dependency version clashes? I received the following warnings during image build:

azure-mgmt-nspkg 3.0.2 has requirement azure-nspkg>=3.0.0, but you'll have azure-nspkg 2.0.0 which is incompatible.
docker-compose 1.24.1 has requirement requests!=2.11.0,!=2.12.2,!=2.18.0,<2.21,>=2.6.1, but you'll have requests 2.21.0 which is incompatible.
sagemaker 1.42.8 has requirement boto3>=1.9.213, but you'll have boto3 1.7.84 which is incompatible.
sagemaker 1.42.8 has requirement requests<2.21,>=2.20.0, but you'll have requests 2.21.0 which is incompatible.

Followed by this from a DAG importing the package.

Broken DAG: [/usr/local/airflow/dags/sagemaker-example.py] (requests 2.21.0 (/usr/lib/python3.6/site-packages), Requirement.parse('requests<2.21,>=2.20.0'), {'sagemaker'})

Also, considering the massive build time when including this dependency, might it make sense to install scipy + sagemaker in the Dockerfile so as to benefit from image cacheing in subsequent builds?

Alright, I got this working well by including the following in the Dockerfile:

RUN apk add python3-dev gfortran gcc lapack-dev build-base py3-scipy
RUN pip install sagemaker