Using Python 3.11 with AWS Lambda

2023/02/28

#software #python #development #docker #serverless #aws

Table of contents

Python Lambda

A Python Problem

AWS Lambda has a Python offering that is limited to version 3.9. It is challenging to use that version due to many things, but primarily because package maintainers do not necessarily cater to users who use older versions. And there is more. When you work with Python, what you must realize is that you usually have C, C++, Rust, or libc dependencies not to mention CPU architectures. The most useful Python libraries are written in C, C++, Fortran, or Rust. So when you try to deploy your code to a less used platform, it can burst into flames in the worst kind of ways.

I have spent more hours trying to get some Python lib work on a platform than learning Rust.

The last nail in the coffin of trying to use Python 3.9 on AWS was when a library was using Rust that was compiled against a newer version of libc that AWS has. That triggered the creation of a new solution for deploying Python to AWS Lambda so that we do not need to care about these issues anymore.

Docker for the Rescue

By using a Docker image as the source for your Lambda, you can simplify your deployment. Docker allows you to package your application and its dependencies into a single container. In addition, you can deploy your entire application stack with a single command, making the deployment process simpler.

Docker provides a consistent environment for your application to run in, regardless of the underlying infrastructure. By using a Docker image as the source for your AWS Lambda function, you can ensure that your application will always run in the same environment, regardless of where it is deployed. Using a Docker image ensures that your application stack, including the operating system, libraries, and dependencies, is consistent across all environments. This consistency ensures that your application will behave the same way regardless of where it is deployed, which reduces the risk of errors or unexpected behavior.

How to Do It

  1. Create, build, and upload a base image with the required Python version
  2. Create, build, and upload an application-specific image
  3. Set up and deploy the Lambda function with Terraform

Base Dockerfile

This Dockerfile constitutes our base image from which we build our application-specific image. This provides a stable environment for the app. To make our end image smaller, we start a multi-phase build. We specify the Python version we require for our runtime in the FROM lines. Then, we install the AWS Lambda Runtime Interface Client to guarantee communication between the Lambda environment and our code. In the second phase, we only need to copy the runtime interface client to the completed image.

ARG FUNCTION_DIR="/var/task"

FROM python:3.11.2-slim-bullseye as build-image
ARG FUNCTION_DIR

RUN mkdir -p ${FUNCTION_DIR} && mkdir -p /venv
RUN useradd -m -u 5000 lambda || :
RUN chown lambda ${FUNCTION_DIR} && chown lambda /venv
RUN apt-get update && \
  apt-get install -y --no-install-recommends \
  g++ \
  make \
  cmake \
  unzip \
  libcurl4-openssl-dev && \
  apt-get clean && \
  rm -rf /var/lib/apt/lists/*
USER lambda
RUN python -m venv /venv
ENV PATH="/venv/bin:$PATH"
RUN which pip
RUN pip install pip --upgrade
RUN pip install awslambdaric

FROM python:3.11.2-slim-bullseye
RUN mkdir -p /venv
RUN useradd -m -u 5000 lambda || :
RUN chown lambda /venv
USER lambda
COPY --from=build-image /venv /venv
ENV PATH="/venv/bin:$PATH"
RUN pip list
ENTRYPOINT [ "python", "-m", "awslambdaric" ]

As you can see we use a user instead of using root for running our app. Running your application as a non-root user is recommended. In the context of AWS Lambda security is probably not that big of a deal but following the principle of least privilege (PoLP) is a good idea.

We use Ninja for building and uploading images. You can find more about this here: Misusing Ninja. After being built, the base image is uploaded to AWS ECR, so it can be fetched for the application-specific image. This is our build.ninja file:

aws_account_id = xxxxxxxxxx
python_version = 3.11.2
repo = my-repo

rule login-to-ecr
  command = aws ecr get-login-password --region eu-west-1 | docker login --username AWS --password-stdin ${aws_account_id}.dkr.ecr.eu-west-1.amazonaws.com
  description = Logging into ECR

rule build-aws-lambda-base
  command = docker build . -t ${repo}:lambda-py-${python_version} --file Dockerfile.aws_lambda_python_${python_version}
  description = Building ${repo}:lambda-py-${python_version}

rule tag-aws-lamda-base
  command = docker tag ${repo}:lambda-py-${python_version} ${aws_account_id}.dkr.ecr.eu-west-1.amazonaws.com/${repo}:lambda-py-${python_version}
  description = Tagging ${repo}:lambda-py-${python_version}

rule push-aws-lamda-base
  command = docker push ${aws_account_id}.dkr.ecr.eu-west-1.amazonaws.com/${repo}:lambda-py-${python_version}
  description = Push ${repo}:lambda-py-${python_version}

build login-to-ecr: login-to-ecr
build build-aws-lambda-base: build-aws-lambda-base || login-to-ecr
build tag-aws-lamda-base: tag-aws-lamda-base || build-aws-lambda-base
build push-aws-lamda-base: push-aws-lamda-base || tag-aws-lamda-base

default login-to-ecr build-aws-lambda-base tag-aws-lamda-base push-aws-lamda-base

Let’s run it:

ninja -f build.ninja

We have the base image in ECR. This gives us control over what goes into the application image later. For immutable infrastructure it is paramount to use a base image that is not a moving target, meaning you do not run apt-get update or a similar command every time you build an application-specific container.

Application-Specific Dockerfile

This is the Dockerfile for the application that runs inside the Lambda function as a container. The base image, that has just been created, is referenced on the first line.

FROM <aws_account_id>.dkr.ecr.eu-west-1.amazonaws.com/<repo>:lambda-py-3.11.2
ARG FUNCTION_DIR="/var/task"
ENV PATH="/venv/bin:$PATH"
USER lambda
WORKDIR ${FUNCTION_DIR}
ADD app .
COPY pyproject.toml .
COPY README.md .
RUN pip install .
RUN pip list
CMD ["app.handler"]

In this file when we install the python packages we usually lock the versions to a specific one, again, we would like to have immutable infra, running the build on different computers should result in the same image.

Our directory looks like the following:

|__app/
| |__app.py
| |__pyproject.toml
| |__README.MD
|
|__Dockerfile
|__build.ninja

If you have a more complicated application there are many more files in the app folder.

Our entry point is the function called handler that is inside the app.py. CMD specifies where is the Lambda handler and will be used by the ENTRYPOINT of the base image.

This image is built similarly to the base image: with Ninja. This is our build.ninja file for the app-specific image:

version   = 0.6.1
aws_account_id = xxxxxxxx
repo = my-repo


rule login-to-ecr
  command = aws ecr get-login-password --region eu-west-1 | docker login --username AWS --password-stdin ${aws_account_id}.dkr.ecr.eu-west-1.amazonaws.com
  description = Logging in to ECR

rule build-image
  command = docker build . -t ${repo}:backend-api-${version} --file Dockerfile
  description = Building ${repo}:backend-api-${version}

rule tag-image
  command = docker tag ${repo}:backend-api-${version} ${aws_account_id}.dkr.ecr.eu-west-1.amazonaws.com/${repo}:backend-api-${version}
  description = Tagging ${repo}:backend-api-${version}

rule upload-image
  command = docker push ${aws_account_id}.dkr.ecr.eu-west-1.amazonaws.com/${repo}:backend-api-${version}
  description = Push ${repo}:backend-api-${version}


build login-to-ecr: login-to-ecr
build build-image: build-image || login-to-ecr
build tag-image: tag-image || build-image
build upload-image: upload-image || tag-image

default login-to-ecr build-image tag-image upload-image

Let’s run it:

ninja -f build.ninja

Now there is an image in ECR that hosts our application. You can very easily move back and forth between different base image versions, try out new libraries, or update your application. Because we deploy the same image to dev and prod you can gain confidence about the version that you are working on as it passes through different stages from local to dev and finally to prod.

I have left out testing and formatting, linting from the build process because it would be too much to display the complete build file with those steps. We use pytest, Ruff, and Black for most of these tasks. We also implemented integration tests in pytest for more thorough testing.

Lambda Function with Terraform

Our final task is to deploy to AWS Lambda. We are using Terraform to achieve this. The package_type must be Image and the image URI needs to be provided. We take adventage of ARM64 with this setup which is a bit cheaper than running on AMD64.

resource "aws_lambda_function" "docker-lambda-function" {
  function_name = "my-lambda-function"
  description   = "This is my lambda function"
  image_uri     = "<account_id>.dkr.ecr.eu-west-1.amazonaws.com/<repo>:backend-api-<lambda-function-version>"
  package_type  = "Image"
  role          = <lambda_role_arn>
  memory_size   = 2048
  timeout       = 10
  architectures = ["arm64"]
}

After deploying to AWS Lambda we also create a CloudFront distribution and API Gateway v2 api that fronts Lambda. The new function-url makes it possible to skip API Gateway. Maybe later on we can have a look what are the tradeoffs with that setup.


## Conclusion

Using a Docker image as the packaging for an AWS Lambda function provides a consistent, immutable environment for your application, ensuring that it runs the same way regardless of where it is deployed. This can help reduce errors and unexpected behavior and make it easier to package and distribute your application and its dependencies. Python 3.11 is much faster in many cases than previous versions of Python.