4 Pitfalls to Avoid when Deploying a Python Lambda to AWS in 2024

Aws Lambda Logo white outline on orange background

I have a very simple Python service I’d like to run “in the cloud”.
For nearly all of my personal projects I’m using github actions and Linode. At work, I use AWS for everything though – just like at my last job, and the job before that.

The problem with AWS in 2024 is when you have a simple project you want to get running cheaply and efficiently. The project management triangle steps in and tells you that you can pick two of: performant, easy to use, and cost effective.
Because the project is for my wife, I picked it based on her criteria – cost effective and performant. I ended up choosing an AWS Lambda function, which is actually a python service using fastapi, and then cloudwatch events to trigger a scheduled “job”.
I’m glossing over a lot of the logic and systems design (like data storage, querying, authn/z, etc.) because sadly, that’s not the hard part.
The hard part with AWS Lambdas in 2024 is actually getting it debuggable locally, fast to build/iterate on, and working in a cpu arch/OS that is not available in the lambda runtimes (macos + arm).

The Pitfalls:

1. Stack Choice

So it’s 2024 and you can still use an infinite number of ways to deploy a lambda to AWS. The problem is that in all of the documentation available from AWS and all of the help you might get from your TAMs/AWS Support staff will be for AWS-created IaC. The main choices should be CDK, CloudFormation, and SAM and of that, I chose SAM. It simplifies some things and accomplishes some of what the various other serverless stacks solve.

2. Dependencies

Since this is a Python project, I chose to use poetry at the top level of the project and then use poetry export -f requirements.txt --output hello_world/requirements.txt to export a pip-ready requirements.txt file. If you have a lot of lambdas using similar dependencies it might be easier to combine these into an external dependency layer and attach that to your lambda in your sam template. Once a layer has been created it can be reused for other lambdas. I chose to just set the PYTHONPATH variable to a directory that I’ve told pip to install to.

This all would’ve been easier to manage if I deployed lambdas as containers instead of zip packages but then you get nailed on cold start times.

3. Architecture

Since I’m trying to run this on arm and I’m developing on an Apple Silicon Mac, I ran into an issue I had not seen since developing in perl and deploying to hp-ux Itanium machines. For the first time in ages I was installing dependencies that did not have binaries built for the system I was working on or deploying to. From pygwalker requiring psutil which was missing a wheel to numpy being grumpy about the way it was being imported…I thought I was done with this headache when I moved to doing mobile and backend service development in Java/Kotlin/Swift/etc.
After stumbling my way through this mess, I found an AWS KB article that brought me to my final pip install method:

pip install -r hello_world/requirements.txt -t hello_world/python_deps --upgrade --platform manylinux2014_aarch64 --only-binary=:all: --upgrade

4. Debugging

The minute I had to run sam build, I felt like a failure. Once you run sam build, all sam local commands will want to use that build, and rebuilding takes forever. So you either never run sam build, or you’re running it constantly. My happy middle-ground ended up like this:

  • I created events for all of my debuggable API actions that I wanted to do and used pycharm’s sam debugging to invoke the event. This will trigger a sam build for every time you need to debug.
  • When I was quickly iterating and wanted to see changes fast, I deleted the .aws-sam directory and just ran sam local start-api. This allowed me to quickly iterate on API development without constantly having to wait for containers to build as the start-api command allows for hot-reloading.
  • I 100% gave up on using the -d PORT_NUM option for sam local start-api. It’s best to just create runconfigs in pycharm or vscode, or let pycharm do it by using the aws toolkit plugin’s helpers. This solution uses sam build so of course it takes 10-30 seconds to debug on my m1 macbook air so I’ve just switched my development style to only debug when necessary now.

Conclusion

Was it worth it?

I don’t even know anymore. I would’ve had this whole thing up on my Linode hosted PaaS in 10 minutes. Alternatively I could’ve just used Linode‘s kubernetes functionality and had fairly performant and resilient solution for 1/4 of the cost of using kubernetes on AWS.

On AWS, I probably should’ve just deployed a docker file with Elastic Beanstalk and tuned the scaling/instance count. This would’ve been the simplest and cheapest combination that still would’ve been fairly performant. I still believe for this use case that a lambda is cheaper overall, but it comes at this significant debugging/config/deployment cost.

On Lambda – you can avoid some size constraint pitfalls (250mb limit for unzipped package size) and gain some reusability over alternative methods by using Layers as well.

Ideally – this lambda setup will be more performant when my wife needs it and less costly when she’s not using it, it’s just still a pain in the butt in 2024.


Leave a Reply

Your email address will not be published. Required fields are marked *