10 things I learned while deploying my first python function to AWS Lambda

I spent a few days on and off trying to deploy a Flask REST service to AWS Lambda, just to experience what the cool kids were talking about. These are some of the things I learned along the way:

 

Zappa is the easiest packager/deployer for python (as of December 2018)

Zappa provides good quality feedback on the packaging/deployment process. It’s compatible with all the popular python REST frameworks. Minimally configure, deploy and you are done. It packages your whole environment and your own modules, so you will face minimal “module not found” errors.

Zappa provides a “tail” function that allows you to debug the errors in your deployment directly from the command line.

 


Zappa doesn’t work properly with Anaconda (as of December 2018)

I use Anaconda for environment management. Zappa is geared towards venv users. There are “hacky” ways of making it work. Setting the VIRTUAL_ENV environment variable, according to https://github.com/Miserlou/Zappa/issues/167, got me a long way. But I had a third-party module that kept failing. I spent 6 hours of a weekend that I’ll never get back on this.

 

Anaconda’s environment management is very different than venv

Files are stored elsewhere (conda info –envs is your friend). I’ve used Anaconda since forever, so I didn’t know that venv stored files locally within your project. Ugh! (am I getting this wrong? please tell me so). By default, Anaconda stores your requirements outside of your project folder. Naturally, this wreaks havoc with anything that is expecting an environment folder within your project.

 

Chalice is almost as good as Zappa

Chalice is the native tool from AWS for doing these kind of things. It’s not a packager / deployer like Zappa but a whole framework, so you have to refactor your code from whatever framework you are using. Fortunately for me the syntax is almost the same as Flask.

Chalice relies on your requirements.txt as a guide to package the dependencies of your lambda functions. Good if you have a messy environment.

Debugging the deployment is brutal.

 

Chalice doesn’t automatically package your own modules

According to the documentation you have to put all your modules in a magical “chalicelibs” directory. Even after doing that, I still had import problems (importing a local module that was importing another local module). I solved it by spelling out the location, using “from . import mymodule”

 

AWS Lambda packages are restricted to 50MB

Messy environment? too many requirements? you are out of luck. AWS Lambda packages are limited to 50MB, zipped.

I actually had a huge 28MB library (zipped!) that I had to strategically trim down (hi, technical debt!) in order to fit it into my package.

Probably this is a sign that I shouldn’t be uploading a python class, but the actual methods independently. I know, it’s lambda functions not lambda class with everything and the kitchen sink

 

You can upload a class with a bunch of methods and a REST api and enjoy the benefits of serverless

Not the best pattern, not the most efficient solution, but hey, for small stuff works. You get a million requests for free.

 

You can decrease the response time if you increase the memory allocation of your function

I was getting a 500ms response time, good but an order of magnitude slower that I was getting on my laptop. Until I read this text below the “Memory” slider in the Lambda console

“Your function is allocated CPU proportional to the memory configured.”

I moved the slider to 1024MB and the response time went down to 125ms!

 

Lambda functions are a good solution for APIs

I’ve been using Docker since 2016 for packaging small python APIs, but deploying them and managing the Dockerfiles is kind of a pain. If you manage to refactor your application down to single-purpose methods, AWS Lambda offers an impressively easy way to deploy pay-per-use load-balanced functions in a secure server whose hardware and OS stack you don’t have to manage. This schema works best for functions that will be run sporadically and don’t need an always-on server.
Come to think of it, few functions are rarely running *all* the time, and if the inputs are the same, then you can leverage the Cloudfront cache. This cost calculator might help you estimate total cost once you deploy.

 

Links that saved me (beside the docs)

The Right Way™ to do Serverless in Python

Building Serverless Python Apps Using AWS Chalice

The fear and frustration of migrating a simple web app to serverless

 

I'm a software architect and I help people solve their problems with technology. In this site, I write about how to seize the opportunities that a hyperconnected world offers us. How to live simpler and more productive lives. I invite you to check the "Best of" section. If you want to contact me, or work with me, you can use the social links below.