Software systems care about time, money, and space. You have to balance all 3.
Fast, small, cheap
With metered pricing, serverless lets you pay directly for execution time, memory size, and storage space. No overhead. You don't use it, you don't pay it.
Storage is cheap – $0.023/GB/month on S3 – and memory is pretty cheap at $0.000016/GB/second. Memory cost per unit increases when you need extra. Storage cost per unit decreases when you use more.
Bandwidth is where size gets you. AWS charges for taking data out of the system. Other providers are similar.
You pay to send data to users or between availability zones. Details depend on which services are involved. For example: S3 to CloudFront (the CDN) is free, then CloudFront charges $0.085/GB.
Latency measures delay. How long does it take to start working?
You can write the fastest code in the world, but if it takes 2 seconds to get started you'll have unhappy users.
The biggest factors are:
network time
internal routing
lambda wake up time
Network time measures how long it takes for a user's request to reach your server. Depends on distance and connection quality.
Routing is internal to your serverless provider. How long does it take to accept a request and send it to your code? Security rules can make this slower or faster.
Lambda wake up time measures how long it takes to spin up your tiny server. Depends on bundle size, runtime environment, and how your code warms up.
Fancy algorithms are slow when N is small, and N is usually small. Fancy algorithms have big constants. Until you know that N is frequently going to be big, don't get fancy.
~ Rob Pike
That leaves input/output.
Waiting for a database, talking to 3rd party APIs, loading a web page, those will destroy your performance.
The average duration for my screenshot lambda is 10 seconds. Mostly waiting for Chrome to start and webpages to load.
Vertical scaling is the art of getting 1 computational resource to do more.
This type of scaling can get expensive. You need more resources – faster CPU, more memory, better hardware, a GPU – and lots of engineering effort to optimize your code.
For many workloads, this approach is best.
It's easier to scale a database by adding CPU and memory than by rebuilding your application to use more databases. AI researchers prefer a computer with terabytes of memory over tweaking algorithms to need less.
And sometimes it's the only way. Like processing a video.
Horizontal scaling is the art of splitting work between computational resources.
This type of scaling can be cheap. Easier to provision, quicker to get going, less effort to optimize.
But you have to find a balance. 6 cheap computers can cost more than 3 expensive computers.
That's where serverless shines.
Horizontal scaling is perfect for isolated operations with little inter-dependency. Like API requests, processing a video library, or serving static files.
Use the map-reduce pattern to split larger tasks like we did in the Lambda Pipelines chapter. You pay with system complexity.
Traditional servers split performance into warm and cold. The server starts cold and warms up its caches, algorithm setup, and execution environment with the first few requests.
Most requests reach a warm server which is orders of magnitude faster.
With serverless, every request could be cold.
There is no one solution I can give you. Optimizing cold boot performance takes work and understanding your software.
A few areas to look at:
Use a language with a fast and nimble runtime. JavaScript is surprisingly effective, Go can work great. JVM-based languages tend to struggle.
Ruthlessly reduce bundle size. The less code that needs to load, the better. Compile and minimize your source, remove un-needed dependencies. Use the exclude config in serverless.yml.
Avoid fancy algorithms with large constants. Iterating your data 5x to set up a fast algorithm may not be worth it. You're processing small payloads.
Once you've optimized individual components, it's time to look at your system as a whole. Find the bottlenecks.
The fastest algorithm in the world is worthless when throttled by a slow database.
A bottleneck happens when there's a performance mismatch between parts of your system. Fast code feeding into slow code. Slow code that your whole system relies on ...
Typical offenders include:
large computation (like video and image processing)
poorly optimized databases
3rd party APIs
Bottlenecks impact your whole system
You can find the bottleneck by looking at your queues. Is one of them filling up with data? Likely feeding a bottleneck.
You can look at Lambda execution times. A slow Lambda could be full of bad code, more likely it's talking to a bottleneck.
Can you move it off the critical path? Cache or memoize any API and database responses?
The exact answer depends on your code and your system.
I talked to an AWS billing expert and that was the take-away. Then I killed the whole chapter on cost.
Your time is worth more than your bill.
However, he suggested you try AWS Lambda Power Tuning. A tool that tries your lambda in different configurations and shows the balance between power, speed, and cost.
Example graph from AWS Lambda Power Tuning
Execution time goes from 35s with 128MB to less than 3s with 1.5GB, while being 14% cheaper to run.
A great example of how vertical scaling beats horizontal. But you keep every benefit of horizontal, because serverless. 🚀
Next chapter we look at running Serverless Chrome, a typical case where beefy lambdas help lots.
Hello! 👋
Are you a frontend engineer diving into backend? Do you have just that one bit of code that can't run in the browser? Something that deals with secrets and APIs?
That's what cloud functions are for my friend. You take a JavaScript function, run it on serverless, get a URL, and voila.
But that's easy mode. Any tutorial can teach you that.
What happens when you wanna build a real backend? When you want to understand what's going on? Have opinions on REST vs GraphQL, NoSQL vs. SQL, databases, queues, talk about performance, cost, data processing, deployment strategies, developer experience?
Serverless Handbook shows you how with 360 pages for people like you getting into backend programming.
With digital + paperback content Serverless Handbook has been more than 1 year in development. Lessons learned from 14 years of building production grade websites and webapps.
With Serverless Handbook, Swiz teaches the truths of distributed systems – things will fail – but he also gives you insight on how to architect projects using reliability and resilience perspectives so you can monitor and recover.
~ Thai Wood, author of Resilience Roundup
If you want to understand backends, grok serverless, or just get a feel for modern backend development, this is the book for you.
Serverless Handbook full of color illustrations, code you can try, and insights you can learn. But it's not a cookbook and it's not a tutorial.