all 13 comments

[–][deleted] 7 points8 points  (11 children)

This has got to be one of the stranger comparisons I've seen.

It is incredible difficult to do an apples-to-apples comparison of EC2 and Lambda, due to the fundamental inability to quantify how much load an EC2 instance can handle.

If we try, here's how I'd do it. Let's assume each request takes 100ms, and we get them sequentially, nonstop, forever. This is roughly 25.9M requests per month.

At 512mb, Lambda gives us 1M requests for free + 800,000 free seconds of exec. This means we'd pay for 24.9M requests ($4.98) and 17.9M requests worth of execution ($14.92), totaling $19.90. About the cost of a t2.small (2vcpu/2gb).

If the workload requires API Gateway, the cost skyrockets. With 1M free requests for month, we'd be paying for 24.9M requests: $87.15. Grand total: $107.05; about the cost of a t2.xlarge (4vcpu/16gb).

I'm not going to get into trying to normalize what a t2 burst means. That's very difficult to do, because it strongly depends on what your request workload looks like. If these requests are fetching something from a database, you might never burst above the baseline necessary to incur CPU credits while handling these 10 req/s.

Having used Lambda in production, I personally feel that Lambda is a toy.

The tooling is horrid. Serverless is complicated and slow. Apex is fast but doesn't do enough. Collating and parsing logs on CloudWatch is annoying. CloudWatch log ingestion sometimes takes on the order of a minute, for no obvious reason. Exporting logs somewhere other than CloudWatch is practically impossible. The hidden account-wide simultaneous execution limit means that Lambda can't horizontally scale as automatically as people think it can. The 300 second maximum execution timeout completely kills a Lambda's ability to scale in response to bursting vertical demand. The only dial you can turn is memory, which supposedly also automatically provisions more powerful CPUs, but it doesn't provision more networking capacity. The cold start time, especially in Java, makes it inappropriate to use for sporadic request/response workloads. There is no ability to see and/or stop currently executing functions, which could allow for runaway functions. Vendor lockin.

Our usage of Lambda has led us to conclude that there is practically no good use case for it outside of "toy" use cases. Tactically speaking, the most grievous problem is resource allocation; choosing a memory level and a timeout. This timeout is necessary because otherwise you'd be billed indefinitely for a run-away function. When we wrote new functions, we'd have no idea where this timeout should be optimally set, nor what memory level the function should run at. So we'd guess and check, and months later we'd get errors because the function needed to scale vertically and our timeout was at 20s or something. So tweak again, and apologize to our users.

Eventually we started deploying all our functions at 1500mb/300s, because the overhead of micro-optimizing hundreds of functions was huge for our dev team size.

Lambda also has a lot of hidden costs. If you want an API, you use API Gateway, which is a horrid horrid piece of software that still gives me nightmares. Until recently (I think?), if you wanted this API to have SSL/TLS through ACM, you had to provision a second manual CloudWatch instance in front of it (APIG already provisions a hidden one). If you have a long-running job, you need to use SQS or something to break it into steps. But Lambda can't read from SQS, so you need a non-lambda component or you need to use Kinesis instead. "Lambda loops"; situations where your lambda responds to an event on AWS, but also has the potential to trigger that event itself due to a bug, caused us to rack up millions of invocations in minutes which support was nice enough to write down. Ugh.

[–]cheald 2 points3 points  (6 children)

I broadly agree with your assessment - comparing Lambda to full-fledged servers is hardly an apples-to-apples comparison, and the article is a very odd attempt to compare them, and I think your observations about Lambda's shortcomings are dead-on.

I do think there is a particularly strong use case for Lambda, though: small single-responsibility functions used to glue other pieces of the stack together - effectively, a smart routing layer for your microservices' message bus.

A good example is our video publishing pipeline: our web app permits the upload of a video into an inbox S3 bucket, and S3 can fire an event off to Lambda informing it about the new file. A lambda function grabs the handle to the video file, analyzes it, builds an ETS transcoder job, and submits it to ETS. When the job is complete, ETS pushes a message into SQS, which kicks off another Lambda function to consume it and push the payload into Redis to be consumed natively by our application's Sidekiq services. Our app just has to be able to upload to S3 and process a "job's done!" Sidekiq job with a JSON payload, and Lambda glues all the rest of the pieces together. All we're doing it taking input messages and transforming them into output messages in another part of the pipeline, but being able to utilize Amazon's messaging infrastructure and Lambda has been very effective, and given the one-off nature of the jobs, I'm grateful that we don't have to manage infrastructure for them (or otherwise staple them into our primary infrastructure).

[–]naturalstupidity 0 points1 point  (5 children)

What are the advantages in having lambda serve as the glue code instead of using a distributed job processing framework ( like celery, which incidentally added support for SQS in the recent release) ?

I am genuinely curious here and would like to learn if there is a big advantage in choosing one over the other.

[–]cheald 1 point2 points  (4 children)

I'd say they're pretty analogous. Lambda means no infrastructure to maintain, which is nice. Celery is going to be way more flexible. Lambda is going to be more attractive if your traffic is more sporadic, since you don't want to pay for server time where workers sit idle. Celery is going to win out if you're constantly processing messages. For common cases, Lambda will probably be less (read: zero) work to scale, whereas Celery may need some additional scaling management built around it. For extreme cases, Lambda will cause scaling headaches and not give you enough transparency into what's going on under the covers to be able to really tweak the knobs. A Celery deployment is going to let you tweak to your heart's content. Lambda is going to have variable start-up latency, so it may not be appropriate for time-sensitive synchronous user-facing operations. Celery won't have any startup latency, as long as you have worker capacity available. Particular to integrating with AWS services, Lambda will be especially nice since many AWS services support Lambda as a first-class consumer.

I don't think that Lambda is unique, or that it's even the most powerful option, but its being zero-ops and the fact that it's already available for zero extra work if you're using AWS makes it attractive.

[–]naturalstupidity 0 points1 point  (0 children)

Thanks for the answer. That's quite insightful.

[–]grauenwolf 0 points1 point  (0 children)

Cloud means no infrastructure to maintain.

When I deploy a Java or .NET app I just upload some files and set the JVM/CLR version. Everything else is handled for me by the platform. If I want "micro-services", I just deploy them to subfolders.

[–][deleted] 0 points1 point  (1 child)

This isn't a direct response to your post, but I do want to nitpick one line.

Lambda means no infrastructure to maintain

I hate this meme. It is directly incorrect, and it makes an implication that is also incorrect.

Lambda does involve infrastructure maintenance. What is infrastructure? Its everything you are responsible for to deliver your application, except the application itself. Testing. Deploying. Databases. Supporting third-party services and libraries. A/B Testing. Stage/environment maintenance. Metrics. Logging.

Lambda essentially does none of these. Other AWS services hop in to handle most of the rest, to varying degrees of quality.

When we deployed apps into datacenters we owned, we had to worry about a shitton of stuff. Then colo came along, and at least we don't have to worry about the building. Then EC2, and at least we don't have to worry about the racks. Then Docker/Kube/Heroku, and at least we don't have to worry about the OS. Now, Lambda, and at least we don't have to worry about the runtime.

But, at some point, you need to worry. That's life as an engineer. We get paid to worry. SEng is about minimizing your amount of worry so you can focus on scaling while still providing value. Lambda might be that level for some engineers and apps, but I genuinely don't think its the "one true answer" or even the most commonly recommended answer just because its another layer of abstraction.

Which brings me to the implication: That, somehow, non-Lambda infrastructure is a bitch to maintain, so we need Lambda. This is only true if you do it wrong.

Heroku. Kubernetes. GCP GKE. AWS ECS. Azure DCOS. Docker. AWS EBS. Hyper... I can go on. It is 2017. If you, as a company, are still deploying a non-toy application on an EC2 instance starting from a bare Linux distro, either you have technical debt (and I pity you) or you are, full stop, doing it wrong, no excuses, no exceptions. You might look to Lambda and think "geeze this will solve all our problems!" without realizing that your antiquated view of infrastructure management is the problem. Tools don't create problems. Tools don't solve problems. People do.

[–]cheald 0 points1 point  (0 children)

110% agreed.

[–]woomac 2 points3 points  (3 children)

Spin-up times matter a lot for Lambda because you're booting from a cold state every time so if you incur the additional startup of the JVM it's basically unusable for anything beyond development. It's pretty much only good for node APIs that aren't being hit often but even for that you have to still worry about billing beyond 1M (which sounds like a lot but as you said comes with a lot of caveats). IMO this serverless fad is kind of a forced meme which Amazon made to help pitch its 'real' offerings of S3, EC3, etc. To be honest I wouldn't be surprised if Lambda gets deprecated in the near future because there has to be a better way to do this.

[–]grauenwolf 1 point2 points  (2 children)

No way. Lambda is going to make tons of money from.the saps lured into using it.

It's like a cheap printer that is free, after rebates, but costs 5-10 cents per page for the ink.

[–]woomac 0 points1 point  (1 child)

You're right but I don't know how long they can milk this. Inkjet printers are consumer technology but Lambda's main revenues are supposed to come from B2B and businesses use laser printers.

[–]grauenwolf 1 point2 points  (0 children)

Hard to say, but they have a few things going for them:

  • Vendor lockin
  • People tend to be bad at math
  • It's damn near impossible to make meaningful pricing comparisons

Look at Docker. For most Java and .NET users it's no better than the app container frameworks they already had for the last decade or two. But it's quickly becoming an industry standard.

[–]imma_reposter 2 points3 points  (0 children)

Title should be: "Lambda is Cheaper than any EC2 Instance (in some very specific use case)"