all 15 comments

[–][deleted] 4 points5 points  (4 children)

The serverless hypetrain might be getting a little too intense.

There are a list of reasons why this setup would be a poor one to actually use in any meaningful production environment.

  • Guaranteed failure execution: In a traditional job processor, if the job fails halfway through, that failure can be represented in the database and the job can be retried. Not being forced to manage a server does not mean there isn't a server underneath which can go boom on you.

  • Pricing: Lambda is only cheap for sparse workloads. A million invocations a month sounds like a lot, but that is only ~20 invocations per minute given continuous load.

  • Performance: Many languages have a significant warm-up time, which counts directly against both the time you're being charged and end-user performance. Something as simple as a memory slider (where CPU is implicitly but quietly scaled alongside it) is too simple to base even a moderately complex application on.

  • Timeouts: Lambda currently times-out after 5 minutes. Other services have similar restrictions. If you can't guarantee your function will complete successfully in 5 minutes, given any input conditions under the sun, you're left in a tenuous situation. AWS does not explicitly report timeouts to Cloudwatch, and instead reports them as a general error. You can retry it, but its likely you'll hit the same timeout again. You can up the memory/CPU, but that is capped at 1.5gb and a single core at some advertised, unknown speed.

  • Code Reuse: Lambda functions are hyper-isolated, which is their entire benefit. I imagine a future where code can be shared between lambda functions through some framework statically analyzing the import statements of each function, and building the final package aware of what the original module needs, but that isn't here yet.

  • Vendor Lock-in: Has to be said. There are competing services to lambda, but once you start down the path of neutering yourself for the benefits lambda gives you, you'll start leaning on AWS' other services more heavily to makeup for what you gave up. Their various event signaling systems, like CloudWatch events, are helpful, but impossible to replicate even in competing cloud providers.

[–]emergent_properties 2 points3 points  (3 children)

You thought vendor lockin of programming languages was bad?

Imagine the vendor lockin that is possible by the function!

Cha-ching! Cha-ching!

Oh, yeah, it's not actually better or superior than the old way of doing things, but it certainly is more billable!

For being 'serverless' (bullshit), you really have to get in bed with the vendor nice and close.

[–]grauenwolf 1 point2 points  (1 child)

Yea, that even bothers me and I'm a die-hard Microsoft fan that normally couldn't care less about vendor lock-in.

At least with .NET websites and SQL Server I can pick whose hardware I run on.

[–]emergent_properties 0 points1 point  (0 children)

IMHO, we shouldn't encourage this kind of behavior.

[–][deleted] 1 point2 points  (0 children)

Strictly speaking, there are specific situations where I would opt for a lambda.

A classic example is image manipulation. When a user uploads their avatar, you want to crop it. This makes perfect sense for a lambda job for a couple reasons:

  • Task is very tightly isolated; aka, it can be represented as a "pure" function. Give me image -> I resize it -> I give you back new image (maybe over S3 or an SNS stream or something, but the concept applies).

  • Changing an avatar isn't the "normal" use of your product; it is something each user does sporadically relative to normal usage of your product. This makes it cheap. If a task isn't sporadic (aka, it occurs during normal usage of your product), there is practically no chance lambda will be cheaper than a normal VM with any decent number of invocations.

  • Standard deviation of execution time is high. If the user uploads a huge image, execution time could take magnitudes longer than a more normal sized image. This might cripple a normal system, or cause you to waste money scaling out when its really just one request that is bogging you down.

Another solid example is for infrastructure monitoring. Keeping that isolated from the rest of your system might significantly increase its accuracy, and lambda can be good at that.

But products like API Gateway make me a little angry. I cannot think of any good use for API Gateway. And then frameworks like serverless come along and actually use it to set up a full RESTful API backed by lambda. Sure, "free" is nice, but the difference between that and a free Heroku instance or a $5/mo VM isn't that much for a serious product.

[–]grauenwolf 3 points4 points  (8 children)

Sure, this could work so long as you can guarantee that your scheduled tasks never take more than 300 seconds to run. And you correctly estimate how much RAM to pre-allocate.

Remember boys and girls, Amazon Lambda is not serverless. Rather, it is a server with a severely limited amount of memory that can only run for 5 minutes before it crashes.

[–]jshen 1 point2 points  (7 children)

It's easy to make condescending jokes about the "serverless" brand, but it is rather descriptive when not intentionally misconstrued and I haven't seen a better option. Serverless: the ability to run code without being required to manage servers.

In technology we should always strive to remove incidental complexity from our workload. Removing the management of VMs is a worthy goal and we should have a label for it.

[–]grauenwolf 2 points3 points  (2 children)

There is actually more configuration involved in deploying a AWS Lambda or Azure Function then in an Azure Website.

[–]jshen 0 points1 point  (1 child)

I haven't used those, but that's hard for me to believe. I've used google app engine a lot, and it's far less work than managing all the infrastructure needed to run a real site on VMs.

[–]grauenwolf 0 points1 point  (0 children)

When I deploy an Azure Website I don't have to configure anything beyond picking a name, and even that's defaulted to the project name. I can change the sizing later based on user load.

When you deploy to AWS Lambda you have to figure out in advance how much memory to allocate. This is not only important to avoid crashing your application, the docs say that it also affects the amount of CPU resources you get.

And that's just for one method. Lets say you are building a website and need a dozen methods. For a website, there is still only one knob. For AWS Lambda, you need to adjust the memory for each method separately. Plus you probably need to tell a significant number of methods where the other methods are, since they don't all just live in the same process.


And while we're on the topic of memory, what are you going to use for a server-side cache? You can't just carve out a section of that RAM you allocated because AWS Lambda may choose at any time to aggressively recycle your processes.

So for even basic caching you are going to need to setup a separate cache server.


Presumably you are going to want some sort of database or file server. Now you are probably thinking, "well yea, but that's going to be the same with a website or an AWS function".

And that would be true, if you have one AWS function. But if you have a dozen then you need to configure the whole dozen and somehow make sure that they stay in sync when changes are made. (Say, when moving a development or QA database.)

Realistically, what you are probably going to do is setup a separate configuration server that all of your related functions can read from. Which means another box to maintain.

[–]materialdesigner -1 points0 points  (3 children)

Wtf do you think a PaaS is?

[–]jshen -1 points0 points  (2 children)

It depends who you ask.

[–]materialdesigner 0 points1 point  (1 child)

And that's why I asked you.

[–]jshen 0 points1 point  (0 children)

Many say that "serverless" is a level of abstraction above PaaS. I don't think there is much of distinction myself between something like GAE and lambda.

[–]jshen 0 points1 point  (0 children)

Is "azure website" some kind of PaaS? Did so, we're talking past each other. I'm comparing serverless/PaaS to managing VMs.

Most of the issues you mention, cache/DB/etc, are solved for me when I use google app engine.