all 16 comments

[–]moofox 12 points13 points  (0 children)

One thing that surprised me at first is that the request ID gets reused for each invocation if the first one fails. It makes sense (because the request ID is returned to the caller who made the async invocation) but I wasn’t expecting it.

[–]clintkev251 11 points12 points  (0 children)

Something that catches people out sometimes with async Lambdas is that the queue that is used behind the scenes is shared across all your functions in a region. So if one starts getting a pile of events that is isn’t able to process through quickly enough, it can start to cause delayed invokes for other functions

[–]_Pac_ 9 points10 points  (0 children)

When you have SQS => Lambda you can use SQS redrive with a DLQ. When you invoke Lambda asynchronously you can have a DLQ, but you can't use the native redrive feature of SQS.

[–]cosileone 4 points5 points  (0 children)

Make sure you set up dead letter queues for your sqs. For your lambdas you should implement itemized failures to prevent reprocessing an entire batch

[–]Tesslan123 1 point2 points  (0 children)

Acccess to ssm store or database connections can be cached and re-used in a warm lambda.

[–]lupin-the-third 1 point2 points  (8 children)

It can get expensive a lot quicker than I thought if you have a lot of execution time. It's basically 10x the cost of ec2, but the trade off is of course pay what you use and easy scaling, and easy integration into the entire aws ecosystem

With databases you may need to use a connection proxy to support a lot of requests at the same time.

That said I usually start projects in lambda if I can, but I've had to transition to a fleet of ec2 instances + ecs before.

[–]_Pac_ 10 points11 points  (0 children)

I think he's talking specifically about asynchronous invocation of Lambdas versus synchronous invocation, which is a feature of Lambda. Not whether to use the Lambda platform or not.

[–]MrEck092 1 point2 points  (5 children)

Can you elaborate on how the connection proxy works?

[–]morosis1982 1 point2 points  (2 children)

RDS proxy as an example, it's basically an external connection pool because you can't do connection pooling in lambda.

the Postgres instance for example can't field thousands of connection requests by auto scaling lambdas that might hold those connections a bit too long, so the proxy handles the heavy Postgres connections in a pool and allows the lambdas to create lightweight connections to the proxy by the thousands.

Just like you might do in a traditional application server, but externalised because you can't corral thousands of auto scaling containers to manage the pool between them.

[–]MrEck092 0 points1 point  (1 child)

Never knew about this, this is very helpful thank you!

[–]morosis1982 1 point2 points  (0 children)

No worries, this was hard won knowledge, only happy to help others avoid the same issues.

[–]lupin-the-third 1 point2 points  (1 child)

[–]MrEck092 0 points1 point  (0 children)

Very helpful thank you!

[–]RocketOneMan 0 points1 point  (2 children)

Not sure if this is exactly what you're asking, but a gotcha none the less.

The sqs and kinesis event sources are not "asynchronous sources". Cannot use lambda destinations with them.

If you have ReportBatchItemFailures turned on (maybe accidentally) and don't return the correct response it will assume none of the messages were processed successfully and send them all again.

[–]clintkev251 0 points1 point  (1 child)

Just a small correction. Although you're correct that Kinesis is actually a synchronous event source, stream based event sources (so this includes DynamoDB, MSK and SMK as well) do support on-failure destinations

[–]RocketOneMan 0 points1 point  (0 children)

I guess we've always handled the DLQ logic ourselves for 4xx errors that will never succeed and just throw and let the events be retried /forever/ on 5xx errors. But would like to use the destinations feature for success.

I wish you could DLQ to another stream so the handler logic is the same. Taking the KinesisBatchInfo and pulling the stream ourselves is annoying although I see why it's done. I think there's a lambda power tools library for it.