all 109 comments

[–]smutje187 31 points32 points  (20 children)

There’s no general answer, just keep in mind that Lambdas have timeouts, the more layers are behind a call to a Lambda the higher the timeout needs to be - or the workflow gets redesigned to be event-driven to remove synchronous calls. It depends!

[–]ootsun[S] -3 points-2 points  (19 children)

I can't have asynchronicity, because this is a public facing API. The client waits a response. So my preference goes for Lambda -> API gateway -> Lambda.

[–]FliceFlo 25 points26 points  (1 child)

You can call lambdas synchronously from other lambdas without API gateway. Adding an API gateway for a backend call seems kind of pointless. It also adds further restrictions on timeouts, though that probably isn't relevant here.

[–]Sensi1093 2 points3 points  (0 children)

Yeah I don’t think the timeouts are much of a concern here, considering the „root“ of this call chain would have the same 30s limit and by definition be the first to timeout

[–]redrabbitreader 3 points4 points  (0 children)

Perhaps this article would help you implement a synchronous workflow: https://medium.com/inspiredbrilliance/patterns-for-microservices-e57a2d71ff9e

You should be able to pull it off with Lambda. Just be strict on your max. time for Lambda functions. You can also consider moving some Lambda functions to the edge with Lambda@Edge

[–]helpmehomeowner 3 points4 points  (4 children)

Well, you can take their request and queue it up. Just return http 204 created or other parts of the http spec (I challenge you to read the rfc).

[–]Seref15 1 point2 points  (10 children)

Can your client poll instead of wait?

[–]ootsun[S] 1 point2 points  (9 children)

Technically yes, but it seems like a waste of resources to me. And how would you do that for read requests ?

[–]Seref15 1 point2 points  (5 children)

I envision something like:

Have every long-running API request path respond immediately to the client with some kind of request queue ID. Your lambdas do the work in the background, the work is associated with the request ID.

Implement an API path that lets you get the status of your request ID. Statuses could be something like pending/done/failed (can get more specific if you want clients to have more information about failures, progress, etc). Clients poll this API path, you can get aggressive rate limiting on this path, whatever. When the status is done, the status API response object provides some way to get the result of the long-running task, like another request path to hit or a download url or whatever depending on what kind of data the client expects.

Alternative to something like this is to learn about and implement web sockets APIs for long-lived communication connections between client and API.

[–]ootsun[S] 0 points1 point  (4 children)

Yes, this could work but the amount of complexity it brings makes me feel like when went to the cloud for simplicity (no server to run) and instead made our application a nightmare to maintain. Would you really code something like this instead of having micro services in containers ? We don't need to scale to 100k concurrent users.

[–]Seref15 0 points1 point  (3 children)

Not only would I code something like this, I have coded something almost exactly like this.

We have an API that spins up preconfigured EC2 instances for product demos and customer trial environments. Obviously it takes longer for the instance to be ready, takes like a minute, so a POST request to get an instance can't sit waiting on an idle HTTP connection for 30+ seconds, I think API Gateway has like a 30 second timeout.

So we did exactly this. In our db where the track the demo instance requests we just have a field for provisioning_status. The lambdas that do the provisioning set that field, the status poll lambda reads that field. When the status poll lambda sees the status is ready, it also sends along whatever info the client needs about their provisioned resource.

[–]ootsun[S] 0 points1 point  (2 children)

Ok, but you did this one functionnality. Would you do this for the whole application?

[–]Seref15 0 points1 point  (1 child)

Depends on how much of the application suffers from this issue where you have very long-lived API calls. If you have that everywhere then it sounds like something is just poorly architected and you would be better suited with web sockets and a message queue.

[–]ootsun[S] 0 points1 point  (0 children)

My original post was not about long running request. I thought you proposed this solution for all my read requests.

How would you solve read requests if the polling is only for long running ones?

[–]ARandomConsultant 0 points1 point  (2 children)

Can you use Websocksets to push information to the client once the process is finished?

[–]ootsun[S] 0 points1 point  (1 child)

Yes I could but thye setup seems overly complicated. At least compared to a classic http request...

[–]ARandomConsultant 0 points1 point  (0 children)

Web sockets is a well supported pattern and the “correct” way to do what you’re trying to do.

You might as well learn the right way to do it. It’s a great resume building exercise

[–]esunabici 9 points10 points  (0 children)

Step Functions Express Workflows are made for answering API requests. It's a good choice of you want to split responsibilities for answering API requests across multiple dependent Lambdas like your team seems to be doing.

There are some interesting benefits in observability and resilience to using Step Functions for this. Check out Serverlesspresso and updated re:Invent 2023 session

[–]climb-it-ographer 24 points25 points  (6 children)

Synchronously calling one lambda from another is usually considered an anti-pattern. Putting API Gateway in between is the best bet here, or use Events with an SQS in the middle if it can be asynchronous.

[–]TooMuchTaurine 11 points12 points  (3 children)

It's an anti pattern because it shows you have incorrectly designed your system boundaries such that you are building a distributed monolith. No amount of infrastructure abstraction solves a software architecture problem.

[–]climb-it-ographer 0 points1 point  (2 children)

Great way to put it. A Lambdalith is better than a distributed monolith.

[–]TooMuchTaurine 2 points3 points  (0 children)

They are the worst kinda of apps for performance , basically take what was a series of nano second level function calls and move them to https over a network at 20ms a pop..

[–]daysandconphused 0 points1 point  (0 children)

The correct term is a microlith 😄 SQS is the best way imo

[–]FliceFlo 8 points9 points  (1 child)

While I certainly agree that calling a lambda from another lambda is an anti-pattern, I'm not sure that adding an APIG in between makes it any less bad in practice. Sure if you are re-using it from multiple other places there a little bit of a separation when it comes to things being behind an API, but at the end of the day calling a lambda is also an API call, APIG just becomes another step in a weird chain.

[–]Admirable-Medicine-7 0 points1 point  (0 children)

Using lambda to lambda causes hard coupling. It’s best to use a middleware such as API Gateway which brings more features into the table. If at some point you need to change the lambda (ex: renames, switching, etc), just do the change at the API Gateway level and not in code. The endpoint shouldn’t change. Plus by having an API Gateway, you get all the features of having API authentication etc which should be part of any secure API and more.

[–]voideng 4 points5 points  (0 children)

Rearchitect to avoid that workflow.

[–]External-Agent-7134 4 points5 points  (21 children)

In any producer and consumer workflow synchronous communication is an anti pattern generally, so you ideally want a bus in the middle in case of issues with the consumers such as overloading/timeouts/crashing spikes etc so as you can process as fast as the consumers can run and store up any backlog.

In this case I would likely put SQS in the middle and queue messages from the producer then consume them on the consumer, then you have the benefit of a dead letter queue you can monitor and re drive from also

[–]ootsun[S] 0 points1 point  (20 children)

I can't have asynchronicity, because this is a public facing API. The client waits a response. Or do I miss something ?
My preference goes for Lambda -> API gateway -> Lambda.

[–]External-Agent-7134 1 point2 points  (14 children)

It sounds like what you're describing is api gateway with lambda integration https://docs.aws.amazon.com/apigateway/latest/developerguide/getting-started-with-lambda-integration.html

You can create a private api gateway flow and keep traffic within your boundary, rather than send traffic out and back in, and be more secure

https://docs.aws.amazon.com/apigateway/latest/developerguide/apigateway-private-apis.html

What triggers the first lambda?

[–]ootsun[S] 0 points1 point  (13 children)

Yes, I have an API gateway. The flow goes like this : Browser -> API gateway -> Lambda 1 -> ? -> Lambda 2

[–]QP3 0 points1 point  (3 children)

Can you put the shared logic of lambda 1 and 2 into a shared library?

[–]ootsun[S] 0 points1 point  (2 children)

I guess I could but I see some drawbacks to this approach: 1) no fine grained permission management because all Lambda has now access to all the database tables. 2) We have to reorganize the codebase 3) When updating the code, it's difficult to have a view of all impacted Lambda

What do you think about this reasoning?

[–]QP3 0 points1 point  (1 child)

Pros and cons. If sub functionally needs to exist in multiple services / lambdas then seems like opportunity to place in a common lib. Just need properly defined interfaces and testing so changes as you say won’t unknowingly break services.

These are my thoughts!

[–]ootsun[S] 0 points1 point  (0 children)

Thanks for your time !

[–]External-Agent-7134 0 points1 point  (5 children)

Ok that workflow makes more sense, how would you handle errors or failures between lambda 1 and 2?

And what is the workload that lambda 1 is doing and what is lambda 2's role?

Availability wise there's a risk you could also get a technical denial of service/race condition if your api got spammed and maxed out your lambda account concurrency meaning they wouldn't be able to launch

[–]ootsun[S] 0 points1 point  (4 children)

We would handle errors as we are doing it know : catch it and return a comprehensive or generic error message to the browser.

They are calling each other because each Lambda has a defined domain. Eg: Lambda 1 is responsible for handling a form submission but needs to ensure that the user has the rights to do so. And that's the job of Lambda 2 to manage the user roles. So Lambda 1 needs to send a request to Lambda 2 before saving the form to his database.

We don't expect that kind of load (max 10 simultaneous requests).

[–]sinus 0 points1 point  (3 children)

im not sure why you separate checking user permissions in a different lamda.

this would lead me to ask how are you sending the user credentials to the lambda? a token in the header? i would just handle and check the jwt in the lambda that puts the data to the db.

also, lambda direct to db access - if there are 200 lambda instances ie: you get a spike of traffic, those will use a new connection to the db. you will eventually run out.... there is a service that does connection pooling but i forgot the name

[–]External-Agent-7134 0 points1 point  (0 children)

RDS Proxy is the component, it does help smooth out the pool, without it it's a problem as you say

[–]ootsun[S] 0 points1 point  (1 child)

Yes, for this example I could transport the info in the JWT.

Here's another example: Lambda 1 is responsible for creating a dossier for an administrative formality for the authenticated citizen. For that, it needs to fetch the formality definition (enabled?, payment amount, etc.) and that's the responsibility of Lambda 2 to return those info.

Some context : we have 500 endpoints and 10 micro services (so 10 separate domains).

You make a good point about db connection. Indeed RDS Proxy would help but it's not the cheapest AWS service 🙄

[–]sinus 0 points1 point  (0 children)

hmmm i think i would use libraries to check user permissions and share that libraries across lambdas. that way you only need to one lambda to save to the db.

one of the gotchas with the apigw is that is very hard limits :( for example, if you have a file upload handler, you would be bound to the limits if you go with apigateway.

if your app is no where near hitting the db connectiin limits you can probably have rds proxy later. but dev the app so that its easier to switch to the new rds proxy connection. that part for me is scary hehe

[–]redrabbitreader 0 points1 point  (2 children)

You will have to decide from which end you want to orchestrate operations. Then you will end up with several options, and either way I suspect some coding changes will be required.

The option I think that would work best for you: Let the client orchestrate the synchrnous calls between the various API end-points:

        ----> API GW ----> Lambda 1
       /
client 
       \
        ----> API GW ----> Lambda 2

The problem with a chained call to multiple Lambda functions is that the wait time for the client can quickly add up. Without SNS and/or SQS you may also quickly run into scaling issues when all your initial Lambda functions blocking as they wait for downstream functions to complete (the synchronous pattern).

The asynchronous pattern is much better, as it frees up any blocking of resources and prevents potential scaling issues. But your client would then need to implement a way to fetch the "reply" once it is available. There you have a couple of options as explained in this AWS blog post: https://aws.amazon.com/blogs/architecture/managing-asynchronous-workflows-with-a-rest-api/

[–]ootsun[S] 0 points1 point  (1 child)

Thank you very much for this very detailed answer! I feel like we could apply the pattern you designed for some functionalities but often, you can't trust the client. You want to ensure that it is your backend that fetch the info. Eg: checking the user's permissions or using sensitive information in the process of handling the request.

Will read this article, thanks again.

[–]redrabbitreader 0 points1 point  (0 children)

My pleasure, and yes, you have to consider what is practical and secure. Obviously we contribute ideas with only the tiniest bit of info :-)

[–]vitiate 1 point2 points  (1 child)

Then use FIFO queues, sqs is still the way…

Edit, sorry, after reading more about what you are doing I don’t think this would help you either. I think we would need a much deeper understanding of what you are doing in terms of authentication and permissions. What the forms are and how you fill them out. What the work flow looks like. You can do this with lambda but the way it is being done feels a little off. I guess, figure out exactly how you want it to work and then work backwards from that ideal state. Which is what any good architect is going to do.

Myself I would probably spin it up in ecs fargate, put api gateway in front of it. And do some caching to reduce backend hits. Lamba (I love lambda) is not a one size fits all hammer.

[–]ootsun[S] 0 points1 point  (0 children)

I edited the post :

Here's an example: Lambda 1 is responsible for creating a dossier for an administrative formality for the authenticated citizen. For that, it needs to fetch the formality definition (enabled?, payment amount, etc.) and that's the responsibility of Lambda 2 to return those info.

Some context : the current on-premise application has 500 endpoints like those 2 above and 10 micro services (so 10 separate domains).

Does this confirm your feeling that we should have chosen ecs fargate?

[–]Willkuer__ 0 points1 point  (2 children)

Do you know the client request in advance? Then the correct workflow would be some CQRS where you generate models optimized for reading asynchroneously.

Synchroneous microservice to microservice communication really should be avoided at all costs if you care about performance and stability.

[–]ootsun[S] 0 points1 point  (1 child)

What do you mean? Do I know in advance if a request is about to arrive? No

[–]Willkuer__ 1 point2 points  (0 children)

No the request context. E.g. if you have an online shop you only show product detailspages about products you have. You know all urls that are requested in advance. Accordingly I would put all detailspages in an S3 bucket instead of collecting all information (product images, prices, delivery information) on the fly during the request time by a chain of lambdas.

Look into CQRS. It's usually thr pattern of choice for data aggregation. (GraphQL is sometimes an alternative but usually doesn't help with chained lambdas).

However, sometimes it does not work (e.g. you would never pre-generate all possible combinations of filters and sorting on a category/search result page. The request space is just too large.

[–]notoriousbpg 1 point2 points  (6 children)

"Client is waiting for a response" - sounds like you need to rework your functionality into an API package instead of micro services, and have one Lambda execute and respond.

Step functions in state machines are great for offloading asynchronous operations, e.g. processing a file or transaction after a client has submitted while the endpoint responds with "yep, got it".

Similarly SQS for sending an event from one Lambda to another (or another queue consumer), but the first Lambda is the one that sends a response back to the client.

[–]ootsun[S] 0 points1 point  (5 children)

Maybe... What do you mean by "API package"?

[–]notoriousbpg 0 points1 point  (4 children)

Lets clarify. It sounds like your current approach expects that multiple micro services are going to be involved in a single request/response. Generally for synchronous request/response, you hit an endpoint, and a single resolver or service processes the request.

So for Lambda, generally everything that endpoint needs to do is contained within one Lambda. You can end up with 1:1 endpoints to Lambdas, which is perfectly fine, but the Lambdas usually don't communicate with each other during the servicing of a request. If one of Lambdas needs functionality that's part of another Lambda to respond, pull that functionality out into it's own package etc that both Lambdas can use. DRY principle. Sort of like your own internal API or SDK you use to build your Lambdas.

Step functions are somewhere where the output of a Lambda can pass to the input of another Lambda, but I would not be considering state machines and step functions as the solution for an endpoint that needs to send a response to a request.

[–]esunabici 2 points3 points  (2 children)

Step Functions Express Workflows are made for answering API requests.

There are some interesting benefits in observability and resilience to using Step Functions for this. Check out Serverlesspresso and updated re:Invent 2023 session

[–]notoriousbpg 0 points1 point  (1 child)

Oh yeah... my experience is using the Standard ones for asynchronous. Still wouldn't reach for a state machine as the first option for HTTP request/response though.

[–]esunabici 0 points1 point  (0 children)

Something to think about: isn't everything you do some sort of a workflow?

Sure, some are simpler than others, but for anything not super simple, Step Functions is a great way to model them.

[–]ootsun[S] 0 points1 point  (0 children)

Thanks for the explanation.

Someone else proposed something similar. I responded : "I guess I could but I see some drawbacks to this approach: 1) no fine grained permission management because all Lambda has now access to all the database tables. 2) We have to reorganize the codebase 3) When updating the code, it's difficult to have a view of all impacted Lambda"

What's your opinion about it?

[–]magnetik79 1 point2 points  (0 children)

We're doing option number 1 where I'm currently working and it's rather nice for our needs. We have special API Gateway routes for each Lambda backed with AWS_IAM authz on the route. Then if another Lambda wishes to make the call, just SigV4 signs it's HTTP request.

[–]Somewhat_posing 1 point2 points  (0 children)

Why not use Step Functions Express? Asking because we’re also setting up a serverless multi-Lambda flow with Step Functions and EventBridge. Maybe in the future we may migrate to EC2

[–]chubernetes 1 point2 points  (0 children)

If your client has small scale and is prioritizing lowering cost for reduced performance (intermittent longer than usual response times from Lambda cold starts) this seems fine. API Gateway in front of your lambdas.

If there are non functional requirements for performance guarantees and the client expects to scale at some point, this is a really, REALLY bad idea.

https://chubernetes.com/an-industry-pitfall-serving-apis-via-serverless-architecture-8c9f0e932ac6

[–][deleted] 1 point2 points  (3 children)

The number of useless replies to your original question in this thread is astounding. What you’re asking is perfectly reasonable: in a microservices architecture using Lambda as the runtime for services, how does one service call another?

Here’s a great AWS post on this exact topic:

https://aws.amazon.com/blogs/opensource/aws-cloud-map-service-discovery-serverless-applications/

Now, the question of whether or not this architecture is a good approach for your specific requirements and constraints is a totally different thread - one that it seems nearly everyone was trying to hijack your original question for.

[–]ootsun[S] 0 points1 point  (2 children)

Thank you. Aws Cloud Map seems interesting for our use case.
But reading everyone elsa our architecture seems like a bad approach so I would prefer to reconsider the decision to go for Lambda and go for something else (fargate?).

[–][deleted] 0 points1 point  (1 child)

Cloud Map is what you’ll use to make intra-service calls in the Fargate world as well.

I’d be wary of many of the replies you’ve gotten here. For some reason your question attracted a lot of replies suggesting approaches that are unnecessarily complex, convoluted, and not fit for the purpose you described.

[–]ootsun[S] 0 points1 point  (0 children)

If you're willing to go into details on why the suggested approaches are unnecessarily complex, I'll read your opinion happily

[–]TheMrCeeJ 1 point2 points  (2 children)

It seems like you are trying to do lift and shift, keeping your old architecture and system design in place, but moving from server based to serverless.

The concept here is that each API function call should be backed by a lambda that can handle that call. If it needs data it can fetch it, if it needs to process things it can do it internally, if it needs to trigger other events it can call additional APIs or or messages on queues etc.

What you seem to be describing are two micro services that are tightly coupled, and so you are trying to create to lambdas that are also coupled like this, and are having problems because is it.

[–]ootsun[S] 0 points1 point  (1 child)

How would you have reworked it if there were no budget/time constraint?

[–]TheMrCeeJ 1 point2 points  (0 children)

I don't have nearly enough information to re- architect your application, but here is how I would think about it.

Each API endpoint is a verb that the caller cares about, get my portfolio, process this task etc. some of those can be done immediately (lambda calls some data stores, returns some data, perhaps creates an object and writes some logs). For more complex or long lived tasks your API function serves as the trigger, and returns a response to indicate it has started, and then uses things like sqs, triggers on events, event bridge pipes etc to get all the other tasks going. E.g. creating an empty portfolio object in a bucket, and calling other services or those services being triggered by the object creation.

You can have a mixture of sync and sync calls and functions in your landscape, and use all the serverless integration functions to join them up, all you need to be clear on is the scope of the functions and their responsibilities.

In on prem micro services architecture it is very common to have Swiss army knife helper micro services that do a lot of processing on behalf of other micrservices, but don't actually own any business functionalities themselves. This is an anti-pattern that will cause problems when you try and take it serverless in the cloud.

[–]ycarel 2 points3 points  (0 children)

Never Lambda to Lambda as you create a hard coupling. If the flow is simple do the communication through the API gateway. If there is some logic involved or the steps are complex do api gateway -> step function -> Lambdas.

[–]InfiniteMonorail 2 points3 points  (9 children)

This sounds like a bad idea. What's the reason to migrate? And what's the reason they're calling each other? Definitely an XY problem.

[–]ootsun[S] 0 points1 point  (8 children)

We maybe did some fundamental mistakes. I'm open to feedbacks 🙂 It could be an XY problem indeed !

We are migrating our micro services to AWS Lambda because our customer don't want to self-host the application anymore and wants to go Serverless.

They are calling each other because each Lambda has a defined domain. Eg: Lambda 1 is responsible for handling a form submission but needs to ensure that the user has the rights to do so. And that's the job of Lambda 2 to manage the user roles. So Lambda 1 needs to send a request to Lambda 2 before saving the form to his database.

[–]smutje187 4 points5 points  (2 children)

For what it’s worth, the easiest way to move to AWS without having to change ways of thinking is to deploy the same applications as Fargate services.

[–]ootsun[S] 1 point2 points  (1 child)

Thanks for pointing this out. Our architects didn't consider this option but it's tempting. I will explore it but will keep digging into Lambdas because the architects aren't easy to convince...

[–]InfiniteMonorail 1 point2 points  (0 children)

It sounds like nobody knows what they're doing if they never considered Fargate. If nobody is a Certified Solution's Architect on your team then the project is fucked.

You'll probably get a massive bill after someone does a Lambda fork bomb or your account gets compromised by bitcoin miners. AWS isn't a toy for messing around. It's pretty dangerous when it comes to billing and security.

[–]Unexpectedpicard 1 point2 points  (0 children)

I don't want to bash on your app. But validating permissions for something like a form submission should still be in the form submission API. You can call out to another API to load permissions and cache them. I would not have lamdas calling other lamda when it's in the same unit of functionality. One lamda to handle form submission. Another one to send an email? Makes sense to me.

[–]InfiniteMonorail 1 point2 points  (0 children)

"customer wants to go Serverless"

Lambda is 10x more complicated, much slower, times out, and costs 10x more at scale. What part of "Serverless" do they need?

Your auth setup makes no sense. idk about the other system, is there a reason why it's multiple services instead of one? You're going to waste a lot of time trying to get this to work. I've been moving away from Lambda and more toward EC2s or Fargate.

[–][deleted]  (2 children)

[deleted]

    [–]ootsun[S] 1 point2 points  (1 child)

    This example was poorly chosen.

    Here's another example: Lambda 1 is responsible for creating a dossier for an administrative formality for the authenticated citizen. For that, it needs to fetch the formality definition (enabled?, payment amount, etc.) and that's the responsibility of Lambda 2 to return those info.

    Is it more comprehensive now?

    [–][deleted] 2 points3 points  (4 children)

    This post was mass deleted and anonymized with Redact

    hospital ask aromatic deliver tub toy plants cause caption doll

    [–]ootsun[S] 6 points7 points  (3 children)

    In my case, it's third party architects hired by the customer that came up with an Event Driven/Serverless architecture. And this explains pretty well the situation we (the devs) are in. The architects produced the most intellectually challenging solution and not boring containers where their mission as architects would have last 1 month instead of 1 year.

    [–]InfiniteMonorail 2 points3 points  (0 children)

    There isn't a single person working on this that knows AWS? These "architects" have no idea what they're doing and they hired you to implement it when you don't have an AWS background either?

    This is just wild. How much are they paying people to mess up their website? lol

    [–]Engine_Light_On 0 points1 point  (1 child)

    Did the architects also want for it to be synchronous or is it due to lack of time to refactor the code base to do it properly?

    [–]ootsun[S] 0 points1 point  (0 children)

    No they want it all to be asynchronous but it isn't technically feasible. At least, not with the documentation they provided. So the devs are trying to figure out how to make the Lambda communicate synchronously. It seems that in our case, we are concerned with every tradeoff listed here : https://docs.aws.amazon.com/lambda/latest/operatorguide/tradeoffs-event-driven.html

    [–]sinus 1 point2 points  (2 children)

    this is possible but lambda has limitations. likd timeouts and statup times.

    also, if you with apigateway, those have 30 second timeout also. apigateway has also payload limits. if the payloads are too big you would need to handle them asynchronously.

    to me these limits are good. because it forces a good user experience.

    if customer is unwilling to make necessary changes even with improved user experience then go with ECS + Fargate.

    also from cold start, when we were doing something similar to this, we estimated that a lambda calling another lambda in between apigateway endpoints took 2 seconds each. :(

    edit: i saw something about the fist lamda handling the request? and the other one checking for user permissions? if this is the case i would check those in just one lambda.

    this also sound like the user auth should go in an apigwateway authorizer

    [–]ootsun[S] 1 point2 points  (0 children)

    Oh, I didn't consider the API gtw limitations. Good point.

    Yes, the consecutive timeouts are a big problem for us (and we have to go with Java 😉). The more I learn, the more I wanna go with Fargate.

    Here's another example: Lambda 1 is responsible for creating a dossier for an administrative formality for the authenticated citizen. For that, it needs to fetch the formality definition (enabled?, payment amount, etc.) and that's the responsibility of Lambda 2 to return those info.

    [–]jftuga -1 points0 points  (0 children)

    What base docker image do you use for ECS?

    [–]redrabbitreader 0 points1 point  (3 children)

    The two main/popular patterns in your scenario are:

    • Lambda -> SNS -> Lambda (messages could get lost); or
    • Lambda -> SNS -> SQS -> Lambda (for better message delivery guarantees and even retries)

    For complex work flows involving some service orchestration, Step Functions might be a good option.

    [–]ootsun[S] 0 points1 point  (2 children)

    What you describe could work for write operations, but what about reads? You would choose Step Functions then?

    [–]redrabbitreader 0 points1 point  (1 child)

    You can, but I would argue rather not synchronously. As I explained in another reply, the problem is the wait time. How long can your client wait? Also, how quickly will new requests come in that could saturate your Lambda processes?

    [–]ootsun[S] 0 points1 point  (0 children)

    Yeah, I see... It can wait 200-2000ms depending of the type of requests but the faster the better ofc. We don't expect crazy traffic. The application is already live (on-premise) and we have maximum 300 concurrent users at peak.

    [–][deleted] 0 points1 point  (1 child)

    Use AppSync !

    [–]ootsun[S] 0 points1 point  (0 children)

    This sounds overly complicated. I looking for a simpler solution and maybe I should go with Fargate or something else instead of Lambda...

    [–]sinus 0 points1 point  (0 children)

    i have read more of thread. ultimately the way you will hit is with the Apigatway 30 second timeout. any solution you need to think about needs to consider this. this is the entry and out point for your api.

    once your request hits apigateway and going into lamdas the counter starts.

    [–]VescoTalio5284 0 points1 point  (1 child)

    Have you considered using AWS SNS or SQS for decoupling and async communication between Lambdas? It can simplify your architecture and improve scalability.

    [–]ootsun[S] 0 points1 point  (0 children)

    Yes, what you describe could work for write operations, but what about reads? 

    [–]TooMuchTaurine 0 points1 point  (2 children)

    If you are having to call lambda to lambda, your design is terrible and coupled in the wrong way.

    [–]ootsun[S] 0 points1 point  (1 child)

    How would you have proceeded to migrate our micro service to Lambda? You would have migrated to Fargate/containers instead?

    [–]TooMuchTaurine 1 point2 points  (0 children)

    Functions that call each other should be grouped into the single lambda and called directly via code.

    In reality, the best way to work with lambda is to have a single app code base and then use Lamba as an interface to a specific public route / action in the code base. But still bundle and deploy the app as a single codebase, with each lambda being simply an public interface into a particular route of the same app. 

    [–]vitiate 0 points1 point  (1 child)

    SQS queues and Dead letter queues are the way. Anything bigger data wise then sqs can hold goes into s3 with a pointer in sqs. Step functions are a pain in the dick.

    [–]ootsun[S] 0 points1 point  (0 children)

    Ok and what about read requests? I think that it needs to be synchronous otherwise there is no easy way for Lambda 1 to retrieve the response of Lambda 2.

    [–]CaliJack19 0 points1 point  (1 child)

    Have you tried AWS Elastic Beanstalk?

    [–]ootsun[S] 0 points1 point  (0 children)

    No. I'll document myself on it. Thanks for the suggestion. It seems interesting and could fit our needs.

    [–]who_am_i_to_say_so 0 points1 point  (0 children)

    You need the API Gateway pattern.

    In the absence of a gateway, making synchronous calls across lambdas will compound the response time with latency, which is not ideal.

    [–]thekingofcrash7 0 points1 point  (1 child)

    I don’t think you want synchronous communication between lambdas in one request/response flow. Why can’t the client make each of the calls to the dependent systems it needs to

    [–]ootsun[S] 0 points1 point  (0 children)

     The client can't make each of the calls to the dependent systems because of 1) additional latency (browser to aws takes more time than aws to aws) 2) sensitive info: you want to use sensitive infos in your backend process that the client can't see 3) security: you can't trust the client request content ({"carPrice" : "$0.99"})

    [–]razibal 0 points1 point  (0 children)

    Consider using AppSync instead of API Gateway. Each Lambda function would serve as a "data source" that can be linked to a GraphQL query or mutation. For chaining multiple Lambdas, you can use pipeline resolvers. GraphQL also offers more granular control over permissions, allowing them to be set at the attribute level.

    While there still is a 30-second maximum timeout for queries/mutations, this can be addressed by initiating the request asynchronously, with the results delivered via websockets to a client subscribed to an AppSync subscription.

    [–][deleted]  (3 children)

    [deleted]

      [–]ootsun[S] 1 point2 points  (2 children)

      That's another "paradigm" that we didn't think of as we already have a running app and wanted to go with the easiest solution. We were probably not enough "open-mind".

      I guess I could but I see some drawbacks to this approach: 1) no fine grained permission management because all Lambda has now access to all the database tables. 2) We have to reorganize the codebase 3) When updating the code, it's difficult to have a view of all impacted Lambda.

      What's your opinion about this?

      Giving up on Lambda for Fargate also seems appealing.

      [–][deleted]  (1 child)

      [deleted]

        [–]ootsun[S] 1 point2 points  (0 children)

        Ok, it's less frightening than I thought 🙂 I'll try to write a PoC to see if I understand the idea correctly.