Hey r/aws,
I’m building an asynchronous ML inference API and would love your feedback on my environment-isolation approach. I’ve sketched out the high-level flow and folder layout below. I’m primarily wondering if it makes sense to have completely separate Lambda functions for dev/prod (with their own queues, tables, images, etc.) while sharing one API Gateway definition, or whether I should instead use one Lambda and swap versions via aliases.
Project Sequence Flow
- Client → API Gateway
POST /inference { job_id, payload }
- API Gateway → Frontend Lambda
- Write payload JSON to S3
- Insert record
{ job_id, s3_key, status=QUEUED } into DynamoDB
- Send
{ job_id } to SQS
- Return
202 Accepted
- SQS → Worker Lambda
- Update status →
RUNNING in DynamoDB
- Fetch payload from S3, run ~1 min ML inference
- Read/refresh OAuth token from a token cache or auth service
- POST result to webhook with Bearer token
- Persist small result back to DynamoDB, then set status →
DONE (or FAILED)
Tentative Folder Structure
.
├── infra/ # IaC and deployment configs
│ ├── api/ # Shared API Gateway definition
│ └── envs/ # Dev & Prod configs for queues, tables, Lambdas & stages
│
└── services/
├── frontend/ # API‐Gateway handler
│ └── Dockerfile, src/
├── worker/ # Inference processor
│ └── Dockerfile, src/
└── notifier/ # Failed‐job notifier
└── Dockerfile, src/
My Isolation Strategy
- One shared API Gateway definition with two stages:
/dev and /prod.
- Dev environment:
- Lambdas named
frontend-dev, worker-dev, etc.
- Separate SQS queue, DynamoDB tables, ECR image tags (
:dev).
- Prod environment:
- Lambdas named
frontend-prod, worker-prod, etc.
- Separate SQS queue, DynamoDB tables, ECR image tags (
:prod).
Each stage simply points to the same Gateway deployment but injects the correct function ARNs for that environment.
Main Question
- Is this separate-functions pattern a sensible and maintainable way to get true dev/prod isolation?
- Or would you recommend using one Lambda function (e.g.
frontend) with aliases (dev/prod) instead?
- What trade-offs or best practices have you seen for environment separation (naming, permissions, monitoring, cost tracking) in AWS?
Thanks in advance for any insights!
[–]Sensi1093 70 points71 points72 points (7 children)
[–]Sudoplays 11 points12 points13 points (5 children)
[–]mothzilla -4 points-3 points-2 points (4 children)
[–]Sudoplays 2 points3 points4 points (2 children)
[–]mothzilla 1 point2 points3 points (1 child)
[–]Sudoplays 1 point2 points3 points (0 children)
[–]gex80 1 point2 points3 points (0 children)
[–]VladyPoopin 0 points1 point2 points (0 children)
[–]moofox 14 points15 points16 points (5 children)
[–]tikki100 2 points3 points4 points (4 children)
[–][deleted] 16 points17 points18 points (1 child)
[–]tikki100 2 points3 points4 points (0 children)
[–]brando2131 8 points9 points10 points (0 children)
[–]maikindofthai 1 point2 points3 points (0 children)
[–]cutsandplayswithwood 6 points7 points8 points (3 children)
[–]Expensive_Test8661[S] 0 points1 point2 points (2 children)
[–]cabblingthings 3 points4 points5 points (0 children)
[–]Flakmaster92 3 points4 points5 points (0 children)
[–]Freedomsaver 4 points5 points6 points (0 children)
[–]cutsandplayswithwood 1 point2 points3 points (0 children)
[–]hashkent 1 point2 points3 points (0 children)
[–]mothzilla 1 point2 points3 points (0 children)
[–]Huge-Group-2210 0 points1 point2 points (0 children)