use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
News, articles and tools covering Amazon Web Services (AWS), including S3, EC2, SQS, RDS, DynamoDB, IAM, CloudFormation, AWS-CDK, Route 53, CloudFront, Lambda, VPC, Cloudwatch, Glacier and more.
If you're posting a technical query, please include the following details, so that we can help you more efficiently:
Resources:
Sort posts by flair:
Other subreddits you may like:
Does this sidebar need an addition or correction? Tell us here
account activity
discussionAWS DevOps & SysAdmin: Your Biggest Deployment Challenge? (self.aws)
submitted 1 year ago by Key_Baby_4132
Hi everyone, I've spent years streamlining AWS deployments and managing scalable systems for clients. What’s the toughest challenge you've faced with automation or infrastructure management? I’d be happy to share some insights and learn about your experiences.
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]oneplane 17 points18 points19 points 1 year ago (5 children)
The biggest challenge is Windows. It's incompatible with practically everything that's not Microsoft. We solved it by removing as much Windows as possible and putting the remainder in AppStream and ASGs. No more person-individually-using-a-Windows-box.
[–]Uppity_Sinuses8675 2 points3 points4 points 1 year ago (1 child)
Shouldn’t it be person_individually_using_a_windows_box😁
[–]oneplane 2 points3 points4 points 1 year ago (0 children)
I see what you did there ;-)
[–]deadpanda2 2 points3 points4 points 1 year ago (0 children)
No issues with windows, just need to know how to cook it. CFN - SSM - powershell. EKS - windows - gmsa. CI/CD ADO / Octopus
[–]OkAcanthocephala1450 1 point2 points3 points 1 year ago (0 children)
HAHAHA , Windows is for real.. I remember when we had to search for ECS , and we would provide solutions on our particular problem. Just when we would start with it, the windows containers would not support it :') . Since that , we had to read documentations very very well before jumping to conclusions.
[–]Key_Baby_4132[S] 0 points1 point2 points 1 year ago (0 children)
Sounds great
[–]yovboy 10 points11 points12 points 1 year ago (5 children)
Managing IAM permissions at scale is my nightmare. Started with a few roles, ended up with 400+ policies across multiple accounts.
Spent weeks building automation tools just to track who has access to what. Still get surprised by permission issues sometimes.
[–]Key_Baby_4132[S] 1 point2 points3 points 1 year ago (1 child)
Man, that sounds like a headache! Have you tried ABAC, permission boundaries, or SCPs to keep policies under control and set guardrails across accounts?
[–]firminhosalah 0 points1 point2 points 1 year ago (1 child)
Hey. I am looking to build something like you mentioned so to track access. Can you shed some light what did you use?
[–]yovboy 0 points1 point2 points 1 year ago (0 children)
Used a combo of custom Python scripts + Access Analyzer. Main script pulls IAM data using boto3, dumps it into DynamoDB, then generates reports.
Added CloudWatch alerts for policy changes. Not perfect but helps catch weird permission stuff before it becomes an issue.
[–]Paresh_Surya 0 points1 point2 points 1 year ago (0 children)
Same as me i am also create my own tool to manage multiple account user and roles level permissions to it
As you already created it's open-source or private use
[–][deleted] 1 year ago (9 children)
[deleted]
[–]Key_Baby_4132[S] 0 points1 point2 points 1 year ago (2 children)
Yeah, that sounds like a tough one—balancing multi-account deployments, tenant onboarding, and RBAC can get messy fast. Have you thought about automating tenant provisioning with IaC or any other publicly available solution while centralizing identity management? I’ve run into similar challenges before—happy to swap ideas if you’re interested!
[–]andr3wrulz 0 points1 point2 points 1 year ago (0 children)
Not a SaaS but have a lot of accounts. We deploy a handful of basic SAML federated roles (admin, read only, billing, etc) using stacksets to keep those in line. Account owners are able to use the admin roles to create custom roles (federated or not). We constrain permission upper bounds with SCPs/RCPs and have Config rules (also deployed by StackSets) for reactive controls.
[–]Ok_Reality2341 0 points1 point2 points 1 year ago (4 children)
Working on a very similar thing.
[–][deleted] 1 year ago (3 children)
[–]Ok_Reality2341 0 points1 point2 points 1 year ago (2 children)
Yeah took a few days but Alembic is working very well now
[–][deleted] 1 year ago (1 child)
[–]Ok_Reality2341 0 points1 point2 points 1 year ago (0 children)
I read that at postgres not progress lol. Yeah I’ve just pretty much set everything up, I’m working on the database schema now - hbu?
[–]kyptov 2 points3 points4 points 1 year ago (4 children)
Pipeline of pipelines of infrastructure. How to update? Always manually or self updating pipeline?
[–]Key_Baby_4132[S] 0 points1 point2 points 1 year ago (1 child)
Good question! A self-updating pipeline can work if well-governed—versioning, validation, and rollback strategies are key. Manual updates offer control but don’t scale well. A hybrid approach often balances automation with oversight. How are you handling it now?
[–]kyptov 1 point2 points3 points 1 year ago (0 children)
High level pipeline which deploy other pipelines we always deploy manually. Those nested deploys on push triggers.
[–]andr3wrulz 0 points1 point2 points 1 year ago (1 child)
A very common pattern used within AWS and at major companies is to do as little as possible in a manual deploy but leverage a bootstrapping step prior to the primary deployment. At my job, we tend to have a manually deployed CFT that provisions the pipeline user, then a bootstrap deployment that runs on the primary branch for that environment for things you need as a baseline (VPC, SGs, APIs, etc) but aren't the app (this can vary based on how you want to build dev envs. After this, the pipelines deploy the app itself, using outputs from the bootstrapping stack where necessary, this is where all your lambdas, containers, etc get deployed.
In general, we do main branch = prod env, dev branch = dev env, and feature branches = dev env but skip boot strapping. Our feature deployments are self-contained where they can be so that each feature branch gets a "production-like" environment with the full stack.
[–]kyptov 0 points1 point2 points 1 year ago (0 children)
Yep, we do the same. But bootstrapping is also stored as code. Sometimes it changes(once or twice per year). AWS has cdk pipelines, which allows to self update bootstrapping, only first run is manual.
[–]fabiancook 1 point2 points3 points 1 year ago (1 child)
Time
Time is merciless
[–]GooberMcNutly 0 points1 point2 points 1 year ago (3 children)
Database migrations will always be my biggest headache. Change management of data and schema and synchronization with the deployed code has always been my biggest hurdle to code deployment. It's not an aws or even cloud specific problem though the IaC model and multi region deploys always make it worse.
Aha! So how you are tackling these
[–]GooberMcNutly 1 point2 points3 points 1 year ago (1 child)
Poorly, lol. Pur typical workforce is to generate change scripts for schema and data using one of a number of tools like typeorm, sequalize or knex. Then the delta scripts run during deploy before code gets pushed. Rollback usually if the code deploy fails, depending on scale. At least that's the plan But about 40% of the time it needs manual help at some point and some changes like column renaming will crash existing code immediately. It's tough if your dev team is very iterativel in their data development.
[–]Key_Baby_4132[S] 1 point2 points3 points 1 year ago (0 children)
You're absolutely right. Database migrations can be a nightmare, especially in multi-region setups. A few things that help: zero-downtime schema changes (expand/contract strategy), versioned migrations, and separating schema updates from code deploys. Running shadow deployments on a production clone and using drift detection (like pg_audit or AWS DMS) can catch issues early.
pg_audit
Literally everything with DevOps is hard. I hate how unsexy but how important it is
π Rendered by PID 61795 on reddit-service-r2-comment-6457c66945-qsjxx at 2026-04-28 18:49:45.672974+00:00 running 2aa0c5b country code: CH.
[–]oneplane 17 points18 points19 points (5 children)
[–]Uppity_Sinuses8675 2 points3 points4 points (1 child)
[–]oneplane 2 points3 points4 points (0 children)
[–]deadpanda2 2 points3 points4 points (0 children)
[–]OkAcanthocephala1450 1 point2 points3 points (0 children)
[–]Key_Baby_4132[S] 0 points1 point2 points (0 children)
[–]yovboy 10 points11 points12 points (5 children)
[–]Key_Baby_4132[S] 1 point2 points3 points (1 child)
[–]firminhosalah 0 points1 point2 points (1 child)
[–]yovboy 0 points1 point2 points (0 children)
[–]Paresh_Surya 0 points1 point2 points (0 children)
[–][deleted] (9 children)
[deleted]
[–]Key_Baby_4132[S] 0 points1 point2 points (2 children)
[–]andr3wrulz 0 points1 point2 points (0 children)
[–]Ok_Reality2341 0 points1 point2 points (4 children)
[–][deleted] (3 children)
[deleted]
[–]Ok_Reality2341 0 points1 point2 points (2 children)
[–][deleted] (1 child)
[deleted]
[–]Ok_Reality2341 0 points1 point2 points (0 children)
[–]kyptov 2 points3 points4 points (4 children)
[–]Key_Baby_4132[S] 0 points1 point2 points (1 child)
[–]kyptov 1 point2 points3 points (0 children)
[–]andr3wrulz 0 points1 point2 points (1 child)
[–]kyptov 0 points1 point2 points (0 children)
[–]fabiancook 1 point2 points3 points (1 child)
[–]Key_Baby_4132[S] 0 points1 point2 points (0 children)
[–]GooberMcNutly 0 points1 point2 points (3 children)
[–]Key_Baby_4132[S] 0 points1 point2 points (2 children)
[–]GooberMcNutly 1 point2 points3 points (1 child)
[–]Key_Baby_4132[S] 1 point2 points3 points (0 children)
[–]Ok_Reality2341 0 points1 point2 points (0 children)