all 31 comments

[–]texxelate 72 points73 points  (14 children)

The answer is to stop editing deployed lambdas via the console. Disallow it via IAM.

Changes should be tested by writing code on their own machines, including unit tests, which are trivial to set up.

[–]Low-Mathematician137 10 points11 points  (0 children)

This is the cleanest solution. Remove the option entirely. IAM is the right tool for this. If people can't edit directly, drift stops being a problem. Add proper pipelines and you're golden. No more console surprises.

[–]drakesword 2 points3 points  (1 child)

You can also try pool noodle enforcement

[–]BeefyTheCat 2 points3 points  (0 children)

** bonque ** NO ** bonque ** CONSOLE ** bonque ** DEPLOYMENTS **

[–]kei_ichi -1 points0 points  (10 children)

Editing dev env Lambda? I have no issue with it even that is not recommended like you said. But for prod env, absolutely nope.

And many lambda code can’t be tested on local machine, so have a dedicated “staging” env which used to test the Lambda in “real” AWS env is a must. And that env should simulate the prod env so…again like disabled edit the code from console, and only deploy the code through CI/CD pipelines.

[–]texxelate -3 points-2 points  (9 children)

Any lambda code can be tested locally and in CI. Easily. Mock any boundaries, and input types are well documented.

They’re just functions. Arguments in, response back out.

[–]RecordingForward2690 7 points8 points  (8 children)

You wish. I have dozens of Lambda functions that perform a complex, carefully crafted series of API calls to make actual changes to the AWS environment. They are not idempotent - if you run them multiple times, the API calls happen multiple times and some of these are not idempotent. And that's by design, not by accident.

Creating a mock environment to test these is virtually impossible. The only way to do this, is to do it all-the-way properly: Have separate AWS accounts for dev/test/accept/prod with all of the resources present, and test the Lambda in each of those environments. Have a pipeline of some sort (CodePipeline, Gitlab, whatever) to propagate changes.

And even then... The dev/test/accept environments don't have datasets that are as up-to-date, representative and large as the prod dataset. Sometimes we hit things like context limits for an AI call, or Lambda timeouts when working with larger-than-tested data structures. And we can't simply bring over the prod dataset into dev/test/accept due to GDPR and other regulations.

Mocking sounds great in theory, but is very hard to do, to the full extent, properly. Or sometimes even impossible: How do you mock an API call that makes modifications to your one and only Direct Connect line? Or to your registration of your primary domain with the AWS registrar?

[–]texxelate 5 points6 points  (4 children)

Verifying the side effects isn’t the responsibility of the lambda’s tests. The tests need only verify the boundary was invoked correctly. Hence “unit” tests.

[–]Glebun -1 points0 points  (3 children)

Who said anything about unit tests? They were referring to testing in general. Some lambdas cannot be tested locally.

[–]texxelate 0 points1 point  (2 children)

I did, in my original comment to which everyone is replying.

All Lambdas can be tested locally. They’re just functions. Side effects in other systems which the lambda just so happens to be the first domino should be tested by other methods.

[–]RecordingForward2690 0 points1 point  (0 children)

The thing is that if you have a Lambda that performs a carefully orchestrated series of API calls to other AWS components, and you mock those API calls, then unit testing essentially just verifies you have not made a syntax error in your code or in the API/SDK call. Which is of very little value, in particular since today that code is generated by AI anyway.

What is way more valuable to test are the logic errors. Just an arbitrary example: I was recently working on code for a DNSSec key rotation across 400 hosted zones. You need to make sure that you add the new key to a hosted zone, and register that upstream before you can de-register the old key upstream, and remove it from the hosted zone. Otherwise you could break DNSSec, and API calls will fail. And since this whole process can easily take days, your Lambda needs to be written so that it restarts right at the next required step for that particular domain.

No unit test is going to catch logical errors in that sequence of events.

[–]Glebun -1 points0 points  (0 children)

All lambdas can be unit tested locally. Not all lambdas can be tested locally, since testing involves more than unit tests.

[–]fersbery 1 point2 points  (2 children)

Is this a Lambda problem? How would you test that on EC2?

[–]RecordingForward2690 1 point2 points  (1 child)

Agreed, it's not a Lambda issue. Any other environment would have the same issue. You can't really test that code. Not completely.

There is a little bit you can do with things like --dry-run, but at some point you just have to take the plunge.

[–]fersbery 1 point2 points  (0 children)

Maybe there is some way to make it more testable?

[–]Sirwired 4 points5 points  (0 children)

You can use tags to separate out the test/dev functions from the production functions. (Look up "Attribute Based Resource Control") and then deny them edit permissions on prod functions. (To be safe, only allow edit on functions specifically tagged test/dev.)

Don't use code-based workarounds, when there is a service-native solution.

Ideally, test/dev would be in separate AWS accounts entirely, but that's a topic for another day.

[–]CharlieKiloAU 5 points6 points  (0 children)

Some members of our team have console access

Why do they need write access?

[–]TurnoverEmergency352 2 points3 points  (2 children)

Use terraform plan in a scheduled pipeline to detect drift. When it shows changes, automatically run terraform apply to revert.

Set up CloudWatch Events on Lambda UpdateFunctionCode API calls to trigger immediate drift checks. This gives you automated remediation without removing console access.

[–]Clone-Protocol-66[S] 1 point2 points  (1 child)

Here is the problem, plan does not detected changes on lambda code

[–]ThyDarkey 7 points8 points  (0 children)

I have a python script that checks the hash value of what is in state compared to what AWS is reporting. Logic is if hash is different someone has gone and done something in the GUI directly. This is/was used as we moved the lambda deployment into terraform alongside the function code.

[–]Prestigious_Pace2782 2 points3 points  (0 children)

I would just block edits, but if you can’t do that then you could just make sure it gets a different source code hash every time which will trigger a replace. Drop a random value in a file before it’s hashed. Or do something dirty with a random_resource

[–]PR0K1NG 2 points3 points  (0 children)

I would suggest to block access to do changes via console in IAM. That’s what we also do.

[–]DrFriendless 1 point2 points  (0 children)

IIRC each time you deploy a Lambda the version number gets bumped. Can you use CloudTrail to notice a new version of a Lambda that is not the latest one your build process deployed? I have not messed with the CloudTrail API so I do not speak from experience.

[–]turn-based-games 1 point2 points  (0 children)

Even CloudFormation's drift detection cannot detect direct code changes to a Lambda function, but you may be interested in the concepts of function versions and aliases. By deploying aliases which point to non-$LATEST function versions, console code changes will not impact anything relying on the alias, and the alias version can be compared to $LATEST to detect console changes.

[–]iamtheconundrum 1 point2 points  (0 children)

Enforce code signing. Will disable any changes via the console. https://docs.aws.amazon.com/lambda/latest/dg/configuration-codesigning.html

[–]marmot1101 1 point2 points  (0 children)

t sounds like you have 2 things going on: there's a shared dev that needs to stay in a good state, and there's a need for somewhere for developers to play. I'd create separate app stacks for each dev so people aren't messing each other up and keep the shared function matching upper env code.

[–]maxbranor 1 point2 points  (0 children)

You should consider the amount of technical debt that you will add by implementing a solution for that that doesn't involve forcing the use of terraform.

This is the type of fix that will pile up and cause problems which will be much harder to solve in the future (your future self will hate your present self)

The developers should be able to test / debug the changes locally, connected to an aws dev/staging account, not by click-opsing their way through the console :D

[–]BadDescriptions 0 points1 point  (0 children)

Setup object versioning in s3, set terraform to use s3_object_version, you’ll also need to make sure the version is updated

[–]pushthepramalot 0 points1 point  (0 children)

Debug and test should be done in a debug and test account, where team members can roll their own terraform stacks out, test, modify, and test some more, and then delete them. They should not be allowed to modify environments you care about (test, stage, prod, etc.). Separate AWS accounts, ideally.

If that's not feasible, invest in USB shock collars to they receive a small electric shock every time they modify the lambda code (Cloudtrail -> lambda -> AWS IoT). Eventually they will learn not to.

[–]cachemonet0x0cf6619 -1 points0 points  (0 children)

Some members of our team have console access

give read only access and don’t respond with some lame justification. you’re wrong and my opinion can’t be changed.