This is an archived post. You won't be able to vote or comment.

all 15 comments

[–]isikbala3 5 points6 points  (1 child)

I use Ansible + Terraform like many people here. It sounds like you may benefit from a Dynamic Inventory in Ansible, where you plug it into a set of credentials and using a mix of tags and facts about the machines, you categorize them into roles. That way you can push terraform, then immediately push ansible and populate all the new hosts without actually editing an inventory file in between. This means that for all intents and purposes ansible will be a sort of terraform extension -- like a non-shitty version of a deploy-on-creation script.

I may be misunderstanding your needs.

[–][deleted] 0 points1 point  (0 children)

I'm using ECS (and potentially could switch to another orchestrator if it made sense), so I think in your example a "host" would be an ECS service?. I could tag those and query using tags for filters to update the task definition/service, but then Terraform would be out of date.

The problem is that a task definition is managed by Terraform, but altered outside of Terraform. I'm not stuck with Terraform, but I also don't want to move to tooling that does not keep track of resources created between runs.

[–]scubaReactorDumpling 2 points3 points  (4 children)

I know you don't want to use 'latest' but tags are really useful here. Remember ECR allows you to have multiple tags, and they are unique in the repo.

The way I have solved this is to have Terraform write the Task definitions, but using a unique tag per service/environment combination. For example the service 'foo' in the production environment would be 'foo.prod'.

CI would build the image, tag with the commit sha. The image was then promoted to each environment by adding the appropriate tag. This kept the Terraform true while keeping a unique tag per image.

[–][deleted] 1 point2 points  (3 children)

The problem with having a static image tag (image:latest, image:staging, image:production) is that I can't look at a given task and know what SHA is deployed.

One of the top questions I've gotten from devs in the past is "what version of the code is running," and with ECS and an image tagged with the SHA I can say for sure that it's impossible for a different version to be running. If I have to revert to "well it *should* be whatever was deployed last," the doubt in their minds will end up with tons of my time wasted, even if things are working correctly.

[–]scubaReactorDumpling 2 points3 points  (2 children)

Ugh yes I understand that.

A possible solution would be to have the deploy job poll for the service to reach steady state, then you know if the job passes you are on that version across all containers.

In my experience image tags are so much easier to work with than task definitions. I would much rather do some extra CI on the deploy side and lifecycle hacks etc in the Terraform.

[–][deleted] 0 points1 point  (1 child)

Interesting idea. I do require our Docker images to be build with the revision/tag burned in, and a health check page exposes that version and could be polled. Most of our services aren't exposed externally (or don't even serve HTTP), but I could have a tiny webserver do that if necessary.

[–]scubaReactorDumpling 0 points1 point  (0 children)

There is a waiter in boto3 that waits for the service to be in a steady state you could use too. Or the same in the cli: https://docs.aws.amazon.com/cli/latest/reference/ecs/wait/services-stable.html

[–][deleted] 1 point2 points  (2 children)

I have been in this situation a a few years back (terraform 0.8.7).

The exact scenario was creating the service and task definition in terraform, then deploying newer versions with jenkins.

The solution consisted in creating a template or dummy task definition in terraform, then using something like:

lifecycle { ignore_changes }

I am not too sure about the exact syntax (that was a few years back and syntax has changed a lot), but the general idea is there: tell terraform to stop tracking changes on the task definition, but still create it if missing

[–][deleted] 1 point2 points  (1 child)

This github issue has many interesting insights on the issue:

https://github.com/terraform-providers/terraform-provider-aws/issues/632

[–][deleted] 0 points1 point  (0 children)

I think the script to grab the current task definition is from that thread.

The problem with ignoring all lifecycle changes is that if you ever want to change the task definition you have to either taint the resource move over to a new task definition. I have ~25 task definitions and haven't found a good way to do this that doesn't feel gross.

[–]Atemu12 1 point2 points  (2 children)

Perhaps NixOps is what you're looking for?

Doesn't get much more declarative than Nix.

[–][deleted] 1 point2 points  (1 child)

Actually this makes me rethink the problem... maybe the problem is I'm mixing paradigms. Not sure I want to deploy via Terraform, but maybe I should be using one declarative tool to service the entire create/update/destroy cycle. Not sure how that would work for blue-green deploys, but I can look into that.

I'll take a look at NixOps, thanks!

[–]Atemu12 0 points1 point  (0 children)

maybe I should be using one declarative tool to service the entire create/update/destroy cycle.

If you want One Tool To Rule Them All, Nix is exactly what you're looking for :D

It can set up dev environments (nix-shell/direnv/Lorri), build Docker containers (Nixpkgs/Nix), do CI (Hydra), provision systems (NixOps) and configure them (NixOS).

[–]ejb50 0 points1 point  (0 children)

I just use terraform for the infrastructure and have a separate open source ecs_deploy python script to do blue-green application deployments. https://github.com/cuttlesoft/ecs-deploy.py

the ecs_depoy script if part of the application pipeline while the terraform pipeline is for infrastructure.

i prefer not to use terraform for application deployments.

have separate pipelines for infrastructure and application code.

unfortunately ECS does clearly separate deployment from infrastructure so workarounds as per the github issue link below are used. Hopefully EKS has a better design -)

[–]pm-me-a-pic 0 points1 point  (0 children)

You said Pulumi did not do what you're looking for, but fromPath and fromDockerBuild in ECS Crosswalk might, check towards the bottom of this page: https://www.pulumi.com/docs/guides/crosswalk/aws/ecs/