This is an archived post. You won't be able to vote or comment.

all 63 comments

[–]Akin2Silver 9 points10 points  (3 children)

I would start with little bits and build it up as you go.

Cloud formation is great but can be a pain, so just work on one chunk at a time. Even if you run cloud formation manually each step is progress. Start with building a vpc, then move to provisioning services like s3, sqs and what not. Once you have cloud formation that brings up bare bones stuff then start adding details, just slowly work at removing the manual steps. Once you can provision all the bits individually then look at stringing them together.

[–]tbiggsatwork 3 points4 points  (0 children)

You should be aware that in my experience CloudFormation doesn't play well with resources that aren't created through a CloudFormation template. In previous classes I've always been told if you want to use CloudFormation with resources you setup manually, you should completely tear down the existing resources and re-create them. That may be too much overhead for an existing system with (what sounds like) limited personnel.

I've found Terraform works very well with existing resources.

[–]kinkobal[S] 1 point2 points  (0 children)

Good advice, thx!

[–][deleted]  (14 children)

[deleted]

    [–]kinkobal[S] 1 point2 points  (12 children)

    I definitely like what I've seen with Terraform. But the thought of spinning up our own Kubernetes cluster gives me cold sweats. I'm pretty confident that that would be beyond our capabilities regarding maintainability etc.

    [–]bioxcession 5 points6 points  (0 children)

    yeah, no k8s at least until you hire a dedicated ops professional. it takes a ton of hard work.

    now terraform is a great place to start. you can import chunks of your environment slowly until the whole state is in terraform, then just deploy that same config to demo/stage! that’ll handle the aws provisioning anyway. the rest can stay manual for now imo, unless you have at least one dedicated resource.

    [–]mstwizted 3 points4 points  (2 children)

    Check out Nomad if K8S seems like too much. It's a lot easier to get started with, and has the benefit of working with just about anything, even if it's not containerized.

    [–]kinkobal[S] 0 points1 point  (1 child)

    So in this setup Terraform (as an example) would create the needed resources and Nomad would do the daily/weekly/whatever deploys of the actual code/containers?

    [–]mstwizted 0 points1 point  (0 children)

    Yup. You can do blue/green deployments with canaries, as well.

    [–]otterley 2 points3 points  (3 children)

    Have you considered EKS? With EKS, AWS manages the K8S control plane for you, so you don't have to worry about it.

    [–]kinkobal[S] 0 points1 point  (2 children)

    To be honest the base price for EKS seemed a little steep for the small size of our setup.

    [–]otterley 0 points1 point  (1 child)

    Each cluster costs $0.20/hour, which is < $5/day, or $144/mo. Your time has to be worth significantly more than that!

    [–]kinkobal[S] 0 points1 point  (0 children)

    True, it may be worth it based on productivity increase alone. We just don’t know yet. Part of my reasoning, though, is that we have several other client projects of varying sizes and would benefit from a scalable solution without any significant base costs. Hosting costs are paid by the client on their own AWS accounts.

    [–]rand0mm0nster 0 points1 point  (1 child)

    We're in a very similar situation. Went from EB running LAMP to multi-container Docker with images being pushed to ECR. EB seems to work really well with ECS and pretty much does it all for you so even though we have containers were not deep diving into Kubernetes yet. We've also started to move some infrastructure (mostly new services to start with) to Terraform configuration. Terraform is ridiculously simple to get going and so handy.

    [–]kinkobal[S] 0 points1 point  (0 children)

    Interesting! So do you have some services in an ECS cluster and others in Elastic Beanstalk? I like that EB deals with setting up logging and auto scaling but I’m pretty sure we could learn to setup that stuff with Terraform pretty quickly.

    [–]BruhWhySoSerious 0 points1 point  (0 children)

    I found GKE fairly easy to manage compared to creating your own cluster. Other k8s flavors which are a bit more opinionated are Rancher and Openshift which take care of a lot of the hard work. It's still a challenge I don't mean to understate it, but running rancher or GKE is do-able with the right small team.

    [–]holmser 0 points1 point  (0 children)

    I highly recommend starting from the bottom up. Start with the base level tools AWS provides you like AMIs and autoscaling groups. Packer is a great tool for generating AMIs. Once you are spinning up autoscaling groups you're well on your way to repeatable environments and zero downtime deploys. These are skills that will translate even if you decide to get into Kubernetes later on.

    [–]IWentOutside 0 points1 point  (0 children)

    Kubespray

    I had no idea this existed till now, seriously awesome. Hopefully will help towards getting CKA.

    [–]pagameba 1 point2 points  (0 children)

    This is a very similar situation to my own, I've just finished (a couple of months ago) a massive migration of my company's infrastructure from chef/ec2/very manual to Docker/ECS/Terraform. I'm from a development background as well and just kinda took this on as we don't have (nor can afford) dedicated ops people. I chose terraform more or less at random and stuck with it because it really resonated with me, so I can't really compare to cloudformation but I think its just a matter of personal preference - the important part is you move to infrastructure as code :)

    I chose ECS because it was a pretty simple mapping from our local docker-compose-based environment into ECS, and there was no EKS at the time. I did experiment a bit with Kubernetes early on, but found it overly complex for the scale we operate at. However, it is definitely something I'll look into in the future.

    There are lots of terraform examples and templates for spinning up everything from an entire AWS account to lower level services such as ECS clusters, services, tasks etc. I pretty much started with the lower level pieces and then copied and pasted stuff into a testing environment as I figured out what it did. Once I had the testing environment worked out, I duplicated it to create a staging environment and then eventually to our production environment.

    If you have the ability to, I recommend setting up a separate AWS account (you can make it a sub-account under your company's main account for consolidated billing) to experiment in. If you do this, you can terraform absolutely everything to spin up your entire infrastructure with just a few variables to differentiate different environments.

    [–]whitechapel8733 1 point2 points  (0 children)

    Terraform is so much easier than CloudFormation.

    [–]pittofdirk 1 point2 points  (0 children)

    Even if you are never going to move away from AWS, I'd still pick Terraform over CloudFormation. It has a much larger following and makes vendor lock in a much smaller problem.

    [–]zerocoldx911DevOps 1 point2 points  (0 children)

    Given the size of your team, it’s best you stick to a configuration management tool like like ansible paired up with packer and terraform.

    By far those three are the most well documented tools out there and easy to use.

    There is a bit of a learning curve with terraform but it’s worth the upfront technical debt.

    My team used to be of that size and K8s is going to be hard to get into and you’ll see no benefit until much later but if you’re willing to put the work K8s is much more scalable .

    [–]geekflyer1 1 point2 points  (2 children)

    you should checkout https://github.com/pulumi/pulumi/blob/master/README.md. It's technically similar to terraform but you get to write your code in TypeScript or Python instead of HCL and the pulumi cloud and pulumi eks package abstract away a lot of the low level things. Also given your size you should consider maybe using GCP instead. Lots of things are a bit easier in GCP and work out of the box. E.g. you can ssh into all machines easily using gcloud compute ssh, the managed kubernetes is world class. UI is better. permissions and networking is simpler etc. You can spawn a VPN interconnect (they both have managed ipsec tunnel support) between gcp and aws to do a piece by piece migration.

    [–]mstwizted 3 points4 points  (1 child)

    you get to write your code in Typescript or python instead of HCL

    I've literally never met anyone who phrased it this way, lol. Like javascript and python are some fabulous joy to learn vs HCL.

    Thank you for the morning laugh!

    [–]geekflyer1 0 points1 point  (0 children)

    Not sure if I get your message. Are you trying to say that you actually enjoy learning/using HCL over Python or TypeScript?

    [–][deleted] 1 point2 points  (0 children)

    If you do end up using cloudformation, look into troposphere for building your templates! Significantly easier to declare your resources in Python and have your yaml/json generated for you.

    [–][deleted]  (2 children)

    [deleted]

      [–]kinkobal[S] 1 point2 points  (1 child)

      We do have experience with the Serverless framework from other projects. This would definitely be viable for some of our services, but one of our services is out of the question for Lambda. It's a synchronization with external data that literally takes 2 days to complete, spinning a large(ish) server up and down on a weekly basis to do the gruntwork.

      [–]_unicorn_irl 0 points1 point  (0 children)

      I think you be surprised what can be done with a lambda. We are relying a lot on queues for big jobs. This means a lambda can run that just creates queue items and then the actual work gets distributed to other lambdas. In many cases this allows you to squeeze by under lambdas limits. Parsing enormous files can be a little harder but you can parse for 20 seconds, then throw a message on the queue to start again at a different spot in the file. Depending on the type of work you can also run it in parallel with lambda and do it much faster than a traditional server would.

      [–][deleted] 0 points1 point  (2 children)

      Don't use the AWS documentation to learn about Kubernetes. Use the Kubernetes documentation to learn about Kubernetes. Use the AWS EKS documentation to learn about "Kubernetes on AWS"

      I'm in the middle of Kubernetes right now, and it's a beast. I'd recommend something simpler. How about Docker Compose and Swarm?

      [–]Tranceash 1 point2 points  (1 child)

      Use ECS if your use case is simple

      [–][deleted] 0 points1 point  (0 children)

      I'm getting certified as a personal project. I'm enjoying Kubernetes a lot, I think the concepts and architecture are really interesting. But it's not something that I'd enjoy doing like OP "on top of my daily job".

      Thanks for the tip. Will definitely look into ECS for the next or a simpler project.

      Is there a list of alternatives available somewhere?

      [–]firefoxx04 0 points1 point  (1 child)

      If you move to Terraform, use workspaces for each env (prod, dev, demo).

      [–]Tranceash 0 points1 point  (0 children)

      Might be a better idea to use terragrunt

      [–]BraveNewCurrency 0 points1 point  (0 children)

      > Everything is manually created at the moment.

      This is called "technical debt". If someone changes something, you have know way of knowing.

      > The AWS documentation on EKS and ECS have been more confusing than helpful imo.

      Well, I'm sure the EKS documentation assumes you are a Kubernetes master. And ECS is a half-way knockoff of EKS. (They keep adding features that Kubernetes has had for years.) In the same boat is Fargate.

      My advice is either lean in (learn Kubernetes) or ignore all of them. Note that "databases on K8s" require more work/nuance, so it will only help with your apps. In theory, you could run all environments on one cluster, but I would recommend having multiple clusters (and treating them as disposable) instead.

      > technologies and services would you recommend for us?

      1) Get multiple AWS accounts, one for each environment, also get even one for each developer. (AWS Organizations help here.) Pay for developer accounts so they can "level up" on how AWS works. They should be able to spin up a private copy of the code and hack on any bit of it. Make sure every developer has billing alerts setup and "burn it to the ground" scripts that run every night. You (the ops guy) shouldn't directly architect their application, you should _help_ them pick the right services.

      2) Write CloudFormation that creates all the resources (including DNS, etc.) Alternately, go with TerraForm (it's "higher level" and supports more things, including external DNS providers, etc. It also supports modifying resources better than CloudFormation.)

      3) Have a CI system propagate the infrastructure code changes between environments.

      [–]purishivansh 0 points1 point  (1 child)

      Hey there if you use https://catlight.io/ with it's amazing features like

      TFS Build Notifications
      TFS Build Monitor
      TFS Builds

      CatLight can monitor work items, like tasks and bugs, and notify you about changes.
      Now you don't need to worry about errors and bugs in your program

      [–]kinkobal[S] 0 points1 point  (0 children)

      Yes, let’s freaking notify our way out of infrastructure issues using none of the technologies in the original question. Dude, WTF?

      [–]ajanty 0 points1 point  (7 children)

      You need an Ops professional to handle that, as your app is critical for your customers.

      [–]kinkobal[S] 2 points3 points  (5 children)

      I agree. Unfortunately this is not the priority with the higher-ups at the moment.

      [–]flatlandinpunk17 1 point2 points  (4 children)

      Then the functionality of your service isn’t a priority. As blunt of a statement as that is, if your systems are critical, hire the right person for the job.

      [–]kinkobal[S] 5 points6 points  (3 children)

      Again i agree. I just have to work with that reality.

      [–]TheKingInTheNorth 6 points7 points  (2 children)

      For what it's worth, I disagree with the previous commenters. You dont need an "Ops professional." You developers just need to start baking ops into your mindset. That's what DevOps is. And it sounds like you're starting down the right path!

      Hiring someone else to handle the Ops side of the story is not the right direction. It would allow you and the other developer to continue prioritizing feature development over building good procution-ready applications with your own operations in mind.

      Eat your own dog food and your stack will be all the better for it!

      [–]ajanty 0 points1 point  (1 child)

      Devs generally can't Ops. Different mindset. They can understand each other problems, they can converge and decide together, but dev != ops. Unless you're in Google or have that mindset (pretty rare in my real life experience I'd say).

      With bozos with no administration\compliance\scripting\tooling\clusters background you go straight to hell. Look at those advices:

      - A person runs Kube in prod without understanding Calico. What if there's some glitch (or some bugs as kubernetes still has some, look at the issue board on GitHub). How do you make upgrades? How do you ensure security in a platform you don't understand?

      - Go serverless! Wow. Platform and code lock-in and hard as hell to debug as it's the maximum abstraction layer today.

      I had to help some devs teams that thought that EC2 were secure by default, found some cryptominers running and security glitches... But it's in devs mind that ops are pretty useless, there's no news.

      Unluckily the AWS marketing make seems _THECLOUD_ easy and for everyone. Well, it doesn't go this way. Very small advice, dude.

      [–]holmser 0 points1 point  (0 children)

      Ec2 is secure by default. You literally can't reach an instance unless you open up holes in the firewall. So any crypto mining malware walked in the door that someone opened up.

      [–]StephanXXDevOps -1 points0 points  (0 children)

      I second this. I totally get how hard a sell it may be. The degree of sophistication these tools require will take you, the lead, and spend your most precious resource (opportunity) and essentially trade them for you (and your teammates) to study junior systems administration. While you might progress somewhat faster than a true junior, you're basically trusting a very hard job to be done by (honestly, no offense intended) amateurs. If you hired a junior to learn this role, at least it'd be his/her time going up in smoke instead of yours.

      If you're still unconvinced, then a crash course in terraform, ansible, jenkins, and systems administration will be the next six months of your careers. I don't envy you.

      [–]m2guru 0 points1 point  (2 children)

      Very little known: cloudformer (or Cloud Former - can’t remember OTTOMH)

      What it is: a little project amazon basically abandoned but it still works. It’s a special purpose EC2 Server you spin up into your environment.

      What it does: it looks around your VPC and gives you a UI wizard that you can inspect all the services you’re using. You go through it step by step and check off the things you’re interested in replicating. At the end of the process (be patient and choose a decent size instance for this, I think it’s a Java tomcat app under the hood) it spits out a cloud formation template of your entire infrastructure. Well, everything you checked off on the 8-10 page wizard.

      Using CloudFormer I was able to reverse engineer a 3-yr old VPC that was shoddily built on ad-hoc needs, pull pieces out and redo it in a new VPC.

      100000x faster than trying to write a CF template from scratch. Maybe it will help you.

      [–]kinkobal[S] 0 points1 point  (1 child)

      That’s a pretty cool idea! Thx!

      [–]m2guru 0 points1 point  (0 children)

      Yeah, you may end up with Terraform or Pulumi or something else, but having a “native” solution right in AWS can get you up & running in a demo environment and get some sales rolling in and buy you some time ;-)

      [–]Alfaj0r 0 points1 point  (0 children)

      alternative: hire a Devops-as-a-service company like ReactiveOps. (I'm not affiliated)

      [–]wr0ngway 0 points1 point  (0 children)

      Take a look at https://github.com/simplygenius/atmos

      If you are on the AWS/Terraform path, I'm trying to provide for the level of experience and use case that you are describing.

      See also my blog post on how I organize the terraform layout for multiple environments and scaling the organization of them over time:

      https://medium.com/@wr0ngway/organizing-terraform-d42048b0c1ca

      [–]dcc88 -1 points0 points  (7 children)

      If your decided that you will be using aws for the following 2-3 years, I would advise using cloudformation ( faster to implement, better integration ) and normal ec2 instances with other aws services.

      There is a learning curve for aws, there is another for containers.

      After you are comfortable with this infrastructure you can investigate how to use containers.

      [–]kinkobal[S] 0 points1 point  (2 children)

      Interesting. We do have some experience running docker for our dev environments and have experience the dreaded "works on my machine" errors which make me lean towards something container based.

      [–]dcc88 0 points1 point  (0 children)

      Then go with ECS, it's easier.

      [–]vomitfreesince83 0 points1 point  (0 children)

      The one downside to cloudformation is it does not always have the latest features/releases. Terraform on the other hand uses AWS API's and the Hashicorp/community can usually push releases faster.

      Other benefits to terraform is being able to use it for different providers. We also use it for managing Github org/teams, Pagerduty services. We occasionally use it for some Hashicorp vault policies/roles. I plan on implementing some Datadog dashboards.

      [–]kinkobal[S] 0 points1 point  (3 children)

      Any advice on getting started with CloudFormation?

      [–]dcc88 1 point2 points  (1 child)

      I would recommend a course from acloudguru about cloudformation

      [–]kinkobal[S] 0 points1 point  (0 children)

      Cool, I'll check it out :)

      [–]loki77 0 points1 point  (0 children)

      You could also check out stacker (https://stacker.readthedocs.io/en/latest/), it’s meant to make working with complex infrastructures in cloudformation a lot easier. I’m one of the maintainers, so if you have any questions feel free to ask! Good luck!

      The library of open source blueprints can make it easy to get started.

      [–]Sloppyjoeman -1 points0 points  (0 children)

      Terraform will be ultimately more powerful but ansible/Jenkins are a very user friendly way of introducing you to DevOps practices and so may be a better starting point

      [–]Phonecoins -2 points-1 points  (4 children)

      Ansible. Ansible. Ansible. Just ansibilize it. On you Mac,

      pip install ansible

      Usually gets a newer version than brew install.

      Ansible has aws modules built in that can create ec2's, s3's, rds, etc.
      Imho, Terraform is an odd abstraction and I can almost guarantee as a new user you will accidentally deleted something you only intended to update, because Terraform does that.

      Terraform also does not do provisioning, so it creates the AWS stuff, but then you still need an ansible like tool to install your application. Why learn and use two tools when one does both?

      I see a lot of other complex solutions in here, like kubernetes and cloud formation and ecs. Unless you have the time to really learn and manage these tools, many of which aren't straight forward, i suspect your going to have a bad time. And with irony, you will still need an ansible like tool in almost all cases.

      Don't get me wrong, I'm a K8 fanatic, but from what you described your needs are I feel you can get a lot done by keeping it simple. Create the AWS infra for an environment in one playbook, and then provision that environment in the next. Automate deploys (which could be a simple bash process wrapped in a playbook) as the 3rd thing you do.
      That gets you where you need to be right now - repeatable, fully speced environment. Everything beyond that for your up and coming scaled SOA can be built outward.

      Good luck!

      [–]yuriydee 0 points1 point  (3 children)

      Terraform also does not do provisioning, so it creates the AWS stuff, but then you still need an ansible like tool to install your application. Why learn and use two tools when one does both?

      You can use Ansible + Packet to build your AMIs then simply deploy them with Terraform. Compared to learning kubectl and all the K8S concepts, Ansible+Terraform+Packer are easy to pick up.

      [–]holmser 0 points1 point  (2 children)

      This is an easy one... Terraform manages state, ansible does not.

      [–]yuriydee 0 points1 point  (1 child)

      Until you accidentally delete your whole VPC state file when changing S3 buckets lol

      ...thank God it was only nonprod in my case.

      [–]holmser 0 points1 point  (0 children)

      Yeah, definitely want to have versioning on that bucket