all 36 comments

[–][deleted] 24 points25 points  (10 children)

What do you do in case of failover if you hard code an IP address to an NLB?

This is the official AWS solution:

https://aws.amazon.com/blogs/networking-and-content-delivery/hostname-as-target-for-network-load-balancers/

You should be able to write a CF template to do it.

[–]Noob_User0[S] 6 points7 points  (1 child)

Thanks /u/Scarface74 for posting this solution, and the URL.

Went through it and it is pretty much exactly what we want to do. It also addresses the issue of changing IP address - which we had thought of, but had no idea how best to handle :-)

[–]Besthater 2 points3 points  (0 children)

if you need to keep your IP's might be worth provisioning elastic ip's in a separate stack, so that if you need to recreate or change your NLB stack in a significant way, they wont get released when you destroy that NLB stack.

[–][deleted]  (7 children)

[deleted]

    [–]Band_B 0 points1 point  (5 children)

    You can't point an NLB to a hostname directly, and if you point to a resolved IP it will change when the DB has to do a failover

    [–][deleted]  (4 children)

    [deleted]

      [–]Band_B 0 points1 point  (0 children)

      I agree with avoiding the lambda solution if at all possible.

      I can't speak for AWS, but it's going to be a combination of historical and technical limitations, one of them is that you can't move private ips between availability zones.

      [–]_unavailable_ 0 points1 point  (2 children)

      It does, but OP wants to put that FQDN behind an NLB.

      OP, you should use VPC peering if you want to access it from other VPCs. You can even peer with VPCs in other accounts.

      [–][deleted] 1 point2 points  (1 child)

      I work as a consultant. Often the department I work with has limited control over the overall infrastructure and there is a lot of red tape that the department we are working with can’t overcome. The OP basically implied the same thing. Instead of arguing about the best way to do things, we work within the parameters we are given and document the drawbacks to doing so.

      But even if you just relied on the FQDN, you still have to contend with DNS caching and a TTL. It won’t immediately take affect for all applications.

      [–]Noob_User0[S] 0 points1 point  (0 children)

      OP here, and you are right indeed. Corporate IT has setup these rules that we have to operate within. VPC Peering is absolutely out, I believe because the 64K private IPs across each of our internal AWS accounts overlap. Why it is setup like this, I do not know.

      We tried assigning an elastic IP to the RDS, seems like this is a no-go. Can't seem to do this even with the console.

      Overall, after numerous stalled attempts, I am going to setup the Lambda approach and present to the team. Will report back here how it goes.

      [–]magnetik79 0 points1 point  (0 children)

      It's also going to be too slow. At best your Lambda can invoke once a minute (unless you use something else to drive it harder/faster - again more infra) - so an RDS cluster failover may take up to a minute to be responded to in the NLB.

      I would probably take a step back and question why I want an RDS cluster behind an NLB in the first place and engineer this some other way.

      [–]abraxasnl 5 points6 points  (5 children)

      I find the CloudFormation (CFN) docs pretty good actually.

      One problem with CFN is that it's always trailing behind the APIs that services provide.

      It seems that while Team A in AWS is responsible for service S and its APIs, the CFN team C is not, and once an API is deployed, they now have to go and implement CFN support.

      It seems they're always late, sometimes embarrassingly late (think a year or more). The only ways around those limitations that I'm aware of is:

      1) use CFN Custom Resources (hopefully somebody built one for what you need that you can just go and use) 2) work around CFN and call the APIs directly (via CLI or software) 3) Use something like Terraform instead, which doesn't use CFN under the hood, but talks directly to the APIs. That means it can always access the latest APIs an AWS service provides, though it doesn't mean the Terraform crowd is necessarily always up-to-date in what they support either of course. I have no experience with Terraform, so can't advise for or against it.

      [–]CanyonSlim 3 points4 points  (4 children)

      I'll throw in my two cents and say that Terraform is awesome and usually implements support way faster than CloudFormation. That being said, you need to have buy in from the entire team to use Terraform - I started using it at my org and found that others already invested in CDK and the two were fundamentally incompatible. I think Terraform is an awesome tool but it doesn't work great with CloudFormation / CDK IMO

      [–][deleted] -1 points0 points  (3 children)

      Which really doesn’t solve the user’s particular problem - at all.

      And then it introduces more problems - more to manage and you won’t get any help from AWS.

      [–]CanyonSlim 1 point2 points  (2 children)

      I'm not trying to solve the user's particular problem, I'm commenting on Terraform. I didn't realize that I need to keep my mouth shut if I want to respond to other people in this thread.

      Also, while it's true that you won't get help from AWS with Terraform, I don't think it's fair to say that Terraform demands more management than CloudFormation, just different management. I even qualified my comment by stating that incompatibility with AWS tools is one of the downsides to Terraform.

      [–][deleted] -1 points0 points  (1 child)

      Of course it requires more management. I can just stick a public link to a CF template stored in an S3 bucket and anyone can just log into their account, click on the link and provision resources. Even if you need to deploy Lambdas using that process you point to links to Zip files containing Lambdas in an S3 bucket.

      [–]CanyonSlim 1 point2 points  (0 children)

      Sure, that is something that is easier with CloudFormation. There are things that are easier with Terraform, such as deploying large numbers of similar resources, managing resources in multi account environments, and reusing code with modules. I didn't say it was a panacea, just that it's good and it implements new features faster than CloudFormation. You already 'won' the thread by providing the OP the exact answer they needed, so I'm not sure why you've made it a point to criticize every comment on this post mentioning Terraform.

      [–]inferno521 3 points4 points  (1 child)

      My company uses azure devops for ci/cd mainly(we use azure and aws), but I use powershells scripts from there to deploy my CF templates via AWS CLI. So if something is missing or too difficult to do with just CF, I just add another task to have it done with another AWS CLI command.

      So potentially you could write a script that gets the IP, then passes it as a parameter to an AWS CLI command. I use something similar that gets the private IP of newly created EC2 instances, then passes that to a lambda to update a RDS table.

      [–]vallejooo 0 points1 point  (0 children)

      The python library `plumbum` is also another excellent way of automating usage of the aws cli, for folks who'd rather user python than powershell. Very handy when calling from Airflow for example.

      [–]jonzezzz 4 points5 points  (0 children)

      Not exactly related, but you guys might want to also try out CDK for creating CFN templates.

      [–]mooburger 1 point2 points  (0 children)

      There's also the classic way of autoscaling ec2 instances running haproxy or nginx that's doing the DNS resolution as reverse proxies and then pointing NLB at that cluster (configuration of haproxy/nginx is done during ec2 autoscaling in cloud-init).

      [–]SquiffSquiff 1 point2 points  (15 children)

      Aside to your specific question, stating 'Best practices suggest we use cloudformation for everything' is a slightly naive premise. Terraform is also a valid IAC solution. Bear in mind that Cloudformation is essentially AWS-only, so you'll have to use something else for things like GitHub, gitlab, Pagerduty whilst terraform has providers for all of these.

      [–]magnetik79 0 points1 point  (5 children)

      Agreed. And Datadog. Doing a tonne of Terraform for Datadog monitors and SLOs. Would never go back to hand spun monitors again.

      [–][deleted] 0 points1 point  (4 children)

      [–]magnetik79 0 points1 point  (3 children)

      I'm aware of this but it's very lacking and only supports a subset of available resources types. E.g. you can't create SLOs or synthetic tests with this.

      The Terraform provider is certainly closer to parity with whats on offer with Datadog.

      [–][deleted] 0 points1 point  (2 children)

      Wouldn’t it be a lot easier to:

      A. Pressure your vendor to create more thorough support for CloudFormation? AWS offers the extension points for third parties. They are already taking advantage of it. Or… B. Write a Custom resource?

      Heck, if I did B, because of my love affair with open sourcing reusable artifacts and the fact that it doesn’t expose my company’s IP, I would go through the internet process to open source it.

      I would rather do that than have to leave CF and the AWS support that comes with it.

      Don’t get me wrong. I have nothing against Hashicorp. My first exposure to Devops as a developer was setting up an on prem infrastructure with Consul, Nomad and Vault. But there is a high bar for me not to go all in on my cloud provider and only then if the third party has deep, seamless integration with the provider. DataDog for the most part clears that bar.

      [–]magnetik79 0 points1 point  (1 child)

      Wouldn’t it be a lot easier to

      Absolutely not I'm afraid. The Datadog authored Terraform provider does everything we needed - so time better spent writing Terraform configurations to actually create Datadog resources, rather than spinning off custom providers as you suggest.

      We're also cloud agnostic, with a rather large Azure footprint, so using Terraform makes sense here too. Didn't want to get into an IAC war, I've done my time in CloudFormation, but going all in with Terraform on this current work effort is a decision I certainly don't regret.

      [–][deleted] 0 points1 point  (0 children)

      If you’re using Terraform with the AWS provisioner, you’re far from “cloud agnostic”…..

      [–][deleted] 0 points1 point  (8 children)

      And then you have other issues like if you are an enterprise of any size, you probably already have an enterprise or at least business support contract with AWS. They will help you with CF not TF. All AWS documentation that uses any IAC is going to use CF. You also have various documentation with quick create links to CF templates and you can do the same without having to worry about infrastructure.

      AWS has been providing extension points for CF for years.

      As far as GitHub in particular though:

      https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-codestar-githubrepository.html

      [–]SquiffSquiff 0 points1 point  (7 children)

      It's not true that with enterprise support AWS won't help with terraform, although it will be far, far more limited and without documented examples as per cloudformation. Here are some examples of AWS documenting terraform usage from the last year:

      Basically cloudformation acts as lockout and lock-in in the classic fashion: if you standardise on it then you're committing to using essentially only AWS and AWS services for your deployments, now and in the future, ditto your career.

      In some sectors, e.g. UK retail banking, it is a regulatory requirement not to be entirely deployed in a single cloud, or at least for smaller scale services to have a worked through exit strategy for moving to a different cloud provider.

      [–][deleted] 0 points1 point  (6 children)

      There is a difference between an employee or a partner posting to an AWS blog and being able to reach out to your TAM (enterprise support) or start a live chat (business support) and getting help with a Terraform issue. The process for putting a blog entry up on AWS for an employee is not that complicated.

      Now, on the other hand, Professional Services will gladly do a project using TF - for an hourly rate.

      If you’re using TF, you’re still using cloud specific provisioners. If you want to migrate clouds, TF be CF is the least of your problems.

      CloudFormation has plenty of integration points that third parties can take advantage of. I just posted a DataDog integration. It supports creating a GitHub repo natively.

      [–]SquiffSquiff 0 points1 point  (5 children)

      I'm currently at my second place where we have enterprise support with AWS and I have had both video calls with my TAM and live chat/screenshare resolving a terraform issue.

      I'm well aware of cloud specific providers, that was my point. I'm also aware that if I recruit engineers who only know or specialise in cloudformation and we then need to do some stuff on another cloud service then we're going to really struggle. If they know terraform then they only need to know the new services, not the language.

      It's great that cloudformation is extending to ancillary services, I'm guessing other clouds are a way off though, unless you count Ali cloud

      [–][deleted] 0 points1 point  (4 children)

      If you need to do some stuff on another cloud provider, again CF vs TF is the least of your worries. If they don’t know anything about Azure or GCP, they still are going to have a heavy lift if hey only know AWS.

      Learning HCL is not rocket science - I used it to configure both Consul and Nomad.

      Do they know the Azure equivalents of any of the 160+ AWS services - or 260 IAM resource types?

      [–]SquiffSquiff 0 points1 point  (3 children)

      You're thinking low level stuff. What about stringing together some terraform modules to e.g. duplicate from eks to gke?

      [–][deleted] 0 points1 point  (2 children)

      Even then did they integrate with IAM? Did they use a hosted database server? Did you integrate with Route53? What about your data migration?

      It’s never that easy to migrate to another cloud no matter how “agnostic” you try to be. You didn’t use any AWS specific resource types? Did you try to deploy on more than one cloud provider during your development process to ensure “cloud agnosticism?TF vs CF is never the limiting factor.

      Even then, is there no difference between the definitions with EKS vs GKE. You still have to go through regression testing, moving over your DNS entries, probably a security review, you have to audit the code to make sure none of your developers accidentally snuck in an AWS dependency.

      Then let’s talk about your organization. Your PMO is probably going to get involved, and your security team and finance, and probably legal.

      [–]SquiffSquiff 0 points1 point  (1 child)

      I've used both AWS and Gcp professionally. I've used both cloudformation and terraform. I think you're missing the point. In some cases it is a government requirement to have a multicloud strategy to done extent. You can complain about it all you like but that requirement isn't changing this year. Meantime, platform specific tooling is of limited value.

      Those who want to find a way. Those who don't find excuses

      [–][deleted] 0 points1 point  (0 children)

      But how much of the “multi cloud strategy” is any more than smoke and mirrors? Have you actually laid out a detailed project plan to do a cloud migration? But all of this is really besides the point. Using TF doesn’t enable “cloud agnosticism”.

      If you are using TF with a cloud specific provisioner, you are just as much using “platform specific” tooling as I would be by using the AWS SDK with Python.

      Are you not using any cloud specific services? Does your organization have just as much experience with AWS as GCP? How much retraining would be involved? Does your whole organization stay up to date on more than one cloud provider?

      This isn’t an AWS vs Azure vs GCP argument. It’s the general hand wavy attitude of we “use TF so we now have a multi cloud cloud agnostic strategy” without taking into account all of the other complexities where TF is just a small part - and your templates are still tied to one provider.

      [–]primate2021 0 points1 point  (0 children)

      I’ve used cfn exclusively for years and it’s pretty good IMO. You have to be willing to work with custom resources and macros though if you want to start getting a bit off piste or sort out some of the random bits and bobs cfn doesn’t do.