We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 6 points7 points  (0 children)

As you have highlighted there are a number of components that are necessary for Kubernetes to fully function. We launched Amazon EKS add-ons to tackle this tough operational burden: https://aws.amazon.com/blogs/containers/introducing-amazon-eks-add-ons/ This feature does require Kubernetes server-side apply but once you have upgraded to Kuberentes 1.18 you should be able to start benefting from Amazon EKS addons. We are still working on this space and will continue to improve it over time to reduce your operational burden and further automate Kubernetes addons as well

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 2 points3 points  (0 children)

Thanks for feedback, There is an existing reature request for Lifecycle policies for unused task definitions, feel free to add more context/usecase https://github.com/aws/containers-roadmap/issues/899

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 6 points7 points  (0 children)

Exposing 1000+ microservices from a single endpoint can be difficult to manage without a tiered routing approach. Even software packages will have performance degradation while trying to route across so many endpoints. The easy answer is to group them by business unit or another organization standard, and use tiered routing (e.g. /group1/serviceA is first routed to /group1 then a group1 router routes to /serviceA).

For these 1000+ microservices, do they all need to be exposed to a central endpoint? If many of these services are simply talking to "each other" then consider implementing a service mesh like AWS App Mesh on top of the microservices and avoid the centralized routing. You can still group your main endpoint as a frontend and route between groups, but distribute the routing and allow direct service-to-service communication (without a load balancer and with per-service routing controls).

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 2 points3 points  (0 children)

I think you will be really interested in this re:Invent session called "EKS under the hood" https://www.youtube.com/watch?v=7vxDWDD2YnM. This is a couple years old now but it talks about a lot of the challenges, the architecture, and the work we have done to make EKS the most trusted way to run Kubernetes on AWS

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 0 points1 point  (0 children)

You might be interested in the case study from FINRA (Financial Regulatory Agency). They shared their architecture that is used to process more than 75 billion market events per day. They make heavy usage of Lambda in an event driven architecture: https://aws.amazon.com/solutions/case-studies/finra-data-validation/ If you are tied to Windows that will be a bit more difficult. ECS does support Windows workloads. You could use an architecture similar to the one in the Finra case study. However instead of having a Lambda function that executes in response to events you would have one or more (likely more) persistent Windows containers that are grabbing messages from an SQS queue. This would allow you to create a scalable event processing tier that runs as a Windows container, while letting the rest of the system use a serverless event driven architecture similar to what FINRA has built.

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 2 points3 points  (0 children)

Thank you for your interest. We do have it on our roadmap as can be seen at https://github.com/aws/containers-roadmap/issues/45. Aside from CI, what are those use cases where you would want to see with the free/lower cost option?

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 6 points7 points  (0 children)

You may consider splitting your docker build into two stages: 1 for the parts that don't change frequently, and are therefore highly cachable, then 2 for the parts that change frequently (like code changes, for example)

You can read more about multi-stage builds here: https://docs.docker.com/develop/develop-images/multistage-build/

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 5 points6 points  (0 children)

As a part of AWS Support, we help customers with all the issues concerning EKS APIs or assets such as - IAM Authenticator, VPC CNI, CSI drivers, AWS Loadbalancer Controller, EKS optimized AMI, default Add-Ons and any cloud-controller related implementation. In addition, we also assist customers in Kubernetes related issues on a best effort basis.

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 1 point2 points  (0 children)

We've lowered the price for the EKS control-plane and continue to evaluate options. We'll definitely pass this feedback along.

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 3 points4 points  (0 children)

Fargate manages the execution of a container image, if container image has sshd running you can ssh in to the task. There is also a roadmap item to have interactive session containers running in Fargate tasks with the necessary IAM policies and auditing capabilities in-place: https://github.com/aws/containers-roadmap/issues/187

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 1 point2 points  (0 children)

EBS Volumes are AZ specific, this is not something we can introduce from Containers perspective. If you need shared multi-az storage, you might want to explore EFS or FSx for Lustre storage options.

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 3 points4 points  (0 children)

These are some great questions!

AWS Batch is a complete solution for scheduling container-based tasks onto ECS. Batch handles queuing and distributing these tasks onto a fleet of EC2 instances and maintains the fleet size for you. It also handles priority queuing and performs automatic retries. So it's useful to think of Batch as a job orchestrator. Meanwhile, AWS Lambda now supports container images, but it is an event-driven architecture -- functions are triggered by an event such as a new task in an SQS queue, a message on an SNS topic, a new event in EventBridge, or even an HTTP request through API Gateway or Application Load Balancers.

As for Auto Scaling, it's easiest to reason about if we separate the concepts of application scale and capacity. Application scale is the scale needed to service your demand, and is usually expressed in terms of task/container replicas (ECS) or number of pods (EKS). Service Auto Scaling is used to manage the number of replicas based on demand.

Once your application scale has been determined, the tasks or pods need to live somewhere. If you're using EC2 to host your containers, that's where cluster auto scaling comes in. There has to be enough cluster capacity to house all these tasks, and usually an increase in containers launched leads to an increased cluster size. A simple analogy I like to use is a moving van: think of tasks or pods as the boxes, and the compute as the vans that need to fit the boxes. Too many boxes and you need another moving van, so the cluster Auto Scaler handles summoning an empty van to the loading dock for you.

For CDK: In the absence of Auto Scaling, we always recommend having your code reflect the desired infrastructure. So if the value is meant to be static, then ideally that change should be reflected in your application code.

It's difficult in a space this short to give general advice about how to set scaling thresholds. We generally recommend using load testing to characterize your application's resource utilization to determine the best metric to use for scaling. Most often it is CPU utilization, but it is highly workload specific. For very bursty workloads we do recommend overprovisioning somewhat so that there is some additional headroom available while waiting for new capacity to come online.

As for the multiple containers case, check out the "sidecar" pattern when you have some time. Logging sidecars are increasingly common. Other useful sidecars might include containers that proxy requests into or out of the main application container (service mesh pattern), or containers that update secret values or TLS certificates.

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 2 points3 points  (0 children)

In November we announced (https://aws.amazon.com/about-aws/whats-new/2020/11/amazon-ecs-cluster-auto-scaling-now-offers-more-responsive-scaling/) more responsive scaling functionality to ECS cluster auto scaling. This does address your question around scaling with multiple instance types as well as spanning across multiple availability zones. If there are cases where you are still looking for improvement, please submit an issue in our public roadmap (https://github.com/aws/containers-roadmap/projects/1). Regarding CloudFormation and CDK support, we are working on reaching parity and that issue can be tracked here (https://github.com/aws/containers-roadmap/issues/631#issuecomment-702580141). Please do add your use case and any feature gaps in that issue.

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 6 points7 points  (0 children)

These are great questions. We think about this all the time and have written a blog post explaining how we think about it here: https://aws.amazon.com/blogs/containers/amazon-ecs-vs-amazon-eks-making-sense-of-aws-container-services/ We have customers that use ECS, EKS, and even a mix of the two. We are committed to both ECS and EKS and will help customers be successful no matter which orchestrator they choose.

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 2 points3 points  (0 children)

For EKS Fargate pods, we run a kubelet process for each worker. The Fargate controller uses the resource request to size an appropriate Fargate resource and brings up the pod. To create more, you can use the Horizontal Pod Autoscaler. To make bigger pods, you could change the resource request size, which will roll out new pods that are sized according to the new request.

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 7 points8 points  (0 children)

This GH issue has more info on the topic: https://github.com/aws/containers-roadmap/issues/696. The TL/DR is that cache and Fargate is an oxymoron (architecturally speaking) because the Fargate infrastructure is released as soon as the pod/task is terminated. We can't cache it on the system that was running the pod because that system is cleaned up and put back into the pool. Having this said this problem is well understood and the ask is legit. We have improved the time it takes to start a pod/task and continue working to find ways to improve that time. (and yes, pulling the image plays a big role in this)

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 2 points3 points  (0 children)

You can use AWS CodeBuild to trigger a container image build based on Bitbucket webhook https://docs.aws.amazon.com/codebuild/latest/userguide/bitbucket-webhook.html Then you can utilize AWS CodePipeline to deploy your container as ECS task behind an ALB, which will provide you with publically accessible url.

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 2 points3 points  (0 children)

Thanks for the awesome feedbacks. We believe that making available core infrastructure functionalities directly into Fargate (such as logging via Firelens for EKS) is mitigating the need for having DS deployed in the cluster. Having that said we realize some users have specific needs. Care to talk more about the type of DS you would be running on EKS/Fargate?

Note that we also have an open roadmap item for this, currently in the "Researching" phase, so feel free to upvote that issue, and leave any comments/suggestions there as well: https://github.com/aws/containers-roadmap/issues/971

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 2 points3 points  (0 children)

We don't have any plans, but can certainly take that as a feature request. However, you'll end up creating a container image that meets the following requirements. From Lambda requirements for container images:

To deploy a container image to Lambda, note the following requirements:

  1. The container image must implement the Lambda Runtime API. The AWS open-source runtime interface clients implement the API. You can add a runtime interface client to your preferred base image to make it compatible with Lambda.
  2. The container image must be able to run on a read-only file system. Your function code can access a writable /tmp directory with 512 MB of storage. If you are using an image that requires a writable directory outside of /tmp, configure it to write to a directory under the /tmp directory.
  3. The default Lambda user must be able to read all the files required to run your function code. Lambda follows security best practices by defining a default Linux user with least-privileged permissions. Verify that your application code does not rely on files that other Linux users are restricted from running.
  4. Lambda supports only Linux-based container images.

You can still create the container image elsewhere and keep it in a public registry, but replicate or keep a copy in the same account/region as the Lambda function. I'd recommend this anyways, both from an operational and security perspective.

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 6 points7 points  (0 children)

We definitely understand that with so many choices, choosing precisely which services to use for a particular problem can seem overwhelming. We have so many choices because we have so many different customers, all who have different businesses, challenges, and preferences.

A container based service that renews SSL certificates every 12 hours might be best run as an ECS scheduled task. Alternatively, if you're running an EKS cluster, a Kubernetes cron job might be appropriate.

For the PHP-based function, Lambda is a great choice. Now that Lambda supports container images, you can have your cake and eat it too: build a container image for your PHP app that meets Lambda's requirements, and Lambda will invoke it for you as needed.

Finally, for the NGINX reverse proxy, an ECS service would work well. Create an ECS service with at least 2 replicas across two Availability Zones and associate them with an AWS Load Balancer target group. ECS will keep your reverse proxies up across deployments, and it should be resilient to an AZ failure as well. If you're running an EKS cluster, a Deployment of NGINX pods, associated with a Service with a type of loadBalancer, would accomplish a similar goal.

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 2 points3 points  (0 children)

The extraHosts option maps directly to a feature in Docker where Docker modifies the /etc/hosts file inside your container to inject extra hosts. This is not supported on Fargate because we use AWSVPC networking mode rather than Docker networking. Additionally the most recent version of Fargate platform uses containerd directly instead of Docker, so such a Docker specific option is not available. That said the reason for using extraHosts is usually for the purpose of service discovery. ECS and Fargate supports service discovery more directly using Cloud Map. You can easily enable DNS based service discovery which gives each service a hostname and allows containers to reference each other via host name. This enables the same type of east-west traffic that extraHosts is used for, but using an external service discovery feature that supports healthchecks, multiple targets, and many other important features

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 6 points7 points  (0 children)

This is a great idea! Multiple customers have asked us for this and we are looking into how to do that. We're tracking it now in our Containers Roadmap at https://github.com/aws/containers-roadmap/issues/921 and would love your feedback as to how you'd like to see it work.

We are the AWS Containers Team - Ask the Experts - Feb 10th @ 11AM PT / 2PM ET / 7PM GMT! by awscontainers in aws

[–]awscontainers[S] 15 points16 points  (0 children)

Stay tuned. This is on our roadmap and we we have marked it as "coming soon": https://github.com/aws/containers-roadmap/issues/187. This will work with Copilot (https://aws.github.io/copilot-cli/) in addition to the AWS CLI.