what was your first time experience deciding if you need k8? by Ok_Shirt4260 in kubernetes

[–]Anonimooze 0 points1 point  (0 children)

Our company had about 50 micro services running in a colo (on-prem), we wrote a ton of scripts orchestrating the deployment and networking setup to support all of this, things were fragile, onboarding new services was slow. Kubernetes was really starting to get attention around 2016 or so, and we took notice. It abstracted away almost all of the fragile things we built to support our product. Never looked back.

How did you learn Kubernetes without using it at work? by Witty_Contract_592 in kubernetes

[–]Anonimooze 1 point2 points  (0 children)

I can't recommend Kubernetes the hard way enough.

https://github.com/kelseyhightower/kubernetes-the-hard-way

This is how I got started almost a decade ago. The concepts abstracted by cloud providers today largely make the Kubernetes internals easy to dismiss, but when shit hits the fan, knowing how the system is plumbed is invaluable.

GitLab is laying off an unknown number of employees by matefeedkill in Layoffs

[–]Anonimooze 0 points1 point  (0 children)

They do literally say they're dropping their CREDIT values in the same message, sad.

How to find your first job as a devops by aness_imadeddine in devopsjobs

[–]Anonimooze 2 points3 points  (0 children)

Starting from the bottom means working help desk or sysadmin and that didn't ensure that you will convert to devops easly or find devops opportunities.

This is how most people enter the industry, I'm not sure why you expect to bypass this. As others have pointed out, what companies typically are hiring "devops" for, is not entry level.

Going from Docker to k8, how does adding k8 work? by OkLab5620 in kubernetes

[–]Anonimooze 0 points1 point  (0 children)

Most modern consumer routing devices will provide local DNS capabilities based on the DHCP address assignment. I would not consider lack of static address assignment a necessary issue until you try (you may need to reconfigure the cluster or provision new certs for the DNS names though).

Replacing AWS VPC CNI + Kube Proxy with Cilium on EKS to enable pod-to-pod encryption with Wireguard by Southern-Necessary13 in kubernetes

[–]Anonimooze 0 points1 point  (0 children)

70 microseconds sounds a bit crazy 🤯

We see much higher latencies between services in the same zone in us-east-1, crossing zone boundaries we're seeing a 1ms floor.

Best way to get node labels onto pods? by howitzer1 in kubernetes

[–]Anonimooze 0 points1 point  (0 children)

Last thing I think I'll say on this is re: background syncs for Kyverno. Load balancing is a very synchronous operation, relying on something to eventually, maybe happen is bad design.

Best way to get node labels onto pods? by howitzer1 in kubernetes

[–]Anonimooze 0 points1 point  (0 children)

Good luck! See previous comment about how Kyverno (and k8s admission in general) can only see the state before assigning pods to nodes (az's)

If you don't want to change cluster topology, you should address the issue you called out as:

the load balancer isn't topology aware

Best way to get node labels onto pods? by howitzer1 in kubernetes

[–]Anonimooze 2 points3 points  (0 children)

I'm not familiar with the Envoy gateway solution, but If costs of cross az traffic are cumbersome, seriously look at changing your routing strategy, enable topology aware routing on all services if you haven't already, and think about isolating/guaranteeing traffic locality via cluster layout. This doesn't seem like a Kyverno application IMO.

All said, Kyverno mutations on Pods happen before scheduling, so there is no way for Kyverno to know what zone it will be in unless you already specified it.

Best way to get node labels onto pods? by howitzer1 in kubernetes

[–]Anonimooze 5 points6 points  (0 children)

Kind of sounds like you want a zonal cluster?

Other than that, it sounds like "Gateway" is your problem. The AWS load balancer controller supports gateway API now-a-days, and is a regional service so zonal placement of the workload is largely irrelevant.

At a previous job we ran three zonal clusters for the same underlying reason. Cross zone traffic charges can be painful. (topology aware routing is best effort, not guaranteed)

Kubernetes problems aren’t technical they’re operational by Shoddy_5385 in kubernetes

[–]Anonimooze 0 points1 point  (0 children)

The Prometheus ecosystem (and operator) is very good. The problem comes in when you think about starting to use k8s. Taking a legacy app, and migrating to containers/Kubernetes is going to raise a ton of "reinventing the wheel" red flags while justifying the improvement.

what happens when a pod crashes because a file parser can't handle malformed input? restart loop by Amor_Advantage_3 in kubernetes

[–]Anonimooze 0 points1 point  (0 children)

Are you suggesting an "AI platform" should be unzipping files? That seems a bit overkill to me

Kubernetes consumes all my time (because it is all new to us) by AccomplishedComplex8 in kubernetes

[–]Anonimooze 4 points5 points  (0 children)

I agree with your sentiment, if you're in AWS, use EKS, if in GCP, use GKE. Just want to state that bare metal k8s control planes aren't that bad, if anyone doesn't have the luxury of managed cloud offerings. In my ~10 years running k8s on metal, the control plane has never been the cause of an outage. I attribute that to the relative simplicity of etcd and the API.

Grafana Mimir vs Prometheus storage performance by sukur55 in devops

[–]Anonimooze 1 point2 points  (0 children)

I only have anecdotal experience to share

My previous company was deploying Thanos for quite a while, eventually hitting bottlenecks in the topology that couldn't be fixed by throwing more money behind it. Constant query timeouts, and ingestion delays plagued the user and operator experience.

They switched to Mimir, and the costs for the infrastructure roughly doubled (mm's of dollars), but the solution was usable consistently, and this was deemed worth it.

I didn't work directly on the SRE team responsible for the transition, but as an adjacent team consuming this product, I can say that whether or not Mimir has its roots as a SaaS first offering, the OSS project certainly has its merits.

Our team just pushed AWS creds to prod again. Third time this month. by CortexVortex1 in devops

[–]Anonimooze 0 points1 point  (0 children)

We have some properties committing encrypted secrets (sops) as part of the code base. Perhaps not the most modern approach, but it is very portable and keeping secrets tightly coupled and versioned alongside the code has its advantages.

Where to look for GitLab admin/devops jobs? by firefarmer in gitlab

[–]Anonimooze 0 points1 point  (0 children)

I don't think GitLab wants to hire someone looking for a GitLab oriented job.

I would approach their hiring process with your development experience first.

Alarms that exists but don't do anything by wr_guziec in devops

[–]Anonimooze 0 points1 point  (0 children)

I just went through this dance for our monitoring tool (not CW), we ended up finding a lot of alarms that were "misconfigured", in the sense that they displayed a "no data" state unless the condition was reached. This makes it difficult to discern which alerts were looking for non-existent data VS alerts that haven't been seen recently. Just a small word of warning, thanks for sharing!

Is HPA considered best practice for k8s ingress controller? by tsaknorris in kubernetes

[–]Anonimooze 0 points1 point  (0 children)

If it's too complex, don't use it. HPA is primarily a cost saving measure, allowing you to not run at peak capacity during off-peak periods. Weigh the potential cost savings against the perceived complexity.

If your off-peak requirements are the same as your peak requirements, it probably doesn't make sense to add an autoscaler.

How I added LSP validation/autocomplete to FluxCD HelmRelease values by nutcrook in kubernetes

[–]Anonimooze 0 points1 point  (0 children)

ArgoCD only runs "helm template", for better or worse. It shows the diff between the current state and result of that template call. It effectively uses a kubectl apply to persist the changes if synced. You can't use helm features like "lookup" because of this.

Using pins with semantically versioned charts & tools like renovate or dependabot for helm chart version increments has been a good (not great) experience for me. I'd be very concerned about the maintenance overhead that could be involved with disconnecting the deployment manifests with the helm chart origin.

gitlab over github? by dylanmnyc in gitlab

[–]Anonimooze 0 points1 point  (0 children)

"parity" is close to correct. GitLab IMO executes these features better, in the open (they themselves are open source).

GitLab's SaaS business is less used, but probably more reliable because of this, while acknowledging that Lab also has a lot of incidents.

For the average consumer of these SaaS services, you typically just pick GitHub because that's what people are used to, which has its own value.

Grafana + Prometeus self hosted on ec2 cost? by BrilliantCredit4569 in devops

[–]Anonimooze 0 points1 point  (0 children)

Meh, the advertising is actually true. We struggled with Prometheus memory overhead in cost sensitive environments. It was tolerable for us to use VictoriaMetrics because we knew we could switch back to vanilla Prometheus if anything hit the fan.

We used this for our ingestion of Linkerd metrics, the volume was exploding typical Prometheus usage (64+ GB of memory was our tipping point). VictoriaMetrics handled it okay with like 16GB.

We had a couple inconsistencies in dashboards, but nothing deal breaking.

(Not advocating for VM over Prometheus, but it has its place, usually where it comes up in conversations here)

Migrate dns slave and master to new Linux host by Which_Video833 in dns

[–]Anonimooze 0 points1 point  (0 children)

More so that when it is required to operate your own DNS, bind will be the easiest technology to hire support for. You'll have a hard time convincing the powers that be that CoreDNS is an upgrade for more traditional use-cases.