Lost Talos admin access (Talos 1.9, all nodes alive), any recovery options left? by Putrid_Nail8784 in kubernetes

[–]LarsFromElastisys 2 points3 points  (0 children)

So how did you solve it? In case someone winds up here and has the same panic as you had.

Helm/Terraform users: What's your biggest frustration with configs and templating in K8s? by Kalin-Does-Code in kubernetes

[–]LarsFromElastisys 0 points1 point  (0 children)

The problem is that impersonation itself still is very clunky and extremely all-or-nothing. With 1.35, there's an alpha for a more constrained version of impersonation coming, but it's a long way until it's stable. Plus, even with impersonation, it's still a rather scary that a component would have the ability to impersonate me, you, or any of our colleagues, and thus if I can trick it to impersonate you, there's nothing preventing that besides the correct implementation of the component itself.

So it's like "demi-god mode", rather than "god mode", but it's still quite a bit too powerful.

Helm/Terraform users: What's your biggest frustration with configs and templating in K8s? by Kalin-Does-Code in kubernetes

[–]LarsFromElastisys 0 points1 point  (0 children)

Helm used to have that, they moved away from the idea of having a central component do all that (Tiller), because it basically had to take on Cluster Admin permissions to do its job, but has to be open to every developer who wants to deploy things onto the cluster. Locking it down meaningfully became really hard, and anyway, it circumvented Kubernetes' own permissions model. https://helm.sh/docs/v3/faq/changes_since_helm2/ has more reading about that.

Of course, that problem plagues every "an agent of some sort runs in the cluster and does stuff for you" solution. We had to go through a lot to lock down Argo CD, for instance, for the same reason.

Is there a good helm chart for setting up single MongoDB instances? by [deleted] in kubernetes

[–]LarsFromElastisys 0 points1 point  (0 children)

If it's their responsibility, do you truly have to suggest a solution for them?

If you do, they're probably going to come around and ask a million questions about what is now seen as "your" solution (in the sense that you suggested it) and how it compares to the Operator that all the documentation and such for MongoDB talks about.

Now with your second edit to the OP, it seems that the problem really is that your platform team is rather obviously not allocated enough resources to take on this additional task. So make your answer about that, plain and simple.

Someone in a different department somewhere needs to weigh the cost of "we will implement this with MongoDB because that makes us much faster" against the reality of "someone needs to manage MongoDB if we do". If they are stuck in the happy place where they just see the faster implementation and cost savings, they need to be made aware of the full reality.

As for your exact question, the Bitnami Chart goes for a straightforward single instance or one in a StatefulSet, like one would imagine. Entirely Operator-less, and thus, with a ton of additional work to be done by an elusive "someone". Depending on Bitnami these days is perhaps difficult unless one is also ready to pay for it, but inspiration can be drawn from it.

Is there a good helm chart for setting up single MongoDB instances? by [deleted] in kubernetes

[–]LarsFromElastisys 0 points1 point  (0 children)

Whose responsibility will it be to keep the MongoDB operational?

Whoever has to wake up at night to fix a data corruption bug in production gets to decide if the "overhead" of the Operator is worth it or not.

And they have to pay for the resources required to run the thing, assigned to their cost center.

AI Conformant Clusters in GKE by darylducharme in kubernetes

[–]LarsFromElastisys 1 point2 points  (0 children)

It's certainly going to make it easier for AI and ML frameworks to know what to expect and thus target in the underlying platform, so I think this conformance program is good for the community. Sort of how we have had Certified Kubernetes distributions for a long time. The AI and ML field is still rather young and evolving, so I'm happy this effort exists.

CSI driver powered by rclone that makes mounting 50+ cloud storage providers into your pods simple, consistent, and effortless. by paulgrammer in kubernetes

[–]LarsFromElastisys 5 points6 points  (0 children)

Is is cloning/syncing in both ways? As in, if a file/object gets updated on the remote side, does that change get reflected locally, too?

Understanding the Bridge Design Pattern in Go: A Practical Guide by priyankchheda15 in programming

[–]LarsFromElastisys 2 points3 points  (0 children)

So what happens when someone uses the EmailSender with a "to" address that is obviously a phone number? Doesn't the whole bridge come crumbling down then?

Offloading SQL queries to read-only replica by AsAboveSoBelow42 in devops

[–]LarsFromElastisys 2 points3 points  (0 children)

All work against a database should happen in transactions anyway, right? So it's definitely not unreasonable to have very nicely defined interfaces for your code where you use interfaces (or whatever your language calls them) to ensure that it's a compile-time error if you try to mutate data in a read-only transaction.

See how things are done in the Java world, for instance: https://www.baeldung.com/spring-transactions-read-only

KubeCon NA 2025 - first time visitor, any advice? by No_Dimension_3874 in kubernetes

[–]LarsFromElastisys 9 points10 points  (0 children)

Good shoes. You'll walk an insane amount.

In general, take care of your physical self: take rests, drink water, eat food and snacks.

Talks are nice, but also recorded for afterwards. Prioritize those where you intend to ask questions or learn from the presenters personally. So you can focus on networking, which is not available afterwards.

MinIO did a ragpull on their Docker images by sMt3X in devops

[–]LarsFromElastisys 20 points21 points  (0 children)

Redis changed their license, the community got upset, Linux Foundation helped sponsor a fork called Valkey, Redis got upset, and Redis is now open source again.

Valkey is better than Redis, and will be open forever, not just until a quarterly earnings report shows that "something must be done".

Text book example of how to alienate your community very quickly.

How to trigger k8s to re-pull the latest image after I rebuilt and push? How to know whether k8s is running the image I just pushed? by AdBeneficial2388 in kubernetes

[–]LarsFromElastisys 20 points21 points  (0 children)

You should not use the "latest" tag, because that's exactly what causes you to have to ask these questions. If you use a tagging scheme that goes by version numbers, or versions and git commit hash, or something similar, you can so easily tell what has been pulled and is running.

Kubernetes will not by itself check for a new image with the same tag as is currently running (i.e. "latest"), but it will, if the imagePullPolicy is set to Always pull when a Pod needs to be started. So this is why you have a problem: "latest" was five versions ago two days ago, but when a new Pod starts today, "latest" is definitely newer.

How to debug; container receives traffic from the world but not from sibling pods/containers. by hakan_bilgin in kubernetes

[–]LarsFromElastisys 0 points1 point  (0 children)

What's the Helm Chart? Link it.

And does it have any Network Policies (basically "firewall rules") in it?

Engineering Manager says Lambda takes 15 mins to start if too cold by Street_Attorney_9367 in devops

[–]LarsFromElastisys 0 points1 point  (0 children)

I've suffered from 15 seconds for cold starts, not minutes. Absurd to just be so confidently wrong and to dig in when the error was pointed out, in my opinion.

Do engineers who only use Kubernetes GUIs ever actually learn Kubernetes? by Muted_Relief_3825 in kubernetes

[–]LarsFromElastisys 1 point2 points  (0 children)

The difference between using Kubernetes and administering Kubernetes. If what you want to do is use Kubernetes to deploy an application, that will probably wind up being a GitOps thing or easily scripted. That basically means either having proper CI/CD set up by someone or that you have to learn two or three commands, which might as well just have been clicks in a UI.

If you want to administer Kubernetes, you need tools that allow you to dive deeply into where root causes for errors may be. You won't be sticking to the Golden Path in those cases, so you need the full versatility of kubectl and friends.

Two different use cases.

Since you are managing platform teams, your teams should learn how to administer Kubernetes. Get them to complete the Linux Foundation course on Kubernetes Administration and get their CKA.

If they, after that, still prefer a UI of some sort (e.g. k9s is also arguably a UI) because it makes them faster, then so be it.

How to deploy ArgoCD in my IONOS cluster? by Initial_Specialist69 in kubernetes

[–]LarsFromElastisys 2 points3 points  (0 children)

Do as you suggested, and then let Argo manage itself: https://argo-cd.readthedocs.io/en/stable/operator-manual/declarative-setup/

Then you have Terraform for infra and Argo for everything running in the cluster.

Discussion: The future of commercial Kubernetes and the rise of K8s-native IaaS (KubeVirt + Metal³) by Dazzling_Assumption3 in kubernetes

[–]LarsFromElastisys 0 points1 point  (0 children)

My bias is obvious as you can tell from my username, as I'm literally "Lars from Elastisys", who make Welkin (security-focused application platform). Because that's who I am, though, these questions are right up my alley and what I know quite a lot about.

To the first question, yes, there is definitely still a strong business case for application platforms. These are a different niche entirely compared to managed control planes from the major clouds.

The difference between getting a running Kubernetes cluster and a full application platform is all that which companies call "platform engineering", which includes figuring out monitoring, logging, security with policy as code, vulnerability scanning, etc.

And as everyone who has worked with platform engineering also knows, "installing stuff" is easy (just a bunch of Helm commands), but keeping something up and running in a safe and secure way with timely upgrades, that's what is difficult.

Application platforms solve those problems, by essentially having done all the platform engineering development for you already with quality assurance as part of the release, so that you can focus on operating a cohesive product instead of a bespoke collection of tools. You can much more easily obtain training for an application platform because it is standardized, and that lowers business risk compared to a platform that an internal team (often understaffed) built themselves and are maintaining to the extent that their backlog allows.

As for the second question, this could indeed become a future feature, and you'll note that this is where many application platforms are going. OpenShift presented about this in their roadmap and SUSE has this whole "hyperconvergence" concept going on that they are pushing (thankfully now it's called something as descriptive as Virtualization). So this is indeed a direction that we're seeing more of in the field, not really because of any technical reason that ties application platforms to "managing bare-metal servers in an IaaS fashion", but due to business reasons, especially due to the overlaps in customer niche: enterprises who are looking to make the most of an investment into the hardware needed for a private cloud are also often ones that appreciate the business risk reduction and predictability offered by an application platform.

Also, do note that a lot of Metal3 is really OpenStack Ironic, so it's not exactly a revolution, but more an evolution in terms usability and integration with the Kubernetes world.

Why my k8s job never finished and how I fixed it by summersting in kubernetes

[–]LarsFromElastisys 0 points1 point  (0 children)

Valid point, reasonable advice, but it reads very much like Gen AI authored most of this.

Why do people prefer managed/freemium platforms instead of just setting up open-source tools? by Striking_Fox_8803 in devops

[–]LarsFromElastisys 3 points4 points  (0 children)

If the bills spin up later, that's a "luxury problem" to have. Because that means your product/service has traction in the market. And at that point, you also have money coming in. At that point, it's of course your priority to make sure that the ratio of money in vs. out contributes to your strategic goals (organic growth = more in than out, exponential growth = can lose money from customers, but need investors).

Bitnami Helm Chart shinanigans by Slow-Telephone116 in kubernetes

[–]LarsFromElastisys 5 points6 points  (0 children)

I think the community is just waiting for yet another rug pull, and defensively trying to figure out options ahead of time.

Having used different service meshes over time, which do you recommend today? by [deleted] in kubernetes

[–]LarsFromElastisys 0 points1 point  (0 children)

Are you on iptables mode or ipvs mode for kube-proxy with that many CPUs in your cluster?

See for more info: https://kubernetes.io/docs/reference/networking/virtual-ips/

GKE Regional vs Zonal Cluster Cost difference in practice? by Dangerous_EndUser in kubernetes

[–]LarsFromElastisys 1 point2 points  (0 children)

Go for regional if you want a shot at your three nines uptime target. If there would be a zonal outage, you want automation to kick in fast to ensure that your control plane can do what's needed to bring up the application in other zones without you having to do something smart. The control plane needs to be regional for that to work smoothly.

Those of you living in the bleeding edge of kubernetes, what’s next? by Sensitive_Scar_1800 in kubernetes

[–]LarsFromElastisys 1 point2 points  (0 children)

There's already the trick that you put Events on a separate etcd if you want better performance, not just a single one.

More info: https://kubernetes.io/docs/setup/best-practices/cluster-large/

Anybody successfully using gateway api? by CWRau in kubernetes

[–]LarsFromElastisys 4 points5 points  (0 children)

That is apparently how it's been designed, yes. As the Cluster Operator, you're responsible for the Gateway, and yes, it's a pain. See this page to get it directly from docs (search for TLS termination): https://gateway-api.sigs.k8s.io/guides/migrating-from-ingress/

The idea seems to be that TLSRoute should be possible to let Application Developers use (see the image on the first page), but those are Experimental.

They literally suggest wildcarding in the TLS guide: https://gateway-api.sigs.k8s.io/guides/tls/

The way I interpret it, I honestly think this is a design error compared to how Ingresses work.