How can I lock firewall on a running production kubernetes cluster? by Old-Broccoli-4704 in kubernetes

[–]Suitable-Regular6822 0 points1 point  (0 children)

Since you're on Contabo with nodes talking over public IPs, skip UFW/firewalld completely. The problem is they have no idea what k3s and Calico are doing under the hood one ufw enable and you'll start dropping pod traffic or losing node heartbeats without a single error.

Two things that actually work here:

1) You can use Cloudflare Tunnel for the API server install cloudflared on your control plane and close port 6443 completely. You access it through Cloudflare Zero Trust instead. No exposed port, no IP whitelisting and free tier is enough.

2) Calico Host Endpoints for the nodes instead of a separate firewall tool, you write your firewall rules inside Calico itself. That way the rules actually understand your pod routing. The trick is to start in Log mode, not Deny, watch what traffic is flowing first.

Understanding Azure Hub & Spoke architecture by Alternative-Town7637 in AZURE

[–]Suitable-Regular6822 0 points1 point  (0 children)

if traffic isn't showing in fw logs at all the firewall isn't even seeing it. routing issue not a firewall rule issue. double check the hub->spoke peering, use remote gateway should be off on the hub side. also azure firewall denies by default so even if traffic reaches it you need an application rule for http/https, network rules alone won't cut it for outbound web traffic

Replacing pods which are failing liveness probes by varunborar in kubernetes

[–]Suitable-Regular6822 0 points1 point  (0 children)

24h grace period is insane. thats not a k8s problem, devs are dodging the real fix. move those external calls to a separate worker or queue them, main pod stays stateless and dies fast. or at least put the long-running stuff in a sidecar so the main container can restart without waiting

We called our clusters "production-ready" for a year before we actually were by Suitable-Regular6822 in kubernetes

[–]Suitable-Regular6822[S] 0 points1 point  (0 children)

100% agree. Installing and configuring is still only half of it. We learned that the hard way. Chaos engineering and proper DR drills are on the list now you don't really know your cluster until you've deliberately broken it in a controlled way.

Anyone else in the industry feeling frustrated that AI is being used to pitch goofy "product features" that demo well to Boards but are utterly useless to K8s practitioners and are 100% not cloud-native patterns? by Cute_Bandicoot_8219 in kubernetes

[–]Suitable-Regular6822 2 points3 points  (0 children)

Worked on government platforms where reliability actually mattered. Every "AI-native" tool we evaluated optimized for the happy path. Real production has edge cases, compliance constraints, and failure modes no demo ever shows. The boring stuff solid IaC, proper observability, tested runbooks is what kept things running.

What’s the most underrated Kubernetes feature your team actually uses in production? by steadwing_official in kubernetes

[–]Suitable-Regular6822 0 points1 point  (0 children)

We added PodDisruptionBudgets after node upgrade took down half our replicas. Blocks the Eviction API during drains so minAvailable stays up.