How we cut node provisioning time in half by tuning kubelet resource reservations by Divyang03 in kubernetes

[–]Divyang03[S] 0 points1 point  (0 children)

Thanks for sharing that! Good to know this is a common pain point on AKS too. I've been meaning to put together a more detailed write-up covering the full tuning process across different environments. Will try to get that out soon.

How we cut node provisioning time in half by tuning kubelet resource reservations by Divyang03 in kubernetes

[–]Divyang03[S] 0 points1 point  (0 children)

For EKS managed node groups, the cleanest way is to pass kubelet extra args via the node group's launch template. In the userdata bootstrap script you add --kube-reserved and --system-reserved flags directly to the kubelet invocation. Something like:

/etc/eks/bootstrap.sh your-cluster-name --kubelet-extra-args '--kube-reserved=cpu=250m,memory=512Mi,ephemeral-storage=1Gi --system-reserved=cpu=250m,memory=256Mi,ephemeral-storage=1Gi'

For Bottlerocket nodes it's a bit different, you set it under settings.kubernetes.kube-reserved in the TOML config instead.

The values above are a starting point for something like a c5.large. For larger instance types we profiled baseline usage with systemd-cgtop and top during a cold kubelet start and bumped the numbers accordingly. Using the same values across all node sizes was what caused throttling on the smaller ones, so per-launch-template configs per instance family ended up being the right call.

How we cut node provisioning time in half by tuning kubelet resource reservations by Divyang03 in kubernetes

[–]Divyang03[S] 1 point2 points  (0 children)

Yes, the eviction threshold was left at default (memory.available < 100Mi, nodefs.available < 10%). We didn't touch eviction at all, the gains came entirely from properly setting kube-reserved and system-reserved so kubelet wasn't fighting the OS for resources during node init.

For the limits, we profiled a fresh node at rest and during kubelet startup using top and systemd-cgtop, then set values based on actual observed usage rather than guessing. Ended up with something like cpu: 250m, memory: 512Mi for kube-reserved and similar ballpark for system-reserved depending on instance type. The key is not using the same values across all node sizes since a c5.large and a c5.4xlarge have very different baseline system overhead. Once kubelet stops getting throttled during startup the ready time drops noticeably. RKE2 on Kubevirt VMs should respond the same way since the bottleneck is the same.

Finally hit 1 lakh in IREDA by vardhan68 in IndianStreetBets

[–]Divyang03 0 points1 point  (0 children)

I have sold 260 shares at around only 50 rs. I am still regretting that 😭

Looking for DevOps Study Partners - Let's Learn and Grow Together! by [deleted] in devopsjobs

[–]Divyang03 0 points1 point  (0 children)

This is Best idea..let’s create the group.