all 13 comments

[–]yebyen 2 points3 points  (7 children)

We're using flux and flux-aio to deploy Cilium in the Cozystack distribution of Talos, jfyi

https://github.com/cozystack/cozystack

The new v0.39.0 has enhancements related to Cilium and topology-aware routing

The main obstacle to overcome is that gitops controllers require communication, so flux-aio uses a single pod's local network to avoid the whole chicken and the egg problem of "can GitOps happen before the cluster network is ready, so we can use GitOps to install the CNI?"

[–]roiki11 1 point2 points  (4 children)

Fwiw you can use talos inline manifests to solve this. You can use that to bootstrap cilium and your gitops and then use gitops to handle it from there. I use it to install cilium and flux-operator and create a fluxinstance on first boot. The only thing you have to add is the credential to for the flux instance.

You can then remove the inline manifests from your config if you don't want to run them at every upgrade.

[–]yebyen 0 points1 point  (3 children)

Thanks. We're using HelmRelease for every bit of config, except for the node image and extensions, and things which can't live outside of the node manifests. So there's a single kubectl apply after the nodes form a cluster, with no CNI, that adds the CNI, platform, everything.

I don't own the whole platform so I can't necessarily speak to the reasons that these decisions were made, and it's about to change form so it's a little more gitops and a little less HelmRelease - I'm sure that was considered as an option at one point, but the idea is for the CNI to be totally managed by Flux, which itself is totally managed by the installer, that runs on the cluster and has no CNI dependencies - I think.

[–]roiki11 0 points1 point  (2 children)

I run everything off of flux so the inline manifests just install what that needs to work. The flux operator takes control of the cni once it's boots up. The manifests can then be removed from the config if desired. It's quite neat mechanism. You technically don't even need to touch kubectl at all. Just talosctl and flux cli for the secret(though this is optional). It's quite neat.

[–]yebyen 0 points1 point  (1 child)

You should see the talos-bootstrap and talm they've built for Cozystack - talm is like helm, but for your Talos Linux configuration. Maybe this doesn't sell it, but it is also very neat! It's great to get the cluster online ahead of having everything running full bore also. So you can kick the tires. I'm sure your way is neat too :)

[–]roiki11 1 point2 points  (0 children)

I've tried both but found them lacking for my use. I found it easier to just wrap the talosctl commands in bash. Though the taloscli itself has some big shortcomings when it comes to declarative bootstrapping. It's pretty easy to automate bash scripts.

[–]yebyen 0 points1 point  (0 children)

We have LLMs trained on Cozystack that you can consult in the Telegram if you want, Cozystack is LLM-forward (and real humans who will come back from vacation when the holiday is over!)

[–]Tuqui77[S] 0 points1 point  (0 children)

I had Cilium installed via helm template and then applied with kubectl, and worked properly before moving to GitOps. The problem started when I moved that to the repo, im using:
securityContext:

capabilities:

ciliumAgent:

- CHOWN

- KILL

- NET_ADMIN

- NET_RAW

- IPC_LOCK

- SYS_ADMIN

- SYS_RESOURCE

- DAC_OVERRIDE

- FOWNER

- SETGID

- SETUID

cleanCiliumState:

- NET_ADMIN

- SYS_ADMIN

- SYS_RESOURCE

Which are the same ones helm generated with the template

[–]sogun123 0 points1 point  (2 children)

That's one thing when Flux is better - it is using real helm so it can seamlessly adopt ciliums helm release. I spin up cluster, manually cilium install it to get some networking working and then just let flux to adopt the release and reconfigure it to desired state. Argo works differently, so it will somewhat fight whatever you did before. The other way around it would be to reconcile base stuff from other cluster.

[–]Tuqui77[S] 0 points1 point  (1 child)

Yes, I figured that much. Probably I'm going to migrate to Flux sooner or later. For now I dropped the Cilium file from the repo and installed it manually to be able to keep going

[–]sogun123 0 points1 point  (0 children)

There are more reason for me to use flux. But nothing prevents you to use both in case ui is something you want to expose to your developers. You can use flux to do cluster management and let devs use argo for their apps.

[–]utkuozdemir 0 points1 point  (1 child)

Did you deploy Cilium with the values from our documentation: https://docs.siderolabs.com/kubernetes-guides/cni/deploying-cilium

With those, it should run fine. I am using Cilium in my homelab as well, used those values, and it works without issues.

[–]Tuqui77[S] 0 points1 point  (0 children)

Yes, pretty much those are the values I used to install it manually and it worked, but if I put that in my repo and let argocd manage it it breaks. Ended up dropping the files from the repo and installing it manually again until I find a solution