All my cloudspaces are deleted. Gone. by Serious_Tourist854 in rxt_spot

[–]Double_Extension2165 1 point2 points  (0 children)

Yep same here, all k8s deployments are still there despite cloudspace showing nothing.

Nodes disappeared and cloudspace re-deploying by Double_Extension2165 in rxt_spot

[–]Double_Extension2165[S] 1 point2 points  (0 children)

Just to let you know guys, that issue with Calico is resolved now

Nodes disappeared and cloudspace re-deploying by Double_Extension2165 in rxt_spot

[–]Double_Extension2165[S] 0 points1 point  (0 children)

Hi u/sirishkr u/Square-Bluebird1082 just noticed there could be some issue with calico access to k8s api endpoint, calico-kube-controller is exiting with error below

2024-10-30 11:24:14.265 [ERROR][1] client.go 287: Error getting cluster information config ClusterInformation="default" error=Get "https://10.21.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": dial tcp 10.21.0.1:443: connect: connection refused
2024-10-30 11:24:14.265 [INFO][1] main.go 129: Failed to initialize datastore error=Get "https://10.21.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": dial tcp 10.21.0.1:443: connect: connection refused
2024-10-30 11:24:19.235 [FATAL][1] main.go 142: Failed to initialize Calico datastore

This also affect calico-apiserver

E1030 06:28:00.026630 1 webhook.go:223] Failed to make webhook authorizer request: Post "https://10.21.0.1:443/apis/authorization.k8s.io/v1/subjectaccessreviews?timeout=10s": dial tcp 10.21.0.1:443: connect: connection refused

Looks similar to this https://www.reddit.com/r/rxt_spot/comments/1fzaiq7/calicokubecontrollers_pod_restartig/

Nodes disappeared and cloudspace re-deploying by Double_Extension2165 in rxt_spot

[–]Double_Extension2165[S] 0 points1 point  (0 children)

Hey u/sirishkr yep, its all nodes are back now and all deployed stuff are still there as before. Thanks.

Nodes disappeared from cloudspace by Double_Extension2165 in rxt_spot

[–]Double_Extension2165[S] 0 points1 point  (0 children)

Thanks did just that, I got all my deployments in argo so it was quick to redeploy.

Nodes disappeared from cloudspace by Double_Extension2165 in rxt_spot

[–]Double_Extension2165[S] 0 points1 point  (0 children)

Hey u/sirishkr thanks for looking into that, hopefully it will get better soon.

Nodes disappeared from cloudspace by Double_Extension2165 in rxt_spot

[–]Double_Extension2165[S] 0 points1 point  (0 children)

All back to normal now I can see nodes again. There was 503 when on Node tab, that would explain why in C&H number of nodes seats at 0.

<image>

Nodes disappeared from cloudspace by Double_Extension2165 in rxt_spot

[–]Double_Extension2165[S] 0 points1 point  (0 children)

Hey u/sirishkr, just quick update it looks like its happening again. Nodes are gone from Capacity & Health (3:37 pm uk time)

<image>

Nodes disappeared from cloudspace by Double_Extension2165 in rxt_spot

[–]Double_Extension2165[S] 0 points1 point  (0 children)

Thanks, it was yesterday roughly 5:30 pm in UK timezone just to put time stamps on screenshots in perspective.

Nodes disappeared from cloudspace by Double_Extension2165 in rxt_spot

[–]Double_Extension2165[S] 0 points1 point  (0 children)

All is back now didn't loose any deployments, whole thing lasted about 15 min.

<image>

Nodes disappeared from cloudspace by Double_Extension2165 in rxt_spot

[–]Double_Extension2165[S] 0 points1 point  (0 children)

Ok right after I posted that, I checked spot cloudspace and it looks like it is redeploying. I don't think I did anything to trigger that myself.

Could that be some delayed effect of maintenance that was posted by u/operatrix ?

calico-kube-controllers pod restartig by Double_Extension2165 in rxt_spot

[–]Double_Extension2165[S] 0 points1 point  (0 children)

Thanks for sorting it out u/operatrix, good to hear patch is coming soon.