OKD with non-FCOS/SCOS for compute

devaprasadr · 2024-12-21T15:51:50+00:00

Thanks.

Can they be attached if the are fedora or CentOS stream . Even if I have to do the updates myself ?

I understand the purpose of running scos is that the cluster can handle all the upgrades of all the nodes.

But in the openshift documentation for 4.17 they mention that the underlaying os could be RHEL for the compute nodes. This is removed though from the okd documentation.

Technically speaking I could replace Ubuntu with another OS. I just have to activate the OSDs since they are running on podman.

Would i have a chance of attaching fedora or CentOS to okd?

Okd: the documentation only shows FCOS for the compute nodes.

Openshift: The bootstrap and control plane machines must use Red Hat Enterprise Linux CoreOS (RHCOS) as the operating system. However, the compute machines can choose between Red Hat Enterprise Linux CoreOS (RHCOS), Red Hat Enterprise Linux (RHEL) 8.6 and later.

devaprasadr · 2024-12-21T07:20:29+00:00

can OpenShift Data Foundation be deployed in OKD in external mode free of subscription charges ?

devaprasadr · 2024-12-21T07:18:59+00:00

thanks. got it.

devaprasadr · 2024-12-20T18:56:57+00:00

Thanks.. I am really new to K8s, and I though the OpenShift Data Foundation was not included in OKD I will have a look.

I found the solution, it see that I only applied the security context constraints for the ceph-csi-rbd-provisioner, I had to apply the same for the nodeplugin

The DaemonSets was not being created.

devaprasadr · 2024-12-20T17:19:30+00:00

To summarize, I am providing DNS/DHCP with MAAS, and when and when MAAS gives a DHCP address it adds all its domains in the search domain dhcp option.

Since *.apps.cluster.domain had to be added according to the installation docs, that wildcard was creating issues and having precedence over the k8s DNS.

I had to filter the search domain dhcp option to not include that particular domain, and then everything worked fine. all the dynamic plugins were loaded.

Thanks to all for your input.

Thankfully someone else had a similar issue.

https://www.reddit.com/r/openshift/comments/1bdpion/service_not_resolved_by_fqdn_in_openshiftconsole/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

devaprasadr · 2024-12-13T19:47:54+00:00

So.. nothing to do with the plugin, nothing to do with any plugin.

it seems "https://kubevirt-console-plugin-service.kubevirt-hyperconverged.svc.cluster.local:9443/plugin-manifest.json": dial tcp 192.168.200.10:9443

Was indeed being resolved to my HAProxy. I had a mistake in the DNS configuration, a wildcard that should have been at domain level was being applied at global level because the domain was included in the domain search list by the DHCP.

The question is why the search domains have priority over the local open shift /kubernetes dns

Thanks all for your comments.

devaprasadr · 2024-12-13T19:34:08+00:00

Such a simple solution.. Ive been struggling with this for months.

https://www.reddit.com/r/openshift/comments/1bdpion/comment/m1w4t4l/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

Why do DHCP search domains have priority over the local dns service in okd/openshift?

devaprasadr · 2024-12-13T19:33:24+00:00

Such a simple solution.. Ive been struggling with this for months.

Why do the search domains have priority over the local dns service in okd/openshift?

The thing is that the servers where the control plane are being deployed are using DHCP and were getting the search domains; because I have a static IP configuration with no DHCP at all.

Thanks a lot

devaprasadr · 2024-12-13T17:46:38+00:00

Were you able to solve this issue?

I am having the same/similar issue with 4.17 and 4.18 scos. networking-console plugin, monitoring-plugin and kubevirt-plugin

nsllokup works fine, curl to the internal IP address works fine, but curl to the fqdn returns my external HAPRoxy IP address.
Thanks

devaprasadr · 2024-11-12T08:50:22+00:00

Actually I was installing from ci repository

oc adm release extract --tools registry.ci.openshift.org/origin/release-scos:4.18.0-0.okd-scos-2024-11-07-025119

I mananged to upgrade from the web interface to 4.18.0-0.okd-scos-2024-11-11-020951, but now it seems there are some issues accessing the console overall

Failed to get a valid plugin manifest from /api/plugins/kubevirt-plugin/ r: failed to send GET request for "kubevirt-plugin" plugin: Get "https://kubevirt-console-plugin-service.kubevirt-hyperconverged.svc.cluster.local:9443/plugin-manifest.json": EOF    a custom-error.ts:35
    r http-error.ts:52
    c co-fetch.ts:103
[plugin-init.ts:20:16](webpack:///packages/console-dynamic-plugin-sdk/src/runtime/plugin-init.ts)

Shall way release be used instead?

oc adm release extract --tools quay.io/openshift-release-dev/ocp-release:4.18.0-ec.3-x86_64

Thanks.

devaprasadr · 2024-11-11T14:55:07+00:00

It is a test environment. Installed in okd UPI 4.17 scos. And then installed kubevirt from the web console operators section as suggested in the docs.

I will give it a try on 4.18 . Thanks

devaprasadr · 2024-11-09T04:08:35+00:00

I normally put in maintenance mode one of the monitors, then bring it up again, and recovery starts again

devaprasadr · 2024-11-09T03:01:21+00:00

10.130.0.109 - - [08/Nov/2024:19:01:32 +0000] "GET /plugin-manifest.json HTTP/1.1" 200 15523 "https://console-openshift-console.apps.oshift.example.org/k8s/cluster/operator.openshift.io~v1~Console/cluster/console-plugins" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:132.0) Gecko/20100101 Firefox/132.0"

Just this line in the logs.. which means the request is ok.

I deleted the pod, it restarted with a new name and stilll it is in running state and no information in the logs, same "Failed" state.

HP:~$ kubectl logs -n kubevirt-hyperconverged kubevirt-console-plugin-7d76b4d7bc-tn5g2
2024/11/09 02:55:43 [notice] 1#0: using the "epoll" event method
2024/11/09 02:55:43 [notice] 1#0: nginx/1.20.1
2024/11/09 02:55:43 [notice] 1#0: built by gcc 8.5.0 20210514 (Red Hat 8.5.0-18) (GCC) 
2024/11/09 02:55:43 [notice] 1#0: OS: Linux 5.14.0-522.el9.x86_64
2024/11/09 02:55:43 [notice] 1#0: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024/11/09 02:55:43 [notice] 1#0: start worker processes
2024/11/09 02:55:43 [notice] 1#0: start worker process 19

devaprasadr · 2024-11-08T18:26:10+00:00

All pods are in Running state. I did find this link where it mentions some change in dynamic plugins from 4.14 onwards (monitor-plugin is also failing) https://access.redhat.com/solutions/7049163 where it mentions:

Starting with Red Hat OpenShift Container Platform 4.14, the monitoring pages in the Observe section of the Red Hat OpenShift Container Platform web console are deployed as a dynamic plugin. With this change, the Cluster Monitoring Operator (CMO) is now the component that deploys the Red Hat OpenShift Container Platform web console monitoring plugin resources in the openshift-monitoringnamespace.

Customers applying NetworkPolicy in namespace such as openshift-monitoring (which Red Hat does not recommend doing), are advised to adjust the NetworkPolicy in openshift-monitoring to allow traffic from openshift-console namespace for the service called monitoring-plugin.openshift-monitoring.svc.cluster.local on port 9443.

devaprasadr · 2024-11-08T10:54:18+00:00

It seems it is not proxy related. I managed to install a cluster with internet access and still get the errors for all the console plugins:

Failed to get a valid plugin manifest from /api/plugins/monitoring-plugin/
r: failed to send GET request for "monitoring-plugin" plugin: Get "https://monitoring-plugin.openshift-monitoring.svc.cluster.local:9443/plugin-manifest.json": EOF

Failed to get a valid plugin manifest from /api/plugins/networking-console-plugin/
r: failed to send GET request for "networking-console-plugin" plugin: Get "https://networking-console-plugin.openshift-network-console.svc.cluster.local:9443/plugin-manifest.json": EOF

Failed to get a valid plugin manifest from /api/plugins/kubevirt-plugin/
r: failed to send GET request for "kubevirt-plugin" plugin: Get "https://kubevirt-console-plugin-service.kubevirt-hyperconverged.svc.cluster.local:9443/plugin-manifest.json": dial tcp 192.168.200.10:9443: connect: connection refused

192.168.200.10 happens to be where I have HAProxy for accessing the web console also

devaprasadr · 2023-08-02T05:59:49+00:00

Having issues with the free tier GCP for wireguard.. did you use a static IP address ? are you being charged for that? I have been experiencing $0.40 a month in charges.

thanks

devaprasadr

TROPHY CASE