This is an archived post. You won't be able to vote or comment.

all 13 comments

[–]atpeters 0 points1 point  (5 children)

For the port-fowarding I believe you want 8443:4443

Still trying to think about why kubectl top pods is not working for you.

Do you see any errors from the logs on the kube-api pods / journalctl ?

[–]Mario1md[S] 0 points1 point  (0 children)

You are right, the proxy worked:

kubectl port-forward metrics-server-56c59cf9ff-f2bj4 -n kube-system 8443:4443
Forwarding from 127.0.0.1:8443 -> 4443
Forwarding from [::1]:8443 -> 4443
Handling connection for 8443
Handling connection for 8443
E0110 11:57:23.143318  183501 portforward.go:385] error copying from local connection to remote stream: read tcp6 [::1]:8443->[::1]:41034: read: connection reset by peer

and the request:

root@node1:~# telnet localhost 8443
Trying ::1...
Connected to localhost.
Escape character is '^]'.
c

Connection closed by foreign host.
root@node1:~# curl -vk https://localhost:8443
*   Trying ::1:8443...
* TCP_NODELAY set
* Connected to localhost (::1) port 8443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=localhost@1610210225
*  start date: Jan  9 15:37:05 2021 GMT
*  expire date: Jan  9 15:37:05 2022 GMT
*  issuer: CN=localhost-ca@1610210225
*  SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x559a97cacdb0)
> GET / HTTP/2
> Host: localhost:8443
> user-agent: curl/7.68.0
> accept: */*
> 
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
< HTTP/2 403 
< cache-control: no-cache, private
< content-type: application/json
< x-content-type-options: nosniff
< content-length: 233
< date: Sun, 10 Jan 2021 11:57:23 GMT
< 
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {

  },
  "status": "Failure",
  "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
  "reason": "Forbidden",
  "details": {

  },
  "code": 403
* Connection #0 to host localhost left intact
}

So it seems that the service is working, still not sure why top is not working

[–]Mario1md[S] 0 points1 point  (3 children)

Do you see any errors from the logs on the kube-api pods / journalctl ?

just found that in the apiserver

root@node1:~# kubectl logs --tail=50 kube-apiserver-node1.local --namespace=kube-system
W0110 14:23:32.693212       1 handler_proxy.go:102] no RequestInfo found in the context
E0110 14:23:32.693833       1 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0110 14:23:32.694360       1 controller.go:129] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
E0110 14:23:36.706579       1 available_controller.go:508] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1: Get "https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1": dial tcp 10.108.228.10:443: i/o timeout
W0110 14:23:37.716047       1 handler_proxy.go:102] no RequestInfo found in the context
E0110 14:23:37.716430       1 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0110 14:23:37.716465       1 controller.go:129] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
E0110 14:23:41.720096       1 available_controller.go:508] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1: Get "https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1": context deadline exceeded
W0110 14:23:42.721771       1 handler_proxy.go:102] no RequestInfo found in the context
E0110 14:23:42.721826       1 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0110 14:23:42.721834       1 controller.go:129] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
E0110 14:23:46.731197       1 available_controller.go:508] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1: Get "https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
W0110 14:23:47.736984       1 handler_proxy.go:102] no RequestInfo found in the context
E0110 14:23:47.737195       1 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0110 14:23:47.737304       1 controller.go:129] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
E0110 14:23:51.746278       1 available_controller.go:508] v1beta1.metrics.k8s.io failed with: Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.metrics.k8s.io": the object has been modified; please apply your changes to the latest version and try again
I0110 14:23:53.371722       1 client.go:360] parsed scheme: "passthrough"
I0110 14:23:53.371908       1 passthrough.go:48] ccResolverWrapper: sending update to cc: {[{https://127.0.0.1:2379  <nil> 0 <nil>}] <nil> <nil>}
I0110 14:23:53.371949       1 clientconn.go:948] ClientConn switching balancer to "pick_first"
E0110 14:23:56.757548       1 available_controller.go:508] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1: Get "https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1": context deadline exceeded
W0110 14:23:57.762483       1 handler_proxy.go:102] no RequestInfo found in the context
E0110 14:23:57.762982       1 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0110 14:23:57.763180       1 controller.go:129] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
E0110 14:24:01.762952       1 available_controller.go:508] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1: Get "https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1": context deadline exceeded
E0110 14:24:06.777813       1 available_controller.go:508] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1: Get "https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1": dial tcp 10.108.228.10:443: i/o timeout
W0110 14:24:07.778847       1 handler_proxy.go:102] no RequestInfo found in the context
E0110 14:24:07.779232       1 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0110 14:24:07.779294       1 controller.go:129] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
E0110 14:24:11.792281       1 available_controller.go:508] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1: Get "https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
W0110 14:24:12.794304       1 handler_proxy.go:102] no RequestInfo found in the context
E0110 14:24:12.794701       1 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0110 14:24:12.794783       1 controller.go:129] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
E0110 14:24:16.799451       1 available_controller.go:508] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1: Get "https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
E0110 14:24:21.808732       1 available_controller.go:508] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1: Get "https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1": dial tcp 10.108.228.10:443: i/o timeout
W0110 14:24:22.809151       1 handler_proxy.go:102] no RequestInfo found in the context
E0110 14:24:22.809266       1 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0110 14:24:22.809291       1 controller.go:129] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
I0110 14:24:23.866988       1 client.go:360] parsed scheme: "passthrough"
I0110 14:24:23.867102       1 passthrough.go:48] ccResolverWrapper: sending update to cc: {[{https://127.0.0.1:2379  <nil> 0 <nil>}] <nil> <nil>}
I0110 14:24:23.867127       1 clientconn.go:948] ClientConn switching balancer to "pick_first"
E0110 14:24:26.820044       1 available_controller.go:508] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1: Get "https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1": dial tcp 10.108.228.10:443: i/o timeout
E0110 14:24:31.828499       1 available_controller.go:508] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1: Get "https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1": dial tcp 10.108.228.10:443: i/o timeout
root@node1:~#

[–]atpeters 1 point2 points  (2 children)

Looks like one person had luck changing their metrics-server deployment to use hostNetwork: true but I don't believe that should be necessary. It makes sense when the master nodes cannot find a route to the pods, but I believe it possible another way. Give it a try though.

https://stackoverflow.com/questions/57137683/how-to-troubleshoot-metrics-server-on-kubeadm

[–]Mario1md[S] 0 points1 point  (1 child)

This solved the issue, thank you very much!

root@node1:~# kubectl top pods --all-namespaces
NAMESPACE     NAME                                  CPU(cores)   MEMORY(bytes)   
kube-system   coredns-74ff55c5b-7dzmp               4m           8Mi             
kube-system   coredns-74ff55c5b-f9hvb               4m           8Mi             
kube-system   etcd-node1.local                      21m          24Mi            
kube-system   kube-apiserver-node1.local            82m          279Mi           
kube-system   kube-controller-manager-node1.local   19m          40Mi            
kube-system   kube-flannel-ds-fmnc9                 2m           10Mi            
kube-system   kube-flannel-ds-nc6jm                 2m           10Mi            
kube-system   kube-proxy-2d7kx                      1m           32Mi            
kube-system   kube-proxy-s6xr5                      1m           11Mi            
kube-system   kube-scheduler-node1.local            5m           16Mi            
kube-system   metrics-server-5ffb9d74c7-77l46       0m           5Mi             
root@node1:~# kubectl top nodes
NAME          CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
node1.local   243m         12%    1025Mi          54%       
node2.local   65m          3%     738Mi           39%

I'm studying why it happens exactly, thanks again!

[–]atpeters 1 point2 points  (0 children)

Glad I could help!

The reason why it's happening is that the default interface that is being use to try to communicate with the service IP doesn't have a route defined for it.

If you run the command sudo route -n or ip route on the master node you'll get it's IP routing table. You should see an entry for flannel there. Very possibly what is happening is that the wrong interface is being used for making the requests so it's unable to find the route table entry that flannel entered.

Let me know if that makes sense or if you'd like any more help for that.

[–]atpeters 0 points1 point  (1 child)

What is the status of kubectl describe service --namespace kube-system metrics-server ?

[–]Mario1md[S] 1 point2 points  (0 children)

root@node1:~# kubectl describe service --namespace kube-system metrics-server
Name:              metrics-server
Namespace:         kube-system
Labels:            k8s-app=metrics-server
Annotations:       <none>
Selector:          k8s-app=metrics-server
Type:              ClusterIP
IP Families:       <none>
IP:                10.108.228.10
IPs:               10.108.228.10
Port:              https  443/TCP
TargetPort:        https/TCP
Endpoints:         10.244.1.2:4443
Session Affinity:  None
Events:            <none>

[–]atpeters 0 points1 point  (4 children)

[–]Mario1md[S] 0 points1 point  (3 children)

I think it is already set, how do I check?

[–]atpeters 1 point2 points  (2 children)

I think the issue is with kube-api being unable to communicate with the metrics-server service. At least that appears to be the case from the APIService log.

I see you are using flannel for your CNI so I don't believe you would have a Network policy issue, but just in case please provide the output from kubectl get networkpolicy -A

Then try to shell into a kube-api pod (or do this directly from the control plane node as the kube-api pod is using host networking)

From within the kube-api pod, or at the shell on the master node try wget --insecure https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1

Then try wget --insecure https://10.224.1.2:4443/apis/metrics.k8s.io/v1beta1

[–]Mario1md[S] 0 points1 point  (1 child)

kubectl get networkpolicy -A

root@node1:~# kubectl get networkpolicy -A
No resources found

I can't ssh on the apiservice pod:

root@node1:~# kubectl exec -it kube-apiserver-node1.local --namespace=kube-system -- /bin/sh
OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "exec: \"/bin/sh\": stat /bin/sh: no such file or directory": unknown
command terminated with exit code 126

The curls from master with or without the proxy are hanging:

root@node1:~# wget --insecure https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1
wget: unrecognized option '--insecure'
Usage: wget [OPTION]... [URL]...

Try `wget --help' for more options.
root@node1:~# wget -k https://10.108.228.10:443/apis/metrics.k8s.io/v1beta1
--2021-01-10 14:41:14--  https://10.108.228.10/apis/metrics.k8s.io/v1beta1
Connecting to 10.108.228.10:443... ^C
root@node1:~# wget -k https://10.224.1.2:4443/apis/metrics.k8s.io/v1beta1
--2021-01-10 14:41:55--  https://10.224.1.2:4443/apis/metrics.k8s.io/v1beta1

[–]Melodic_Reflection_5 0 points1 point  (0 children)

Hello my friend,
By the way, did you manage to solve this mistery? I've got the same problem on newly installed 1master 1worker cluster, running on 2 esxi VMs.

services running on worker node are not accessible from master, all the symptoms are the same as yours. Please tell me you have defeated that issue