linuxea:kubernetes 部署metrics-server(46)

metrics-server

metrics-server是用户开放的一个api server,这个api server用于服务资源指标服务器,并不是服务kubernetes api,更不是服务pod api,仅仅用于服务cpu利用率,内存使用率等等对象。

metrics-server并不是kubernetes组成部分,只是托管在kubernetes之上的一个pod,为了能让用户使用metrics-server之上的api,在kubernetes上可以无缝使用metrics-server,可以在新的结构中这样的组织,如下:

kubernetes依然正常运行,除此之外额外运行一个metrics-server,metrics-server也能提供另外一组api,这两组api合并到一起当一个使用,就需要在之前加一层代理,这个代理叫做聚合器(kube-aggregator)。这个聚合器不单单能聚合metrics-server,其他的第三方也可以聚合。

这个聚合器提供的资源指标是:/apis/metrics.k8s.io/v1beta1,kubernetes默认不提供这个接口,通过metrics-server提供/apis/metrics.k8s.io/v1beta1,而kubernetes提供原生的api 群组,这两个api server通过kube-aggregator聚合器的方式整合到一起,用户访问时通过kube-aggregator,既能访问原生的api 群组,也能通过kube-aggregator访问metrics-server提供的额外群组。

事实上也可以扩展其他的api,加到kube-aggregator下即可。heapster废弃后,metrics将会成kubernetes多个核心组件的先决条件,如:kubectl,top等等,如果没有metrics,这些则用不了。为了给这些组件提供数据,就要部署metrics。

部署

我们可以克隆kubernetes源码树中的metrics-server,也可以克隆metrics-server下的,这两个git地址不同,内容也是不同的。

1,kubernetes-incubato

克隆github上metrics-server的代码,而后使用1.8+版本部署

[root@linuxea ~]# git clone https://github.com/kubernetes-incubator/metrics-server.git

使用kubectl apply -f .//root/metrics-server/deploy/1.8+下的所有yaml文件部署起来

[root@linuxea ~]# cd /root/metrics-server/deploy/1.8+
[root@linuxea 1.8+]# kubectl apply -f ./
  • 确保metrics-server服务成功启动
[root@linuxea 1.8+]# kubectl get svc -n kube-system
NAME                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)         AGE
kube-dns               ClusterIP   10.96.0.10       <none>        53/UDP,53/TCP   53d
kubernetes-dashboard   NodePort    10.101.194.113   <none>        443:31780/TCP   32d
metrics-server         ClusterIP   10.99.129.34     <none>        443/TCP         1m
  • 确保metrics-server-85cc795fbf-7srwpod启动
[root@linuxea 1.8+]# kubectl get pods -n kube-system
NAME                                           READY     STATUS    RESTARTS   AGE
metrics-server-85cc795fbf-7srw2                1/1       Running   0          1m

2,kubernetes 的cluster/addons/metrics-server中的metrics-server

克隆kubernetes源码树中的metrics-server

请注意,我这里使用的是kubernetes v1.11.1版本,期间重装几次,docker使用docker://18.05.0-ce

metrics-server和metrics-server-nanny版本如下:

提示:如果你不是这个版本,如果是更新的版本请阅读github使用文档,或者查看源码和yaml文件

  - name: metrics-server
    image: k8s.gcr.io/metrics-server-amd64:v0.3.1
 - name: metrics-server-nanny
        image: k8s.gcr.io/addon-resizer:1.8.3

单独下载这几个文件

auth-delegator.yaml
auth-reader.yaml
metrics-apiservice.yaml
metrics-server-deployment.yaml
metrics-server-service.yaml
resource-reader.yaml
[root@linuxea metrics-server]# for i in auth-delegator.yaml auth-reader.yaml metrics-apiservice.yaml  metrics-server-deployment.yaml metrics-server-service.yaml resource-reader.yaml;do wget https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/metrics-server/$i;done

如果有以下报错,可参考如下:

403 Forbidden", response: "Forbidden (user=system:anonymous, verb=get, resource=nodes, subresource=stats)
E0903  1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:<hostname>: unable to fetch metrics from Kubelet <hostname> (<hostname>): Get https://<hostname>:10250/stats/summary/: dial tcp: lookup <hostname> on 10.96.0.10:53: no such host
no response from https://10.101.248.96:443: Get https://10.101.248.96:443: Proxy Error ( Connection refused )
E1109 09:54:49.509521       1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:linuxea.node-2.com: unable to fetch metrics from Kubelet linuxea.node-2.com (10.10.240.203): Get https://10.10.240.203:10255/stats/summary/: dial tcp 10.10.240.203:10255: connect: connection refused, unable to fully scrape metrics from source kubelet_summary:linuxea.node-3.com: unable to fetch metrics from Kubelet linuxea.node-3.com (10.10.240.143): Get https://10.10.240.143:10255/stats/summary/: dial tcp 10.10.240.143:10255: connect: connection refused, unable to fully scrape metrics from source kubelet_summary:linuxea.node-4.com: unable to fetch metrics from Kubelet linuxea.node-4.com (10.10.240.142): Get https://10.10.240.142:10255/stats/summary/: dial tcp 10.10.240.142:10255: connect: connection refused, unable to fully scrape metrics from source kubelet_summary:linuxea.master-1.com: unable to fetch metrics from Kubelet linuxea.master-1.com (10.10.240.161): Get https://10.10.240.161:10255/stats/summary/: dial tcp 10.10.240.161:10255: connect: connection refused, unable to fully scrape metrics from source kubelet_summary:linuxea.node-1.com: unable to fetch metrics from Kubelet linuxea.node-1.com (10.10.240.202): Get https://10.10.240.202:10255/stats/summary/: dial tcp 10.10.240.202:10255: connect: connection refused]

我们修改一些参数进行配置

修改metrics-server-deployment.yamlcommand参数,配置cpu内存大小

        command:
          - /pod_nanny
          - --config-dir=/etc/config
          - --cpu=100m
          - --extra-cpu=0.5m
          - --memory=100Mi
          - --extra-memory=50Mi
          - --threshold=5
          - --deployment=metrics-server-v0.3.1
          - --container=metrics-server
          - --poll-period=300000
          - --estimator=exponential
          # Specifies the smallest cluster (defined in number of nodes)
          # resources will be scaled to.
          - --minClusterSize=10

并且修改metrics-server-amd64:v0.3.1的配置段,添加如下:

 - --kubelet-insecure-tls
 - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP

最终如下:

    spec:
      priorityClassName: system-cluster-critical
      serviceAccountName: metrics-server
      containers:
      - name: metrics-server
        image: k8s.gcr.io/metrics-server-amd64:v0.3.1
        command:
        - /metrics-server
        - --metric-resolution=30s
        - --kubelet-insecure-tls
        - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
        # These are needed for GKE, which doesn't support secure communication yet.
        # Remove these lines for non-GKE clusters, and when GKE supports token-based auth.
        #- --kubelet-port=10255
        #- --deprecated-kubelet-completely-insecure=true

- --kubelet-insecure-tls这种方式是禁用tls验证,一般不建议在生产环境中使用。并且由于DNS是无法解析到这些主机名,使用- --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP进行规避。还有另外一种方法,修改coredns,不过,我并不建议这样做。

参考这篇:https://github.com/kubernetes-incubator/metrics-server/issues/131

另外在 resource-reader.yaml 中添加 - nodes/stats,如下:

[root@linuxea metrics-server]# cat resource-reader.yaml 
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:metrics-server
  labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  - nodes/stats
  - namespaces

参考:https://github.com/kubernetes-incubator/metrics-server/issues/95

apply

[root@linuxea metrics-server]# pwd
/root/metrics-server
[root@linuxea metrics-server]# kubectl apply -f .
[root@linuxea metrics-server]# kubectl get pods,svc -n kube-system 
NAME                                               READY   STATUS    RESTARTS   AGE
pod/coredns-576cbf47c7-65ndt                       1/1     Running   0          2m18s
pod/coredns-576cbf47c7-rrk4f                       1/1     Running   0          2m18s
pod/etcd-linuxea.master-1.com                      1/1     Running   0          89s
pod/kube-apiserver-linuxea.master-1.com            1/1     Running   0          97s
pod/kube-controller-manager-linuxea.master-1.com   1/1     Running   0          84s
pod/kube-flannel-ds-amd64-4dtgp                    1/1     Running   0          115s
pod/kube-flannel-ds-amd64-6g2sm                    1/1     Running   0          48s
pod/kube-flannel-ds-amd64-7txhx                    1/1     Running   0          50s
pod/kube-flannel-ds-amd64-fs4lw                    1/1     Running   0          57s
pod/kube-flannel-ds-amd64-v2qvv                    1/1     Running   0          48s
pod/kube-proxy-bmhfh                               1/1     Running   0          2m18s
pod/kube-proxy-c9wkz                               1/1     Running   0          50s
pod/kube-proxy-d8vlj                               1/1     Running   0          57s
pod/kube-proxy-rpst5                               1/1     Running   0          48s
pod/kube-proxy-t5pzg                               1/1     Running   0          48s
pod/kube-scheduler-linuxea.master-1.com            1/1     Running   0          97s
pod/metrics-server-v0.3.1-69788f46f9-82w76         2/2     Running   0          15s

NAME                     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)         AGE
service/kube-dns         ClusterIP   10.96.0.10       <none>        53/UDP,53/TCP   2m32s
service/metrics-server   ClusterIP   10.103.131.149   <none>        443/TCP         19s
[root@linuxea metrics-server]# kubectl get svc -n kube-system
NAME                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)         AGE
metrics-server         ClusterIP   10.98.186.115    <none>        443/TCP         42s

此刻,metrics-server提供的metrics.k8s.io/v1beta1就能显示值啊api-versions

[root@linuxea metrics-server]# kubectl api-versions|grep metrics
metrics.k8s.io/v1beta1

这些已经准备完成,我们可以试试查看收集的数据

[root@linuxea metrics-server]# kubectl top pods
NAME                           CPU(cores)   MEMORY(bytes)   
linuxea-hpa-68ffdc8b94-jjfw7   1m           103Mi           
linuxea-hpa-68ffdc8b94-mbgc8   1m           99Mi            
linuxea-hpa-68ffdc8b94-trtkm   1m           101Mi           
linuxea-hpa-68ffdc8b94-twcxx   1m           100Mi           
linuxea-hpa-68ffdc8b94-w9d7j   1m           100Mi      
[root@linuxea metrics-server]# kubectl top nodes
NAME                   CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%   
linuxea.master-1.com   197m         4%        3213Mi          41%       
linuxea.node-1.com     60m          1%        939Mi           24%       
linuxea.node-2.com     58m          1%        1066Mi          27%       
linuxea.node-3.com     127m         3%        673Mi           17%       
linuxea.node-4.com     47m          1%        664Mi           17% 
0 分享

您可以选择一种方式赞助本站

支付宝扫码赞助

支付宝扫码赞助

日期: 2018-12-10分类: kubernetes

标签: kubernetes

发表评论