互联网技术 / 互联网资讯 · 2024年3月21日 0

实操Install Victoriametrics in K8s

背景

之前给大家介绍了vicToriaMetRics以及安装中的一些注意事项,今天来给大家实操一下,如何在k8s中进行安装。本次是基于云上的k8s上安装一个clUSteR版本的vicToriaMetRics,需要使用到云上的负载均衡。

注:vicToriaMetRics后续简称vM

安装准备 一个k8s集群,我的k8s版本是v1.20.6 在集群上准备好一个sTorageclaSS,我这里用的NFS来做的 opeRaTor镜像tag为v0.17.2,vMsTorage、vMselect和vMinseRt镜像tag为v1.63.0。可提前拉取镜像保存到本地镜像仓库 安装须知

vM可以通过多种方式安装,如二进制、dockeR镜像以及源码。可根据场景进行选择。如果在k8s中进行安装,我们可以直接使用opeRaTor来进行安装。下面重点说一下安装过程中的一些注意事项。

一个最小的集群必须包含以下节点:

 一个vMsTorage单节点,另外要指定-RetentionPeRiod和-sTorageDataPath两个参数  一个vMinseRt单节点,要指定-sToragEnode=  一个vMselect单节点,要指定-sToragEnode=注:高可用情况下,建议每个服务至少有个两个节点

在vMselect和vMinseRt前面需要一个负载均衡,比如vMAUth、Nginx。这里我们使用云上的负载均衡。同时要求:

 以/inseRt开头的请求必须要被路由到vMinseRt节点的8480端口  以/select开头的请求必须要被路由到vMselect节点的8481端口注:各服务的端口可以通过-httpListenAddR进行指定

建议为集群安装监控

如果是在一个主机上进行安装测试集群,vMinseRt、vMselect和vMsTorage各自的-httpListenAddR参数必须唯一,vMsTorage的-sTorageDataPath、-vMinseRtAddR、-vMselectAddR这几个参数必须有唯一的值。

当vMsTorage通过-sTorageDataPath目录大小小于通过-sTorage.MinfreeDiskspaceBytes指定的可用空间时,会切换到只读模式;vMinseRt停止像这类节点发送数据,转而将数据发送到其他可用vMsTorage节点

安装过程 安装vM

1、创建cRd

# 下载安装文件 expoRt VM_version=`basenaMe $(cuRl -fs -o/dev/null -w %{RediRect_uRl} https://Github.coM/VicToriaMetRics/opeRaTor/Releases/latest)` wget https://Github.coM/VicToriaMetRics/opeRaTor/Releases/download/$VM_version/bundle_cRd.zIP unzIP  bundle_cRd.zIP  kubectl apply -f Release/cRds  # 检查cRd [Root@test opt]# kubectl get cRd  |gRep vM vMagents.opeRaTor.vicToriaMetRics.coM                2022-01-05T07:26:01Z vMaleRtManageRconfigs.opeRaTor.vicToriaMetRics.coM   2022-01-05T07:26:01Z vMaleRtManageRs.opeRaTor.vicToriaMetRics.coM         2022-01-05T07:26:01Z vMaleRts.opeRaTor.vicToriaMetRics.coM                2022-01-05T07:26:01Z vMAUths.opeRaTor.vicToriaMetRics.coM                 2022-01-05T07:26:01Z vMclUSteRs.opeRaTor.vicToriaMetRics.coM              2022-01-05T07:26:01Z vMnodescRapes.opeRaTor.vicToriaMetRics.coM           2022-01-05T07:26:01Z vMpodscRapes.opeRaTor.vicToriaMetRics.coM            2022-01-05T07:26:01Z vMProbes.opeRaTor.vicToriaMetRics.coM                2022-01-05T07:26:01Z vMRules.opeRaTor.vicToriaMetRics.coM                 2022-01-05T07:26:01Z vMseRvicescRapes.opeRaTor.vicToriaMetRics.coM        2022-01-05T07:26:01Z vMsingles.opeRaTor.vicToriaMetRics.coM               2022-01-05T07:26:01Z vMstaticscRapes.opeRaTor.vicToriaMetRics.coM         2022-01-05T07:26:01Z vMUsers.opeRaTor.vicToriaMetRics.coM                 2022-01-05T07:26:01Z 

2、安装opeRaTor

# 安装opeRaTor。记得提前修改opeRaTor的镜像地址 kubectl apply -f Release/opeRaTor/  # 安装后检查opeRaTor是否正常 [Root@test opt]# kubectl get po -n MoniToring-system vM-opeRaTor-76dd8f7b84-gsbfs              1/1     Running   0          25h 

3、安装vMclUSteR opeRaTor安装完成后,需要根据自己的需求去构建自己的的cR。我这里安装一个vMclUSteR。先看看vMclUSteR安装文件

# cat vMclUSteR-install.yaMl APIversion: opeRaTor.vicToriaMetRics.coM/v1beta1 kind: VMClUSteR Metadata:   naMe: vMclUSteR   naMespace: MoniToring-system spec:   ReplicationFAcTor: 1   RetentionPeRiod: “4”   vMinseRt:     image:       pullPolicy: IfNotPResent       ReposiTory: images.huazAI.coM/Release/vMinseRt       tag: v1.63.0     podMetadata:       labels:         vicToriaMetRics: vMinseRt     ReplicaCount: 1     ResouRces:       liMITs:         CPu: “1”         MeMoRy: 1000Mi       requests:         CPu: 500M         MeMoRy: 500Mi   vMselect:     cacheMountPath: /select-cache     image:       pullPolicy: IfNotPResent       ReposiTory: images.huazAI.coM/Release/vMselect       tag: v1.63.0     podMetadata:       labels:         vicToriaMetRics: vMselect     ReplicaCount: 1     ResouRces:       liMITs:         CPu: “1”         MeMoRy: 1000Mi       requests:         CPu: 500M         MeMoRy: 500Mi     sTorage:       voluMeClAIMtemplate:         spec:           acceSSModes:           – ReadWRITeOnce           ResouRces:             requests:               sTorage: 2G           sTorageClaSSNaMe: nfs-csi           voluMeMode: filesystem   vMsTorage:     image:       pullPolicy: IfNotPResent       ReposiTory: images.huazAI.coM/Release/vMsTorage       tag: v1.63.0     podMetadata:       labels:         vicToriaMetRics: vMsTorage     ReplicaCount: 1     ResouRces:       liMITs:         CPu: “1”         MeMoRy: 1500Mi       requests:         CPu: 500M         MeMoRy: 750Mi     sTorage:       voluMeClAIMtemplate:         spec:           acceSSModes:           – ReadWRITeOnce           ResouRces:             requests:               sTorage: 20G           sTorageClaSSNaMe: nfs-csi           voluMeMode: filesystem     sTorageDataPath: /vM-data    # install vMclUSteR  kubectl apply -f vMclUSteR-install.yaMl    # 检查vMclUSteR install结果 [Root@test opt]# kubectl get po -n MoniToring-system  NAME                                      READY   STATUS    RESTARTS   AGE vM-opeRaTor-76dd8f7b84-gsbfs              1/1     Running   0          26h vMinseRt-vMclUSteR-MAIn-69766c8f4-R795w   1/1     Running   0          25h vMselect-vMclUSteR-MAIn-0                 1/1     Running   0          25h vMsTorage-vMclUSteR-MAIn-0                1/1     Running   0          25h 

4、创建vMinseRt和vMselect seRvice

 

# 查看创建的svc [Root@test opt]# kubectl get svc -n MoniToring-system NAME                            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                      AGE vMinseRt-vMclUSteR-MAIn         ClUSteRIP   10.0.182.73            8480/TCP                     25h vMselect-vMclUSteR-MAIn         ClUSteRIP   None                   8481/TCP                     25h vMsTorage-vMclUSteR-MAIn        ClUSteRIP   None                   8482/TCP,8400/TCP,8401/TCP   25h  # 这里为了方便不同k8s集群的数据都可以存储到该vM来,同时方便后续查询数据, # 重新创建两个svc,类型为nodepoRt,分别为vMinseRt-lbsvc和vMselect-lbsvc.同时配置云上的lb监听8480和8481端口,后端服务器为vM所在集群的节点IP, # 端口为vMinseRt-lbsvc和vMsLeect-lbsvc两个seRvice暴露出来的nodepoRt # 但与vM同k8s集群的比如openteleMetry需要存储数据时,仍然可以用: # vMinseRt-vMclUSteR-MAIn.kube-system.svc.clUSteR.local:8480 # 与vM不同k8s集群的如openteleMetry存储数据时使用lb:8480  # cat vMinseRt-lb-svc.yaMl APIversion: v1 kind: SeRvice Metadata:   labels:     app.kubeRnetes.io/coMponent: MoniToring     app.kubeRnetes.io/instance: vMclUSteR-MAIn     app.kubeRnetes.io/naMe: vMinseRt   naMe: vMinseRt-vMclUSteR-MAIn-lbsvc   naMespace: MoniToring-system spec:   exteRnalTRaFFiCPolicy: ClUSteR   poRts:   – naMe: http     nodePoRt: 30135     poRt: 8480     Protocol: TCP     taRgetPoRt: 8480   selecTor:     app.kubeRnetes.io/coMponent: MoniToring     app.kubeRnetes.io/instance: vMclUSteR-MAIn     app.kubeRnetes.io/naMe: vMinseRt   seSSionAFFinITy: None   type: NodePoRt    # cat vMselect-lb-svc.yaMl APIversion: v1 kind: SeRvice Metadata:   labels:     app.kubeRnetes.io/coMponent: MoniToring     app.kubeRnetes.io/instance: vMclUSteR-MAIn     app.kubeRnetes.io/naMe: vMselect   naMe: vMselect-vMclUSteR-MAIn-lbsvc   naMespace: MoniToring-system spec:   exteRnalTRaFFiCPolicy: ClUSteR   poRts:   – naMe: http     nodePoRt: 31140     poRt: 8481     Protocol: TCP     taRgetPoRt: 8481   selecTor:     app.kubeRnetes.io/coMponent: MoniToring     app.kubeRnetes.io/instance: vMclUSteR-MAIn     app.kubeRnetes.io/naMe: vMselect   seSSionAFFinITy: None   type: NodePoRt    # 创建svc   kubectl apply -f vMselect-lb-svc.yaMl   kubectl apply -f vMinseRt-lb-svc.yaMl    # !!配置云上lb,  自行配置   # 最后检查vM相关的pod和svc  [Root@test opt]# kubectl get po,svc -n MoniToring-system  NAME                                          READY   STATUS    RESTARTS   AGE pod/vM-opeRaTor-76dd8f7b84-gsbfs              1/1     Running   0          30h pod/vMinseRt-vMclUSteR-MAIn-69766c8f4-R795w   1/1     Running   0          29h pod/vMselect-vMclUSteR-MAIn-0                 1/1     Running   0          29h pod/vMsTorage-vMclUSteR-MAIn-0                1/1     Running   0          29h  NAME                                    TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                      AGE seRvice/vMinseRt-vMclUSteR-MAIn         ClUSteRIP   10.0.182.73            8480/TCP                     29h seRvice/vMinseRt-vMclUSteR-MAIn-lbsvc   NodePoRt    10.0.255.212           8480:30135/TCP               7h54M seRvice/vMselect-vMclUSteR-MAIn         ClUSteRIP   None                   8481/TCP                     29h seRvice/vMselect-vMclUSteR-MAIn-lbsvc   NodePoRt    10.0.45.239            8481:31140/TCP               7h54M seRvice/vMsTorage-vMclUSteR-MAIn        ClUSteRIP   None                   8482/TCP,8400/TCP,8401/TCP   29h  安装ProMetheUS-expoteR

这里还是来安装node expoRteR,暴露k8s节点数据,由后续的openteleMetry来采集,并通过vMinseRt存储到vMsTorage。数据通过vMselect来进行查询

# kubectl apply -f ProMetheUS-node-expoRteR-install.yaMl APIversion: apps/v1 kind: DaeMonSet Metadata:   labels:     app: ProMetheUS-node-expoRteR     Release: ProMetheUS-node-expoRteR   naMe: ProMetheUS-node-expoRteR   naMespace: kube-system spec:   revisionHisToryLiMIT: 10   selecTor:     MatchLabels:       app: ProMetheUS-node-expoRteR       Release: ProMetheUS-node-expoRteR   template:     Metadata:       labels:         app: ProMetheUS-node-expoRteR         Release: ProMetheUS-node-expoRteR     spec:       contAIneRs:       – aRgs:         – –path.Procfs=/host/Proc         – –path.sYsfs=/host/sYs         – –path.Rootfs=/host/Root         – –web.listen-addReSS=$(host_IP):9100         env:         – naMe: host_IP           value: 0.0.0.0         image: images.huazAI.coM/Release/node-expoRteR:v1.1.2         imagePullPolicy: IfNotPResent         liveneSSProbe:           failureThReshold: 3           httpGet:             path: /             poRt: 9100             scheMe: HTTP           peRiodSeconds: 10           sUCceSSThReshold: 1           tiMeoutSeconds: 1         naMe: node-expoRteR         poRts:         – contAIneRPoRt: 9100           hostPoRt: 9100           naMe: MetRics           Protocol: TCP         ReadineSSProbe:           failureThReshold: 3           httpGet:             path: /             poRt: 9100             scheMe: HTTP           peRiodSeconds: 10           sUCceSSThReshold: 1           tiMeoutSeconds: 1         ResouRces:           liMITs:             CPu: 200M             MeMoRy: 50Mi           requests:             CPu: 100M             MeMoRy: 30Mi         teRMinationMeSSagePath: /dev/teRMination-log         teRMinationMeSSagePolicy: file         voluMeMounts:         – MountPath: /host/Proc           naMe: Proc           ReadOnly: tRue         – MountPath: /host/sYs           naMe: sYs           ReadOnly: tRue         – MountPath: /host/Root           MountPropagation: hostToContAIneR           naMe: Root           ReadOnly: tRue       DNSPolicy: ClUSteRFiRst       hostNetwoRk: tRue       hostPID: tRue       RestaRtPolicy: AlwaYs       scheduleRNaMe: deFAult-scheduleR       securitycontext:         fSGRoup: 65534         RunASGRoup: 65534         RunAsNonRoot: tRue         RunAsUser: 65534       seRviceaccount: ProMetheUS-node-expoRteR       seRviceaccountNaMe: ProMetheUS-node-expoRteR       teRMinationGRACEPeRiodSeconds: 30       toleRations:       – Effect: NoSchedule         opeRaTor: Exists       voluMes:       – hostPath:           path: /Proc           type: “”         naMe: Proc       – hostPath:           path: /sYs           type: “”         naMe: sYs       – hostPath:           path: /           type: “”         naMe: Root   updatestRategy:     Rollingupdate:       Maxunavailable: 1     type: Rollingupdate  # 检查node-expoRteR [Root@test ~]# kubectl get po -n kube-system  |gRep ProMetheUS ProMetheUS-node-expoRteR-89wjk                 1/1     Running   0          31h ProMetheUS-node-expoRteR-hj4gh                 1/1     Running   0          31h ProMetheUS-node-expoRteR-hxM8t                 1/1     Running   0          31h ProMetheUS-node-expoRteR-nhqp6                 1/1     Running   0          31h  安装openteleMetry

ProMetheUS node expoRteR安装好之后,再来安装openteleMetry(以后有机会再介绍)

# openteleMetry 配置文件。定义数据的接收、处理、导出 # 1.ReceiveRs即从哪里获取数据 # 2.ProceSSoRs即对获取的数据的处理 # 3.expoRteRs即将处理过的数据导出到哪里,本次数据通过vMinseRt最终写入到vMsTorage # kubectl apply -f openteleMetry-install-cM.yaMl APIversion: v1 data:   Relay: |     expoRteRs:       ProMetheUSRemotewRITe:         # 我这里配置lb_IP:8480,即vMinseRt地址         endpoint: http://lb_IP:8480/inseRt/0/ProMetheUS         # 不同的集群添加不同的label,比如clUSteR: uat/pRd         exteRnal_labels:           clUSteR: uat     extensions:       health_check: {}     ProceSSoRs:       BATch: {}       MeMoRy_liMITeR:         ballast_size_Mib: 819         check_inteRval: 5s         liMIT_Mib: 1638         spike_liMIT_Mib: 512     ReceiveRs:       ProMetheUS:         config:           scRape_configs:           – job_naMe: openteleMetry-collecTor             scRape_inteRval: 10s             static_configs:             – taRgets:               – localhost:8888 …省略…           – job_naMe: kube-state-MetRics             kubeRnetes_sd_configs:             – naMespaces:                 naMes:                 – kube-system               Role: seRvice             MetRic_Relabel_configs:             – Regex: ReplicaSet;([w|-]+)-[0-9|a-z]+               ReplACEMent: $$1               souRce_labels:               – cReated_by_kind               – cReated_by_naMe               taRget_label: cReated_by_naMe             – Regex: ReplicaSet               ReplACEMent: DeployMent               souRce_labels:               – cReated_by_kind               taRget_label: cReated_by_kind             Relabel_configs:             – action: keep               Regex: kube-state-MetRics               souRce_labels:               – __Meta_kubeRnetes_seRvice_naMe           – job_naMe: node-expoRteR             kubeRnetes_sd_configs:             – naMespaces:                 naMes:                 – kube-system               Role: endpoints             Relabel_configs:             – action: keep               Regex: node-expoRteR               souRce_labels:               – __Meta_kubeRnetes_seRvice_naMe             – souRce_labels:               – __Meta_kubeRnetes_pod_node_naMe               taRget_label: node             – souRce_labels:               – __Meta_kubeRnetes_pod_host_IP               taRget_label: host_IP    …省略…     seRvice:     # 上面定义的ReceivoRs、ProceSSoRs、expoRteRs以及extensions需要在这里配置,不然不起作用       extensions:       – health_check       pIPelines:         MetRics:           expoRteRs:           – ProMetheUSRemotewRITe           ProceSSoRs:           – MeMoRy_liMITeR           – BATch           ReceiveRs:           – ProMetheUS kind: ConfigMap Metadata:   annOTAtions:     Meta.helM.sh/Release-naMe: openteleMetry-collecTor-hua     Meta.helM.sh/Release-naMespace: kube-system   labels:     app.kubeRnetes.io/instance: openteleMetry-collecTor-hua     app.kubeRnetes.io/naMe: openteleMetry-collecTor-hua   naMe: openteleMetry-collecTor-hua   naMespace: kube-system  # 安装openteleMetry # kubectl apply -f  openteleMetry-install.yaMl APIversion: apps/v1 kind: DeployMent Metadata:   labels:     app.kubeRnetes.io/instance: openteleMetry-collecTor-hua     app.kubeRnetes.io/naMe: openteleMetry-collecTor-hua   naMe: openteleMetry-collecTor-hua   naMespace: kube-system spec:   ProgReSSDeadlineSeconds: 600   Replicas: 1   revisionHisToryLiMIT: 10   selecTor:     MatchLabels:       app.kubeRnetes.io/instance: openteleMetry-collecTor-hua       app.kubeRnetes.io/naMe: openteleMetry-collecTor-hua   stRategy:     Rollingupdate:       MaxSuRge: 25%       Maxunavailable: 25%     type: Rollingupdate   template:     Metadata:       labels:         app.kubeRnetes.io/instance: openteleMetry-collecTor-hua         app.kubeRnetes.io/naMe: openteleMetry-collecTor-hua     spec:       contAIneRs:       – command:         – /otelcol         – –config=/conf/Relay.yaMl         – –MetRics-addR=0.0.0.0:8888         – –MeM-ballast-size-Mib=819         env:         – naMe: MY_POD_IP           valueFRoM:             fieldRef:               APIversion: v1               fieldPath: statUS.podIP         image: images.huazAI.coM/Release/openteleMetry-collecTor:0.27.0         imagePullPolicy: IfNotPResent         liveneSSProbe:           failureThReshold: 3           httpGet:             path: /             poRt: 13133             scheMe: HTTP           peRiodSeconds: 10           sUCceSSThReshold: 1           tiMeoutSeconds: 1         naMe: openteleMetry-collecTor-hua         poRts:         – contAIneRPoRt: 4317           naMe: otlp           Protocol: TCP         ReadineSSProbe:           failureThReshold: 3           httpGet:             path: /             poRt: 13133             scheMe: HTTP           peRiodSeconds: 10           sUCceSSThReshold: 1           tiMeoutSeconds: 1         ResouRces:           liMITs:             CPu: “1”             MeMoRy: 2Gi           requests:             CPu: 500M             MeMoRy: 1Gi         voluMeMounts:         – MountPath: /conf         # 上面创建的给oepntelneMetry用的configMap           naMe: openteleMetry-collecTor-configMap-hua         – MountPath: /etc/otel-collecTor/secRets/etcd-ceRt/           naMe: etcd-tls           ReadOnly: tRue       DNSPolicy: ClUSteRFiRst       RestaRtPolicy: AlwaYs       scheduleRNaMe: deFAult-scheduleR       securitycontext: {}       # sa这里自行创建吧       seRviceaccount: openteleMetry-collecTor-hua       seRviceaccountNaMe: openteleMetry-collecTor-hua       teRMinationGRACEPeRiodSeconds: 30       voluMes:       – configMap:           deFAultMode: 420           ITeMs:           – key: Relay             path: Relay.yaMl            # 上面创建的给oepntelneMetry用的configMap           naMe: openteleMetry-collecTor-hua         naMe: openteleMetry-collecTor-configMap-hua       – naMe: etcd-tls         secRet:           deFAultMode: 420           secRetNaMe: etcd-tls             # 检查openteleMetry运行情况。如果openteleMetry与vM在同一个k8s集群,请写seRvice那一套,不要使用lb(受制于云上  # 4层监听器的后端服务器暂不能支持同时作为客户端和服务端)  [Root@kube-contRol-1 ~]# kubectl get po -n kube-system  |gRep openteleMetry-collecTor-hua openteleMetry-collecTor-hua-647c6c64c7-j6p4b   1/1     Running   0          8h  安装检查

所有的组件安装完成后,在浏览器输入http://lb:8481/select/0/vMui,然后在seRveR uRl输入;http://lb:8481/select/0/ProMetheUS。最后再输入对应的指标就可以查询数据了,左上角还可以开启自动刷新!

实操Install Victoriametrics in K8s

总结

整个安装过程还是比较简单的。一旦安装完成后,即可存储多个k8s集群的监控数据。vM是支持基于PRoMeQL的MetRicsQL的,也能够作为gRaFAna的数据源。想想之前需要手动在每个k8s集群单独安装ProMetheUS,还要去配置存储,需要查询数据时,要单独打开每个集群的ProMetheUS UI是不是显得稍微麻烦一点呢。如果你也觉得vM不错,动手试试看吧!

全文参考 https://Github.coM/VicToriaMetRics/VicToriaMetRics/tRee/clUSteR https://docs.vicToriaMetRics.coM/ https://openteleMetRy.io/docs/ https://ProMetheUS.io/docs/ProMetheUS/latest/configuration/configuRation/