K8s:探针health Check

health Check

探针检测

docker创建容器开启探针检测会显示
nerdctl创建的不会显示

检测内容

1、容器里面的端口和url是否正常被访问(端口存在、api是否正常访问)
		因为:java内存溢出、内存泄漏
2、探针可以移除掉服务

3、重要是服务配置探针检测

4、配合HPA控制器、让副本自动伸缩

5、检测时间自定义(3秒左右检查一次)


检查时间

间隔时间
超时时间
重试次数
保护时间(服务启动60秒之后再检查)

事例

9-58-57M
docker-compose  -f  *.yaml  up  -d		#启动
docker-compose  -f  *.yaml 				#停止
k8s检测
kubelet对容器进行检测

探针的核心功能

pod启动前的检查:
1、启动探针:启动时候就配置检查
		完成之后、pod启动之后、才执行下面两个探针检查

周期检查:
2、readiness就绪探针:探针检查失败之后、可以将pod从svc摘除、既不会再将请求转发至检查失败的pod
3、liveness存活探针:探针检查失败之后、可以讲pod重启

一切检查没有问题之后:镜像更新、配置更新(可以重建、删除pod前的操作:从当前注册中心拿掉或者执行脚本)

pod生命周期 ?

1、提交创建成功之后、先进行init容器、成功之后——————>
2、start pod检查:启动探针检查。(执行之后——————>)
3、周期执行:就绪探针和存活探针、就是业务容器检查了、每隔几秒检查一次、如果不正常、就绪探针就把容器从service拿掉、存活探针会把容器重启、
4、如果一切都通过、pod删除之前会stop操作。

探针钩子

hook  钩子

1、进去执行命令(shell命令、简单的交互式命令)
2、TCP : 对容器端口进行检查、(MySQL、Kafka、zookeeper、6379、3306、80)(zabbix)
3、HTTP_GET请求 : ip1地址加get请求、状态码

检查结果

成功
失败
未知:诊断失败、因此不会采取任何行动

pod重启策略

镜像拉取策略

实验

注意:

HTTP检查

yaml文件

存活探针
1、主要能重启pod、重启三次后就失败、不调度
2、三次失败后处于状态:CrashLoopBackOff:探针检查失败
确保service的后端匹配到pod:就是service的label要选择对应的pod

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myserver-myapp-frontend-deployment
  namespace: myserver
spec:
  replicas: 1
  selector:
    matchLabels: #rs or deployment
      app: myserver-myapp-frontend-label
    #matchExpressions:
    #  - {key: app, operator: In, values: [myserver-myapp-frontend,ng-rs-81]}
  template:
    metadata:
      labels:
        app: myserver-myapp-frontend-label
    spec:
      containers:
      - name: myserver-myapp-frontend-label
        image: nginx:1.20.2
        ports:
        - containerPort: 80
        readinessProbe:
        livenessProbe:
          httpGet:
            #path: /monitor/monitor.html			###错误监控
            path: /index.html
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 3
          timeoutSeconds: 1
          successThreshold: 1
          failureThreshold: 3


---
apiVersion: v1
kind: Service
metadata:
  name: myserver-myapp-frontend-service
  namespace: myserver
spec:
  ports:
  - name: http
    port: 81
    targetPort: 80
    nodePort: 40012				###端口号范围
    protocol: TCP
  type: NodePort
  selector:
    app: myserver-myapp-frontend-label

就绪探针:就绪探针检查之后、如果失败:
用 kubectl get ep -n  myserver 查看pod是不会有地址的
TCP检查
配置一样:
	1、主要就是配置端口
	2、通常就是检查3306、6379、Kafka、没有api的一些服务

yaml文件

配置文件:2-tcp-Probe.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myserver-myapp-frontend-deployment
  namespace: myserver
spec:
  replicas: 1
  selector:
    matchLabels: #rs or deployment
      app: myserver-myapp-frontend-label
    #matchExpressions:
    #  - {key: app, operator: In, values: [myserver-myapp-frontend,ng-rs-81]}
  template:
    metadata:
      labels:
        app: myserver-myapp-frontend-label
    spec:
      containers:
      - name: myserver-myapp-frontend-label
        image: nginx:1.20.2
        ports:
        - containerPort: 80
        livenessProbe:
        #readinessProbe:
          tcpSocket:
            port: 80
            #port: 8080
          initialDelaySeconds: 5
          periodSeconds: 3
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 3


---
apiVersion: v1
kind: Service
metadata:
  name: myserver-myapp-frontend-service
  namespace: myserver
spec:
  ports:
  - name: http
    port: 81
    targetPort: 80
    nodePort: 30012				###端口号修改
    protocol: TCP
  type: NodePort
  selector:
    app: myserver-myapp-frontend-label
命令式检查
有些服务端口检查可能不准确:
	1、非交互式、在容器里执行命令
	2、
案例:
	/usr/loacl/bin/redis-cli
	quit

错误:可能是quit执行成功了

yaml文件

3-exec-Probe.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myserver-myapp-redis-deployment
  namespace: myserver
spec:
  replicas: 1
  selector:
    matchLabels: #rs or deployment
      app: myserver-myapp-redis-label
    #matchExpressions:
    #  - {key: app, operator: In, values: [myserver-myapp-redis,ng-rs-81]}
  template:
    metadata:
      labels:
        app: myserver-myapp-redis-label
    spec:
      containers:
      - name: myserver-myapp-redis-container
        image: redis
        ports:
        - containerPort: 6379
        livenessProbe:
        #readinessProbe:
          exec:
            command:
            #- /apps/redis/bin/redis-cli
            - /usr/local/bin/redis-cli
            - quit
          initialDelaySeconds: 5
          periodSeconds: 3
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 3

---
apiVersion: v1
kind: Service
metadata:
  name: myserver-myapp-redis-service
  namespace: myserver
spec:
  ports:
  - name: http
    port: 6379
    targetPort: 6379
    nodePort: 40016
    protocol: TCP
  type: NodePort
  selector:
    app: myserver-myapp-redis-label
启动探针
如果失败:就把pod强制终止
yaml文件:4-startupProbe.yaml
两种资源:Deployment和service


apiVersion: apps/v1
kind: Deployment
metadata:
  name: myserver-myapp-frontend-deployment
  namespace: myserver
spec:
  replicas: 1
  selector:
    matchLabels: #rs or deployment
      app: myserver-myapp-frontend-label
    #matchExpressions:
    #  - {key: app, operator: In, values: [myserver-myapp-frontend,ng-rs-81]}
  template:
    metadata:
      labels:
        app: myserver-myapp-frontend-label
    spec:
      containers:
      - name: myserver-myapp-frontend-label
        image: nginx:1.20.2
        ports:
        - containerPort: 80
        startupProbe:
          httpGet:
            path: /index.html
            port: 8080								###此处修改端口就检查失败、nginx是80端口
          initialDelaySeconds: 5 #首次检测延迟5s
          failureThreshold: 3  #从成功转为失败的次数
          periodSeconds: 3 #探测间隔周期


---
apiVersion: v1
kind: Service
metadata:
  name: myserver-myapp-frontend-service
  namespace: myserver
spec:
  ports:
  - name: http
    port: 81
    targetPort: 80
    nodePort: 30012
    protocol: TCP
  type: NodePort
  selector:
    app: myserver-myapp-frontend-label


####报错:
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  2m2s                 default-scheduler  Successfully assigned myserver/myserver-myapp-frontend-deployment-5d59544b5-fn47f to k8s-32
  Normal   Pulled     86s (x4 over 2m2s)   kubelet            Container image "nginx:1.20.2" already present on machine
  Normal   Created    86s (x4 over 2m2s)   kubelet            Created container myserver-myapp-frontend-label
  Normal   Started    86s (x4 over 2m2s)   kubelet            Started container myserver-myapp-frontend-label
  Normal   Killing    86s (x3 over 110s)   kubelet            Container myserver-myapp-frontend-label failed startup probe, will be restarted
  Warning  Unhealthy  80s (x10 over 116s)  kubelet            Startup probe failed: Get "http://10.100.81.90:8080/index.html": dial tcp 10.100.81.90:8080: connect: connection refused


#####总结:事件里面报错、连接不到端口

所有探针
yaml文件:5-startupProbe-livenessProbe-readinessProbe.yaml


apiVersion: apps/v1
kind: Deployment
metadata:
  name: myserver-myapp-frontend-deployment
  namespace: myserver
spec:
  replicas: 3
  selector:
    matchLabels: #rs or deployment
      app: myserver-myapp-frontend-label
    #matchExpressions:
    #  - {key: app, operator: In, values: [myserver-myapp-frontend,ng-rs-81]}
  template:
    metadata:
      labels:
        app: myserver-myapp-frontend-label
    spec:
      terminationGracePeriodSeconds: 60
      containers:
      - name: myserver-myapp-frontend-label
        image: nginx:1.20.2
        ports:
        - containerPort: 80
        startupProbe:
          httpGet:
            path: /index.html
            port: 80
          initialDelaySeconds: 5 #首次检测延迟5s
          failureThreshold: 3  #从成功转为失败的次数
          periodSeconds: 3 #探测间隔周期
        readinessProbe:
          httpGet:
            #path: /monitor/monitor.html
            path: /index.html
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 3
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 3
        livenessProbe:
          httpGet:
            #path: /monitor/monitor.html
            path: /index.html
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 3
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 3
---
apiVersion: v1
kind: Service
metadata:
  name: myserver-myapp-frontend-service
  namespace: myserver
spec:
  ports:
  - name: http
    port: 81
    targetPort: 80
    nodePort: 40012
    protocol: TCP
  type: NodePort
  selector:
    app: myserver-myapp-frontend-label


所有探针全部配置上: 启动探针  就绪探针  存活探针
注意:多副本也不行、检查失败也不启动


1、故意修改存活探针端口、使其检查失败
2、后续两个都失败:发现启动探针成功、存活探针和就绪探针检查失败、就控制这个pod、不会调度到service上、就一直反复重启这个pod。(只要是没通过、就重启)
3、把存活修改正确、就绪修改检测错误(就绪失败:发现pod不会重启、但就是不调度在service上面)
4、都修改正确:进入到pod、把一个副本修改挂掉、把index.html修改一下、存活检查的是这个文件、(就会重启、用原镜像在原pod的基础上重建一次)(文件系统会改变、地址不会变化)(把容器、基于原镜像重建一次、之前的namespace没有动)(类似于云主机重建系统)
区别
就绪不会重启pod
存活会重启pod
posted @ 2024-07-25 21:20  姬高波  阅读(37)  评论(0)    收藏  举报