K8s:探针health Check
health Check
探针检测
docker创建容器开启探针检测会显示
nerdctl创建的不会显示
检测内容
1、容器里面的端口和url是否正常被访问(端口存在、api是否正常访问)
因为:java内存溢出、内存泄漏
2、探针可以移除掉服务
3、重要是服务配置探针检测
4、配合HPA控制器、让副本自动伸缩
5、检测时间自定义(3秒左右检查一次)
检查时间
间隔时间
超时时间
重试次数
保护时间(服务启动60秒之后再检查)
事例
9-58-57M
docker-compose -f *.yaml up -d #启动
docker-compose -f *.yaml #停止
k8s检测
kubelet对容器进行检测
探针的核心功能
pod启动前的检查:
1、启动探针:启动时候就配置检查
完成之后、pod启动之后、才执行下面两个探针检查
周期检查:
2、readiness就绪探针:探针检查失败之后、可以将pod从svc摘除、既不会再将请求转发至检查失败的pod
3、liveness存活探针:探针检查失败之后、可以讲pod重启
一切检查没有问题之后:镜像更新、配置更新(可以重建、删除pod前的操作:从当前注册中心拿掉或者执行脚本)
pod生命周期 ?
1、提交创建成功之后、先进行init容器、成功之后——————>
2、start pod检查:启动探针检查。(执行之后——————>)
3、周期执行:就绪探针和存活探针、就是业务容器检查了、每隔几秒检查一次、如果不正常、就绪探针就把容器从service拿掉、存活探针会把容器重启、
4、如果一切都通过、pod删除之前会stop操作。
探针钩子
hook 钩子
1、进去执行命令(shell命令、简单的交互式命令)
2、TCP : 对容器端口进行检查、(MySQL、Kafka、zookeeper、6379、3306、80)(zabbix)
3、HTTP_GET请求 : ip1地址加get请求、状态码
检查结果
成功
失败
未知:诊断失败、因此不会采取任何行动
pod重启策略
镜像拉取策略
实验
注意:
HTTP检查
yaml文件
存活探针
1、主要能重启pod、重启三次后就失败、不调度
2、三次失败后处于状态:CrashLoopBackOff:探针检查失败
确保service的后端匹配到pod:就是service的label要选择对应的pod
apiVersion: apps/v1
kind: Deployment
metadata:
name: myserver-myapp-frontend-deployment
namespace: myserver
spec:
replicas: 1
selector:
matchLabels: #rs or deployment
app: myserver-myapp-frontend-label
#matchExpressions:
# - {key: app, operator: In, values: [myserver-myapp-frontend,ng-rs-81]}
template:
metadata:
labels:
app: myserver-myapp-frontend-label
spec:
containers:
- name: myserver-myapp-frontend-label
image: nginx:1.20.2
ports:
- containerPort: 80
readinessProbe:
livenessProbe:
httpGet:
#path: /monitor/monitor.html ###错误监控
path: /index.html
port: 80
initialDelaySeconds: 5
periodSeconds: 3
timeoutSeconds: 1
successThreshold: 1
failureThreshold: 3
---
apiVersion: v1
kind: Service
metadata:
name: myserver-myapp-frontend-service
namespace: myserver
spec:
ports:
- name: http
port: 81
targetPort: 80
nodePort: 40012 ###端口号范围
protocol: TCP
type: NodePort
selector:
app: myserver-myapp-frontend-label
就绪探针:就绪探针检查之后、如果失败:
用 kubectl get ep -n myserver 查看pod是不会有地址的
TCP检查
配置一样:
1、主要就是配置端口
2、通常就是检查3306、6379、Kafka、没有api的一些服务
yaml文件
配置文件:2-tcp-Probe.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myserver-myapp-frontend-deployment
namespace: myserver
spec:
replicas: 1
selector:
matchLabels: #rs or deployment
app: myserver-myapp-frontend-label
#matchExpressions:
# - {key: app, operator: In, values: [myserver-myapp-frontend,ng-rs-81]}
template:
metadata:
labels:
app: myserver-myapp-frontend-label
spec:
containers:
- name: myserver-myapp-frontend-label
image: nginx:1.20.2
ports:
- containerPort: 80
livenessProbe:
#readinessProbe:
tcpSocket:
port: 80
#port: 8080
initialDelaySeconds: 5
periodSeconds: 3
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 3
---
apiVersion: v1
kind: Service
metadata:
name: myserver-myapp-frontend-service
namespace: myserver
spec:
ports:
- name: http
port: 81
targetPort: 80
nodePort: 30012 ###端口号修改
protocol: TCP
type: NodePort
selector:
app: myserver-myapp-frontend-label
命令式检查
有些服务端口检查可能不准确:
1、非交互式、在容器里执行命令
2、
案例:
/usr/loacl/bin/redis-cli
quit
错误:可能是quit执行成功了
yaml文件
3-exec-Probe.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myserver-myapp-redis-deployment
namespace: myserver
spec:
replicas: 1
selector:
matchLabels: #rs or deployment
app: myserver-myapp-redis-label
#matchExpressions:
# - {key: app, operator: In, values: [myserver-myapp-redis,ng-rs-81]}
template:
metadata:
labels:
app: myserver-myapp-redis-label
spec:
containers:
- name: myserver-myapp-redis-container
image: redis
ports:
- containerPort: 6379
livenessProbe:
#readinessProbe:
exec:
command:
#- /apps/redis/bin/redis-cli
- /usr/local/bin/redis-cli
- quit
initialDelaySeconds: 5
periodSeconds: 3
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 3
---
apiVersion: v1
kind: Service
metadata:
name: myserver-myapp-redis-service
namespace: myserver
spec:
ports:
- name: http
port: 6379
targetPort: 6379
nodePort: 40016
protocol: TCP
type: NodePort
selector:
app: myserver-myapp-redis-label
启动探针
如果失败:就把pod强制终止
yaml文件:4-startupProbe.yaml
两种资源:Deployment和service
apiVersion: apps/v1
kind: Deployment
metadata:
name: myserver-myapp-frontend-deployment
namespace: myserver
spec:
replicas: 1
selector:
matchLabels: #rs or deployment
app: myserver-myapp-frontend-label
#matchExpressions:
# - {key: app, operator: In, values: [myserver-myapp-frontend,ng-rs-81]}
template:
metadata:
labels:
app: myserver-myapp-frontend-label
spec:
containers:
- name: myserver-myapp-frontend-label
image: nginx:1.20.2
ports:
- containerPort: 80
startupProbe:
httpGet:
path: /index.html
port: 8080 ###此处修改端口就检查失败、nginx是80端口
initialDelaySeconds: 5 #首次检测延迟5s
failureThreshold: 3 #从成功转为失败的次数
periodSeconds: 3 #探测间隔周期
---
apiVersion: v1
kind: Service
metadata:
name: myserver-myapp-frontend-service
namespace: myserver
spec:
ports:
- name: http
port: 81
targetPort: 80
nodePort: 30012
protocol: TCP
type: NodePort
selector:
app: myserver-myapp-frontend-label
####报错:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m2s default-scheduler Successfully assigned myserver/myserver-myapp-frontend-deployment-5d59544b5-fn47f to k8s-32
Normal Pulled 86s (x4 over 2m2s) kubelet Container image "nginx:1.20.2" already present on machine
Normal Created 86s (x4 over 2m2s) kubelet Created container myserver-myapp-frontend-label
Normal Started 86s (x4 over 2m2s) kubelet Started container myserver-myapp-frontend-label
Normal Killing 86s (x3 over 110s) kubelet Container myserver-myapp-frontend-label failed startup probe, will be restarted
Warning Unhealthy 80s (x10 over 116s) kubelet Startup probe failed: Get "http://10.100.81.90:8080/index.html": dial tcp 10.100.81.90:8080: connect: connection refused
#####总结:事件里面报错、连接不到端口
所有探针
yaml文件:5-startupProbe-livenessProbe-readinessProbe.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myserver-myapp-frontend-deployment
namespace: myserver
spec:
replicas: 3
selector:
matchLabels: #rs or deployment
app: myserver-myapp-frontend-label
#matchExpressions:
# - {key: app, operator: In, values: [myserver-myapp-frontend,ng-rs-81]}
template:
metadata:
labels:
app: myserver-myapp-frontend-label
spec:
terminationGracePeriodSeconds: 60
containers:
- name: myserver-myapp-frontend-label
image: nginx:1.20.2
ports:
- containerPort: 80
startupProbe:
httpGet:
path: /index.html
port: 80
initialDelaySeconds: 5 #首次检测延迟5s
failureThreshold: 3 #从成功转为失败的次数
periodSeconds: 3 #探测间隔周期
readinessProbe:
httpGet:
#path: /monitor/monitor.html
path: /index.html
port: 80
initialDelaySeconds: 5
periodSeconds: 3
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 3
livenessProbe:
httpGet:
#path: /monitor/monitor.html
path: /index.html
port: 80
initialDelaySeconds: 5
periodSeconds: 3
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 3
---
apiVersion: v1
kind: Service
metadata:
name: myserver-myapp-frontend-service
namespace: myserver
spec:
ports:
- name: http
port: 81
targetPort: 80
nodePort: 40012
protocol: TCP
type: NodePort
selector:
app: myserver-myapp-frontend-label
所有探针全部配置上: 启动探针 就绪探针 存活探针
注意:多副本也不行、检查失败也不启动
1、故意修改存活探针端口、使其检查失败
2、后续两个都失败:发现启动探针成功、存活探针和就绪探针检查失败、就控制这个pod、不会调度到service上、就一直反复重启这个pod。(只要是没通过、就重启)
3、把存活修改正确、就绪修改检测错误(就绪失败:发现pod不会重启、但就是不调度在service上面)
4、都修改正确:进入到pod、把一个副本修改挂掉、把index.html修改一下、存活检查的是这个文件、(就会重启、用原镜像在原pod的基础上重建一次)(文件系统会改变、地址不会变化)(把容器、基于原镜像重建一次、之前的namespace没有动)(类似于云主机重建系统)
区别
就绪不会重启pod
存活会重启pod

浙公网安备 33010602011771号