CRD + Operator + Admission Webhook + Scheduler + 聚合 API 流程
CRD + Operator + Admission Webhook + Scheduler + 聚合 API 流程,用 Python + Flask 实现,并且每个模块都有清晰功能,同时展示 API Server / Controller / Scheduler / Aggregated API 的对应示例。下面我帮你梳理完整设计方案和 Python 示例。
Kubernetes Node 负载调度示例(Python + Flask 版)
1️⃣ 系统整体功能分解
| 模块 | 功能 | Python 实现 | 日志/记录层级 |
|---|---|---|---|
CRD (NodeLoadRule) |
定义 Node 调度规则(CPU/Memory 上限 + 可调度节点) | YAML + kubectl apply |
API Server 存储,事件日志可通过 kubectl get nlr -o yaml 查看 |
| Operator / Controller | 监听 CRD & Node,计算负载,生成 Pod 调度建议 | Flask + Kubernetes client | Operator 日志,记录每次负载检查和 Pod 调度建议 |
| Admission Webhook | 拦截 Pod 创建,按 CRD 规则修改 Pod spec(nodeSelector/affinity) | Flask + Kubernetes client JSONPatch | Webhook 日志,记录修改 Pod 的操作 |
| Custom Scheduler | 监控未调度 Pod,根据 Node 负载选择目标 Node 并绑定 Pod | Flask + Kubernetes client | Scheduler 日志,记录每次 Pod 调度决策 |
| Aggregated API | 对外暴露 Node 状态、CRD 状态,方便观察 & 调试 | Flask + Kubernetes client | HTTP 请求日志,可记录每次访问节点负载信息 |
2️⃣ CRD 示例
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: nodeloadrules.example.com
spec:
group: example.com
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
maxCPU:
type: integer
maxMemory:
type: integer
allowedNodes:
type: array
items:
type: string
scope: Namespaced
names:
plural: nodeloadrules
singular: nodeloadrule
kind: NodeLoadRule
shortNames:
- nlr
说明:
-
CRD 定义 Pod 调度规则,最大 CPU/内存和允许的 Node 列表。
-
API Server 存储并管理 CRD 对象。
3️⃣ Operator + Controller(Flask)
from flask import Flask, jsonify
from kubernetes import client, config
import threading
import time
app = Flask(__name__)
config.load_kube_config()
v1 = client.CoreV1Api()
# 模拟 Operator/Controller 功能:定期检查 Node 负载和 CRD
node_load_cache = {}
pod_scheduling_suggestions = []
def operator_controller_loop():
while True:
# 获取 Node 负载信息
nodes = v1.list_node().items
node_load_cache.clear()
for node in nodes:
cpu = int(node.status.capacity.get('cpu', 0))
mem = int(node.status.capacity.get('memory', 0))
node_load_cache[node.metadata.name] = {"cpu": cpu, "mem": mem}
# 获取 CRD 配置
crds = v1.list_namespaced_custom_object(
group="example.com", version="v1",
namespace="default", plural="nodeloadrules"
)
pod_scheduling_suggestions.clear()
for crd in crds.get('items', []):
max_cpu = crd['spec'].get('maxCPU', 80)
max_mem = crd['spec'].get('maxMemory', 80)
allowed_nodes = crd['spec'].get('allowedNodes', [])
for node_name, load in node_load_cache.items():
if node_name in allowed_nodes and load['cpu'] <= max_cpu and load['mem'] <= max_mem:
pod_scheduling_suggestions.append({
"node": node_name,
"rule": crd['metadata']['name']
})
print(f"[Operator] Scheduling suggestions: {pod_scheduling_suggestions}")
time.sleep(30)
threading.Thread(target=operator_controller_loop, daemon=True).start()
@app.route("/operator/suggestions")
def get_suggestions():
return jsonify(pod_scheduling_suggestions)
if __name__ == "__main__":
app.run(port=5000)
说明:
-
Operator + Controller 合并到一个 Flask 服务。
-
功能:监听 Node + CRD → 生成 Pod 调度建议 → 提供 API 查询。
-
日志:控制台打印每轮调度建议。
4️⃣ Admission Webhook(Flask)
from flask import Flask, request, jsonify
app = Flask(__name__)
@app.route('/mutate', methods=['POST'])
def mutate():
req = request.get_json()
pod_name = req['request']['object']['metadata']['name']
# 强制 nodeSelector
patch = [
{"op": "add", "path": "/spec/nodeSelector", "value": {"custom-node": "true"}}
]
resp = {
"response": {
"uid": req['request']['uid'],
"allowed": True,
"patchType": "JSONPatch",
"patch": patch
}
}
print(f"[Webhook] Pod {pod_name} patch applied: {patch}")
return jsonify(resp)
if __name__ == '__main__':
app.run(port=5001)
说明:
-
拦截 Pod 创建,按规则强制修改 nodeSelector。
-
日志:Webhook 打印修改操作。
5️⃣ Scheduler(Flask)
from flask import Flask, jsonify
from kubernetes import client, config
import threading
import time
app = Flask(__name__)
config.load_kube_config()
v1 = client.CoreV1Api()
def scheduler_loop():
while True:
pods = v1.list_pod_for_all_namespaces(field_selector="spec.nodeName==''").items
nodes = v1.list_node().items
for pod in pods:
target_node = None
for node in nodes:
cpu = int(node.status.capacity['cpu'])
if cpu < 50:
target_node = node.metadata.name
break
if target_node:
body = {"spec": {"nodeName": target_node}}
v1.patch_namespaced_pod(pod.metadata.name, pod.metadata.namespace, body)
print(f"[Scheduler] Pod {pod.metadata.name} scheduled to {target_node}")
time.sleep(10)
threading.Thread(target=scheduler_loop, daemon=True).start()
@app.route("/scheduler/status")
def scheduler_status():
return jsonify({"status": "running"})
if __name__ == "__main__":
app.run(port=5002)
说明:
-
监控未调度 Pod,根据 Node 负载选择 Node。
-
日志:Scheduler 控制台打印每次调度决策。
6️⃣ 聚合 API(Flask)
from flask import Flask, jsonify
from kubernetes import client, config
app = Flask(__name__)
config.load_kube_config()
v1 = client.CoreV1Api()
@app.route("/api/node-load")
def node_load():
nodes = v1.list_node().items
data = []
for node in nodes:
data.append({
"name": node.metadata.name,
"cpu": node.status.capacity['cpu'],
"memory": node.status.capacity['memory']
})
print("[Aggregated API] Node load queried")
return jsonify({"nodes": data})
if __name__ == "__main__":
app.run(port=5003)
说明:
-
对外提供 Node 状态和 CRD 状态查询接口。
-
日志:每次查询记录到控制台。
7️⃣ 功能关系总结
| 层级 | 功能 | 日志记录 | Python 实现 |
|---|---|---|---|
| API Server | 存储 CRD | kubectl describe nlr |
YAML |
| Controller / Operator | 监听 CRD + Node,生成调度建议 | 控制台 | Flask + Kubernetes client |
| Admission Webhook | 修改 Pod nodeSelector | 控制台 | Flask |
| Scheduler | 根据 Node 负载调度 Pod | 控制台 | Flask |
| Aggregated API | Node/CRD 状态查询 | 控制台 | Flask |
+------------------+
| Kubernetes API |
| Server |
+------------------+
|
v
+------------------+ +----------------+
| Aggregated API | | Custom CRD |
| Server | | (NodeLoadRule) |
+------------------+ +----------------+
|
v
+------------------+
| Operator | --watch--> Pod, Node, CRD
+------------------+
|
v
+------------------+
| Scheduler | --custom scheduling logic
+------------------+
|
v
+------------------+
| Admission Webhook|
| (validating/ |
| mutating) |
+------------------+
浙公网安备 33010602011771号