CRD + Operator + Admission Webhook + Scheduler + 聚合 API 流程

CRD + Operator + Admission Webhook + Scheduler + 聚合 API 流程,用 Python + Flask 实现,并且每个模块都有清晰功能,同时展示 API Server / Controller / Scheduler / Aggregated API 的对应示例。下面我帮你梳理完整设计方案和 Python 示例。


Kubernetes Node 负载调度示例(Python + Flask 版)

1️⃣ 系统整体功能分解

模块功能Python 实现日志/记录层级
CRD (NodeLoadRule) 定义 Node 调度规则(CPU/Memory 上限 + 可调度节点) YAML + kubectl apply API Server 存储,事件日志可通过 kubectl get nlr -o yaml 查看
Operator / Controller 监听 CRD & Node,计算负载,生成 Pod 调度建议 Flask + Kubernetes client Operator 日志,记录每次负载检查和 Pod 调度建议
Admission Webhook 拦截 Pod 创建,按 CRD 规则修改 Pod spec(nodeSelector/affinity) Flask + Kubernetes client JSONPatch Webhook 日志,记录修改 Pod 的操作
Custom Scheduler 监控未调度 Pod,根据 Node 负载选择目标 Node 并绑定 Pod Flask + Kubernetes client Scheduler 日志,记录每次 Pod 调度决策
Aggregated API 对外暴露 Node 状态、CRD 状态,方便观察 & 调试 Flask + Kubernetes client HTTP 请求日志,可记录每次访问节点负载信息

2️⃣ CRD 示例

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: nodeloadrules.example.com
spec:
  group: example.com
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                maxCPU:
                  type: integer
                maxMemory:
                  type: integer
                allowedNodes:
                  type: array
                  items:
                    type: string
  scope: Namespaced
  names:
    plural: nodeloadrules
    singular: nodeloadrule
    kind: NodeLoadRule
    shortNames:
      - nlr

说明

  • CRD 定义 Pod 调度规则,最大 CPU/内存和允许的 Node 列表。

  • API Server 存储并管理 CRD 对象。


3️⃣ Operator + Controller(Flask)

from flask import Flask, jsonify
from kubernetes import client, config
import threading
import time

app = Flask(__name__)
config.load_kube_config()
v1 = client.CoreV1Api()

# 模拟 Operator/Controller 功能:定期检查 Node 负载和 CRD
node_load_cache = {}
pod_scheduling_suggestions = []

def operator_controller_loop():
    while True:
        # 获取 Node 负载信息
        nodes = v1.list_node().items
        node_load_cache.clear()
        for node in nodes:
            cpu = int(node.status.capacity.get('cpu', 0))
            mem = int(node.status.capacity.get('memory', 0))
            node_load_cache[node.metadata.name] = {"cpu": cpu, "mem": mem}
        
        # 获取 CRD 配置
        crds = v1.list_namespaced_custom_object(
            group="example.com", version="v1",
            namespace="default", plural="nodeloadrules"
        )
        pod_scheduling_suggestions.clear()
        for crd in crds.get('items', []):
            max_cpu = crd['spec'].get('maxCPU', 80)
            max_mem = crd['spec'].get('maxMemory', 80)
            allowed_nodes = crd['spec'].get('allowedNodes', [])
            for node_name, load in node_load_cache.items():
                if node_name in allowed_nodes and load['cpu'] <= max_cpu and load['mem'] <= max_mem:
                    pod_scheduling_suggestions.append({
                        "node": node_name,
                        "rule": crd['metadata']['name']
                    })
        print(f"[Operator] Scheduling suggestions: {pod_scheduling_suggestions}")
        time.sleep(30)

threading.Thread(target=operator_controller_loop, daemon=True).start()

@app.route("/operator/suggestions")
def get_suggestions():
    return jsonify(pod_scheduling_suggestions)

if __name__ == "__main__":
    app.run(port=5000)

说明

  • Operator + Controller 合并到一个 Flask 服务。

  • 功能:监听 Node + CRD → 生成 Pod 调度建议 → 提供 API 查询。

  • 日志:控制台打印每轮调度建议。


4️⃣ Admission Webhook(Flask)

from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route('/mutate', methods=['POST'])
def mutate():
    req = request.get_json()
    pod_name = req['request']['object']['metadata']['name']

    # 强制 nodeSelector
    patch = [
        {"op": "add", "path": "/spec/nodeSelector", "value": {"custom-node": "true"}}
    ]

    resp = {
        "response": {
            "uid": req['request']['uid'],
            "allowed": True,
            "patchType": "JSONPatch",
            "patch": patch
        }
    }
    print(f"[Webhook] Pod {pod_name} patch applied: {patch}")
    return jsonify(resp)

if __name__ == '__main__':
    app.run(port=5001)

说明

  • 拦截 Pod 创建,按规则强制修改 nodeSelector。

  • 日志:Webhook 打印修改操作。


5️⃣ Scheduler(Flask)

from flask import Flask, jsonify
from kubernetes import client, config
import threading
import time

app = Flask(__name__)
config.load_kube_config()
v1 = client.CoreV1Api()

def scheduler_loop():
    while True:
        pods = v1.list_pod_for_all_namespaces(field_selector="spec.nodeName==''").items
        nodes = v1.list_node().items
        for pod in pods:
            target_node = None
            for node in nodes:
                cpu = int(node.status.capacity['cpu'])
                if cpu < 50:
                    target_node = node.metadata.name
                    break
            if target_node:
                body = {"spec": {"nodeName": target_node}}
                v1.patch_namespaced_pod(pod.metadata.name, pod.metadata.namespace, body)
                print(f"[Scheduler] Pod {pod.metadata.name} scheduled to {target_node}")
        time.sleep(10)

threading.Thread(target=scheduler_loop, daemon=True).start()

@app.route("/scheduler/status")
def scheduler_status():
    return jsonify({"status": "running"})

if __name__ == "__main__":
    app.run(port=5002)

说明

  • 监控未调度 Pod,根据 Node 负载选择 Node。

  • 日志:Scheduler 控制台打印每次调度决策。


6️⃣ 聚合 API(Flask)

from flask import Flask, jsonify
from kubernetes import client, config

app = Flask(__name__)
config.load_kube_config()
v1 = client.CoreV1Api()

@app.route("/api/node-load")
def node_load():
    nodes = v1.list_node().items
    data = []
    for node in nodes:
        data.append({
            "name": node.metadata.name,
            "cpu": node.status.capacity['cpu'],
            "memory": node.status.capacity['memory']
        })
    print("[Aggregated API] Node load queried")
    return jsonify({"nodes": data})

if __name__ == "__main__":
    app.run(port=5003)

说明

  • 对外提供 Node 状态和 CRD 状态查询接口。

  • 日志:每次查询记录到控制台。


7️⃣ 功能关系总结

层级功能日志记录Python 实现
API Server 存储 CRD kubectl describe nlr YAML
Controller / Operator 监听 CRD + Node,生成调度建议 控制台 Flask + Kubernetes client
Admission Webhook 修改 Pod nodeSelector 控制台 Flask
Scheduler 根据 Node 负载调度 Pod 控制台 Flask
Aggregated API Node/CRD 状态查询 控制台 Flask

 

+------------------+
| Kubernetes API   |
| Server           |
+------------------+
          |
          v
+------------------+          +----------------+
| Aggregated API   |    | Custom CRD     |
| Server           |          | (NodeLoadRule) |
+------------------+          +----------------+
          |
          v
+------------------+
| Operator         | --watch--> Pod, Node, CRD
+------------------+
          |
          v
+------------------+
| Scheduler        | --custom scheduling logic
+------------------+
          |
          v
+------------------+
| Admission Webhook|
| (validating/     |
| mutating)        |
+------------------+

posted on 2025-09-26 18:59  吃草的青蛙  阅读(11)  评论(0)    收藏  举报

导航