Kubernetes Operator入门开发指南
Kubernetes Operator 入门开发指南
使用 Python 和 Kopf 框架编写一个简单的 Kubernetes Operator。我们将创建一个名为 WebSite 的自定义资源,当用户创建该资源时,Operator 会自动为其创建一个 Deployment 和一个 Service,并在资源删除时清理这些资源。
准备工作
-
一个可用的 Kubernetes 集群
-
Python 3.8+
-
安装
kopf和kubernetesPython 包:pip install kopf kubernetes熟悉基本的 Kubernetes 概念(Deployment、Service、CRD)
步骤 1:定义自定义资源(CRD)
首先,我们需要在集群中注册自定义资源。创建一个名为 crd-website.yaml 的文件:
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: websites.example.com
spec:
group: example.com
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
image:
type: string
replicas:
type: integer
minimum: 1
maximum: 10
required: ["image"]
scope: Namespaced
names:
plural: websites
singular: website
kind: WebSite
shortNames:
- ws
将该 CRD 部署到集群:
kubectl apply -f crd-website.yaml
步骤 2:编写 Operator 代码
创建文件 operator.py,内容如下:
import kopf
import kubernetes
import yaml
# 当 WebSite 资源创建或更新时触发
@kopf.on.create('example.com', 'v1', 'websites')
@kopf.on.update('example.com', 'v1', 'websites')
def create_or_update_website(spec, name, namespace, logger, **kwargs):
# 从 spec 中获取参数
image = spec.get('image')
replicas = spec.get('replicas', 1)
# 构建 Deployment 和 Service 的 YAML 定义
deployment_manifest = {
'apiVersion': 'apps/v1',
'kind': 'Deployment',
'metadata': {
'name': f'{name}-deployment',
'namespace': namespace,
'labels': {'app': name}
},
'spec': {
'replicas': replicas,
'selector': {'matchLabels': {'app': name}},
'template': {
'metadata': {'labels': {'app': name}},
'spec': {
'containers': [{
'name': 'web',
'image': image,
'ports': [{'containerPort': 80}]
}]
}
}
}
}
service_manifest = {
'apiVersion': 'v1',
'kind': 'Service',
'metadata': {
'name': f'{name}-service',
'namespace': namespace
},
'spec': {
'selector': {'app': name},
'ports': [{'protocol': 'TCP', 'port': 80, 'targetPort': 80}],
'type': 'ClusterIP'
}
}
# 加载 Kubernetes 配置(在集群内运行时会自动加载 ServiceAccount)
api = kubernetes.client.AppsV1Api()
v1 = kubernetes.client.CoreV1Api()
# 创建或更新 Deployment
try:
# 检查 Deployment 是否已存在
api.read_namespaced_deployment(name=f'{name}-deployment', namespace=namespace)
# 如果存在则更新
api.patch_namespaced_deployment(name=f'{name}-deployment', namespace=namespace, body=deployment_manifest)
logger.info(f"Deployment {name}-deployment updated")
except kubernetes.client.rest.ApiException as e:
if e.status == 404:
# 不存在则创建
api.create_namespaced_deployment(namespace=namespace, body=deployment_manifest)
logger.info(f"Deployment {name}-deployment created")
else:
raise
# 创建或更新 Service
try:
v1.read_namespaced_service(name=f'{name}-service', namespace=namespace)
v1.patch_namespaced_service(name=f'{name}-service', namespace=namespace, body=service_manifest)
logger.info(f"Service {name}-service updated")
except kubernetes.client.rest.ApiException as e:
if e.status == 404:
v1.create_namespaced_service(namespace=namespace, body=service_manifest)
logger.info(f"Service {name}-service created")
else:
raise
# 更新自定义资源的状态(可选)
return {'status': 'deployed', 'url': f'http://{name}-service.{namespace}.svc.cluster.local'}
# 当 WebSite 资源被删除时触发
@kopf.on.delete('example.com', 'v1', 'websites')
def delete_website(name, namespace, logger, **kwargs):
api = kubernetes.client.AppsV1Api()
v1 = kubernetes.client.CoreV1Api()
# 删除 Deployment
try:
api.delete_namespaced_deployment(name=f'{name}-deployment', namespace=namespace)
logger.info(f"Deployment {name}-deployment deleted")
except kubernetes.client.rest.ApiException as e:
if e.status != 404:
raise
# 删除 Service
try:
v1.delete_namespaced_service(name=f'{name}-service', namespace=namespace)
logger.info(f"Service {name}-service deleted")
except kubernetes.client.rest.ApiException as e:
if e.status != 404:
raise
步骤 3:容器化 Operator
为了让 Operator 在 Kubernetes 集群中运行,我们需要将其打包成 Docker 镜像。
创建 Dockerfile:
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY operator.py .
CMD ["kopf", "run", "/app/operator.py", "--verbose"]
创建 requirements.txt:
kopf
kubernetes
pyyaml
构建镜像(假设镜像名为 my-website-operator:latest):
docker build -t my-website-operator:latest .
步骤 4:部署 Operator 到 Kubernetes
创建必要的 RBAC 权限,因为 Operator 需要创建/删除 Deployment 和 Service。创建一个名为 rbac.yaml 的文件:
apiVersion: v1
kind: ServiceAccount
metadata:
name: website-operator
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: website-operator
rules:
- apiGroups: ["example.com"]
resources: ["websites"]
verbs: ["get", "list", "watch", "patch", "update"]
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["*"]
- apiGroups: [""]
resources: ["services"]
verbs: ["*"]
- apiGroups: ["apiextensions.k8s.io"]
resources: ["customresourcedefinitions"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "patch", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: website-operator
subjects:
- kind: ServiceAccount
name: website-operator
namespace: default
roleRef:
kind: ClusterRole
name: website-operator
apiGroup: rbac.authorization.k8s.io
创建 Deployment deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: website-operator
spec:
replicas: 1
selector:
matchLabels:
app: website-operator
template:
metadata:
labels:
app: website-operator
spec:
serviceAccountName: website-operator
containers:
- name: operator
image: my-website-operator:latest
imagePullPolicy: IfNotPresent
应用 RBAC 和 Deployment:
kubectl apply -f rbac.yaml
kubectl apply -f deployment.yaml
检查 Operator 日志:
kubectl logs -f deployment/website-operator
步骤 5:测试 Operator
现在可以创建自定义资源实例了。创建一个文件 example-website.yaml:
apiVersion: example.com/v1
kind: WebSite
metadata:
name: my-nginx
spec:
image: nginx:latest
replicas: 2
应用它:
kubectl apply -f example-website.yaml
查看 Operator 日志,应该能看到创建 Deployment 和 Service 的信息。检查 Kubernetes 资源:
kubectl get deployments
kubectl get services
kubectl get websites
应该能看到名为 my-nginx-deployment 和 my-nginx-service 的资源。删除自定义资源:
kubectl delete website my-nginx
相应的 Deployment 和 Service 也会被自动删除。
进一步探索
- Status 更新:在
create_or_update_website函数中返回的字典会被自动填充到资源的status字段,用户可以通过kubectl get website my-nginx -o yaml查看。 - 错误处理:Kopf 提供了重试机制,可以通过
@kopf.on.create的retries参数控制。 - 更复杂的逻辑:可以根据业务需求添加更多自定义控制器行为。
总结
这个入门指南展示了如何使用 Python 和 Kopf 快速构建一个 Kubernetes Operator。Kopf 封装了事件监听、资源变化处理等底层细节,让开发者可以专注于业务逻辑。

浙公网安备 33010602011771号