nacos-server 2.4.x 集群版部署(在阿里云k8s中部署)
部署目标
在阿里云k8s集群内网vpc中使用mysql部署nacos集群, 不需鉴权
准备
- mysql实例:
172.16.16.66:3306
, 用户名:root
, 密妈:pwd123456
- 已有的阿里云k8s集群, 命名空间:
pd-cloud-online
部署步骤
Step 1: 导入nacos所需的表结构
在mysql中新建nacos
数据库, 将nacos需要的表结构导入nacos
数据库中,表结构数据见:mysql表结构
Step 2: 部署nacos-server StatefulSet服务
StatefulSet服务yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app: pd-nacos-cm
name: pd-nacos-cm
namespace: pd-cloud-online
spec:
persistentVolumeClaimRetentionPolicy:
whenDeleted: Retain
whenScaled: Retain
podManagementPolicy: OrderedReady
replicas: 3
selector:
matchLabels:
app: pd-nacos-cm
serviceName: pd-nacos-cm-headless
template:
metadata:
labels:
app: pd-nacos-cm
spec:
containers:
- env:
- name: MODE
value: cluster
- name: NACOS_AUTH_ENABLE
value: 'false'
- name: MYSQL_SERVICE_HOST
value: 172.16.16.66
- name: MYSQL_SERVICE_DB_NAME
value: nacos
- name: MYSQL_SERVICE_USER
value: root
- name: MYSQL_SERVICE_PASSWORD
value: pwd123456
- name: NACOS_SERVERS
value: 'pd-nacos-cm-0.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:8848 pd-nacos-cm-1.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:8848 pd-nacos-cm-2.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:8848'
- name: SPRING_DATASOURCE_PLATFORM
value: mysql
- name: PREFER_HOST_MODE
value: hostname
image: 'nacos/nacos-server:v2.4.3'
imagePullPolicy: IfNotPresent
name: pd-nacos-cm
ports:
- containerPort: 8848
name: p-client
protocol: TCP
- containerPort: 9848
name: p-client-rpc
protocol: TCP
- containerPort: 9849
name: p-raft-rpc
protocol: TCP
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 15
periodSeconds: 10
successThreshold: 1
tcpSocket:
port: 8848
timeoutSeconds: 1
resources:
limits:
cpu: '1'
memory: 2Gi
requests:
cpu: 500m
memory: 512Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/localtime
name: volume-localtime
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- hostPath:
path: /etc/localtime
type: ''
name: volume-localtime
updateStrategy:
type: RollingUpdate
等待服务启动完成.
Step 3: 创建nacos headless服务(用于nacos集群自身)
apiVersion: v1
kind: Service
metadata:
name: pd-nacos-cm-headless
namespace: pd-cloud-online
spec:
ports:
- name: p1
port: 8848
protocol: TCP
targetPort: 8848
- name: p2
port: 9848
protocol: TCP
targetPort: 9848
- name: p3
port: 9849
protocol: TCP
targetPort: 9849
- name: p4
port: 7848
protocol: TCP
targetPort: 7848
selector:
app: pd-nacos-cm
clusterIP: None
type: ClusterIP
Step 4: 创建普通服务(用于其他服务服务发现和注册)
apiVersion: v1
kind: Service
metadata:
name: pd-nacos
namespace: pd-cloud-online
spec:
ports:
- name: p1
port: 8848
protocol: TCP
targetPort: 8848
- name: p2
port: 9848
protocol: TCP
targetPort: 9848
- name: p3
port: 9849
protocol: TCP
targetPort: 9849
selector:
app: pd-nacos-cm
type: ClusterIP
Step 5: 检查集群状态
打开浏览器访问其中一个pod的ip,端口为8848
确认集群节点为3个
所有节点元数据为下面格式时则集群正常:
{
"lastRefreshTime": 1737869534723,
"raftMetaData": {
"metaDataMap": {
"naming_instance_metadata": {
"leader": "pd-nacos-cm-1.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:7848",
"raftGroupMember": [
"pd-nacos-cm-0.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:7848",
"pd-nacos-cm-2.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:7848",
"pd-nacos-cm-1.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:7848"
],
"term": 1
},
"naming_persistent_service": {
"leader": "pd-nacos-cm-1.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:7848",
"raftGroupMember": [
"pd-nacos-cm-0.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:7848",
"pd-nacos-cm-2.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:7848",
"pd-nacos-cm-1.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:7848"
],
"term": 1
},
"naming_persistent_service_v2": {
"leader": "pd-nacos-cm-1.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:7848",
"raftGroupMember": [
"pd-nacos-cm-0.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:7848",
"pd-nacos-cm-2.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:7848",
"pd-nacos-cm-1.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:7848"
],
"term": 1
},
"naming_service_metadata": {
"leader": "pd-nacos-cm-1.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:7848",
"raftGroupMember": [
"pd-nacos-cm-0.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:7848",
"pd-nacos-cm-2.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:7848",
"pd-nacos-cm-1.pd-nacos-cm-headless.pd-cloud-online.svc.cluster.local:7848"
],
"term": 1
}
}
},
"raftPort": "7848",
"readyToUpgrade": true,
"version": "2.4.3"
}
踩坑点
坑1: 集群节点存在DOWN状态的节点
是因为StatefulSet服务
中spec.serviceName
的值跟nacos headless服务
的名称不一致,需配置与nacos headless服务
的名称一致
坑2: 个别rpc服务注册不上
是因为集群元数据不正常,需正确配置 环境变量: NACOS_SERVERS
,值必须为headless服务的全限定名(带svc.cluster.local
的那种)
坑3: 集群节点跟预期不符(设置的3个pod,却有4个节点数据)
是因为nacos默认使用的ip模式,需PREFER_HOST_MODE=hostname
坑4: 新建的配置,重启nacos pod后数据丢失
是因为没有使用mysql,需设置环境变量: SPRING_DATASOURCE_PLATFORM=mysql
人生如修仙,岂是一日间。何时登临顶,上善若水前。