Ceph CSI 在 Nomad 上配置卷(稳定持续更新)
使用 Ceph CSI 在 Nomad 上配置卷
这篇文章中使用的重要组件如下
- Nomad v1.2.3
- Ceph Storage v14 (Nautilus)
- Ceph CSI v3.3.1
在 Nomad 上提供卷
首先,将以下内容添加到 nomad 客户端配置中,使Docker容器可以在Nomad客户端节点上以特权运行。
cat <<EOC >> /etc/nomad.d/client.hcl 
plugin "docker" {
  config {
    allow_privileged = true
  }
}
EOC
systemctl restart nomad
继续之前,在所有nomad客户端节点上加载RBD模块。
sudo modprobe rbd;
sudo lsmod |grep rbd;
rbd                    83733  0
libceph               306750  1 rbd
#开机模块自启 
echo "rbd" >> /etc/modules-load.d/ceph.conf
CSI 由 controller 和 node 组成。
首先创建一个 Ceph CSI controller job,类型为service。
修改以下内容并创建
- clusterID:ceph -s |grep id获取
- monitors:  ceph -s |grep mon获取,不写主机名写ip
cat <<EOC > ceph-csi-plugin-controller.nomad
job "csi-cephrbd-controller" {
  datacenters = ["dc1", "dc2"]
  constraint {
    attribute = "${attr.kernel.name}"
    value     = "linux"
  }
  type = "service"
  group "cephrbd" {
    network {
      port "prometheus" {}
    }
    service {
      name = "prometheus"
      port = "prometheus"
      tags = ["ceph-csi"]
    }
    task "plugin" {
      driver = "docker"
      config {
        image = "quay.io/cephcsi/cephcsi:v3.3.1"
        args = [
          "--drivername=rbd.csi.ceph.com",
          "--v=5",
          "--type=rbd",
          "--controllerserver=true",
          "--nodeid=${NODE_ID}",
          "--instanceid=${POD_ID}",
          "--endpoint=${CSI_ENDPOINT}",
          "--metricsport=${NOMAD_PORT_prometheus}",
        ]
        ports = ["prometheus"]
        # we need to be able to write key material to disk in this location
        mount {
          type     = "bind"
          source   = "secrets"
          target   = "/tmp/csi/keys"
          readonly = false
        }
        mount {
          type     = "bind"
          source   = "ceph-csi-config/config.json"
          target   = "/etc/ceph-csi-config/config.json"
          readonly = false
        }
      }
      template {
        data = <<-EOT
POD_ID=${NOMAD_ALLOC_ID}
NODE_ID=${node.unique.id}
CSI_ENDPOINT=unix://csi/csi.sock
EOT
        destination = "${NOMAD_TASK_DIR}/env"
        env         = true
      }
      # 主要修改这里  
      template {
        data = <<EOF
[{
    "clusterID": "380a1e72-da89-4041-8478-xxxxx",
    "monitors": [
      "10.103.3.x:6789",
      "10.103.3.x:6789",
      "10.103.3.x:6789"
    ]
}]
EOF
        destination = "ceph-csi-config/config.json"
      }
      csi_plugin {
        id        = "cephrbd"
        type      = "controller"
        mount_dir = "/csi"
      }
      resources {
        cpu    = 256
        memory = 256
      }
    }
  }
}
EOC
再创建一个 Ceph CSI node job
cat <<EOC > ceph-csi-plugin-nodes.nomad
job "csi-cephrbd-node" {
  datacenters = ["dc1", "dc2"]
  constraint {
    attribute = "${attr.kernel.name}"
    value     = "linux"
  }
  type = "system"
  group "cephrbd" {
    network {
      port "prometheus" {}
    }
    service {
      name = "prometheus"
      port = "prometheus"
      tags = ["ceph-csi"]
    }
    task "plugin" {
      driver = "docker"
      config {
        image = "quay.io/cephcsi/cephcsi:v3.3.1"
        network_mode = "host"  #添加本地网络模式  解决系统messages日志 kernel: libceph: connect (1)10.103.3.xxx:6789 error -101
        args = [
          "--drivername=rbd.csi.ceph.com",
          "--v=5",
          "--type=rbd",
          "--nodeserver=true",
          "--nodeid=${NODE_ID}",
          "--instanceid=${POD_ID}",
          "--endpoint=${CSI_ENDPOINT}",
          "--metricsport=${NOMAD_PORT_prometheus}",
        ]
        privileged = true
        ports      = ["prometheus"]
        # we need to be able to write key material to disk in this location
        mount {
          type     = "tmpfs"
          #source   = "secrets"
          target   = "/tmp/csi/keys"
          readonly = false
        }
        mount {
          type     = "bind"
          source   = "ceph-csi-config/config.json"
          target   = "/etc/ceph-csi-config/config.json"
          readonly = false
        }
      }
      template {
        data = <<-EOT
POD_ID=${NOMAD_ALLOC_ID}
NODE_ID=${node.unique.id}
CSI_ENDPOINT=unix://csi/csi.sock
EOT
        destination = "${NOMAD_TASK_DIR}/env"
        env         = true
      }
      # 主要修改这里  
      template {
        data = <<EOF
[{
    "clusterID": "380a1e72-da89-4041-8478-xxxxx",
    "monitors": [
      "10.103.3.x:6789",
      "10.103.3.x:6789",
      "10.103.3.x:6789"
    ]
}]
EOF
        destination = "ceph-csi-config/config.json"
      }
      csi_plugin {
        id        = "cephrbd"   #如果现在有一个csi,这个不要和那个名字冲突
        type      = "node"
        mount_dir = "/csi"
      }
      # note: there's no upstream guidance on resource usage so
      # this is a best guess until we profile it in heavy use
      resources {
        cpu    = 256
        memory = 256
      }
    }
  }
}
EOC
此 Ceph node job 的类型是 system ,即将在所有 nomad 客户端节点上创建 ceph csi node 容器。
运行 Ceph CSI job
nomad job run ceph-csi-plugin-controller.nomad;
nomad job run ceph-csi-plugin-nodes.nomad;
查看 ceph csi 插件的状态
nomad plugin status ceph-csi;
ID                   = ceph-csi
Provider             = rbd.csi.ceph.com
Version              = v3.3.1
Controllers Healthy  = 1
Controllers Expected = 1
Nodes Healthy        = 2
Nodes Expected       = 2
Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created    Modified
b6268d6d  457a8291  controller  0        run      running  1d21h ago  1d21h ago
ec265d25  709ee9cc  nodes       0        run      running  1d21h ago  1d21h ago
4cd7dffa  457a8291  nodes       0        run      running  1d21h ago  1d21h ago
现在,它可以使用ceph csi驱动程序从外部ceph存储装载卷了。
让我们创建一个ceph池myPool和管理员用户myPoolAdmin
# 创建 pool:
ceph osd pool create nomad 64 64
rbd pool init nomad;
# 创建 nomadAdmin 用户
ceph auth get-or-create-key client.nomadAdmin mds 'allow *' mgr 'allow *' mon 'allow *' osd 'allow * pool=nomad'
# 查看用户名密码
ceph auth list |grep -A 3 nomadAdmin
现在我们需要一个卷在Nomad上注册,创建一个卷
我这里觉得它创建卷比较麻烦就直接写了个小脚本
vi /usr/bin/nvc 
#!/bin/bash
# path:/usr/bin/nvc
VSIZE=$3
VNAME=$2
namespace=$1
cat <<EOF > /tmp/$namespace-$VNAME.hcl
type = "csi"
id   = "$VNAME"
name = "$VNAME"
capacity_min = "$VSIZE"
capacity_max = "$VSIZE"
mount_options {
  fs_type     = "xfs"
  mount_flags = ["discard" ,"defaults"]    # 这里主要是 discard 来实时删除
}
capability {
  access_mode     = "single-node-writer"
  attachment_mode = "file-system"
}
capability {
  access_mode     = "single-node-writer"
  attachment_mode = "block-device"
}
plugin_id       = "cephrbd"  #这里和上方创建的 csi_plugin id 保持一致
secrets {
  userID  = "nomadAdmin"
  userKey = "AQDD6GxiISmwGhAAlONuWEB869f6yeuEY9iicQ=="
}
parameters {
  clusterID = "380a1e72-da89-4041-8478-76383f5f6378"
  pool      = "nomad"
  imageFeatures = "layering"
}
EOF
nomad volume create -namespace $namespace /tmp/$namespace-$VNAME.hcl
现在,在Nomad上注册该卷。
#查询状态
nomad volume status  -namespace=ic-es
#进行注册
namespace=ic-es
name=ic-node-2
nvc $namespace $name 12576GB 
# 取消注册 ,但是ceph盘还在
nomad volume deregister -force -namespace=ic-es ic-node-2
nomad volume deregister  -namespace=ic-es ic-node-2
# 直接删除ceph的rbd也会删除
nomad volume delete  -namespace=ic-es ic-node-2
所遇问题
有时候删除已经创建的卷会失败,查看对应报错有以下字段
ceph  is still being used
解决方法
就是查看目前的ceph是rbd在哪个产生了连接,然后我们去对应节点,取消挂载即可,但是我这里没有找到对方挂载,只好重启服务器解决,
解决K8S中Pod无法正常Mount PVC的问题:https://os.51cto.com/article/675005.html
引用
https://rancher.com/docs/rancher/v2.x/en/cluster-admin/volumes-and-storage/ceph/
https://learn.hashicorp.com/tutorials/nomad/stateful-workloads-csi-volumes?in=nomad/stateful-workloads
https://github.com/hashicorp/nomad/tree/main/demo/csi/ceph-csi-plugin
https://docs.ceph.com/en/latest/rbd/rbd-nomad/
 
                    
                
 
                
            
         浙公网安备 33010602011771号
浙公网安备 33010602011771号