在k3s 开启vGPU Time-Slicing
helm repo add nvdp https://nvidia.github.io/k8s-device-plugin helm repo update
创建 RuntimeClass
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: nvidia
handler: nvidia
创建 配置文件
cat dp-example-config.yaml version: v1 flags: migStrategy: "none" failOnInitError: true nvidiaDriverRoot: "/" plugin: passDeviceSpecs: false deviceListStrategy: "envvar" deviceIDStrategy: "uuid" gfd: oneshot: false noTimestamp: false outputFile: /etc/kubernetes/node-feature-discovery/features.d/gfd sleepInterval: 60s sharing: timeSlicing: resources: - name: nvidia.com/gpu replicas: 10
安装
helm template nvidia-device-plugin . -f values.yaml --set gfd.enabled=true --set-file config.map.config=/root/nvidia/dp-example-config.yaml --set runtimeClassName=nvidia --include-crds --dry-run --namespace nvidia-device-plugin > nvidia-device-plugin-with-time-slicing.yml
The answer to 2. is to include --include-crds in the helm template command.
https://github.com/NVIDIA/gpu-operator/issues/546
--set runtimeClassName=nvidia 是必需的,因为 K3s 自动发现 nvidia-container-runtime 不会将其配置为默认运行时
https://fissssssh.aiursoft.cn/posts/configure-nvidia-gpus-in-k3s/

浙公网安备 33010602011771号