GPUManager部署
1、部署gpu-manager-daemonset和gpu-quota-admission服务
kubectl apply -f gpu-manager.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: gpu-quota-admission
namespace: kube-system
data:
gpu-quota-admission.config: |
{
"QuotaConfigMapName": "gpuquota",
"QuotaConfigMapNamespace": "kube-system",
"GPUModelLabel": "gaia.tencent.com/gpu-model",
"GPUPoolLabel": "gaia.tencent.com/gpu-pool"
}
---
apiVersion: v1
kind: Service
metadata:
name: gpu-quota-admission
namespace: kube-system
spec:
ports:
- port: 3456
protocol: TCP
targetPort: 3456
selector:
k8s-app: gpu-quota-admission
type: ClusterIP
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: gpu-quota-admission
name: gpu-quota-admission
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
k8s-app: gpu-quota-admission
template:
metadata:
labels:
k8s-app: gpu-quota-admission
namespace: kube-system
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: node-role.kubernetes.io/master
operator: Exists
weight: 1
containers:
- env:
- name: LOG_LEVEL
value: "4"
- name: EXTRA_FLAGS
value: --incluster-mode=true
image: ccr.ccs.tencentyun.com/tkeimages/gpu-quota-admission:latest
imagePullPolicy: IfNotPresent
name: gpu-quota-admission
ports:
- containerPort: 3456
protocol: TCP
resources:
limits:
cpu: "2"
memory: 2Gi
requests:
cpu: "1"
memory: 1Gi
volumeMounts:
- mountPath: /root/gpu-quota-admission/
name: config
dnsPolicy: ClusterFirstWithHostNet
initContainers:
- command:
- sh
- -c
- ' mkdir -p /etc/kubernetes/ && cp /root/gpu-quota-admission/gpu-quota-admission.config
/etc/kubernetes/'
image: busybox
imagePullPolicy: Always
name: init-kube-config
securityContext:
privileged: true
volumeMounts:
- mountPath: /root/gpu-quota-admission/
name: config
priority: 2000000000
priorityClassName: system-cluster-critical
restartPolicy: Always
serviceAccount: gpu-manager
serviceAccountName: gpu-manager
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
volumes:
- configMap:
defaultMode: 420
name: gpu-quota-admission
name: config
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: gpu-manager
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: gpu-manager
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: gpu-manager
namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
name: gpu-manager-metric
namespace: kube-system
annotations:
prometheus.io/scrape: "true"
labels:
kubernetes.io/cluster-service: "true"
spec:
clusterIP: None
ports:
- name: metrics
port: 5678
protocol: TCP
targetPort: 5678
selector:
name: gpu-manager-ds
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: gpu-manager-daemonset
namespace: kube-system
spec:
updateStrategy:
type: RollingUpdate
selector:
matchLabels:
name: gpu-manager-ds
template:
metadata:
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ""
labels:
name: gpu-manager-ds
spec:
serviceAccount: gpu-manager
tolerations:
- key: CriticalAddonsOnly
operator: Exists
- key: tencent.com/vcuda-core
operator: Exists
effect: NoSchedule
priorityClassName: "system-node-critical"
nodeSelector:
nvidia-device-enable: enable
hostPID: true
initContainers:
- image: menghe.tencentcloudcr.com/public/alpine-uvm:0.1
imagePullPolicy: IfNotPresent
name: nvidia-uvm-enable
securityContext:
capabilities:
add:
- ALL
volumeMounts:
- mountPath: /lib/modules
name: lib-modules
readOnly: true
- mountPath: /dev
name: dev
containers:
- image: tkestack/gpu-manager:v1.0.4
imagePullPolicy: IfNotPresent
name: gpu-manager
securityContext:
privileged: true
ports:
- containerPort: 5678
volumeMounts:
- name: device-plugin
mountPath: /var/lib/kubelet/device-plugins
- name: vdriver
mountPath: /etc/gpu-manager/vdriver
- name: vmdata
mountPath: /etc/gpu-manager/vm
- name: log
mountPath: /var/log/gpu-manager
- mountPath: /var/run/docker.sock
name: docker
readOnly: true
- name: run-dir
mountPath: /var/run
- name: cgroup
mountPath: /sys/fs/cgroup
readOnly: true
- name: usr-directory
mountPath: /usr/local/host
readOnly: true
env:
- name: LOG_LEVEL
value: "4"
- name: EXTRA_FLAGS
value: --incluster-mode=true
- name: NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
volumes:
- name: device-plugin
hostPath:
type: Directory
path: /var/lib/kubelet/device-plugins
- name: vmdata
hostPath:
type: DirectoryOrCreate
path: /etc/gpu-manager/vm
- name: vdriver
hostPath:
type: DirectoryOrCreate
path: /etc/gpu-manager/vdriver
- name: log
hostPath:
type: DirectoryOrCreate
path: /etc/gpu-manager/log
- name: docker
hostPath:
type: File
path: /var/run/docker.sock
- name: cgroup
hostPath:
type: Directory
path: /sys/fs/cgroup
- name: usr-directory
hostPath:
type: Directory
path: /usr
- name: run-dir
hostPath:
type: Directory
path: /var/run
- name: lib-modules
hostPath:
type: Directory
path: /lib/modules
- name: dev
hostPath:
type: Directory
path: /dev/
2、给GPU节点打nvidia-device-enable=enable 标签
kubectl label node *.*.*.* nvidia-device-enable=enable
3、验证gpu-manager-daemonset是否正确派发到GPU节点
kubectl get pods -n kube-system
4、配置kube-scheduler static pod dns policy为ClusterFirstWithHostNet
5、kube-scheduler 开启policy-config选项,增加如下启动参数:
--policy-config-file=/etc/kubernetes/scheduler-policy-config.json
--use-legacy-policy-config=true
复制
/etc/kubernetes/scheduler-policy-config.json配置文件内容:
{
"apiVersion" : "v1",
"extenders" : [
{
"apiVersion" : "v1beta1",
"enableHttps" : false,
"filterVerb" : "predicates",
"managedResources" : [
{
"ignoredByScheduler" : false,
"name" : "tencent.com/vcuda-core"
}
],
"nodeCacheCapable" : false,
"urlPrefix" : "http://gpu-quota-admission.kube-system:3456/scheduler"
}
],
"kind" : "Policy"
}
方案测试
方案测试采用Tensorflow框架,内置了Mnist,cifar10和Alexnet benchmark等测试数据集,可以根据需要选择不同的测试方案。
测试步骤:
- 使用TensorFlow框架+minst数据集进行测试验证,TensorFlow镜像:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: vcuda-test
qcloud-app: vcuda-test
name: vcuda-test
namespace: default
spec:
replicas: 1
selector:
matchLabels:
k8s-app: vcuda-test
template:
metadata:
labels:
k8s-app: vcuda-test
qcloud-app: vcuda-test
spec:
containers:
- command:
- sleep
- 360000s
env:
- name: PATH
value: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
image: menghe.tencentcloudcr.com/public/tensorflow-gputest:0.2
imagePullPolicy: IfNotPresent
name: tensorflow-test
resources:
limits:
cpu: "4"
memory: 8Gi
tencent.com/vcuda-core: "50"
tencent.com/vcuda-memory: "32"
requests:
cpu: "4"
memory: 8Gi
tencent.com/vcuda-core: "50"
tencent.com/vcuda-memory: "32"
- 进入测试容器(在默认default namespace下,如修改了测试yaml,按需指定namespace)
kubectl exec -it `kubectl get pods -o name | cut -d '/' -f2` -- bash
4. 执行测试命令,可以根据需求选择不同训练框架/数据集
a. Mnist
cd /data/tensorflow/mnist && time python convolutional.py
复制
b. AlexNet
cd /data/tensorflow/alexnet && time python alexnet_benchmark.py
复制
c. Cifar10
cd /data/tensorflow/cifar10 && time python cifar10_train.py
复制
5. 在物理机上通过nvidia-smi pmon -s u -d 1命令查看GPU资源使用情况