Volcano (gang scheduling)测试
1,Volcano Install
从github获取volcano源码:/volcano/installer/volcano-development.yaml。直接部署即可。
[root@master1 home]# kubectl apply -f volcano-development.yaml
namespace/volcano-system created
namespace/volcano-monitoring created
serviceaccount/volcano-admission created
configmap/volcano-admission-configmap created
clusterrole.rbac.authorization.k8s.io/volcano-admission created
clusterrolebinding.rbac.authorization.k8s.io/volcano-admission-role created
service/volcano-admission-service created
deployment.apps/volcano-admission created
job.batch/volcano-admission-init created
customresourcedefinition.apiextensions.k8s.io/jobs.batch.volcano.sh created
customresourcedefinition.apiextensions.k8s.io/commands.bus.volcano.sh created
serviceaccount/volcano-controllers created
clusterrole.rbac.authorization.k8s.io/volcano-controllers created
clusterrolebinding.rbac.authorization.k8s.io/volcano-controllers-role created
deployment.apps/volcano-controllers created
serviceaccount/volcano-scheduler created
configmap/volcano-scheduler-configmap created
clusterrole.rbac.authorization.k8s.io/volcano-scheduler created
clusterrolebinding.rbac.authorization.k8s.io/volcano-scheduler-role created
service/volcano-scheduler-service created
deployment.apps/volcano-scheduler created
customresourcedefinition.apiextensions.k8s.io/podgroups.scheduling.volcano.sh created
customresourcedefinition.apiextensions.k8s.io/queues.scheduling.volcano.sh created
customresourcedefinition.apiextensions.k8s.io/numatopologies.nodeinfo.volcano.sh created
mutatingwebhookconfiguration.admissionregistration.k8s.io/volcano-admission-service-pods-mutate created
mutatingwebhookconfiguration.admissionregistration.k8s.io/volcano-admission-service-queues-mutate created
mutatingwebhookconfiguration.admissionregistration.k8s.io/volcano-admission-service-podgroups-mutate created
mutatingwebhookconfiguration.admissionregistration.k8s.io/volcano-admission-service-jobs-mutate created
validatingwebhookconfiguration.admissionregistration.k8s.io/volcano-admission-service-jobs-validate created
validatingwebhookconfiguration.admissionregistration.k8s.io/volcano-admission-service-pods-validate created
validatingwebhookconfiguration.admissionregistration.k8s.io/volcano-admission-service-queues-validate created
customresourcedefinition.apiextensions.k8s.io/jobtemplates.flow.volcano.sh created
customresourcedefinition.apiextensions.k8s.io/jobflows.flow.volcano.sh created
查询如下结果代表部署完成。
[root@master1 home]# kubectl get pod -n volcano-system
NAME READY STATUS RESTARTS AGE
volcano-admission-cd4f48cb6-6ms8h 1/1 Running 0 5h34m
volcano-admission-init-cxj8b 0/1 Completed 0 5h34m
volcano-controllers-85b55c8649-rtk9c 1/1 Running 0 5h34m
volcano-scheduler-6bdcf79855-lhlk5 1/1 Running 0 5h34m
2,测试gang scheduling
测试方案:部署两个业务,每个业务请求80核cpu(node可分配cpu一共128核)。
首先部署两个相同weight的Queue。
apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
name: test80-1
spec:
weight: 1
reclaimable: false
capability:
cpu: 80
---
apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
name: test80-2
spec:
weight: 1
reclaimable: false
capability:
cpu: 80
然后部署两个业务,分别对应上述两个Queue。
apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
name: job80-1
spec:
minAvailable: 4
schedulerName: volcano
queue: test80-1
policies:
- event: PodEvicted
action: RestartJob
tasks:
- replicas: 4
name: nginx
policies:
- event: TaskCompleted
action: CompleteJob
template:
spec:
containers:
- command:
- sleep
- 10m
image: mirror.longcloud.me:4443/yhcloud/nginx:alpine
name: nginx
resources:
requests:
cpu: 20
limits:
cpu: 20
restartPolicy: Never
---
apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
name: job80-2
spec:
minAvailable: 4
schedulerName: volcano
queue: test80-2
policies:
- event: PodEvicted
action: RestartJob
tasks:
- replicas: 4
name: nginx
policies:
- event: TaskCompleted
action: CompleteJob
template:
spec:
containers:
- command:
- sleep
- 10m
image: mirror.longcloud.me:4443/yhcloud/nginx:alpine
name: nginx
resources:
requests:
cpu: 20
limits:
cpu: 20
restartPolicy: Never
可以看到,由于资源不够,基于all or nothing(gang scheduling)的原则,只有job1在running,job2出于pending状态。
[root@master1 home]# kubectl get pod
NAME READY STATUS RESTARTS AGE
job80-1-nginx-0 1/1 Running 0 8s
job80-1-nginx-1 1/1 Running 0 8s
job80-1-nginx-2 1/1 Running 0 8s
job80-1-nginx-3 1/1 Running 0 8s
[root@master1 home]# kubectl get podgroup
NAME STATUS MINMEMBER RUNNINGS AGE
job80-1-4437c017-0e60-4aa7-bf61-d887ba72ac29 Running 4 4 8m7s
job80-2-06573f39-04ab-40d3-97c1-927fa267516e Pending 4 8m3s