基于KubeRay提交RayJob
0.背景
基于kuberay-operator 0.4.0版本
1.问题
提交作业
kubectl apply -f ray_v1alpha1_rayjob.yaml
问题报错:
2023-05-10T07:43:56.288Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "rayjob-sample-raycluster-8l688"}
2023-05-10T07:43:56.304Z ERROR controllers.RayCluster Pod Service create error! {"Pod.Service.Error": "Service \"rayjob-sample-raycluster-8l688-head-svc\" is invalid: spec.type: Unsupported value: \"headService\": supported values: \"ClusterIP\", \"ExternalName\", \"LoadBalancer\", \"NodePort\"", "error": "Service \"rayjob-sample-raycluster-8l688-head-svc\" is invalid: spec.type: Unsupported value: \"headService\": supported values: \"ClusterIP\", \"ExternalName\", \"LoadBalancer\", \"NodePort\""}
github.com/ray-project/kuberay/ray-operator/controllers/ray.(*RayClusterReconciler).reconcileServices
/workspace/controllers/ray/raycluster_controller.go:331
github.com/ray-project/kuberay/ray-operator/controllers/ray.(*RayClusterReconciler).rayClusterReconcile
/workspace/controllers/ray/raycluster_controller.go:203
github.com/ray-project/kuberay/ray-operator/controllers/ray.(*RayClusterReconciler).Reconcile
/workspace/controllers/ray/raycluster_controller.go:102
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2023-05-10T07:43:56.310Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "rayjob-sample-raycluster-8l688", "namespace": "default", "error": "Service \"rayjob-sample-raycluster-8l688-head-svc\" is invalid: spec.type: Unsupported value: \"headService\": supported values: \"ClusterIP\", \"ExternalName\", \"LoadBalancer\", \"NodePort\""}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2.分析
job的yaml中没指定serviceType
3.解决方案
指定serviceType,可选值有
\"headService\": supported values: \"ClusterIP\", \"ExternalName\", \"LoadBalancer\", \"NodePort\""
在job的yaml中使用clusterIP:

再次提交就成功了
root@DESKTOP-3813A3M:/mnt/d/all/app/Ray/rayjob# kubectl apply -f ray_v1alpha1_rayjob.yaml
rayjob.ray.io/rayjob-sample created
configmap/ray-job-code-sample created
3万+

被折叠的 条评论
为什么被折叠?



