写这篇帖子的时候我还没意识到我的kubeflow版本装错了,我装的kubeflow和kubernetes版本不兼容,所以才会出现这个问题,当然后来踩到大坑的时候才反应过来。
所以,当你遇到这个问题,先反思一下自己的kubeflow版本是不是装错了!!!
———————————————————————————————————————————
——————————————————以下是原帖————————————————————
kubeflow部署好以后想开个jupyternotebook玩一玩,结果创建jupyter过程中出现若干问题,本文着重讲一下KFserving webhook not found问题的处理,文章结尾提一下其他问题,作为记录。
当我创建一个新的jupyter的任务时,发生以下情况:
报错信息:Reissued from statefulset/default: create Pod default-0 in StatefulSet default failed error: Internal error occurred: failed calling webhook "inferenceservice.kfserving-webhook-server.pod.mutator": Post https://kfserving-webhook-server-service.kubeflow.svc:443/mutate-pods?timeout=30s: service "kfserving-webhook-service" not found
我的配置:
kubernetes1.9
kubeflow1.02
docker1.13
centos7
先在master节点上检查和kfserving相关的各种组件是否正常
[root@k8snode01 ~]# kubectl -n kubeflow get statefulset kfserving-controller-manager
NAME READY AGE
kfserving-controller-manager 1/1 14d
[root@k8snode01 ~]# kubectl -n kubeflow get pod kfserving-controller-manager-0
NAME READY STATUS RESTARTS AGE
kfserving-controller-manager-0 2/2 Running 6 2d7h
OK,都是正常的,那可能是一些配置上的问题,经过一番百度(🙂菜鸟就是这么菜),发现是kubernetes版本的坑,kubernetes1.15以上的用户需要执行这条命令。
[root@k8snode01 ~]# kubectl patch mutatingwebhookconfiguration inferenceservice.serving.kubeflow.org --patch '{"webhooks":[{"name": "inferenceservice.kfserving-webhook-server.pod-mutator","objectSelector":{"matchExpressions":[{"key":"serving.kubeflow.org/inferenceservice", "operator": "Exists"}]}}]}'
mutatingwebhookconfiguration.admissionregistration.k8s.io/inferenceservice.serving.kubeflow.org patched
然后我的jupyter任务就创建成功了
一切正常
kubeflow/kfserving还提供了另一种解决方式,删除过期的webhook,可以去看看https://github.com/kubeflow/kfserving/blob/master/docs/DEVELOPER_GUIDE.md#troubleshooting