kubeflow安装

一、安装kubeflow

kustomize 版本3.3.0
Kubeflow 版本1.6

1.1下载官方安装脚本仓库

安装1.6.0版本

mkdir /data1/kubeflow_file
cd /data1/kubeflow_file
wget https://github.com/kubeflow/manifests/archive/refs/tags/v1.6.0.zip
unzip v1.6.0.zip
#unzip v1.6.0.zip 
#mv manifests-1.6.0/ manifests

1.2下载安装kustomize

这里选择 kustomize 3.3.0(原教程为3.2.0,但已经没有3.2.0版本)

cd /data1/kubeflow_file/
#curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh"  | bash
curl -o install_kustomize.sh "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh"
sh install_kustomize.sh 3.3.0 .

kustomize version
Version: {Version:kustomize/v3.3.0 GitCommit:7050c6a7b692fdba6e831e63c7b83920ab03ad76 BuildDate:2019-10-24T17:54:30Z GoOs:linux GoArch:amd64}

添加到/bin,方便运行

cp kustomize /bin/

1.3找到国外镜像的包,提前下载

查看某个镜像需要提前下载的镜像

cd /data1/kubeflow_file/manifests-1.6.0
kustomize build example |grep 'image: gcr.io'|awk '$2 != "" { print $2}' |sort -u
提前下载好国外源的镜像
类似以下代码,在镜像名称前加上 m.daocloud.io即可。
docker pull m.daocloud.io/gcr.io/ml-pipeline/frontend:2.0.0-alpha.3

如此方便主要得益于public-image-mirror项目,喝水不忘挖井人,表达感谢。
DaoCloud/public-image-mirror

1.4 准备sc、pv、pvc

准备kubeflow组件的存储
首先,准备本地目录

mkdir -p /data/k8s/istio-authservice /data/k8s/katib-mysql /data/k8s/minio /data/k8s/mysql-pv-claim
修改auth路径权限
chmod -R 777 /data/k8s/istio-authservice/

编写kubeflow-storage.yaml,路径需要跟上面的本地目录一一对应。

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: authservice
  namespace: istio-system
  labels:
    type: local
spec:
  storageClassName: local-storage
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/data1/k8s/istio-authservice"

---
apiVersion: v1
kind: PersistentVolume
metadata:
  namespace: kubeflow
  name: katib-mysql
  labels:
    type: local
spec:
  storageClassName: local-storage
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/data1/k8s/katib-mysql"

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: minio
  namespace: kubeflow
  labels:
    type: local
spec:
  storageClassName: local-storage
  capacity:
    storage: 20Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/data1/k8s/minio"

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: mysql-pv-claim
  namespace: kubeflow
  labels:
    type: local
spec:
  storageClassName: local-storage
  capacity:
    storage: 20Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/data1/k8s/mysql-pv-claim"

执行

kubectl apply -f kubeflow-storage.yaml

修改安装脚本拉取镜像
在kustomization.yaml添加images参数,在执行时用已经下载好的镜像替换国外源我们无法下载的镜像,部分截图如下:

修改yaml,每个文件添加存储卷名称:storageClassName: local-storage

apps/katib/upstream/components/mysql/pvc.yaml
apps/pipeline/upstream/third-party/minio/base/minio-pvc.yaml
apps/pipeline/upstream/third-party/mysql/base/mysql-pv-claim.yaml
common/oidc-authservice/base/pvc.yaml

一键安装

cd /data1/kubeflow_file/manifests-1.6.0
# while ! kubectl kustomize example | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done
while ! kustomize build example | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done

报以下错误,尝试了一些连接命令,重新启动就好了。
The Definitive Debugging Guide for the cert-manager Webhook Pod

Error from server (InternalError): error when creating "STDIN":
  Internal error occurred: failed calling webhook "webhook.cert-manager.io": failed to call webhook:
    Post "https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s":
      dial tcp 10.96.20.99:443: connect: connection refused

等待半个小时看下,kubectl get pods --all-namespaces,发现还有一些镜像拉取错误的,需要按上面方法再补一下。其他status则需要进去kubectl describe pod pod_name -n namespace看下具体报错情况。

访问Kubeflow Dashboard

[root@10 kubeflow_file] kubectl port-forward --address 0.0.0.0 svc/istio-ingressgateway -n istio-system 8080:80
# --address 0.0.0.0 代表可以外部host访问,不加的话只能本地访问
# port-forward 将本地的8080端口转发到pod svc/istio-ingressgateway 的80端口

只能http访问,https有问题。
默认用户名和密码:

user@example.com 
12341234

参考

从零搭建机器学习平台Kubeflow
玩转kubeflow(开新坑):1.3版本国内镜像安装及kubeflow组件介绍
kubeflow初探(二):kubeflow安装

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值