JupyterHub on Kubernetes部署

最新推荐文章于 2024-05-20 09:32:50 发布

Running-Waiting

最新推荐文章于 2024-05-20 09:32:50 发布

阅读量2.5k

点赞数 5

分类专栏：容器虚拟化文章标签： docker kubernetes linux Jupyter github

本文链接：https://blog.csdn.net/qq_41999455/article/details/115193288

版权

容器虚拟化专栏收录该内容

10 篇文章 2 订阅

订阅专栏

理论是灰色的，实践之树长青🌲 ——恩格斯

近日在做毕设项目，涉及到在K8s和swarm基础上部署JupyterHub，经过两天时间的学习和部署，N次的失败尝试，最终在服务器上成功部署了JupyterHub！

实验依赖

阿里云服务器2核4G - ubuntu18.04 (服务器至少2核）
Docker v20.10.5
K8s v1.20.5
Helm v3.5.2

step1:安装Docker

curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun

step2:安装k8s

因为我们需要在k8s上安装Jupyterhub，所以需要安装k8s，k8s有多种安装方式，在这里我们使用kubeadm进行安装。

1、镜像替换更新

由于众所周知强的问题，所以这里需要更新一下下载的镜像，使用如下命令：

apt-get update && apt-get install -y apt-transport-https
curl -s https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
apt-get update

2、安装 kubeadm, kubelet and kubectl

root@host1:~# apt-get install -y kubelet kubeadm kubectl
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
  conntrack cri-tools kubernetes-cni socat
The following NEW packages will be installed:
  conntrack cri-tools kubeadm kubectl kubelet kubernetes-cni socat
0 upgraded, 7 newly installed, 0 to remove and 88 not upgraded.
Need to get 68.7 MB of archives.
After this operation, 293 MB of additional disk space will be used.
Get:1 https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 cri-tools amd64 1.13.0-01 [8,775 kB]
Get:2 http://archive.ubuntu.com/ubuntu bionic/main amd64 conntrack amd64 1:1.4.4+snapshot20161117-6ubuntu2 [30.6 kB]
Get:3 http://archive.ubuntu.com/ubuntu bionic/main amd64 socat amd64 1.7.3.2-2ubuntu2 [342 kB]
Get:4 https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 kubernetes-cni amd64 0.8.7-00 [25.0 MB]
Get:5 https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 kubelet amd64 1.20.5-00 [18.9 MB]
Get:6 https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 kubectl amd64 1.20.5-00 [7,945 kB]
Get:7 https://mirrors.aliyun.com/kubernetes/apt kubernetes-xenial/main amd64 kubeadm amd64 1.20.5-00 [7,709 kB]
Fetched 68.7 MB in 51s (1,360 kB/s)
Selecting previously unselected package conntrack.
(Reading database ... 111595 files and directories currently installed.)
Preparing to unpack .../0-conntrack_1%3a1.4.4+snapshot20161117-6ubuntu2_amd64.deb ...
Unpacking conntrack (1:1.4.4+snapshot20161117-6ubuntu2) ...
Selecting previously unselected package cri-tools.
Preparing to unpack .../1-cri-tools_1.13.0-01_amd64.deb ...
Unpacking cri-tools (1.13.0-01) ...
Selecting previously unselected package kubernetes-cni.
Preparing to unpack .../2-kubernetes-cni_0.8.7-00_amd64.deb ...
Unpacking kubernetes-cni (0.8.7-00) ...
Selecting previously unselected package socat.
Preparing to unpack .../3-socat_1.7.3.2-2ubuntu2_amd64.deb ...
Unpacking socat (1.7.3.2-2ubuntu2) ...
Selecting previously unselected package kubelet.
Preparing to unpack .../4-kubelet_1.20.5-00_amd64.deb ...
Unpacking kubelet (1.20.5-00) ...
Selecting previously unselected package kubectl.
Preparing to unpack .../5-kubectl_1.20.5-00_amd64.deb ...
Unpacking kubectl (1.20.5-00) ...
Selecting previously unselected package kubeadm.
Preparing to unpack .../6-kubeadm_1.20.5-00_amd64.deb ...
Unpacking kubeadm (1.20.5-00) ...
Setting up conntrack (1:1.4.4+snapshot20161117-6ubuntu2) ...
Setting up kubernetes-cni (0.8.7-00) ...
Setting up cri-tools (1.13.0-01) ...
Setting up socat (1.7.3.2-2ubuntu2) ...
Setting up kubelet (1.20.5-00) ...
Created symlink /etc/systemd/system/multi-user.target.wants/kubelet.service → /lib/systemd/system/kubelet.service.
Setting up kubectl (1.20.5-00) ...
Setting up kubeadm (1.20.5-00) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...

3、使用 kubeadm 创建 Kubernetes 集群

# 确保关闭交换空间
$ sudo swapoff -a

# 获取最新 Kubernetes 版本号
$ KUBERNETES_RELEASE_VERSION="$(curl -sSL https://dl.k8s.io/release/stable.txt)"

# 可以用下面的命令列出 kubeadm 需要的 images
$ kubeadm config images list --kubernetes-version=${KUBERNETES_RELEASE_VERSION}
# 所需版本如下（每个人情况可能不同，以自己的结果为主）
root@host1:~# kubeadm config images list --kubernetes-version=${KUBERNETES_RELEASE_VERSION}
k8s.gcr.io/kube-apiserver:v1.20.5
k8s.gcr.io/kube-controller-manager:v1.20.5
k8s.gcr.io/kube-scheduler:v1.20.5
k8s.gcr.io/kube-proxy:v1.20.5
k8s.gcr.io/pause:3.2
k8s.gcr.io/etcd:3.4.13-0
k8s.gcr.io/coredns:1.7.0

由于墙的问题，我们不能将这些image拉下来，所以在这里我们采用脚本的方式将其下载；
编写脚本k8sInstall.py下载所需镜像,注意，脚本中的版本号要和kubeadm config images list命令显示的版本号一致，脚本文件如下：

// k8sInstall.py 
#! /usr/bin/python3

import os

images=[
    "kube-apiserver:v1.20.5",
    "kube-controller-manager:v1.20.5",
    "kube-scheduler:v1.20.5",
    "kube-proxy:v1.20.5",
    "pause:3.2",
    "etcd:3.4.13-0",
    "coredns:1.7.0",
]

for i in images:
    pullCMD = "docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/{}".format(i)
    print("run cmd '{}', please wait ...".format(pullCMD))
    os.system(pullCMD)

    tagCMD = "docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/{} k8s.gcr.io/{}".format(i, i)
    print("run cmd '{}', please wait ...".format(tagCMD ))
    os.system(tagCMD)

    rmiCMD = "docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/{}".format(i)
    print("run cmd '{}', please wait ...".format(rmiCMD ))
    os.system(rmiCMD)
    
  # 执行脚本文件，下载镜像
root@host1:~# vim k8sInstall.py
root@host1:~# chmod 775 k8sInstall.py
root@host1:~# sudo ./k8sInstall.py
run cmd 'docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.20.5', please wait ...
v1.20.5: Pulling from google_containers/kube-apiserver
fefd475334af: Pull complete
742efefc8a44: Pull complete
98d681774b17: Pull complete
Digest: sha256:e2ca052cbde8532d4c2e613a7fbb3145a5ecbea1bd47ac243877815ed80abb5a
Status: Downloaded newer image for registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.20.5
registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.20.5
run cmd 'docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.20.5 k8s.gcr.io/kube-apiserver:v1.20.5', please wait ...
run cmd 'docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.20.5', please wait ...
Untagged: registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.20.5
Untagged: registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver@sha256:e2ca052cbde8532d4c2e613a7fbb3145a5ecbea1bd47ac243877815ed80abb5a
run cmd 'docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.20.5', please wait ...
v1.20.5: Pulling from google_containers/kube-controller-manager
fefd475334af: Already exists
742efefc8a44: Already exists
454a7944c47b: Pull complete
Digest: sha256:ddba11d3205b4fde9b8621b79bbb942f1d0797ffcc9dfb72d197b0c84943e369
Status: Downloaded newer image for registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.20.5
......

4、集群初始化

服务器必须是2核及以上，否则在这一步会出现问题！！！

root@host1:~# kubeadm init  --ignore-preflight-errors=SystemVerification
[init] Using Kubernetes version: v1.20.5
[preflight] Running pre-flight checks
	[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.5. Latest validated version: 19.03
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [izwz9hwh629hd38je8xvscz kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.21.24.197]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [izwz9hwh629hd38je8xvscz localhost] and IPs [172.21.24.197 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [izwz9hwh629hd38je8xvscz localhost] and IPs [172.21.24.197 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[apiclient] All control plane components are healthy after 85.005388 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.20" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node izwz9hwh629hd38je8xvscz as control-plane by adding the labels "node-role.kubernetes.io/master=''" and "node-role.kubernetes.io/control-plane='' (deprecated)"
[mark-control-plane] Marking the node izwz9hwh629hd38je8xvscz as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 33wev0.vwvmk139exv3ohx3
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.21.24.197:6443 --token 33wev0.vwvmk139exv3ohx3 \
    --discovery-token-ca-cert-hash sha256:f25ef958e127c21168c661355c7df7280c71dc39ce277de9dabd3c1fb9cd9579

5、使用Flannel网络

这里需要使用到kube-flannel.yml文件，但由于墙的问题https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml 该网址下载不了；
可以使用 https://github.com/flannel-io/flannel/blob/master/Documentation/kube-flannel.yml下载或复制到服务器；

root@host1:~# kubectl apply -f kube-flannel.yml
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created

6、主节点设置为调度节点（多服务器可忽略）

由于博主这里是单节点集群，所以需要将master节点也设置为调度节点

$ kubectl taint nodes --all node-role.kubernetes.io/master-

step3:Helm安装

$ curl -s https://get.helm.sh/helm-v3.5.2-linux-amd64.tar.gz | tar xzv
$ sudo cp linux-amd64/helm /usr/local/bin
$ rm -rf linux-amd64

添加使用 azure.cn 提供的 charts 镜像

$ helm repo add stable https://mirror.azure.cn/kubernetes/charts/
$ helm repo add incubator https://mirror.azure.cn/kubernetes/charts-incubator/

# 更新本地 charts repo
$ helm repo update

# 测试安装 redis chart
$ helm install my-redis stable/redis

# 删除 redis
$ helm uninstall my-redis

到这里我们安装Jupyter的前置工作已经完成了，接下来就需要我们去安装Jupyter！

step4:安装JupyterHub

现在我们已经有了Kubernetes集群，我们可以使用Helm通过 Helm chart安装JupyterHub和相关的Kubernetes资源。

1、相关配置

# 1、生成随机字符串作安全标识：
$ openssl rand -hex 32

# 2、创建一个名为config.yaml的文件
 touch config.yaml

# 3、使用命令vi config.yaml添加以下内容：
proxy:
  secretToken: "<RANDOM_HEX>"
  
  其中<RANDOM_HEX>表示第一步中随机生成的字符串
  注意：secretToken: "<RANDOM_HEX>"中冒号后有一个空格，缺少空格则后续操作无法进行。

# 4、wq保存并退出

2、安装JupyterHub

helm添加软件仓库并更新仓库：

$ helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
$ helm repo update
# 安装完成
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "sqlfiddle" chart repository
...Successfully got an update from the "jupyterhub" chart repository
...Successfully got an update from the "emberstack" chart repository
Update Complete. ⎈ Happy Helming!⎈

在config.yaml所在目录下执行下列命令安装chart：

RELEASE=jhub
NAMESPACE=jhub

helm upgrade --cleanup-on-fail \
  --install $RELEASE jupyterhub/jupyterhub \
  --namespace $NAMESPACE \
  --create-namespace \
  --version=0.10.6 \
  --values config.yaml

⚠️在helm3中我们需要指定release，如果没有就使用 --generat-name=

在执行步骤2的过程中，您可以通过输入其他终端来查看正在创建的Pod：

root@host1:~# kubectl get pod --namespace jhub
NAME                              READY   STATUS              RESTARTS   AGE
continuous-image-puller-mp7mq     1/1     Running             0          25s
hub-55dcd585c-6dpwr               0/1     ContainerCreating   0          25s
proxy-7645dd5d94-fvgsm            1/1     Running             0          25s
user-scheduler-7db7bfbdc6-k76xj   0/1     ErrImagePull        0          25s
user-scheduler-7db7bfbdc6-qg64l   0/1     ContainerCreating   0          25s

等待hub和proxy都进入running状态

root@host1:~# kubectl get pod --namespace jhub
NAME                              READY   STATUS             RESTARTS   AGE
continuous-image-puller-mp7mq     1/1     Running            0          2m22s
hub-55dcd585c-6dpwr               1/1     Running            0          2m22s
proxy-7645dd5d94-fvgsm            1/1     Running            0          2m22s

使用 kubectl get service --namespace jhub命令找到proxy-public对应的端口号

root@host1:~# kubectl get service --namespace jhub
NAME           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
hub            ClusterIP   10.105.101.125   <none>        8081/TCP       14h
proxy-api      ClusterIP   10.100.241.193   <none>        8001/TCP       14h
proxy-public   NodePort    10.101.104.188   <none>        80:31420/TCP   14h

打开浏览器，地址栏输入主机ip:端口号访问并使用jupyterhub。

⚠️这里如果有朋友使用的是服务器，一定要将服务器的安全组修改，入方向的端口和IP修改一下！！！切记

在这里插入图片描述

3、⚠️问题

1、proxy-public如果处于pending状态，是因为svc暴露给外网的方式为LoadBalancer，又没有配转发，所以获取不到ip；修改svc中的type为NodePort，通过使用本机ip+端口的方式暴露给外网。使用如下命令：

kubectl -n jhub get svc
kubectl -n jhub edit svc proxy-public
# 找到type字段将LoadBalancer修改为NodePort

2、hub的pod处于pending状态，kubectl describe pod + pod名，如果报错“pod has unbound immediate PersistentVolumeClaims”，则修改helm安装jupyterhub时的config.yaml文件，添加以下内容：（这个问题困扰了我很久，感谢博友的解答）

hub:
  db:
    type: sqlite-memory
singleuser:
  storage:
    type: none

然后重新执行以下命令：

RELEASE=jhub
NAMESPACE=jhub

helm upgrade --cleanup-on-fail \
  --install $RELEASE jupyterhub/jupyterhub \
  --namespace $NAMESPACE \
  --create-namespace \
  --version=0.10.6 \
  --values config.yaml

总结

在整个安装过程中遇到了各类bug，好在博友们都很强大，在github和博客上有很多解决方案，大家可以去参考！
在这里记录一下对我有重要帮助的博友链接：
https://github.com/maguowei/gotok8s
https://zero-to-jupyterhub.readthedocs.io/en/stable/jupyterhub/installation.html
https://blog.csdn.net/u012174752/article/details/89173916
https://www.jianshu.com/p/491a562127fa