安装docker
Ubuntu系统的命令是抄别人的:
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates
curl gnupg-agent software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo apt-key fingerprint 0EBFCD88
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
sudo apt-get update
sudo apt-get -y install docker-ce docker-ce-cli containerd.io
Centos比较简单:
curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun
安装完成后显示如下信息:
这个时候要先启动docker,才能生成/etc/docker目录。因为docker默认的cgroup是cgroupfs,需要改成systemd,所以需要进行如下操作:
[root@VM-16-10-centos system]# cd /etc/docker
[root@VM-16-10-centos docker]# ls
key.json
[root@VM-16-10-centos docker]# vi daemon.json
[root@VM-16-10-centos docker]# cat daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
[root@VM-16-10-centos docker]# systemctl restart docker
准备工作
先配好系统环境:
[root@VM-16-10-centos ~]# swapoff -a
[root@VM-16-10-centos ~]# setenforce 0
setenforce: SELinux is disabled
[root@VM-16-10-centos ~]# service firewalld status
Redirecting to /bin/systemctl status firewalld.service
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:firewalld(1)
如果firewalld 是开启的,就关闭它:
[root@VM-16-14-centos docker]# service firewalld stop
Redirecting to /bin/systemctl stop firewalld.service
配置K8S到yum的安装源仓库列表:
[root@VM-16-10-centos ~]# curl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-8.repo
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2590 100 2590 0 0 109k 0 --:--:-- --:--:-- --:--:-- 105k
[root@VM-16-10-centos ~]#
编辑文件/etc/yum.repos.d/kubernetes.repo,加入以下内容:
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
如果出现网络失败,如下图:
失败的情况有两种,一是写错了版本,如上图el7写成了el8,二是实在连不上阿里云,那么可以改成腾讯云yum源。
curl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.cloud.tencent.com//repo/Centos-8.repo
至于/etc/yum.repos.d/kubernetes.repo文件,可以在vim中一键替换:
:1,$s/mirrors\.aliyun\.com/mirrors\.cloud\.tencent\.com/g
:1,$s/https/http/g
当然,如果换回阿里云可以这样:
:1,$s/mirrors\.cloud\.tencent\.com/mirrors\.aliyun\.com/g
:1,$s/http/https/g
我是在阿里云成功的,腾讯云失败了,腾讯云失败场景如下:
仔细一看,腾讯云目录下确实没有这个文件:
腾讯云怎么成功我没去摸索,如果有大佬知道可以告诉我。最后使用yum clean all和yum makecahe重新建缓存就行了,中间会有多个提示,不停地输入y就行了。最终效果如图:
然后是安装kubelet,kubeadm,kubectl。由于K8S1.24以上不支持docker,所以需要安装低一点的版本。
yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
如果稳重点,应该指定一个较低的版本:
yum install kubelet-1.23.9-0 kubeadm-1.23.9-0 kubectl-1.23.9-0
最后启动就好了:
[root@VM-16-10-centos ~]# systemctl enable --now kubelet
Created symlink /etc/systemd/system/multi-user.target.wants/kubelet.service → /usr/lib/systemd/system/kubelet.service.
[root@VM-16-10-centos ~]# systemctl start kubelet
[root@VM-16-10-centos ~]#
配置集群
先启动docker
systemctl start docker.service
systemctl enable docker.service
然后就是初始化:
kubeadm init \
--image-repository registry.aliyuncs.com/google_containers \
--pod-network-cidr=10.244.0.0/16
如果初始化失败,那么需要使用kubeadm reset重新初始化。
如果出现失败,如下图:
这种情况下需要检查kubelet的状态,我看了一下:
[root@VM-16-10-centos ~]# service kubelet status
Redirecting to /bin/systemctl status kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Sat 2022-09-24 13:01:05 CST; 1s ago
Docs: https://kubernetes.io/docs/
Process: 752823 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/FAILURE)
Main PID: 752823 (code=exited, status=1/FAILURE)
结果是kubelet服务停止了。这时候需要看kubelet日志,命令如下:
journalctl -xefu kubelet
末尾有这么一段:
9月 24 13:07:13 VM-16-10-centos kubelet[766488]: E0924 13:07:13.905771 766488 server.go:302] "Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""
出现这个错误是没有统一,需要统一使用systemmd。而上述报错是docker使用了cgroupfs导致,那么就改docker吧。首先到目录下进行操作:
[root@VM-16-10-centos system]# cd /etc/docker
[root@VM-16-10-centos docker]# ls
key.json
[root@VM-16-10-centos docker]# vi daemon.json
[root@VM-16-10-centos docker]# cat daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
[root@VM-16-10-centos docker]# systemctl restart docker
[root@VM-16-10-centos docker]# systemctl restart kubelet
[root@VM-16-10-centos docker]# systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Sat 2022-09-24 13:32:33 CST; 7s ago
Docs: https://kubernetes.io/docs/
Main PID: 824643 (kubelet)
Tasks: 13 (limit: 23722)
Memory: 35.8M
CGroup: /system.slice/kubelet.service
└─824643 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --network-plu>
最终成功了。然后再kubeadm reset,在执行那个很长的初始化命令,最终显示如下信息:
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.206.16.10:6443 --token h0d5wz.3duev7uaebfkhpv1 \
--discovery-token-ca-cert-hash sha256:e1f145aa0ea1baaa0f306e422ed01339379aaf28d2e64631073545b2057e5ac7
加入集群
但是别高兴得太早了,还有两台机器呢。其他的两台机器,只需要启动kubelet就行了,不需要kubeadm init,然后在另外两台这里这里join:
kubeadm join 10.206.16.10:6443 --token h0d5wz.3duev7uaebfkhpv1 --discovery-token-ca-cert-hash sha256:e1f145aa0ea1baaa0f306e422ed01339379aaf28d2e64631073545b2057e5ac7
成功之后,会出现以下信息:
[preflight] Running pre-flight checks
[WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'
[WARNING FileExisting-tc]: tc not found in system path
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
但是测试报错啊,如下:
[root@VM-16-10-centos kubernetes]# kubectl get node
The connection to the server localhost:8080 was refused - did you specify the right host or port?
这个错误,我找到了国外一个大神的文章The connection to the server localhost:8080 was refused – did you specify the right host or port?。那就一步步解决吧,在master节点按如下代码就完成了:
[root@VM-16-10-centos kubernetes]# echo $KUBECONFIG
[root@VM-16-10-centos kubernetes]# export KUBECONFIG=/etc/kubernetes/admin.conf
[root@VM-16-10-centos kubernetes]# echo $KUBECONFIG
/etc/kubernetes/admin.conf
[root@VM-16-10-centos kubernetes]# kubectl get node
NAME STATUS ROLES AGE VERSION
vm-16-10-centos NotReady control-plane,master 49m v1.23.9
vm-16-13-centos NotReady <none> 31m v1.23.9
vm-16-14-centos NotReady <none> 11m v1.23.9
如果不是root,应该这样:
[root@VM-16-10-centos .kube]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@VM-16-10-centos .kube]# chown $(id -u):$(id -g) $HOME/.kube/config
因为是not ready,所以需要安装网络插件,只在master节点执行:
[root@VM-16-10-centos kubernetes]# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
namespace/kube-flannel created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
执行完了要检查一下:
[root@VM-16-10-centos ~]# kubectl get pods -n kube-flannel
NAME READY STATUS RESTARTS AGE
kube-flannel-ds-4mz4f 1/1 Running 0 66s
kube-flannel-ds-6kpf5 1/1 Running 0 66s
kube-flannel-ds-zvwq5 1/1 Running 0 10s
但是这个还不够,需要部署另一个网络插件,也只在master节点执行,才能进入ready状态:
[root@VM-16-10-centos kubernetes]# kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
serviceaccount/weave-net created
clusterrole.rbac.authorization.k8s.io/weave-net created
clusterrolebinding.rbac.authorization.k8s.io/weave-net created
role.rbac.authorization.k8s.io/weave-net created
rolebinding.rbac.authorization.k8s.io/weave-net created
daemonset.apps/weave-net created
[root@VM-16-10-centos kubernetes]# kubectl get node
NAME STATUS ROLES AGE VERSION
vm-16-10-centos Ready control-plane,master 61m v1.23.9
vm-16-13-centos Ready <none> 43m v1.23.9
vm-16-14-centos Ready <none> 23m v1.23.9
部署测试-制作镜像
首先下载一个Windows版本的docker客户端用于打镜像,下载地址为Docker Desktop on Windows。安装过程如图:
安装完了之后需要启动,启动后是这个样子:
然后自己写一个Dockerfile,这个文件很奇怪的,没有后缀名,文件名就叫Dockerfile,内容如下:
FROM openjdk:8
COPY target/web-demo-1.0-SNAPSHOT.jar app.jar
EXPOSE 8080
ENTRYPOINT java -jar app.jar
之后就是执行命令打包镜像:
PS C:\Users\86135\IdeaProjects\enterprise\spring-boot\web-demo> docker build -t web-demo:1.0 .
[+] Building 2.9s (7/7) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 31B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/openjdk:8 2.7s
=> [internal] load build context 0.0s
=> => transferring context: 80B 0.0s
=> [1/2] FROM docker.io/library/openjdk:8@sha256:86e863cc57215cfb181bd319736d0baf625fe8f150577f9eb58bd937f5452cb8 0.0s
=> CACHED [2/2] COPY target/web-demo-1.0-SNAPSHOT.jar app.jar 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:e8c11c1a68283d7dc48e09b9fdfb2d539acbaaba86f8c00d30c14cf28bf02ec2 0.0s
=> => naming to docker.io/library/web-demo:1.0 0.0s
Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
可以用命令看看刚刚制作的镜像,再打个标签:
PS C:\Users\86135\IdeaProjects\enterprise\spring-boot\web-demo> docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
juanqilaixuexi-docker.pkg.coding.net/docker-demo/docker/web-demo 1.0 e8c11c1a6828 39 minutes ago 526MB
juanqilaixuexi-docker.pkg.coding.net/web-demo/node-base 1.0 e8c11c1a6828 39 minutes ago 526MB
ccr.ccs.tencentyun.com/web-demo/node-base 1.0 e8c11c1a6828 39 minutes ago 526MB
web-demo 1.0 e8c11c1a6828 39 minutes ago 526MB
web-demo/node-base 1.0 e8c11c1a6828 39 minutes ago 526MB
juanqilaixuexi-docker.pkg.coding.net/docker-demo/docker/web-demo/node-base 1.0 e8c11c1a6828 39 minutes ago 526MB
PS C:\Users\86135\IdeaProjects\enterprise\spring-boot\web-demo> docker tag web-demo:1.0 juanqilaixuexi-docker.pkg.coding.net/docker-demo/docker/web-demo:1.0
然后是先用命令行登录,这个我就不贴代码了,登录完再push:
PS C:\Users\86135\IdeaProjects\enterprise\spring-boot\web-demo> docker push juanqilaixuexi-docker.pkg.coding.net/docker-demo/docker/web-demo:1.0
The push refers to repository [juanqilaixuexi-docker.pkg.coding.net/docker-demo/docker/web-demo]
924f9707f40f: Pushed
6b5aaff44254: Pushed
53a0b163e995: Pushed
b626401ef603: Pushed
9b55156abf26: Pushed
293d5db30c9f: Pushed
03127cdb479b: Pushed
9c742cd6c7a5: Pushed
1.0: digest: sha256:b180e1ad3ace3aea0cb0c73e4bd63b4878f7d2dda4e9c589e8fe3a2668af9b85 size: 2003
最好上仓库网页检查下是否上传成功,我这里是显示成功了:
部署测试-部署K8S
现在离开Windows,登录上K8S master那台机器先创建命名空间:
[root@VM-16-10-centos ~]# kubectl create namespace hope
namespace/hope created
[root@VM-16-10-centos ~]# kubectl get namespace
NAME STATUS AGE
default Active 6h19m
hope Active 81m
kube-flannel Active 5h20m
kube-node-lease Active 6h19m
kube-public Active 6h19m
kube-system Active 6h19m
再写个K8S的部署文件:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-demo-deployment
namespace: hope
labels:
app: web-demo
spec:
replicas: 2
selector:
matchLabels:
app: web-demo
template:
metadata:
labels:
app: web-demo
spec:
containers:
- image: juanqilaixuexi-docker.pkg.coding.net/docker-demo/docker/web-demo:1.0
imagePullPolicy: IfNotPresent
name: web-demo
args:
ports:
- containerPort: 8080
执行这个文件就行了:
[root@VM-16-10-centos ~]# vi app.yml
[root@VM-16-10-centos ~]# kubectl apply -f app.yml
deployment.apps/web-demo-deployment created
但是我发现容器一直创建中:
[root@VM-16-10-centos ~]# kubectl get pods -n hope
NAME READY STATUS RESTARTS AGE
web-demo-deployment-5bfc7dcd4d-6b2m9 0/1 ContainerCreating 0 9m15s
web-demo-deployment-5bfc7dcd4d-6fgl6 0/1 ContainerCreating 0 9m15s
如果容器长时间处于ContainerCreating状态,可以这样看日志:
kubectl describe pod web-demo-deployment-5bfc7dcd4d-xfnnn -n hope
最终我发现这个问题:
Warning FailedCreatePodSandBox 2m47s (x4 over 2m50s) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "110bbaacb7d0826541ad848c145659459a9b5f6e53b89b33374e4678a76e076d" network for pod "web-demo-deployment-5bfc7dcd4d-xfnnn": networkPlugin cni failed to set up pod "web-demo-deployment-5bfc7dcd4d-xfnnn_hope" network: open /run/flannel/subnet.env: no such file or directory
最终发现是这个问题:
[root@VM-16-10-centos ~]# kubectl get pod -n kube-flannel
NAME READY STATUS RESTARTS AGE
kube-flannel-ds-d7gvs 0/1 CrashLoopBackOff 76 (3m34s ago) 6h8m
kube-flannel-ds-lz54w 0/1 CrashLoopBackOff 76 (2m14s ago) 6h8m
kube-flannel-ds-s72bq 0/1 CrashLoopBackOff 76 (2m35s ago) 6h8m
[root@VM-16-10-centos ~]#
再查日志,发现:
Error registering network: failed to acquire lease: subnet "10.244.0.0/16" specified in the flannel net config doesn't contain "10.0.1.0/24" PodCIDR of the "vm-16-13-centos" node.
重新kubeadm reset重新走一遍才弄好了,然后是配置docker密码:
kubectl create secret docker-registry default --docker-server=${server} --docker-username=${username} --docker-password=${password} --docker-email="${email}" --namespace=${namespace}
kubectl patch sa default --namespace="hope" -p '{"imagePullSecrets": [{"name": "default"}]}'
注意,如果账号密码有特殊符号的,要加上引号哦。最后我是部署成功了:
[root@VM-16-10-centos ~]# kubectl get pods -nhope
NAME READY STATUS RESTARTS AGE
web-demo-deployment-7869d7f759-vsh45 1/1 Running 0 21s