1.部署目标
- 在所有节点上安装Docker和kubeadm
- 部署Kubernetes Master
- 部署容器网络插件
2.环境准备
- 安装Virtualbox
- 安装vagrant 工具
- 准备box镜像(使用ubuntu-16.04)
3.新建Vagrantfile文件,文件如下:
Vagrant.configure("2") do |config|
(1..3).each do |i|
config.vm.define "k8s-node#{i}" do |node|
# 设置虚拟机的Box
node.vm.box = "bento/ubuntu-16.04"
# 设置虚拟机的主机名
node.vm.hostname="k8s-node#{i}"
# 设置虚拟机的IP
node.vm.network "private_network", ip: "192.168.56.#{99+i}", netmask: "255.255.255.0"
# 设置主机与虚拟机的共享目录
# node.vm.synced_folder "~/Documents/vagrant/share", "/home/vagrant/share"
# VirtaulBox相关配置
node.vm.provider "virtualbox" do |v|
# 设置虚拟机的名称
v.name = "k8s-node#{i}"
# 设置虚拟机的内存大小
v.memory = 2048
# 设置虚拟机的CPU个数
v.cpus = 2
end
end
end
end
运行 vagrant up,等待个几分钟
使用vagrant ssh k8s-node1,分别登陆三台虚拟机上查看ip:
192.168.56.100 master
192.168.56.101 node1
192.168.56.102 node2
编辑3台虚拟机的 /etc/hosts 添加host配置,使用
hostnamectl set-hostname master 可以修改名称
4.关闭防火墙
sudo ufw disable
5. 关闭swap
vim /etc/fstab
注释掉:
/dev/mapper/vagrant--vg-swap_1 none swap sw 0 0
6. 安装k8s组件并启动mster节点
# 移除旧版本docker sudo yum remove docker docker-common docker-selinux docker-engine
# 安装一些依赖 sudo yum install -y yum-utils device-mapper-persistent-data lvm2
# 下载发行版repo文件 sudo wget -O /etc/yum.repos.d/docker-ce.repo https://download.docker.com/linux/centos/docker-ce.repo
# 官方的源速度太慢 把软件仓库地址替换为 TUNA sudo sed -i 's+download.docker.com+mirrors.tuna.tsinghua.edu.cn/docker-ce+' /etc/yum.repos.d/docker-ce.repo
# 添加kube组件仓库源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
# 刷新索引缓存 sudo yum makecache fast
# 安装软件 yum install -y docker-ce-20.1 kubelet kubeadm kubectl
# 启动并自启动 systemctl start docker systemctl enable docker systemctl start kubelet systemctl enable kubelet
同步虚拟机时间,防止加入集群因为时间不一致出现的问题
ntpdate -u time.pool.aliyun.com
**注意:**上述步骤master和node节点均需操作
在master 启动:
kubeadm init --kubernetes-version=v1.21.1 \
--apiserver-advertise-address=192.168.56.100 \
--image-repository registry.aliyuncs.com/google_containers \
--service-cidr=10.1.0.0/16 \
--pod-network-cidr=10.244.0.0/16
上述命令执行成功后 ,保存
kubeadm join 192.168.56.100:6443 --token ac5j8h.h9f29939ynyjl0u2 --discovery-token-ca-cert-hash sha256:d93e524c749bc79396a525a23df1bbef5689659eb515c383857b6256e13883e1
再执行:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
- 安装网络插件 Flannel
-
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
- 查看是否成功创建flannel网络
-
ifconfig |grep flannel
到此master节点安装成功
7.安装node节点
1。kubadm join 加入集群
kubeadm join 192.168.56.100:6443 --token ac5j8h.h9f29939ynyjl0u2 --discovery-token-ca-cert-hash sha256:d93e524c749bc79396a525a23df1bbef5689659eb515c383857b6256e13883e1
2.将 master 节点的 admin.conf 拷贝到 node1:scp /etc/kubernetes/admin.conf root@node1:/etc/kubernetes/
3.配置 Kubeconfig 环境变量
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
source ~/.bash_profile
4.安装 flannel 网络插件
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
5.在node1节点上面创建目录:
mkdir -p /etc/cni/net.d/
在master:scp /etc/cni/net.d/* root@nodeip:/etc/cni/net.d/
kubectl get nodes 查看 node 节点处于ready状态
到此集群配置完成,并且master和work 都已经是ready状态
8.安装过程遇到的问题
1.提示 cgroupfs 问题
解决办法:
参考:https://kubernetes.io/docs/setup/production-environment/container-runtimes/
修改: /etc/docker/daemon.json,如下:
{
"exec-opts": ["native.cgroupdriver=cgroupfs"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"registry-mirrors": ["https://registry.cn-hangzhou.aliyuncs.com"],
"exec-opts": ["native.cgroupdriver=cgroupfs"]
},
"storage-driver": "overlay2"
}
修改 :/etc/systemd/system/kubelet.service.d/10-kubeadm.conf,如下
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --cgroup-driver=cgroupfs"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
KUBELET_HOSTNAME="--hostname-override=master"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/default/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
只加了 --cgroup-driver=cgroupfs 参数
2.error: couldn't get available api versions from server: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
解决办法:
设置环境变量即可
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
source ~/.bash_profile
3.worknode 上 :The connection to the server localhost:8080 was refused - did you specify the right host or port?
执行: mv /etc/kubernetes/kubelet.conf /etc/kubernetes/admin.conf
4.The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error:
解决办法:
编辑 /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
添加一行:Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --fail-swap-on=false" 即可
5.work 节点出现端口占用等问题,执行 kubeadm reset , 再初始化节点
6. 当work节点相互不能ping 通时
kubectl get po -n kube-system
coredns-545d6fc579-7k2h6 0/1 Running 0 5h23m
coredns-545d6fc579-b67t6 0/1 Running 0 5h23m
如果是上面的状态,很可能是coredns插件问题:
使用
kubectl logs -f coredns-545d6fc579-7k2h6 -n kube-system 查看pod运行状态
尝试编辑
kubectl edit clusterrole system:coredns
+- apiGroups:
+ - discovery.k8s.io
+ resources:
+ - endpointslices
+ verbs:
+ - list
+ - watch
增加了 discovery.k8s.io 和 endpointslices 选项,如下查看详情 还是报如下错误:
E0529 18:58:47.544862 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.21.1/tools/cache/reflector.go:167: Failed to watch *v1.Namespace: failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:kube-system:coredns" cannot list resource "namespaces" in API group "" at the cluster scope
我只能卸了该插件重装后就好了(不知道啥原因);
现在所有都运行正常了:
安装过程出现各种各样的问题,就列举上面几个印象深刻的。到此集群搭建完成。