一:所有节点优化
curl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
curl -o /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
yum clean all && yum makecache #配置阿里源并生成缓存
swapoff -a #临时关闭swap分区
vim /etc/fstab #删除swap行,永久关闭swap分区
yum install net-tools vim git wget netstat -y #安装常用工具
setenforce 0 #临时关闭selinux
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/sysconfig/selinux #永久关闭selinux
systemctl stop firewalld && systemctl disable firewalld #临时加永久关闭防火墙
如果不想更新内核的话可以执行:
禁止内核更新:
[root@spgpu ~]# vim /etc/yum.conf
在[main]部分加上:
exclude=kernel* centos-release
#所有节点设置域名解析(后面k8s执行过程namespace需要用域名定义)
vim /etc/hosts #下面添加集群ip和hostname
192.168.1.171 k8s-master.hikvision.com
192.168.1.172 k8s-node1.hikvision.com
#再使用命令设置hostname
hostnamectl set-hostname k8s-master.hikvision.com
查看cat /etc/hostname已经修改
#域名设置完了可以重启一下,看下主机名是否修改了
保持hostname命令和/etc/hostname,/etc/hosts三者对应起来,避免后续部署出现问题
#br_netfilter模块加载
lsmod |grep br_netfilter
# 如果系统没有br_netfilter模块则执行下面的新增命令,如有则忽略
# 临时新增br_netfilter模块:
$ modprobe br_netfilter
$ lsmod |grep br_netfilter
br_netfilter 22256 0
bridge 151336 1 br_netfilter
# 该方式重启后会失效
# 永久新增br_netfilter模块:
cat > /etc/rc.sysinit << EOF
#!/bin/bash
for file in /etc/sysconfig/modules/*.modules ; do
[ -x $file ] && $file
done
EOF
cat > /etc/sysconfig/modules/br_netfilter.modules << EOF
modprobe br_netfilter
EOF
chmod 755 /etc/sysconfig/modules/br_netfilter.modules
vi /etc/sysctl.conf #添加以下两行内容:
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
##如果不添加的话,后面执行docker info时会出现如下警告
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
最后再执行
sysctl -p
二:所有节点安装docker
#安装依赖
yum install -y yum-utils device-mapper-persistent-data lvm2
#添加docker源
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
#更新源
yum update
#安装docker最新版本
yum -y install docker-ce docker-ce-cli containerd.io
#安装docker指定版本,请参考如下链接:
https://blog.csdn.net/mayi_xiaochaun/article/details/123421532?spm=1001.2014.3001.5501
#启动
systemctl start docker && systemctl enable docker
#添加镜像加速
touch /etc/docker/daemon.json
sudo tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://ogeydad1.mirror.aliyuncs.com"]
}
EOF
重启docker服务:systemctl daemon-reload && systemctl restart docker
测试docker:
docker info
docker run hello-world
三:所有节点安装Kubelet,kubeadm,Kubectl
#设置国内K8S源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
#更新缓存
yum clean all && yum makecache
#查看kubelet,kubeadm,kubectl版本
yum list kubelet --showduplicates|sort -r
#安装
yum install -y kubelet-1.21.0-0 kubeadm-1.21.0-0 kubectl-1.21.0-0
#设置开机启动Kubelet,但是不要启动kubelet,不然加入集群时会导致kubelet丢失config.yaml文件,导致起不来,就报10248端口连接拒绝
systemctl enable kubelet
#设置Kubelet命令补全
echo "source <(kubectl completion bash)" >> ~/.bash_profile
source ~/.bash_profile
四:master节点配置和初始化集群
4.1:导出默认文件并修改
mkdir -p /usr/local/docker/kubernetes
cd /usr/local/docker/kubernetes/
kubeadm config print init-defaults --component-configs KubeletConfiguration > kubeadm.yaml # 1 导出默认配置文件
root@kubernetes-master:/usr/local/docker/kubernetes# vim kubeadm.yaml # 2 编辑配置文件 kubeadm.yaml,内容如下
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.1.171 #master机器所在IP
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: k8s-master.hikvision.com #master主机名
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers #国内只能使用该镜像站拉取相关镜像
kind: ClusterConfiguration
kubernetesVersion: 1.21.0
networking:
dnsDomain: cluster.local
podSubnet: "10.0.0.0/16" #Pod的子网,此处划分calico子网,如果您的网络运行在192.168.*.*则此处配置10.0.0.0/16;如果您的网络是10.0.*.*则此处配置192.168.0.0/16
serviceSubnet: 10.96.0.0/12 #SVC的子网规划
scheduler: {}
---
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 0s
enabled: true
x509:
clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 0s
cacheUnauthorizedTTL: 0s
cgroupDriver: cgroupfs
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
cpuManagerReconcilePeriod: 0s
evictionPressureTransitionPeriod: 0s
fileCheckFrequency: 0s
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 0s
imageMinimumGCAge: 0s
kind: KubeletConfiguration
logging: {}
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
rotateCertificates: true
runtimeRequestTimeout: 0s
shutdownGracePeriod: 0s
shutdownGracePeriodCriticalPods: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 0s
syncFrequency: 0s
volumeStatsAggPeriod: 0s
4.2:查看yaml文件中需要的镜像并下载
kubeadm config images list --config kubeadm.yaml
kubeadm config images pull --config kubeadm.yaml
如果提示拉取coredns:v1.8.0失败:
从docker hub 官方查找镜像
官方中没有v1.8.0的版本, 只有 1.8.0 的版本, 于是我:docker pull coredns/coredns:1.8.0
直接拉取了官方的镜像。
docker images查看镜像:
修改镜像的tag
docker支持修改本地镜像tag的功能, 于是修改下tag成k8s想要的tag 不就可以了吗。
docker tag 296a6d5035e2 registry.cn-hangzhou.aliyuncs.com/google_containers/coredns/coredns:v1.8.0
docker images查看:
4.3:根据修改好的配置文件开始初始化
kubeadm init --config=kubeadm.yaml --upload-certs | tee kubeadm-init.log
#成功
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.1.171:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:09859fe7b4a268409af878a0842013c15b118f1ed6516fa46fe6dad2f128cdf9
4.4设置kubeconfig
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
export KUBECONFIG=/etc/kubernetes/admin.conf
五:K8S网络解决方案
5.1:master节点下载calico网络
下面是calico网络下载安装过程:
cd /usr/local/docker/kubernetes
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
如果执行出现如下错误,则执行后续wget拉去本地执行,否则不用再次操作:
可以使用wget获取:
发现证书认证错误,可以根据提示使用--no-check-certificat
下载完成后直接使用本地的calico.yaml安装
kubectl apply -f calico.yaml
5.2:master节点下载flannel网络(flannel网络安装与calico一样的操作过程)
下载地址:
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kubeflannel.yml
六:node节点加入集群
kubeadm join 192.168.1.171:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:09859fe7b4a268409af878a0842013c15b118f1ed6516fa46fe6dad2f128cdf9
七:master节点查看集群信息
kubectl get node
kubectl get pod -n kube-system
八:常见报错
问题处理可参考: k8s问题处理_mayi_xiaochaun的博客-CSDN博客
1:未join集群时就启动kubelet,导致连接超时。
2:下图是/etc/docker/daemon.json中的"exec-opts": ["native.cgroupdriver=cgroupfs"]和/var/lib/kubelet/config.yaml文件中的cgroupDriver: cgroupfs配置不同,一般建议systemd,或者cgroupfs;
改为一致后systemctl daemon-reload
重启:
systemctl restart docker
systemctl restart kubelet
九:证书过期,重新添加新的token
slave节点加入集群失败
1:生成 token 和 hash 可以在生成token的时候加上 --print-join-command 直接打印出来. 毕竟生成 token 就是用来添加节点用的,其中 --ttl=0 表示生成的 token 永不失效. 如果不带 --ttl 参数, 那么默认有效时间为24小时. 在24小时内, 可以无数量限制添加 worker.
kubeadm token create --print-join-command --ttl=0
十:最终结果:
Master节点:
Node节点: