环境:
-
ubuntu18.04 64bits
-
master IP: 192.168.1.131
-
node1 IP:192.168.1.132
-
网关和DNS服务器:192.168.1.100
可在win10的网络和internet属性中,点击“状态–>查看硬件和连接属性”中获取子网中网关和DNS服务器IP地址。
一、ubuntu系统配置:
解决A stop job is running for Snappy daemon并等待很长时间的问题:
sudo gedit /etc/systemd/system.conf
找到如下两行(大约在第35行左右):
#DefaultTimeoutStartSec=90s
#DefaultTimeoutStopSec=90s
删除前面的’#’,把90改为10:
DefaultTimeoutStartSec=10s
DefaultTimeoutStopSec=10s
执行:
sudo systemctl daemon-reload
使配置生效
安装必备软件包
sudo apt-get install -y curl telnet wget man apt-transport-https ca-certificates software-properties-common vim
设置时区和进行时间同步:
设置时区:
sudo timedatectl set-timezone Asia/Shanghai
#或使用
sudo dpkg-reconfigure tzdata
#修改后,如果想使得系统日志的时间戳也立即生效,则:
sudo systemctl restart rsyslog
时间同步:
sudo apt-get install ntpdate
# 设置系统时间与网络时间同步(cn.pool.ntp.org 位于中国的公共 NTP 服务器)
ntpdate cn.pool.ntp.org
hwclock --systohc
date
二、手动安装docker
https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/amd64/
下载:docker-ce_18.06.0_ce_3-0_ubuntu_amd64.deb
手动安装无需添加安装源。
安装:
sudo dpkg -i docker-ce_18.06.0_ce_3-0_ubuntu_amd64.deb
验证:
sudo docker version
输出为:
Client:
Version: 18.06.0-ce
API version: 1.38
Go version: go1.10.3
Git commit: 0ffa825
Built: Wed Jul 18 19:09:54 2018
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 18.06.0-ce
API version: 1.38 (minimum version 1.12)
Go version: go1.10.3
Git commit: 0ffa825
Built: Wed Jul 18 19:07:56 2018
OS/Arch: linux/amd64
Experimental: false
开机并启动docker
sudo systemctl enable docker
sudo systemctl start docker
重启,确认docker已经运行:
sudo docker ps
配置docker国内加速器:
创建 /etc/docker/daemon.json ⽂文件,内容如下:
{
"registry-mirrors" : ["https://xxxx.mirror.aliyuncs.com"]
}
上面的仓库为我自己的阿里云镜像加速器地址。
重启docker服务
# 重载所有修改过的配置⽂文件
sudo systemctl daemon-reload
# 重启Docker服务
sudo systemctl restart docker
验证镜像:
sudo docker run -it --rm alpine:latest sh
三、k8s安装及部署(master和node通用):
1. 环境准备:
关闭交换空间:
sudo swapoff -a
#永久关闭swap分区
sudo sed -i 's/.*swap.*/#&/' /etc/fstab
关闭防火墙:
sudo ufw disable
修改/etc/sysctl.d/k8s.conf,加入如下内容:
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
vm.swappiness = 0
执⾏行行命令使修改⽣生效:
sudo modprobe br_netfilter
sudo sysctl -p /etc/sysctl.d/k8s.conf
2. 安装k8s:
#sudo touch /etc/apt/sources.list.d/kubernetes.list
#sudo chmod 666 /etc/apt/sources.list.d/kubernetes.list
sudo gedit /etc/apt/sources.list.d/kubernetes.list
添加如下如下源:
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
或执行:
cat << EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
然后执行:
sudo apt update
会提示:
The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 6A030B21BA07F4FB NO_PUBKEY 8B57C5C2836F4BEB
记下这两个NO_PUBKEY后8位,执行:
sudo gpg --keyserver keyserver.ubuntu.com --recv-keys BA07F4FB 836F4BEB
sudo gpg --export --armor BA07F4FB 836F4BEB | sudo apt-key add -
sudo apt update
安装k8s 最新版:
sudo apt update
sudo apt-get install -y kubelet kubeadm kubectl
设为开机启动:
sudo systemctl enable kubelet && systemctl start kubelet
sudo shutdown -r now
$kubelet --version
Kubernetes v1.20.1
完成以上配置的虚拟机和node共用,下面是master的单独配置。
3. 网络配置和修改主机名:
修改网卡配置:
sudo gedit /etc/netplan/01-network-manager-all.yaml
输入:
# Let NetworkManager manage all devices on this system
network:
version: 2
ethernets:
ens33: #配置的网卡名称
dhcp4: false #dhcp4关闭
addresses: [192.168.1.131/24] #设置本机IP及掩码
gateway4: 192.168.1.100 #设置网关
nameservers:
addresses: [192.168.1.100] #设置DNS
optional: true
上面IP地址之后的24代表子网掩码1的个数。
应用配置:
sudo netplan apply
验证配置:
ifconfig
ping baidu.com
修改主机名:
# 修改主机名
sudo hostnamectl set-hostname master
cat /etc/hostname
# 配置 hosts
sudo gedit /etc/hosts
#加入:
192.168.1.131 master
#无需加入node地址
四、配置Master节点
创建工作目录:
mkdir /home/yanghua/working
cd /home/yanghua/working
sudo kubeadm config print init-defaults --kubeconfig ClusterConfiguration > kubeadm.yml
修改生成的kubeadm.yml,主要包括以下项:
- imageRepository
- kubernetesVersion
- localAPIEndpoint
- networking
cat kubeadm.yml
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.1.131 #修改为master IP地址
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: master
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers #修改为阿里云仓库
kind: ClusterConfiguration
kubernetesVersion: v1.20.1 #使用kubectl version查看当前版本
networking:
dnsDomain: cluster.local
podSubnet: 10.244.0.0/16 #新增
serviceSubnet: 10.96.0.0/12
scheduler: {}
查看一下都需要哪些镜像⽂文件需要拉取:
kubeadm config images list --config kubeadm.yml
如果提示:
kubeadm config images list --config kubeadm.yml
W0111 21:54:41.885266 5008 kubelet.go:200] cannot automatically set CgroupDriver when starting the Kubelet: cannot execute 'docker info -f {{.CgroupDriver}}': exit status 1
则用超级权限执行该命令(因docker需要超级权限)。
拉取镜像:
sudo kubeadm config images pull --config ./kubeadm.yml
安装主节点
sudo kubeadm init --config=kubeadm.yml --upload-certs | tee kubeadm-init.log
日志输出的最后为:
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.1.131:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:27e1eb261ecd5d5d8ee9005bf810416965998aef72ac7780e62e46cec14d7f86
记下输出日志中的token:
kubeadm join 192.168.1.131:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:27e1eb261ecd5d5d8ee9005bf810416965998aef72ac7780e62e46cec14d7f86
按照上面的日志输出的提示(配置 kubectl),执行:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
创建系统服务并启动:
# 启动kubelet 设置为开机⾃自启动
sudo systemctl enable kubelet
# 启动k8s服务程序
sudo systemctl start kubelet
验证kubernetes启动结果:
kubectl get nodes
查看当前k8s集群状态:
kubectl get cs
如果出现:
controller-manager Unhealthy Get “http://127.0.0.1:10252/healthz”: dial tcp 127.0.0.1:10252: connect: connection refused
scheduler Unhealthy Get “http://127.0.0.1:10251/healthz”: dial tcp 127.0.0.1:10251: connect: connection refused
则注释掉/etc/kubernetes/manifests下的kube-controller-manager.yaml和kube-scheduler.yaml的- --port=0
。
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-scheduler
tier: control-plane
name: kube-scheduler
namespace: kube-system
spec:
containers:
- command:
- kube-scheduler
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --bind-address=127.0.0.1
- --kubeconfig=/etc/kubernetes/scheduler.conf
- --leader-elect=true
# - --port=0
image: registry.aliyuncs.com/google_containers/kube-scheduler:v1.20.1
imagePullPolicy: IfNotPresent
五、部署集群内部通信flannel网络
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
该网址可能被墙,手动下载它,拷贝到虚拟机,在文件kube-flannel.yml所在的目录执行:
sudo kubectl apply -f kube-flannel.yml
然后再执行:
kubectl get node
输出结果:
NAME STATUS ROLES AGE VERSION
master Ready control-plane,master 11m v1.20.1
可以看到master STATUS列中的Ready字样。
六、配置Node
修改网卡配置:
sudo gedit /etc/netplan/01-network-manager-all.yaml
输入:
# Let NetworkManager manage all devices on this system
network:
version: 2
ethernets:
ens33: #配置的网卡名称
dhcp4: false #dhcp4关闭
addresses: [192.168.1.132/24] #设置本机IP及掩码
gateway4: 192.168.1.100 #设置网关
nameservers:
addresses: [192.168.1.100] #设置DNS
optional: true
应用配置:
sudo netplan apply
修改主机名:
# 修改主机名
sudo hostnamectl set-hostname node1
# 配置 hosts
sudo gedit /etc/hosts
#加入:
192.168.1.132 node1
将master机器器的 /etc/kubernetes/admin.conf 传到到node1
比如将master的admin.conf 拖拽到node1的Desktop目录,然后执行:
mkdir -p $HOME/.kube
sudo cp -i /home/yanghua/Desktop/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
node1加入集群:
sudo kubeadm join 192.168.1.131:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:27e1eb261ecd5d5d8ee9005bf810416965998aef72ac7780e62e46cec14d7f86
然后在master上执行:
kubectl get nodes
输出为:
NAME STATUS ROLES AGE VERSION
master Ready control-plane,master 46m v1.20.1
node1 NotReady <none> 28s v1.20.1
此时node1的STATUS为NotReady 。
node1启动flannel网络.
将master 中的 kube-flannel.yml 传递给node1 节点,并在kube-flannel.yml所在目录执行:
kubectl apply -f kube-flannel.yml
另外,执行该命令后,在node1上使用ifconfig可以看到flannel网络。
在master执行:
kubectl get nodes
输出为:
NAME STATUS ROLES AGE VERSION
master Ready control-plane,master 54m v1.20.1
node1 Ready <none> 8m15s v1.20.1
可以看到node1的Ready状态。
至此,master和node1的k8s环境已经准备就绪,后续可按照node1的搭建步奏再创建node2。