- 查看所有cluster 环境和当前环境
kubectl config get-contexts //
kubectl config current-contexts
cat ~/.kube/config |grep current
2.在主节点是那个调度pod
k get node
k describe node cluster1-master1 |grep Taint
k get node cluster1-master1 --show-labels
k run pod1 --image=httpd:2.4.1 $do >2.yaml
vim 2.yaml
tolerations: # add
- effect: NoSchedule # add
key: node-role.kubernetes.io/master # add
- effect: NoSchedule # add
key: node-role.kubernetes.io/control-plane # add
nodeSelector: # add
node-role.kubernetes.io/control-plane: "" # add
status: {}
3.将statefulset副本改为1
k scale statefulset/osdb --replicas=1 -n
4.pod ready if service is reachable
k run ready-if-service-ready --image= nginx:1.61 $do >4-pod1.yaml
vim 4-pod1.yaml
livenessProbe: # add from here
exec:
command:
- 'true'
readinessProbe:
exec:
command:
- sh
- -c
- 'wget -T2 -O- http://service-am-i-ready:80' # to here
k run am-i-ready --image=nginx:1.16.1-alpine --labels="id=cross-server-ready"
5.kubectl sort
There are various Pods in all namespaces. Write a command into /opt/course/5/find_pods.sh which lists all Pods sorted by their AGE
( metadata.creationTimestamp )
k get pod -A --sort-by=.metadata.creationTimestamp
6.storage,pv,pvc,pod volume
# 6_pv.yaml
kind: PersistentVolume
apiVersion: v1
metadata:
name: safari-pv
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/Volumes/Data"
# 6_pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: safari-pvc
namespace: project-tiger
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
k -n project-tiger create deploy safari \
--image=httpd:2.4.41-alpine $do > 6_dep.yaml
volumes: # add
- name: data # add
persistentVolumeClaim: # add
claimName: safari-pvc # add
containers:
- image: httpd:2.4.41-alpine
name: container
volumeMounts: # add
- name: data # add
mountPath: /tmp/safari-data # add
7.Node and pod resource usage
k top node
show Pods and their containers resource usage
kubectl top pod --containers=ture
8.get master information
思路查看各个组件运行状态是进程还是pod,及查看/etc/systemd/system下是否有该服务。/etc/kubernetes/manifests下是否有静态podyaml,以及普通pod类型
Ssh into the master node with ssh cluster1-master1 . Check how the master components kubelet, kube-apiserver, kube-scheduler, kube-
controller-manager and etcd are started/installed on the master node. Also find out the name of the DNS application and how it’s started/installed on the master node.
# /opt/course/8/master-components.txt
kubelet:[类型]
kube-apiserver:[类型]
kube 调度程序:[类型]
kube-controller-manager: [类型]
etcd:[类型]
dns:[类型] [名称]
ps aux | grep kubelet # shows kubelet process
find /etc/systemd/system/ | grep kube
/etc/systemd/system/kubelet.service.d
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
/etc/systemd/system/multi-user.target.wants/kubelet.service
find /etc/systemd/system/ | grep etcd
find /etc/kubernetes/manifests//etc/kubernetes/manifests/
/etc/kubernetes/manifests/kube-controller-manager.yaml
/etc/kubernetes/manifests/etcd.yaml
/etc/kubernetes/manifests/kube-apiserver.yaml
/etc/kubernetes/manifests/kube-scheduler.yaml
kubectl -n kube-system get pod -o wide | grep master1
coredns-5644d7b6d9-c4f68 1/1 Running ... cluster1-master1
coredns-5644d7b6d9-t84sc 1/1 Running ... cluster1-master1
etcd-cluster1-master1 1/1 Running ... cluster1-master1
kube-apiserver-cluster1-master1 1/1 Running ... cluster1-master1
kube-controller-manager-cluster1-master1 1/1 Running ... cluster1-master1
kube-proxy-q955p 1/1 Running ... cluster1-master1
kube-scheduler-cluster1-master1 1/1 Running ... cluster1-master1
weave-net-mwj47 2/2 Running ... cluster1-master1
kubectl -n kube-system get deploy
NAME READY UP-TO-DATE AVAILABLE AGE
coredns 2/2 2 2 155m
# /opt/course/8/master-components.txt
kubelet: process
kube-apiserver: static-pod
kube-scheduler: static-pod
kube-controller-manager: static-pod
etcd: static-pod
dns: pod coredns
9.kill scheduler,maunal scheduling
进入master机器,将static-pod.yaml移除
cd /etc/kubernetes/manifests/
mv kube-scheduler.yaml ..
移除以后,启动pod查看是否无法调度,后添加nodeName调度
k run manual-schedule --image=httpd:2.4-alpine
查看状态,然后edit yaml 添加nodeName选择cluster-master1
mv ../kube-schedule.yaml ./
新创建pod可调度
kubectl run manual-schedule2 --image=httpd:2.4-alpine
10.rabc
k -n project-hamster create sa processor
k -n project-hamster create role processor \
--verb=create \
--resource=secret \
--resource=configmap
k -n project-hamster create rolebinding processor \
--role processor \
--serviceaccount project-hamster:processor
11 daemonset
k -n project-tiger create deployment --image=httpd:2.4-alpine ds-important $do > 11.yaml
vi 11.yaml
# 11.yaml
apiVersion: apps/v1
kind: DaemonSet # change from Deployment to Daemonset
metadata:
creationTimestamp: null
labels: # add
id: ds-important # add
uuid: 18426a0b-5f59-4e10-923f-c0e078e82462 # add
name: ds-important
namespace: project-tiger # important
spec:
#replicas: 1 # remove
selector:
matchLabels:
id: ds-important # add
uuid: 18426a0b-5f59-4e10-923f-c0e078e82462 # add
#strategy: {} # remove
template:
metadata:
creationTimestamp: null
labels:
id: ds-important # add
uuid: 18426a0b-5f59-4e10-923f-c0e078e82462 # add
spec:
containers:
- image: httpd:2.4-alpine
name: ds-important
resources:
requests: # add
cpu: 10m # add
memory: 10Mi # add
tolerations: # add
- effect: NoSchedule # add
key: node-role.kubernetes.io/master # add
- effect: NoSchedule # add
key: node-role.kubernetes.io/control-plane # add
#status: {} #
12.deployment on all node’
3 个副本分别部署在2个worker机器上,另一个无法调度
There are two possible ways, one using podAntiAffinity and one using topologySpreadConstraint .
k -n project-tiger create deployment \
--image=nginx:1.17.6-alpine deploy-important $do > 12.yaml
vim 12.yaml
PodAntiAffinity
# 12.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
creationTimestamp: null
labels:
id: very-important # change
name: deploy-important
namespace: project-tiger # important
spec:
replicas: 3 # change
selector:
matchLabels:
id: very-important # change
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
id: very-important # change
spec:
containers:
- image: nginx:1.17.6-alpine
name: container1 # change
resources: {}
- image: kubernetes/pause # add
name: container2 # add
affinity: # add
podAntiAffinity: # add
requiredDuringSchedulingIgnoredDuringExecution: # add
- labelSelector: # add
matchExpressions: # add
- key: id # add
operator: In # add
values: # add
- very-important # add
topologyKey: kubernetes.io/hostname # add
status: {}
TopologySpreadConstraints
topologySpreadConstraints: # add
- maxSkew: 1 # add
topologyKey: kubernetes.io/hostname # add
whenUnsatisfiable: DoNotSchedule # add
labelSelector: # add
matchLabels: # add
id: very-important # add
status: {}
13.multi containers and pod sharded volume
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: multi-container-playground
name: multi-container-playground
spec:
containers:
- image: nginx:1.17.6-alpine
name: c1 # change
resources: {}
env: # add
- name: MY_NODE_NAME # add
valueFrom: # add
fieldRef: # add
fieldPath: spec.nodeName # add
volumeMounts: # add
- name: vol # add
mountPath: /vol # add
- image: busybox:1.31.1 # add
name: c2 # add
command: ["sh", "-c", "while true; do date >> /vol/date.log; sleep 1; done"] # add
volumeMounts: # add
- name: vol # add
mountPath: /vol # add
- image: busybox:1.31.1 # add
name: c3 # add
command: ["sh", "-c", "tail -f /vol/date.log"] # add
volumeMounts: # add
- name: vol # add
mountPath: /vol # add
dnsPolicy: ClusterFirst
restartPolicy: Always
volumes: # add
- name: vol # add
emptyDir: {} # add
status: {}
14.You’re ask to find out following information about the cluster k8s-c1-H :
\1. How many master nodes are available?
\2. How many worker nodes are available?
\3. What is the Service CIDR?
\4. Which Networking (or CNI Plugin) is configured and where is its config file?
\5. Which suffix will static pods have that run on cluster1-worker1?
3
ssh cluster1-master1
cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep range
- --service-cluster-ip-range=10.96.0.0/12
4
find /etc/cni/net.d/
/etc/cni/net.d/
/etc/cni/net.d/10-weave.conflist
cat /etc/cni/net.d/10-weave.conflist
{
"cniVersion": "0.3.0",
"name": "weave",
...
# How many master nodes are available?
1: 1
# How many worker nodes are available?
2: 2
# What is the Service CIDR?
3: 10.96.0.0/12
# Which Networking (or CNI Plugin) is configured and where is its config file?
4: Weave, /etc/cni/net.d/10-weave.conflist
# Which suffix will static pods have that run on cluster1-worker1?
5: -cluster1-worker1
15.event log
编写一个命令/opt/course/15/cluster_events.sh
,其中显示整个集群中的最新事件,按时间排序。为之使用。kubectl
现在杀死节点 cluster2-worker1 上运行的 kube-proxy Pod并将其导致的事件写入/opt/course/15/pod_kill.log
.
最后杀死节点 cluster2-worker1 上 kube-proxy Pod的 containerd 容器,并将事件写入/opt/course/15/container_kill.log
.
kubectl get events -A --sort-by=.metadata.creationTimestamp
k -n kube-system get pod -o wide | grep proxy # find pod running on cluster2-worker1
k -n kube-system delete pod kube-proxy-z64cg
sh /opt/course/15/cluster_events.sh
# /opt/course/15/pod_kill.log
kube-system 9s Normal Killing pod/kube-proxy-jsv7t ...
kube-system 3s Normal SuccessfulCreate daemonset/kube-proxy ...
kube-system <unknown> Normal Scheduled pod/kube-proxy-m52sx ...
default 2s Normal Starting node/cluster2-worker1 ...
kube-system 2s Normal Created pod/kube-proxy-m52sx ...
kube-system 2s Normal Pulled pod/kube-proxy-m52sx ...
kube-system 2s Normal Started pod/kube-proxy-m52sx ...
ssh cluster2-worker1
crictl ps | grep kube-proxy
1e020b43c4423 36c4ebbc9d979 About an hour ago Running kube-proxy ...
crictl rm 1e020b43c4423
1e020b43c4423
crictl ps | grep kube-proxy
0ae4245707910 36c4ebbc9d979 17 seconds ago Running kube-proxy ...
sh /opt/course/15/cluster_events.sh
# /opt/course/15/container_kill.log
kube-system 13s Normal Created pod/kube-proxy-m52sx ...
kube-system 13s Normal Pulled pod/kube-proxy-m52sx ...
kube-system 13s Normal Started pod/kube-proxy-m52sx ...
16命名空间和 API 资源
k api-resources # 显示所有
k api-resources -h # 帮助总是好的
k api-resources --namespaced -o name > /opt/course/16/resources.txt
➜ k -n project-c13 get role --no-headers | wc -l
No resources found in project-c13 namespace.
0
➜ k -n project-c14 get role --no-headers | wc -l
300
➜ k -n project-hamster get role --no-headers | wc -l
No resources found in project-hamster namespace.
0
➜ k -n project-snake get role --no-headers | wc -l
No resources found in project-snake namespace.
0
➜ k -n project-tiger get role --no-headers | wc -l
No resources found in project-tiger namespace.
0
17 查看pod中容器的状态和runtime信息
crictl ps | grep tigers-reunite
crictl inspect b01edbe6f89ed | grep runtimeType
ssh cluster1-worker2 'crictl logs b01edbe6f89ed' &> /opt/course/17/pod-container.log
18fix kubelet
service kubelet status
service kubelet start
service kubelet status
failed at step exec spawing /usr/local/bin/kubelet
whereis kubelet
/usr/bin/kubelet
vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf # fix
systemctl daemon-reload && systemctl restart kubelet
systemctl status kubelet # should now show running
# /opt/course/18/reason.txt
wrong path to kubelet binary specified in service config
19创建secret挂载到pod
k -n secret create secret generic secret2 --from-literal=user=user1 --from-literal=pass=1234
k -n secret run secret-pod --image=busybox:1.31.1 $do -- sh -c "sleep 5d" > 19.yaml
containers:
- args:
- sh
- -c
- sleep 1d
image: busybox:1.31.1
name: secret-pod
resources: {}
env: # add
- name: APP_USER # add
valueFrom: # add
secretKeyRef: # add
name: secret2 # add
key: user # add
- name: APP_PASS # add
valueFrom: # add
secretKeyRef: # add
name: secret2 # add
key: pass # add
volumeMounts: # add
- name: secret1 # add
mountPath: /tmp/secret1 # add
readOnly: true # add
dnsPolicy: ClusterFirst
restartPolicy: Always
volumes: # add
- name: secret1 # add
secret: # add
secretName: secret1 # add
status: {}
20.update k8s version and join cluster
Cluster3-worker2 update to cluster3-master1
k get node
NAME STATUS ROLES AGE VERSION
cluster3-master1 Ready control-plane,master 116m v1.23.1
cluster3-worker1 NotReady <none> 112m v1.23.1
➜ ssh cluster3-worker2
➜ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.1",
GitCommit:"86ec240af8cbd1b60bcc4c03c20da9b98005b92e", GitTreeState:"clean", BuildDate:"2021-12-16T11:39:51Z",
GoVersion:"go1.17.5", Compiler:"gc", Platform:"linux/amd64"}
➜ kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4",
GitCommit:"b695d79d4f967c403a96986f1750a35eb75e75f1", GitTreeState:"clean", BuildDate:"2021-11-17T15:48:33Z",
GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}
The connection to the server localhost:8080 was refused - did you specify the right host or port?
➜ kubelet --version
Kubernetes v1.22.4
Here kubeadm is already installed in the wanted version, so we can run:
kubeadm upgrade node
couldn’t create a Kubernetes client from file “/etc/kubernetes/kubelet.conf”: failed to load admin kubeconfig: open
/etc/kubernetes/kubelet.conf: no such file or directory
This is usually the proper command to upgrade a node. But this error means that this node was never even initialised, so nothing to update
here. This will be done later using kubeadm join . For now we can continue with kubelet and kubectl:
apt update
apt show kubectl -a | grep 1.23
apt install kubectl=1.23.1-00 kubelet=1.23.1-00
kubelet --version
systemctl restart kubelet
service kubelet status
Add cluster3-master2 to cluster
ssh cluster3-master1
kubeadm token create --print-join-command
ssh cluster3-worker2
kubeadm join 192.168.100.31:6443 --token leqq1l.1hlg4rw8mu7brv73 --discovery-token-
3c9cf14535ebfac8a23a91132b75436b36df2c087aa99c433f79d531
service kubelet status
if you have troubles with kubeadm join you might need to run kubeadm reset .
This looks great though for us. Finally we head back to the main terminal and check the node status:
21.create a static pod and service
k expose pod my-static-pod-cluster3-master1 \
--name static-pod-service \
--type=NodePort \
--port 80
22.check how long ceriticates are valid
Check how long the kube-apiserver server certificate is valid on cluster2-master1 . Do this with openssl or cfssl. Write the exipiration date
into /opt/course/22/expiration .
Also run the correct kubeadm command to list the expiration dates and confirm both methods show the same date.
Write the correct kubeadm command that would renew the apiserver server certificate into /opt/course/22/kubeadm-renew-certs.sh
➜ ssh cluster2-master1
➜ root@cluster2-master1:~# find /etc/kubernetes/pki | grep apiserver
/etc/kubernetes/pki/apiserver.crt
/etc/kubernetes/pki/apiserver-etcd-client.crt
/etc/kubernetes/pki/apiserver-etcd-client.key
/etc/kubernetes/pki/apiserver-kubelet-client.crt
/etc/kubernetes/pki/apiserver.key
/etc/kubernetes/pki/apiserver-kubelet-client.key
openssl x509 -noout -text -in /etc/kubernetes/pki/apiserver.crt | grep Validity -A2
Validity
Not Before: Jan 14 18:18:15 2021 GMT
Not After : Jan 14 18:49:40 2022 GMT
# /opt/course/22/expiration
Jan 14 18:49:40 2022 GMT
kubeadm certs check-expiration | grep apiserver
apiserver Jan 14, 2022 18:49 UTC 363d ca no
apiserver-etcd-client Jan 14, 2022 18:49 UTC 363d etcd-ca no
apiserver-kubelet-client Jan 14, 2022 18:49 UTC 363d ca no
# /opt/course/22/kubeadm-renew-certs.sh
kubeadm certs renew apiserver
23.Kubelet 客户端/服务器证书信息
找到 cluster2-worker1 的“Issuer”和“Extended Key Usage”值:
-
kubelet客户端证书,用于与 kube-apiserver 的传出连接。
-
kubelet服务器证书,用于来自 kube-apiserver 的传入连接。
要找到正确的 kubelet 证书目录,我们可以查找
--cert-dir
kubelet 参数的默认值。在 Kubernetes 文档中搜索“kubelet”将导致:https 😕/kubernetes.io/docs/reference/command-line-tools-reference/kubelet 。ps aux
我们可以使用或来检查是否配置了另一个证书目录/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
。ssh cluster2-worker1 openssl x509 -noout -text -in /var/lib/kubelet/pki/kubelet-client-current.pem | grep Issuer Issuer: CN = kubernetes openssl x509 -noout -text -in /var/lib/kubelet/pki/kubelet-client-current.pem | grep "Extended Key Usage" -A1 X509v3 Extended Key Usage: TLS Web Client Authentication
Next we check the kubelet server certificate:
openssl x509 -noout -text -in /var/lib/kubelet/pki/kubelet.crt | grep Issuer Issuer: CN = cluster2-worker1-ca@1588186506 openssl x509 -noout -text -in /var/lib/kubelet/pki/kubelet.crt | grep "Extended Key Usage" -A1 X509v3 Extended Key Usage: TLS Web Server Authentication
24.networkpolic
为了防止这种情况,请创建一个在Namespace中调用的NetworkPolicy。它应该只允许Pod:
np-backend
project-snake``backend-*
- 连接到端口 1111 上的Pod
db1-*
- 在端口 2222 上连接到Pod
db2-*
在策略中使用Pod
app
标签。k -n project-snake get pod k -n project-snake get pod -L app NAME READY STATUS RESTARTS AGE APP backend-0 1/1 Running 0 3m15s backend db1-0 1/1 Running 0 3m15s db1 db2-0 1/1 Running 0 3m17s db2 vault-0 1/1 Running 0 3m17s vault
# 24_np.yaml apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: np-backend namespace: project-snake spec: podSelector: matchLabels: app: backend policyTypes: - Egress # policy is only about Egress egress: - # first rule to: # first condition "to" - podSelector: matchLabels: app: db1 ports: # second condition "port" - protocol: TCP port: 1111 - # second rule to: # first condition "to" - podSelector: matchLabels: app: db2 ports: # second condition "port" - protocol: TCP port: 2222
allow outgoing traffic if:
(destination pod has label app=db1 AND port is 1111)
OR
(destination pod has label app=db2 AND port is 2222)
25.etcd 快照保存和恢复
ssh cluster3-master1 ETCDCTL_API=3 etcdctl snapshot save /tmp/etcd-backup.db Error: rpc error: code = Unavailable desc = transport is closing cat /etc/kubernetes/manifests/kube-apiserver.yaml | grep etcd - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key - --etcd-servers=https://127.0.0.1:2379 ETCDCTL_API=3 etcdctl snapshot save /tmp/etcd-backup.db \ --cacert /etc/kubernetes/pki/etcd/ca.crt \ --cert /etc/kubernetes/pki/etcd/server.crt \ --key /etc/kubernetes/pki/etcd/server.key kubectl run test --image=nginx
cd /etc/kubernetes/manifests/ root@cluster3-master1:/etc/kubernetes/manifests# mv * .. root@cluster3-master1:/etc/kubernetes/manifests# watch crictl ps
ETCDCTL_API=3 etcdctl snapshot restore /tmp/etcd-backup.db \ --data-dir /var/lib/etcd-backup \ --cacert /etc/kubernetes/pki/etcd/ca.crt \ --cert /etc/kubernetes/pki/etcd/server.crt \ --key /etc/kubernetes/pki/etcd/server.key
vim /etc/kubernetes/etcd.yaml - hostPath: path: /var/lib/etcd-backup # change type: DirectoryOrCreate name: etcd-data root@cluster3-master1:/etc/kubernetes/manifests# mv ../*.yaml . root@cluster3-master1:/etc/kubernetes/manifests# watch crictl ps
- 连接到端口 1111 上的Pod