DevOps CI/CD、ceph、K8S/istio流量治理

目录

一、前述

涉及到多个开源软件,知识点较多。
由于篇幅有限,针对每一个知识点,讲述了其基本操作,对于某单一开源软件,若需更详细配置说明,可点赞收藏,并留言,大家一起探讨。

如果本文有错误的地方,欢迎指出!

类似架构在工作中曾配置很多次,每一次配置都能学习到新知识点,知识需持续更新。

在实际工作中,有较多镜像从需Internet下载,建议先pull镜像,转存私仓。

1.1 基本架构

在实际工作中,有较多项目的需求中重要诉求是为开发人员创建敏捷开发环境、或对企业进行类似敏捷开发培训。

通过合理的架构设计,可以让开发和运维各自更专注自己职责业务工作和业务功能实现,同时又能高效配合,共同保证整个系统运营性的高性能、高可用。构架搭建后,软件项目从开发到交付,全流程自动化。

基本架构如下图所示。

在这里插入图片描述

本文基于开源软件建立demo环境。

项目简述
开发层开发人员组成,c++/java/go/python/node.js。在此以go为例(主要原因go是我主要使用的开发语言)
CI/CD层Gitlab CI/CD、Gitlab-Runner
业务运营层组网设计要求:冗余架构、避免单点故障、业务流量独享网络
计算层:由PaaS层(k8s/istio组成)和镜像仓库(harbor)两部分组成
存储层:由ceph分布式存储组成
运维层基础支撑层:由DNS、os-w、DHCP、CA、web下载等服务组成
平台运维层:监,控采用promethues/grafana,日志ELK、Dashbord/rancher等
业务运维层:业务应用的后台基础管理,如mysql管理等
硬件运维层:主要是对硬件设备的管理,如服务器、网络设备、IaaS设备等等

提示:
1。在测试环境,在网络设计方面,只做功能性测试,不做冗余设计。
2。生产环境中需全链路冗余设计。
3。在生产中,设备选型时考虑到项目实际需求而进行架构设计。
4。所有节点时钟保持一致。

1.2 本文测试环境

1。网络192.168.3.1/24,电信300Mb。
2。cisco三层交换机一台,vlan3(192.168.3.1)、vlan20(10.2.20.20)、vlan10(10.2.20.10)。
3。HP DL360 Gen9一台, Xeon E5-2620。安装ESXi-8.0.0-20513097。
4。服务器nic1与交换机G0/1(Trunk模式)连接。

在这里插入图片描述

1.3 部署次序

项目简述
基础支撑层os-w服务、管理节点、DNS服务、CA服务、DHCP服务、web下载服务
CI/CD层 及镜像私仓1。安装Gitlab CI/CD、Gitlab Runner
2。安装harbor
存储层ceph安装
计算层k8s/istio安装
运维层监控采用promethues/grafana,日志ELK、Dashbord/rancher等
联调CI/CD测试、istio流量治理测试

二、基础支撑层

2.1 os-w服务

本文涉及到较多台服务器,都是基于os-w提供安装服务。

os-w是基于pxe协议而集成的系统,配合自开发的后台(go语言开发),可以完成企业内部的标准化系统管理,基于网络环境完成各类操作,可以有效减少IT运维人员的日常工作。拒绝本地操作、拒绝本地USB接口。

功能简述
1。标准化批量部署各类主流系统Redhat/Centos/Rocky/Ubuntu/Debian/ESXi/Windows等,支持主流X86服务器和PC。
2。针对windows安装,自动安装各类主流硬件驱动、支持预装应用软件(如office、outlook等)、自动加入域、自动激活。
4。支持自定义模板。
5。一键备份(某个分区或整个系统)到远程存储、从远程存储一键还原到本地。
6。EFI/BIOS双模式自动切换。
7。集成各类运维工具,满足linux/windows常规运维需求。

安装模式适应人员场景
单台安装无IT技能人员(新员工/普通员工),有引导菜单普通中大型企业、公司、事业单位等,员工自己安装系统,像喝水一样简单
批量安装IT运维人员(低IT技能),dhcp配置支持机房/IaaS环境中集群部署。如30分钟快速安装500台机器

提示:
本文不对os-w安装做详细介绍,如有需要,可直接联系我,大家共同学习。

2.2 管理节点

管理节点,主要用途是对系统中各类节点进行管理。
1。ceph各节点管理
2。k8s各节点管理
3。其它节点管理
采用密钥对ssh免密方式进行。

Rocky Linux release 9.1 (Blue Onyx)

基本工具包

yum -y install epel-release
yum -y install bash-completion net-tools gcc wget curl telnet tree lrzsz iproute bind-utils

安装ansible

yum -y install ansible

ssh密钥对配置

ssh-keygen

产生/root/.ssh/id_rsa.(私钥)和/root/.ssh/id_rsa.pub.(公钥)两个文件。
采用ssh-copy-id命令将公钥复制到目标机器。

提示:
后续配置中会对管理节点进行相关配置。

2.3 DNS服务配置

linux系统中开源的DNS服务器有很多,常用的有bind/powerdns/coredns/dnsmasq等。

其中bind是传统bind服务器,也支持windows.
32位
https://ftp.isc.org/isc/bind9/9.15.5/BIND9.15.5.x86.zip

64位
https://ftp.isc.org/isc/bind9/9.17.5/BIND9.17.5.x64.zip

在实际项目生产环境中都是采用linux安装DNS。

由于是demo,今天我们采用windows安装bind,感受一下bind9在windows环境下的情况。
下载https://ftp.isc.org/isc/bind9/9.15.5/BIND9.15.5.x86.zip解压后直接运行BINDInstall.exe即可完成安装。

下面配置样例中配置了demo.io、demo.com、test.com三个域名用来测试。

配置named.conf

options {
    directory "C:\Program Files\ISC BIND 9\etc";
    forwarders {
        8.8.8.8;
        223.6.6.6;    
    };   
    allow-query { any; };
};
zone "." IN {
             type hint;
             file "named.ca";
};

zone "localhost" IN {
             type hint;
             file "localhost.zone";
};

zone "0.0.127.in-addr.arpa" IN {
             type master;
             file "127.0.0.addr.arpa";
             allow-update { none;};
};

zone "demo.com" IN {
             type master;
             file "demo.com.zone";
             allow-update { none;};
};

zone "demo.io" IN {
             type master;
             file "demo.io.zone";
             allow-update { none;};
};

zone "test.com" IN {
             type master;
             file "test.com.zone";
             allow-update { none;};
};

配置localhost.zone

$TTL 1D
@       IN      SOA     localhost.      root.localhost. (        2007091701          ; Serial
        30800               ; Refresh
        7200                ; Retry
        604800              ; Expire
        300 )               ; Minimum
        IN      NS      localhost.
localhost.        IN      A       127.0.0.1

配置demo.com.zone

demo.com.    IN  SOA   ns1.demo.com.  root.demo.com. (        2007091701         ; Serial
        30800              ; Refresh
        7200               ; Retry
        604800             ; Expire
        300 )              ; Minimum
        IN    NS        ns1.demo.com.
*       IN    A         192.168.110.10
test	IN	A	192.168.3.110
master1.k8s	IN	A	10.2.20.60
harbor	IN	A	10.2.20.70
git	IN	A	10.2.20.71
rancher	IN	A	10.2.20.151
grafana	IN	A	192.168.3.180
prometheus	IN	A	192.168.3.180
alert	IN	A	192.168.3.180
kiali	IN	A	192.168.3.182
prometheus-istio	IN	A	192.168.3.180
dashboard	IN	A	192.168.3.181

配置demo.io.zone

demo.io.    IN  SOA   ns1.demo.io.  root.demo.io. (        2007091701         ; Serial
        30800              ; Refresh
        7200               ; Retry
        604800             ; Expire
        300 )              ; Minimum
        IN    NS        ns1.demo.io.
*       IN    A         192.168.3.180
www	IN	A	192.168.3.180
master1.k8s	IN	A	10.2.20.60
harbor	IN	A	10.2.20.70
git	IN	A	10.2.20.71
rancher	IN	A	10.2.20.151
grafana	IN	A	192.168.3.180
prometheus	IN	A	192.168.3.180
alert	IN	A	192.168.3.180
kiali	IN	A	192.168.3.182
prometheus-istio	IN	A	192.168.3.180
dashboard	IN	A	192.168.3.181

配置test.com.zone

test.com.    IN  SOA   ns1.test.com.  root.test.com. (        2007091701         ; Serial
        30800              ; Refresh
        7200               ; Retry
        604800             ; Expire
        300 )              ; Minimum
        IN    NS        ns1.test.com.
*       IN    A         192.168.3.180
www	IN	A	192.168.3.180
master1.k8s	IN	A	10.2.20.60
harbor	IN	A	10.2.20.70
git	IN	A	10.2.20.71
rancher	IN	A	10.2.20.151
grafana	IN	A	192.168.3.180
prometheus	IN	A	192.168.3.180
alert	IN	A	192.168.3.180
kiali	IN	A	192.168.3.182
prometheus-istio	IN	A	192.168.3.180
dashboard	IN	A	192.168.3.181

启动服务

C:\>net start named
ISC BIND 服务正在启动 .
ISC BIND 服务已经启动成功。

测试

C:\>nslookup -q  www.demo.io 192.168.3.250
服务器:  UnKnown
Address:  192.168.3.250

名称:    www.demo.io
Address:  192.168.3.180


C:\>nslookup -q  www.demo.com 192.168.3.250
服务器:  UnKnown
Address:  192.168.3.250

名称:    www.demo.com
Address:  192.168.110.10

提示:
DNS是基础设施,后续会调整相应记录解析。

2.4 CA证书配置

在实际生产中,SSL证书是必不可少的。
例如:在配置harbor时,使用了SSL证书。

我们需要配置相应的证书,测试时我们采用cfssl工具制作私有CA证书及签发证书。

官网地址: https://pkg.cfssl.org/
Github 地址: https://github.com/cloudflare/cfssl

2.4.1 cfssl工具安装

二进制安装

curl -s -L -o /bin/cfssl https://github.com/cloudflare/cfssl/releases/download/v1.6.4/cfssl_1.6.4_linux_amd64
curl -s -L -o /bin/cfssljson https://github.com/cloudflare/cfssl/releases/download/v1.6.4/cfssljson_1.6.4_linux_amd64
curl -s -L -o /bin/cfssl-certinfo https://github.com/cloudflare/cfssl/releases/download/v1.6.0/cfssl-certinfo_1.6.4_linux_amd64
chmod +x /bin/cfssl*

2.4.2 证书产生过程

第一步:产生根ca证书和私钥
配置根ca证书签名请求文件模板

# cat > ca-csr.json <<EOF
{
    "CN":"Root-CA",
    "key":{
        "algo":"rsa",
        "size":2048
    },
    "names":[
        {
            "C":"CN",
            "L":"Gudong",
            "ST":"shenzhen",
            "O":"k8s",
            "OU":"System"
        }
    ]
}
EOF

使用根ca证书签名请求文件生成证书和私钥

cfssl gencert -initca ca-csr.json | cfssljson -bare ca 

输出三个文件

    ca.csr        //根证书请求文件
    ca-key.pem    //根证书私钥
    ca.pem        //根证书

第二步:产生用户证书
证书配置文件,有效期配置比较长为100年。

# cat ca-config.json
{
    "signing":{
        "default":{
            "expiry":"876000h"
        },
        "profiles":{
            "web-auth":{
                "expiry":"876000h",
                "usages":[
                    "signing",
                    "key encipherment",
                    "server auth",
                    "client auth"
                ]
            }
        }
    }
}

证书签发列表配置

# cat server-csr.json
{
    "CN":"domain-ssl-test",
    "hosts":[
        "127.0.0.1",
        "localhost",
        "*.test.com",
        "*.test.io",
        "*.demo.com",
        "*.demo.io",
        "*.gfs.com"
    ],
    "key":{
        "algo":"rsa",
        "size":2048
    },
    "names":[
        {
            "C":"CN",
            "L":"HeNan",
            "ST":"ZhuMaDian",
            "O":"k8s",
            "OU":"System"
        }
    ]
}

签发证书

cfssl gencert \
-ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=web-auth \
server-csr.json | cfssljson -bare web

产生的文件说明如下

# tree /root/cfssl
/root/cfssl
├── ca.csr            根ca证书签名请求文件
├── ca-csr.json       根ca证书签名请求文件模板
├── ca-key.pem        根证书私钥
├── ca.pem            根证书
├── ca-config.json    用户证书模板
├── server-csr.json   用户证书签发列表配置
├── web.csr           用户证书签发申请书
├── web-key.pem       用户证书私钥
└── web.pem           用户证书

实验配置中主要使用到ca.pem、web.pem、web-key.pem三个文件。
把它们放在web下载服务器上,供其它节点下载。

第三步:查验证书信息

# openssl x509 -noout -text -in  web.pem 
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            16:e7:8c:e8:f5:3a:a4:bb:3d:6c:90:04:ac:d4:d9:50:03:ea:21:52
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: C = CN, ST = shenzhen, L = Gudong, O = k8s, OU = System, CN = Root-CA-gfs
        Validity
            Not Before: May 22 05:59:00 2023 GMT
            Not After : Apr 28 05:59:00 2123 GMT
        Subject: C = CN, ST = ZhuMaDian, L = HeNan, O = k8s, OU = System, CN = domain-ssl-test
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
                Modulus:
                    00:c6:a9:45:9b:f2:e4:a1:43:b6:a8:5d:01:31:d9:
                    13:d0:f3:e3:c6:a6:38:8f:bc:2b:6c:bc:8c:84:32:
                    b5:16:22:85:dd:a1:a4:d6:87:a3:f3:91:66:4c:9b:
                    3f:45:1d:6c:97:98:6e:fb:c5:a9:00:95:d5:5f:7d:
                    86:de:26:34:bf:66:92:a8:57:39:c0:36:fb:12:b1:
                    f8:c9:3b:3a:0c:7d:79:d7:10:5f:3d:ba:0e:3c:17:
                    e7:08:45:23:58:cc:6f:d7:f3:3a:fb:4d:61:eb:92:
                    e7:2b:d4:21:ce:7e:a7:32:00:e9:d5:58:f2:94:e6:
                    ea:79:d9:7a:19:92:95:40:5c:f2:80:3c:57:b0:52:
                    f2:ab:9c:d6:d7:9e:1b:fd:d6:d5:66:8d:27:d1:8d:
                    b7:10:dc:a6:4d:ec:fb:21:71:e3:27:d2:b1:fb:f4:
                    63:4d:86:ba:35:fb:7a:40:b2:b1:6d:c7:c5:e9:98:
                    ac:8d:ab:36:2b:8e:79:9d:4a:fa:3a:c3:11:8f:2c:
                    69:04:28:ac:ac:93:c7:bf:6c:a3:b7:32:b1:d0:a4:
                    90:6f:63:2c:37:0e:94:43:d0:ed:9a:b3:e0:f9:ba:
                    2d:18:86:7f:72:44:b3:1c:b0:8c:cb:b4:d1:9d:19:
                    21:c3:48:8f:c9:6a:38:b8:c8:07:07:56:0a:49:e5:
                    a4:5f
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage: 
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Subject Key Identifier: 
                CF:E0:D0:A9:3A:20:5C:B0:CD:AF:4C:C6:5B:51:E7:C4:CB:5F:E3:64
            X509v3 Authority Key Identifier: 
                78:69:9C:18:47:D1:D3:44:05:1F:20:A7:B2:FF:6A:E6:68:85:D3:48
            X509v3 Subject Alternative Name: 
                localhost, DNS:*.test.com, DNS:*.demo.com, DNS:*.test.io, DNS:*.demo.io, DNS:*.gfs.com, IP Address:127.0.0.1
    Signature Algorithm: sha256WithRSAEncryption
    Signature Value:
        5e:59:bc:88:39:1b:12:f6:13:5e:8e:d4:4a:78:1a:42:21:70:
        c4:1e:61:80:c1:64:01:6b:9e:06:55:3f:bd:f2:89:5c:7b:6b:
        2c:ff:9a:5a:79:45:61:a3:25:09:3f:9f:4f:45:ce:bd:8e:b6:
        8e:2c:1e:5b:9e:37:6e:26:83:59:1f:2f:42:04:36:d6:29:59:
        df:e2:17:e8:2e:3c:f1:12:bf:dd:d8:68:e9:3f:26:10:2d:10:
        2e:51:05:a8:b9:6b:e1:f4:08:06:ec:67:2f:91:b4:3c:c5:15:
        c2:10:21:a1:81:af:ce:4f:64:6b:46:b2:6a:6f:01:44:11:c6:
        91:67:f2:33:94:d3:76:a7:b9:29:ff:05:6c:8c:de:64:27:6e:
        dd:c6:17:51:1b:50:00:b4:5f:5e:54:52:31:92:84:53:92:3c:
        8c:58:c8:60:8a:28:89:dd:59:62:42:ff:1b:5f:82:7e:1d:39:
        e3:bd:22:d8:c5:d2:b8:82:73:89:66:43:a8:5a:0a:c1:6f:58:
        2c:a0:6e:c6:e2:6e:b5:d8:0e:30:28:b1:34:bf:e1:0d:bf:ee:
        b2:13:98:27:3f:09:43:f9:0f:87:d1:f4:a2:30:c1:de:71:31:
        cc:b1:10:cc:ff:7e:14:3a:ff:08:40:9a:fe:08:b6:83:89:e0:
        1e:ba:05:d9

可以查看到证书签发给localhost、.test.com、.test.io、.demo.com、.demo.io、*.gfs.com、127.0.0.1且有效期到2123年。

2.5 DHCP服务

在DHCP中需指定DNS和网关。
dhcp服务器安装

yum -y install dhcp-server
systemctl enable dhcpd.service

注意事项:
需在cisco三层交换机中配置dhcp relpy.
例如:

interface Vlan20
 ip address 10.2.20.254 255.255.255.0
 ip helper-address 192.168.3.246
 ip dhcp relay information trusted

2.6 web下载服务

在配置中,若有较多文件需在多个机器中配置,如证书文件,此时配置一个web下载服务会更方便一点。

yum -y install httpd
systemctl enable httpd.service
systemctl start httpd.service 
mkdir /var/www/html/ssl
cp {ca.pem,ca-key.pem,web-key.pem,web.pem} /var/www/html/ssl/

三、CI/CD层

此部分是DevOps敏捷开发理念的核心实现。

3.1 镜像仓库harbor

harbor资源
https://github.com/vmware/harbor/releases
https://github.com/goharbor/harbor/releases

很多公司提供了它们公开的容器 Registr 服务,比如

  1. Docker 官方的 Registry
  2. 亚马逊 ECR(Elastic Container Registry)
  3. Google云Registry
  4. Project Atomic
  5. JFrog Artifactory
  6. dockerhub
  7. harbor
  8. quay.io

其中harbor是VMware公司开源的DockerRegistry项目(https://github.com/vmware/harbor),其目标是帮助用户迅速搭建一个企业级的DockerRegistry服务,提供图形界面和权限控制。它包括权限管理(RBAC)、LDAP、日志审核、管理界面、自我注册、镜像复制和中文支持等功能.

3.1.1 安装前准备

主机基本信息

# hostname
image

# cat /etc/redhat-release 
Rocky Linux release 9.1 (Blue Onyx)

# ip addr | grep ens
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    inet 10.2.20.70/24 brd 10.2.20.255 scope global noprefixroute ens33
3: ens36: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    inet 192.168.3.20/24 brd 192.168.3.255 scope global dynamic noprefixroute ens36

安装docker

yum -y install yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum -y install docker-ce docker-ce-cli containerd.io
systemctl enable docker containerd
systemctl start docker containerd

测试

# docker version
Client: Docker Engine - Community
 Version:           23.0.2
 API version:       1.42
 Go version:        go1.19.7
 Git commit:        569dd73
 Built:             Mon Mar 27 16:19:13 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          23.0.2
  API version:      1.42 (minimum version 1.12)
  Go version:       go1.19.7
  Git commit:       219f21b
  Built:            Mon Mar 27 16:16:18 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.20
  GitCommit:        2806fc1057397dbaeefbea0e4e17bddfbd388f38
 runc:
  Version:          1.1.5
  GitCommit:        v1.1.5-0-gf19387a
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

安装Docker-compose

# ver=v2.17.3
# curl -L https://github.com/docker/compose/releases/download/$ver/docker-compose-`uname -s`-`uname -m` > /usr/local/bin/docker-compose
# chmod +x /usr/local/bin/docker-compose
# docker-compose version
Docker Compose version v2.17.3

3.1.2 安装harbor

下载

# wget https://github.com/goharbor/harbor/releases/download/v2.8.1/harbor-offline-installer-v2.8.1.tgz
# tar zxvf harbor-offline-installer-v2.8.1.tgz 
# tree harbor
harbor
├── common.sh
├── harbor.v2.8.1.tar.gz
├── harbor.yml.tmpl       //配置文件模板,需“cp harbor.yml.tmpl harbor.yml”
├── install.sh            //安装脚本
├── LICENSE
└── prepare

配置harbor.yml

# cp harbor.yml.tmpl harbor.yml
# vi harbor.yml
hostname: harbor.demo.com		//配置访问域名和证书。
http:
  port: 80
https:
  port: 443
  certificate: /etc/ssl/test-ssl/web.pem
  private_key: /etc/ssl/test-ssl/web-key.pem
harbor_admin_password: 123qweasd+pp
database:
  password: root123
data_volume: /data/harbor

提示,需查将harbor.demo.com的证书复制到/etc/ssl/test-ssl/

安装harbor

# ./install.sh 
Note: docker version: 23.0.2
Note: Docker Compose version v2.17.2
Note: stopping existing Harbor instance ...
[+] Running 5/5
 ✔ Container harbor-portal  Removed                                                                                                                                0.1s 
 ✔ Container redis          Removed                                                                                                                                0.1s 
 ✔ Container harbor-db      Removed                                                                                                                                0.1s 
 ✔ Container harbor-log     Removed                                                                                                                                0.1s 
 ✔ Network harbor_harbor    Removed                                                                                                                                0.3s 


[Step 5]: starting Harbor ...
[+] Running 10/10
 ✔ Network harbor_harbor        Created                                                                                                                            0.3s 
 ✔ Container harbor-log         Started                                                                                                                            2.1s 
 ✔ Container registryctl        Started                                                                                                                            4.6s 
 ✔ Container harbor-db          Started                                                                                                                            4.1s 
 ✔ Container redis              Started                                                                                                                            4.1s 
 ✔ Container harbor-portal      Started                                                                                                                            3.6s 
 ✔ Container registry           Started                                                                                                                            4.4s 
 ✔ Container harbor-core        Started                                                                                                                            5.2s 
 ✔ Container harbor-jobservice  Started                                                                                                                            6.4s 
 ✔ Container nginx              Started                                                                                                                            6.8s 
✔ ----Harbor has been installed and started successfully.----

查验

# docker ps
CONTAINER ID   IMAGE                                COMMAND                  CREATED       STATUS                 PORTS                                                                            NAMES
ea6ff7de2bd3   goharbor/harbor-jobservice:v2.8.1    "/harbor/entrypoint.…"   9 days ago    Up 6 hours (healthy)                                                                                    harbor-jobservice
51c3d360f8f7   goharbor/nginx-photon:v2.8.1         "nginx -g 'daemon of…"   9 days ago    Up 6 hours (healthy)   0.0.0.0:80->8080/tcp, :::80->8080/tcp, 0.0.0.0:443->8443/tcp, :::443->8443/tcp   nginx
1d777e5c999c   goharbor/harbor-core:v2.8.1          "/harbor/entrypoint.…"   9 days ago    Up 6 hours (healthy)                                                                                    harbor-core
f37900962e2c   goharbor/harbor-registryctl:v2.8.1   "/home/harbor/start.…"   9 days ago    Up 6 hours (healthy)                                                                                    registryctl
64bf28a7ee91   goharbor/registry-photon:v2.8.1      "/home/harbor/entryp…"   9 days ago    Up 6 hours (healthy)                                                                                    registry
86f26071fac1   goharbor/harbor-db:v2.8.1            "/docker-entrypoint.…"   9 days ago    Up 6 hours (healthy)                                                                                    harbor-db
2988ed0c418f   goharbor/redis-photon:v2.8.1         "redis-server /etc/r…"   9 days ago    Up 6 hours (healthy)                                                                                    redis
f898c0d10656   goharbor/harbor-portal:v2.8.1        "nginx -g 'daemon of…"   9 days ago    Up 6 hours (healthy)                                                                                    harbor-portal
f99caa642448   goharbor/harbor-log:v2.8.1           "/bin/sh -c /usr/loc…"   9 days ago    Up 6 hours (healthy)   127.0.0.1:1514->10514/tcp                                                        harbor-log

# ss -lnt
State       Recv-Q      Send-Q           Local Address:Port             Peer Address:Port      Process      
LISTEN      0           4096                 127.0.0.1:1514                  0.0.0.0:*                      
LISTEN      0           4096                   0.0.0.0:80                    0.0.0.0:*                      
LISTEN      0           32                     0.0.0.0:53                    0.0.0.0:*                      
LISTEN      0           128                    0.0.0.0:22                    0.0.0.0:*                      
LISTEN      0           4096                   0.0.0.0:443                   0.0.0.0:*                      
LISTEN      0           4096                      [::]:80                       [::]:*                      
LISTEN      0           128                       [::]:22                       [::]:*                      
LISTEN      0           4096                      [::]:443                      [::]:*                      
LISTEN      0           4096                         *:2375                        *:*       

重启harbor

# docker-compose start | stop | restart    //此命令的运行,依赖于安装目录中的docker-compose.yml文件。

3.1.3 访问测试

提示:在访问需要将私有CA的根证书添加到浏览器的根证书可信任区域。
在这里插入图片描述
在这里插入图片描述

3.1.4 镜像push和pull测试

可采用crictl/podman/docker等客户端工作都可以。

本测试临时安装一台测试机验,采用docker命令。

登录私有仓

下载证书
# wget http://10.2.20.59/ssl/ca.pem
加入根CA证书链
# cat ca.pem >> /etc/pki/tls/certs/ca-bundle.crt
登录私有仓
# docker login harbor.demo.com
Username: admin
Password: 
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded
#

push镜像到私有仓


下载一个镜像
# docker pull busybox:latest
latest: Pulling from library/busybox
325d69979d33: Pull complete 
Digest: sha256:560af6915bfc8d7630e50e212e08242d37b63bd5c1ccf9bd4acccf116e262d5b
Status: Downloaded newer image for busybox:latest
docker.io/library/busybox:latest

修改tag
# docker tag busybox:latest harbor.demo.com/temp/busybox:latest

上传到私有仓库
# docker push harbor.demo.com/temp/busybox:latest
The push refers to repository [harbor.demo.com/temp/busybox]
9547b4c33213: Pushed 
latest: digest: sha256:5cd3db04b8be5773388576a83177aff4f40a03457a63855f4b9cbe30542b9a43 size: 528

进入https://harbor.demo.com查看下
在这里插入图片描述

从私有仓拉取镜像到本地

查看镜像
# docker images
REPOSITORY                     TAG       IMAGE ID       CREATED       SIZE
harbor.demo.com/temp/busybox   latest    8135583d97fe   12 days ago   4.86MB
busybox                        latest    8135583d97fe   12 days ago   4.86MB

删除现有镜像
# docker image rm -f 8135583d97fe
Untagged: busybox:latest
Untagged: busybox@sha256:560af6915bfc8d7630e50e212e08242d37b63bd5c1ccf9bd4acccf116e262d5b
Untagged: harbor.demo.com/temp/busybox:latest
Untagged: harbor.demo.com/temp/busybox@sha256:5cd3db04b8be5773388576a83177aff4f40a03457a63855f4b9cbe30542b9a43
Deleted: sha256:8135583d97feb82398909c9c97607159e6db2c4ca2c885c0b8f590ee0f9fe90d
Deleted: sha256:9547b4c33213e630a0ca602a989ecc094e042146ae8afa502e1e65af6473db03

查看镜像,为空
# docker images
REPOSITORY   TAG       IMAGE ID   CREATED   SIZE

从私有仓拉取镜像
# docker pull harbor.demo.com/temp/busybox:latest
latest: Pulling from temp/busybox
325d69979d33: Pull complete 
Digest: sha256:5cd3db04b8be5773388576a83177aff4f40a03457a63855f4b9cbe30542b9a43
Status: Downloaded newer image for harbor.demo.com/temp/busybox:latest
harbor.demo.com/temp/busybox:latest

查看镜像
# docker images
REPOSITORY                     TAG       IMAGE ID       CREATED       SIZE
harbor.demo.com/temp/busybox   latest    8135583d97fe   12 days ago   4.86MB

3.2 代码仓库Gitlab

gitlab(极狐)是一个利用 Ruby on Rails 开发的开源应用程序,实现一个自托管的Git项目仓库,可通过Web界面进行访问公开的或者私人项目。Ruby on Rails 是一个可以使你开发、部署、维护 web 应用程序变得简单的框架。

GitLab拥有与Github类似的功能,能够浏览源代码,管理缺陷和注释。可以管理团队对仓库的访问,它非常易于浏览提交过的版本并提供一个文件历史库。它还提供一个代码片段收集功能可以轻松实现代码复用,便于日后有需要的时候进行查找。

GitLab 分为如下两个版本
1.GitLab Community Edition(CE) 社区版。 社区版免费
2.GitLab Enterprise Edition(EE) 专业版。专业版收费

提示:
若在公网搭建gitlab服务,需开放http80、https443、ssh22三个端口。

3.2.1 安装及配置gitlab

gitlab安装方式有源码方式、yum方式、docker方式。
本测试采用yum方式安装。

安装gitlab

yum -y install epel-release  curl policycoreutils openssh-server openssh-clients
systemctl disable firewalld
systemctl stop firewalld
curl -s https://packages.gitlab.com/install/repositories/gitlab/gitlab-ce/script.rpm.sh | sudo bash
yum -y install gitlab-ce
systemctl enable gitlab-runsvdir.service
systemctl start gitlab-runsvdir.service

配置https(/etc/gitlab/gitlab.rb),把证书复制到/etc/gitlab/ssl/(需新创建此目录)

# vi /etc/gitlab/gitlab.rb
external_url 'https://git.demo.com'
letsencrypt['enable'] = false
nginx['redirect_http_to_https'] = true
nginx['ssl_certificate'] = "/etc/gitlab/ssl/web.pem"
nginx['ssl_certificate_key'] = "/etc/gitlab/ssl/web-key.pem"
# gitlab-ctl reconfigure

查看gitlab版本
# cat /opt/gitlab/embedded/service/gitlab-rails/VERSION
16.0.1

查看gitlab管理员root初始密码
# cat /etc/gitlab/initial_root_password

访问https://git.demo.com
在这里插入图片描述
提示:
第一次进入后建议改密码

GitLab头像无法正常显示,可以配置如下

# vi /etc/gitlab/gitlab.rb
### Gravatar Settings
gitlab_rails['gravatar_plain_url'] = 'https://sdn.geekzu.org/avatar/%{hash}?s=%{size}&d=identicon'
gitlab_rails['gravatar_ssl_url'] = 'https://sdn.geekzu.org/avatar/%{hash}?s=%{size}&d=identicon'
# gitlab-ctl reconfigure

通知信息配置(邮件方式)

# vi /etc/gitlab/gitlab.rb
gitlab_rails['smtp_enable'] = true
gitlab_rails['smtp_address'] = "smtp.qq.com"
gitlab_rails['smtp_port'] = 465
gitlab_rails['smtp_user_name'] = "xxxxxxx@qq.com"
gitlab_rails['smtp_password'] = "oherqwzatxxxxxxxxj"
gitlab_rails['smtp_domain'] = "qq.com"
gitlab_rails['smtp_authentication'] = "login"
gitlab_rails['smtp_enable_starttls_auto'] = false
gitlab_rails['smtp_tls'] = true
gitlab_rails['smtp_pool'] = true
gitlab_rails['gitlab_email_from'] = 'xxxxxxxxx@qq.com'
gitlab_rails['gitlab_email_display_name'] = 'Administrator'
gitlab_rails['gitlab_email_reply_to'] = 'xxxxxxxxxx@qq.com'
# gitlab-ctl reconfigure

测试

进入console
# gitlab-rails console
--------------------------------------------------------------------------------
 Ruby:         ruby 3.0.6p216 (2023-03-30 revision 23a532679b) [x86_64-linux]
 GitLab:       16.0.1 (34d6370bacd) FOSS
 GitLab Shell: 14.20.0
 PostgreSQL:   13.8
------------------------------------------------------------[ booted in 96.22s ]
Loading production environment (Rails 6.1.7.2)

发测试邮件:如Notify.test_email('xxxxxxx@163.com','this is title','hello gitlab').deliver_now

irb(main):001:0> Notify.test_email('xxxxxxx@163.com','this is title','hello gitlab').deliver_now
Delivered mail 647965bb85fe7_21e4317463563@test.mail (959.0ms)
=> #<Mail::Message:291920, Multipart: false, Headers: <Date: Fri, 02 Jun 2023 11:44:59 +0800>, <From: Administrator <xxxxxxxxxxxxxxx@qq.com>>, <Reply-To: Administrator <xxxxxxxxxxxx@qq.com>>, <To: xxxxxxxxxxxx@163.com>, <Message-ID: <647965bb85fe7_21e4317463563@test.mail>>, <Subject: this is title>, <Mime-Version: 1.0>, <Content-Type: text/html; charset=UTF-8>, <Content-Transfer-Encoding: 7bit>, <Auto-Submitted: auto-generated>, <X-Auto-Response-Suppress: All>>
irb(main):002:0> 
若能成功收到,说明配置是正常的。

创建新用户
在这里插入图片描述

3.2.1 测试

假如已创建用户guofs,并以guofs帐户创建了项目web.git

3.2.1.1 ssh方式

在用户机器上生成公钥和私钥(用于gitlab)

# ssh-keygen -t rsa -C "guofs@163.com"
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:AP1rvv6ySqYAwZeXQb7AXCvOFx2riAwToptIV++DixI guofs@163.com
The key's randomart image is:
+---[RSA 2048]----+
|o   o* .         |
|+.o =.B o        |
|=o O *.=         |
|==* = *..        |
|=E + = oS.       |
| .. o . +        |
| ... .oo         |
|  .. +  o        |
|    . .o+=.      |
+----[SHA256]-----+
# cat /root/.ssh/id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDAQSNJDMRjORJ599Ez6qdYpKh8V7L+TWv3kqkqxTmJf0ijEvdG/NqPAuC1QqswMlRVb8Zlu1hYawCYfF2FTQUnxW7dvgUXkbxaUvYBacarG/3ewVoa60+9w/kQFNyQsndt4BCYy8G0XsZfB1OmqFlErgQogHAGyau+CF3Fa8yY5j8b5dbHwtR9Yhrs3wyQlNuluU4TAAHTBMDQ6XkAagc53lAbz8VOF7NUbcDMXQ3EdZ74gYHh/RygS003gE+pNSoON+QX9y2uDmPWZQyB0ouRlqRpQx7taxq/nFva3bq55gCIzLAD52CotKeEPnHjEBnhUOAqMo8BIoMVs4Wl8mk5 guofs@163.com

将id_rsa.pub公钥复制到gitlab
在这里插入图片描述

clone项目

# git clone -b main git@git.demo.com:guofs/web.git
Cloning into 'web'...
The authenticity of host 'git.demo.com (10.2.20.36)' can't be established.
ED25519 key fingerprint is SHA256:KDdQTbTJm1fCmC0n3RrNmCJXGBBzXehOQbm4j31tYNg.
This host key is known by the following other names/addresses:
    ~/.ssh/known_hosts:65: 10.2.20.36
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'git.demo.com' (ED25519) to the list of known hosts.
remote: Enumerating objects: 3, done.
remote: Counting objects: 100% (3/3), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (3/3), done.

# tree web -a -L 2
web
|-- .git
|   |-- HEAD
|   |-- branches
|   |-- config
|   |-- description
|   |-- hooks
|   |-- index
|   |-- info
|   |-- logs
|   |-- objects
|   |-- packed-refs
|   `-- refs
`-- README.md

push代码

# cd web
# git config user.name "Darry.Guo"
# git config user.email "guofs@139.com"
# date >> 1.txt
# git add *
# git commit -m "11"
[main d4a8520] 11
 1 file changed, 1 insertion(+)
# git push git@git.demo.com:guofs/web.git main
Enumerating objects: 5, done.
Counting objects: 100% (5/5), done.
Delta compression using up to 2 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 301 bytes | 301.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
To git.demo.com:guofs/web.git
   b492ec7..d4a8520  main -> main
3.2.1.2 http/https方式

使用guofs进入https://git.demo.com创建token值。
例如:glpat-nC7mYxxfTdJQEJLuGpsR
在这里插入图片描述

clone项目

# git clone -b main https://git.demo.com/guofs/web.git
Cloning into 'web'...
Username for 'https://git.demo.com': guofs
Password for 'https://guofs@git.demo.com': 	//此处录入token值
remote: Enumerating objects: 12, done.
remote: Counting objects: 100% (12/12), done.
remote: Compressing objects: 100% (10/10), done.
remote: Total 12 (delta 1), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (12/12), done.
Resolving deltas: 100% (1/1), done.

push代码

# cd web
# git config user.name "Darry.Guo"
# git config user.email "guofs@139.com"
# date >> 1.txt
# git add *
# git commit -m "11"
[main d4a8520] 11
 1 file changed, 1 insertion(+)
# git push https://git.demo.com/guofs/web.git main
Username for 'https://git.demo.com': guofs
Password for 'https://guofs@git.demo.com': 	//此处录入token值
Enumerating objects: 5, done.
Counting objects: 100% (5/5), done.
Delta compression using up to 2 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 308 bytes | 308.00 KiB/s, done.
Total 3 (delta 1), reused 0 (delta 0), pack-reused 0
To https://git.demo.com/guofs/web.git
   10e3dec..347fa15  main -> main

3.3 配置Gitlab CI/CD、Gitlab Runner

持续集成(Continuous Integration)、持续交付(Continuous Delivery)是敏捷开发体现。
可以采用jenkins实现,也可以使用gitlab提供的CI/CD来实现。采用gitlab CI/CD时,主要分为四步
其一:安装gitlab runner
其二:通过gitlab runner向CI/CD注册各类executor。executor体具执行pipeline动作。executor不同,其执行的动作不相同。某个executor只完成特定动作。
其三:编写.gitlab-ci.yml
其四:Gitlab CI/CD依据.gitlab-ci.yml开展工作。CI/CD通过过push/merge/trigger等多种动作执行pipeline.

3.3.1 安装 Gitlab Runner

官网:https://docs.gitlab.com/runner/
runner安装在独立机器上或采用docker方式安装。
本文以docker方式安装为例。

# docker pull gitlab/gitlab-runner

# docker images
REPOSITORY             TAG       IMAGE ID       CREATED         SIZE
gitlab/gitlab-runner   latest    4575ef0329b0   2 weeks ago     749MB

# docker run -itd --restart=always  \
--name gitlab-runner \
-v /etc/gitlab-runner:/etc/gitlab-runner \
-v /var/run/docker.sock:/var/run/docker.sock  \
gitlab/gitlab-runner:latest

说明:
-v $HOME/gitlab-runner/config:/etc/gitlab-runner:
这个挂载是将gitlab-runner的配置文件挂载到宿主机上,这样我们可以通过修改宿主机上的这个配置文件对gitlab-runner进行配置
-v /var/run/docker.sock:/var/run/docker.sock:
这个挂载是将宿主机上的docker socket挂载到了容器内,这样容器内执行的docker命令会被宿主机docker daemon最终执行。
此操作是为Executor Docker时做准备工作。
这两个挂载很关健

进入容器
# docker exec -it gitlab-runner /bin/bash		//可以在容器内部做注册工作。
root@7a8453ddab09:/# gitlab-runner -h

3.3.2 注册runner executor/docker-in-docker

executor是gitlab CI的重要概念,不同的executor完成不同的动作,比如:不同开发语言的源码编译需不同的executor,不同的 executor可以让应用部署在物理节点、虚拟机、docker容器、或k8s中的pod等。

executor被runner注册到gitlab CI中,每个注册成功的executor都有一个tag,配置.gitlab-ci.yml凭tag对executor进行调用。

runner有如下3种情况

类型说明
共享级runner供所有项目使用
群组级runner供群组成员中的项目使用
项目级runner仅供本项目使用

本文采用docker-in-docker方式。
目标:创建一个executor,用于编译go源码,并将编译后的应用打包成镜像上传到私有仓库harbor.demo.com中。

提示:
在此docker-in-docker时采用了一个自制镜像存于私仓harbor.demo.com/cicd/centos8.5-tool:v0.2,它是基于centos8.5制作,其中安装也测试所需的软件,如docker客户端、go、curl等等。

进入容器gitlab-runner

# docker exec -it gitlab-runner /bin/bash

注册runner到Gitlab CI

root@7a8453ddab09:/# gitlab-runner register \
--url http://git.demo.com \
--registration-token GR13489413ocr9Hnhx-eTHAbcNCXx \
--tag-list docker-in-docker-test-1 \
--description "docker-in-docker-test-1" \
--maintenance-note "docker-in-docker-test-1" \
--executor docker \
--docker-pull-policy if-not-present \
--docker-helper-image "harbor.demo.com/cicd/gitlab-runner-helper:x86_64-dcfb4b66" \
--docker-image "harbor.demo.com/cicd/centos8.5-tool:v0.2" \
--docker-volumes /var/run/docker.sock:/var/run/docker.sock \
--env 'DOCKER_AUTH_CONFIG={"auths":{"harbor.demo.com":{"auth":"YWRtaW46MTJxd2FzenhvcGtsbm0="}}}'

说明:
其中–docker-helper-image和–docker-image这两个镜像,都是由gitlab服务器从私有仓库上面下载,因此需配置如下:

1。私有仓库采用https时,需将证书追回到gitlab服务器所在的/etc/ssl/certs/ca-certificates.crt文件中。

2。有密码时,需配置--env参数,如下。
     --env 'DOCKER_AUTH_CONFIG={"auths":{"harbor.demo.com":{"auth":"YWRtaW46MTJxd2FzenhvcGtsbm0="}}}'
   其中用户名称和密码需采用如下办法输出:
     # printf "admin:12qwaszxopklnm" | openssl base64 -A
   此处配置的变量DOCKER_AUTH_CONFIG值本质是一个json字串,如下:
		{
		  "auths": {
		    "harbor.demo.com": {
		      "auth": "YWRtaW46MTJxd2FzenhvcGtsbm0="
		    }
		  }
		}

3。这两个镜像的pull规则:
       --docker-pull-policy {never, if-not-present, always}
   默认是always

查看注册情况

root@7a8453ddab09:/# gitlab-runner list
Runtime platform                                    arch=amd64 os=linux pid=51 revision=dcfb4b66 version=15.10.1
docker-in-docker-test-1                             Executor=docker Token=uyyzkGqaTayfRsJ8yJxg URL=http://git.demo.com

此runner可以编译go并可访问harbor.demo.com私有仓库。

3.3.3 编写.gitlab-ci.yml

采用GoLand编写一段测试代码,如下:
在这里插入图片描述
创建CI/CD流水文件.gitlab-ci.yml,如下

#全局变量赋值
variables:
  image_name: "busybox"
  #image_name: "centos"
  image_ver: "v2.1"

#定义stages
stages:
  - build
  - push_image
  - test
  - deploy

#job1:编译go源码为二进制文件
#局部变量Is_Run默认值为yes,则在push时会执行此任务。
job1:
  variables:
    Is_Run: "yes"
  stage: build
  script:
    - echo "build the code..."
    - export GOROOT=/usr/local/go
    - export PATH=$PATH:/usr/local/go/bin
    - export GOPATH=/opt
    - export GO115MODULE=on
    - export GOOS=linux
    - export GOARCH=amd64
    - export GOPROXY="https://goproxy.cn,direct"
    - go version
    - go mod tidy
    - go build -o app .
    - mkdir build
    - mv app build/
    - mv Dockerfile_nobuild build/Dockerfile
  artifacts:
    paths:
      - build
  tags:
    - docker-in-docker-test-1
  rules:
    - if: $Is_Run == "yes"

#job2的工作是将job1生成的应用打包上镜像,并push到私有仓库。
#局部变量Is_Run默认值为yes,则在push时会执行此任务。
#提示:$UserName和$PassWord是在gitlab项目定义的项目级变量,用于存放私有仓库的用户名和密码
job2:
  variables:
    Is_Run: "yes"
  stage: push_image
  needs:
    - job: job1
      artifacts: true
  script:
    - echo "build image and push harbor register ..."
    - cd build/
    - ls -l
    - docker build -t harbor.demo.com/web/$image_name:$image_ver .
    - docker logout harbor.demo.com
    - echo $PassWord | base64 -d | docker login --username $UserName  --password-stdin harbor.demo.com
    - docker push harbor.demo.com/web/$image_name:$image_ver
    - docker rmi harbor.demo.com/web/$image_name:$image_ver
  tags:
    - docker-in-docker-test-1
  rules:
    - if: $Is_Run == "yes"

#job3的任务是测试应用。
#局部变量Is_Run默认值为yes,则在push时会执行此任务。通常开发过程中测试。
job3:
  variables:
    Is_Run: "yes"
    deploy_svc_name: "app-test"
  stage: test
  script:
    - echo "deploy_to_k8s, $deploy_svc_name, http://www.test.com ..."
  tags:
    - docker-in-docker-test-1
  rules:
    - if: $Is_Run == "yes"

#job4的任务用于发布
#局部变量Is_Run默认值为no,则在push时不会执行此任务,执行条件为:$Is_Run == "deploy"。
#需通过webhook方式执行此任务。通常用于在OA工作流中供领导审批是否正式发布此应用。
job4:
  variables:
    Is_Run: "no"
    deploy_svc_name: "app-demo-io"
  stage: deploy
  script:
    - echo "deploy_to_k8s, $deploy_svc_name, http://www.demo.io ..."
  tags:
    - docker-in-docker-test-1
  rules:
    - if: $Is_Run == "deploy"

此.gitlab-ci.yml有4个job,每个job定义有runner executor,采用tag来指定runner的tag标识。每个job可以有不同的runner,本例中采用相同的runner.

runner executor可以处理java/node.js/python/c/c++/php/go等多种语言,同时可以部署应用于物理服务、云主机、容器、k8s/istio中。

此步测试仅用于流程测试,在最后测试将应用发布在k8s中。

3.3.4 CI/CD测试

Gitlab CI/CD默认是启动,只要有“.gitlab-ci.yml”文件,会自动触发pipeline。
通常有两种方式来触发CI/CD流。
其一:push/merge动作
其二:采用webhook方式。

3.3.4.1 配置变量及webhok token

定义变量UserName和PassWord用于存储私有仓库的用户名称和密码。
在这里插入图片描述
定义webhook使用的token值。
在这里插入图片描述

3.3.4.2 push测试

push源码

git add *
git commit -m "test-1"
git push http://git.demo.com/guofs/cicdtest.git main

查看CI/CD变过程
在这里插入图片描述
可以查看每一个job的详细过程。例如job2.
在这里插入图片描述
查看镜像私有仓库
在这里插入图片描述

3.3.4.3 webhook测试

从配置上看,job4可以采用webhook方式赋值Is_Run=deploy来执行。
此步可通过OA流来调用,让领导通过工作流方式来审批此应用是否正式发布。

curl -X POST \
     --fail \
     -F token=glptt-938d9966afdc10180540a775d6e5e399fcd2cea0 \
     -F ref=main \
     -F "variables[Is_Run]=deploy" \
     -F "variables[deploy_svc_name]=demo-io-test" \
     http://git.demo.com/api/v4/projects/8/trigger/pipeline

查看
在这里插入图片描述
查看job详细

$ echo "deploy_to_k8s, $deploy_svc_name, http://www.demo.io ..."
deploy_to_k8s, demo-io-test, http://www.demo.io ...

四、存储层/ceph分布式存储

存储层采用ceph分布式存储,可提供块存储、对像存储、文件存储等多种方式。
并给k8s提供后端sc支持。

ceph测试环境如下,若是生产环境,需每个进程角色配置主备方式。

节点os配置ip角色
mgmRocky9.12vCPU,RAM2GB,HD:8GB10.2.20.59/192.168.3.x管理节点,ssh免密
ceph-mon1centos8.5.21112vCPU,RAM2GB,HD:8GB10.2.20.90/192.168.3.xmon,mgr,mds,dashboard,rgw
ceph-node1centos8.5.21112vCPU,RAM2GB,HD:8GB+10GBx210.2.20.91/192.168.3.xosd
ceph-node2centos8.5.21112vCPU,RAM2GB,HD:8GB+10GBx210.2.20.92/192.168.3.xosd
ceph-node3centos8.5.21112vCPU,RAM2GB,HD:8GB+10GBx210.2.20.93/192.168.3.xosd

ceph采用version 17.2.6 quincy (stable)。

采用os-w安装上述5台主机。

4.1 基本配置

4.1.1 所有节点基本配置

#配置hosts文件
cat >> /etc/hosts << 'EOF'
10.2.20.90      ceph-mon1
10.2.20.91      ceph-node1
10.2.20.92      ceph-node2
10.2.20.93      ceph-node3
EOF

#安装基础软件
cd /etc/yum.repos.d/
sed -i 's/mirrorlist/#mirrorlist/g' /etc/yum.repos.d/CentOS-*
sed -i 's|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-*
rm -fr Centos8-2111*
wget -O /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-vault-8.5.2111.repo
yum clean all 
yum makecache
yum install -y epel-release
yum -y install net-tools wget bash-completion lrzsz unzip zip tree

#关闭防火墙和selinux
systemctl disable --now firewalld
systemctl stop firewalld
setenforce 0
sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config

ceph-17.2.6安装源

cat> /etc/yum.repos.d/ceph.repo << 'EOF'
[ceph]
name=Ceph packages for $basearch
baseurl=https://download.ceph.com/rpm-17.2.6/el8/$basearch
enabled=1
priority=2
gpgcheck=1
gpgkey=https://download.ceph.com/keys/release.asc

[ceph-noarch]
name=Ceph noarch packages
baseurl=https://download.ceph.com/rpm-17.2.6/el8/noarch
enabled=1
priority=2
gpgcheck=1
gpgkey=https://download.ceph.com/keys/release.asc

[ceph-source]
name=Ceph source packages
baseurl=https://download.ceph.com/rpm-17.2.6/el8/SRPMS
enabled=0
priority=2
gpgcheck=1
gpgkey=https://download.ceph.com/keys/release.asc
EOF

查看ceph安装包

# yum list Ceph*
Repository extras is listed more than once in the configuration
Last metadata expiration check: 0:01:01 ago on Mon 24 Apr 2023 10:22:10 PM CST.
Installed Packages
ceph-release.noarch                                  1-1.el8                @System    
Available Packages
ceph.x86_64                                          2:17.2.6-0.el8         ceph       
ceph-base.x86_64                                     2:17.2.6-0.el8         ceph       
ceph-base-debuginfo.x86_64                           2:17.2.6-0.el8         ceph       
ceph-common.x86_64                                   2:17.2.6-0.el8         ceph       
ceph-common-debuginfo.x86_64                         2:17.2.6-0.el8         ceph       
ceph-debuginfo.x86_64                                2:17.2.6-0.el8         ceph       
ceph-debugsource.x86_64                              2:17.2.6-0.el8         ceph       
ceph-exporter.x86_64                                 2:17.2.6-0.el8         ceph       
ceph-exporter-debuginfo.x86_64                       2:17.2.6-0.el8         ceph       
ceph-fuse.x86_64                                     2:17.2.6-0.el8         ceph       
ceph-fuse-debuginfo.x86_64                           2:17.2.6-0.el8         ceph       
ceph-grafana-dashboards.noarch                       2:17.2.6-0.el8         ceph-noarch
ceph-immutable-object-cache.x86_64                   2:17.2.6-0.el8         ceph       
ceph-immutable-object-cache-debuginfo.x86_64         2:17.2.6-0.el8         ceph       
ceph-mds.x86_64                                      2:17.2.6-0.el8         ceph       
ceph-mds-debuginfo.x86_64                            2:17.2.6-0.el8         ceph       
ceph-mgr.x86_64                                      2:17.2.6-0.el8         ceph       
ceph-mgr-cephadm.noarch                              2:17.2.6-0.el8         ceph-noarch
ceph-mgr-dashboard.noarch                            2:17.2.6-0.el8         ceph-noarch
ceph-mgr-debuginfo.x86_64                            2:17.2.6-0.el8         ceph       
ceph-mgr-diskprediction-local.noarch                 2:17.2.6-0.el8         ceph-noarch
ceph-mgr-k8sevents.noarch                            2:17.2.6-0.el8         ceph-noarch
ceph-mgr-modules-core.noarch                         2:17.2.6-0.el8         ceph-noarch
ceph-mgr-rook.noarch                                 2:17.2.6-0.el8         ceph-noarch
ceph-mon.x86_64                                      2:17.2.6-0.el8         ceph       
ceph-mon-debuginfo.x86_64                            2:17.2.6-0.el8         ceph       
ceph-osd.x86_64                                      2:17.2.6-0.el8         ceph       
ceph-osd-debuginfo.x86_64                            2:17.2.6-0.el8         ceph       
ceph-prometheus-alerts.noarch                        2:17.2.6-0.el8         ceph-noarch
ceph-radosgw.x86_64                                  2:17.2.6-0.el8         ceph       
ceph-radosgw-debuginfo.x86_64                        2:17.2.6-0.el8         ceph       
ceph-resource-agents.noarch                          2:17.2.6-0.el8         ceph-noarch
ceph-selinux.x86_64                                  2:17.2.6-0.el8         ceph       
ceph-test.x86_64                                     2:17.2.6-0.el8         ceph       
ceph-test-debuginfo.x86_64                           2:17.2.6-0.el8         ceph       
ceph-volume.noarch                                   2:17.2.6-0.el8         ceph-noarch
cephadm.noarch                                       2:17.2.6-0.el8         ceph-noarch
cephfs-mirror.x86_64                                 2:17.2.6-0.el8         ceph       
cephfs-mirror-debuginfo.x86_64                       2:17.2.6-0.el8         ceph       
cephfs-top.noarch                                    2:17.2.6-0.el8         ceph-noarch

4.1.2 管理节点

免密配置

ssh-keygen -t rsa
ssh-copy-id root@ceph-mon1
ssh-copy-id root@ceph-node1
ssh-copy-id root@ceph-node2
ssh-copy-id root@ceph-node3

配置ansible

# yum -y install ansible
# vi /etc/ansible/hosts 
[ceph]
ceph-mon1
ceph-node1
ceph-node2
ceph-node3
# ansible ceph -m shell -a "date"
ceph-mon1 | CHANGED | rc=0 >>
Sat Jun  3 22:32:43 CST 2023
ceph-node3 | CHANGED | rc=0 >>
Sat Jun  3 22:32:43 CST 2023
ceph-node1 | CHANGED | rc=0 >>
Sat Jun  3 22:32:43 CST 2023
ceph-node2 | CHANGED | rc=0 >>
Sat Jun  3 22:32:43 CST 2023

安装ceph等客户端命令

# yum -y install ceph-common ceph-base
# ceph -v
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)

4.1.3 ceph集群节点

# ansible ceph -m shell -a "yum -y install net-tools gdisk lvm2"
# ansible ceph -m shell -a "yum -y install ceph"

# ansible ceph -m shell -a "systemctl list-unit-files | grep ceph"
...
ceph-crash.service                         enabled  
ceph-mds@.service                          disabled 
ceph-mgr@.service                          disabled 
ceph-mon@.service                          disabled 
ceph-osd@.service                          disabled 
ceph-volume@.service                       disabled 
ceph-mds.target                            enabled  
ceph-mgr.target                            enabled  
ceph-mon.target                            enabled  
ceph-osd.target                            enabled  
ceph.target                                enabled  
# ansible ceph -m shell -a "ceph -v"
ceph-mon1 | CHANGED | rc=0 >>
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
ceph-node1 | CHANGED | rc=0 >>
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
ceph-node3 | CHANGED | rc=0 >>
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
ceph-node2 | CHANGED | rc=0 >>
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)

每个节点 工作目录

# tree /var/lib/ceph
/var/lib/ceph
├── bootstrap-mds
├── bootstrap-mgr
├── bootstrap-osd
├── bootstrap-rbd
├── bootstrap-rbd-mirror
├── bootstrap-rgw
├── crash
│   └── posted
├── mds
├── mgr
├── mon
├── osd
└── tmp

所有节点,ceph日志目录:/var/log/ceph

4.2 管理机点配置

管理节点主要功能是管理ceph集群,包括配置文件的产生、及使用ceph命令直接访问集群。

为方便配置,在管理节点上建立一个目录,用于存放ceph集群配置过程中产生的文件,默认在此目录中产生各类配置文件,并在需要时同步到ceph各节点。例如:

# mkdir /root/ceph
# cd /root/ceph

4.2.1 ceph集群全局唯一性标识配置

# uuidgen
9b7095ab-5193-420c-b2fb-2d343c57ef52
# ansible ceph -m shell -a "echo export cephuid=9b7095ab-5193-420c-b2fb-2d343c57ef52 >> /etc/profile"
# ansible ceph -m shell -a "source /etc/profile"
# ansible ceph -m shell -a "cat /etc/profile | grep cephuid"
ceph-node1 | CHANGED | rc=0 >>
export cephuid=9b7095ab-5193-420c-b2fb-2d343c57ef52
ceph-mon1 | CHANGED | rc=0 >>
export cephuid=9b7095ab-5193-420c-b2fb-2d343c57ef52
ceph-node3 | CHANGED | rc=0 >>
export cephuid=9b7095ab-5193-420c-b2fb-2d343c57ef52
ceph-node2 | CHANGED | rc=0 >>
export cephuid=9b7095ab-5193-420c-b2fb-2d343c57ef52

4.2.2 keyring配置

ceph-authtool --create-keyring ./ceph.client.admin.keyring --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *' --cap mgr 'allow *'
ceph-authtool --create-keyring ./ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *'
ceph-authtool --create-keyring ./ceph.keyring --gen-key -n client.bootstrap-osd --cap mon 'profile bootstrap-osd' --cap mgr 'allow r'

4.2.3 ceph.conf初始配置

# cat > /root/ceph/ceph.conf <<EOF
[global]
fsid = 9b7095ab-5193-420c-b2fb-2d343c57ef52
public network = 10.2.20.0/24
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
osd journal size = 1024
osd pool default size = 3
osd pool default min size = 2
osd pool default pg num = 32
osd pool default pgp num = 32
osd crush chooseleaf type = 0
mon_host = 10.2.20.90
mon_max_pg_per_osd = 1000

[client.admin]
#mon host = 10.2.20.90
keyring = /etc/ceph/ceph.client.admin.keyring
EOF

4.2.4 客户端ceph命令配置

当ceph集群正常工作后,就可以使用ceph命令在管理节点上对集群进行管理。
建立在所有节点包括管理节点配置。
管理节点

# cp ceph.client.admin.keyring /etc/ceph/
# cp ceph.conf /etc/ceph/

ceph工作节点

ansible ceph -m copy -a "src=ceph.conf dest=/etc/ceph/"
ansible ceph -m copy -a "src=ceph.client.admin.keyring dest=/etc/ceph/"

在ceph集群建立后,可以在每个节点上使用ceph命令对ceph进行管理。

4.3 mon进程配置

在管理机上,在配置文件/root/ceph/ceph.conf中添加如下信息

[mon]
mon initial members = mon1
mon allow pool delete = true

并更新到各个节点

ansible ceph -m copy -a "src=ceph.conf dest=/etc/ceph/"

4.3.1 ceph.mon.keyring

把administrator keyring和bootstrap-osd keyring添加到中ceph.mon.keyring。

ceph-authtool ./ceph.mon.keyring --import-keyring ./ceph.client.admin.keyring
ceph-authtool ./ceph.mon.keyring --import-keyring ./ceph.keyring

将mon.keyring文件复制到所有mon节点,并配置权限

# scp ceph.mon.keyring ceph-mon1:/tmp/
# ssh ceph-mon1 "chown ceph:ceph /tmp/ceph.mon.keyring"

4.3.2 monitor map配置

# monmaptool --create --add mon1 10.2.20.90 --fsid $cephuid /root/ceph/monmap
monmaptool: monmap file /root/ceph/monmap
setting min_mon_release = octopus
monmaptool: set fsid to 9b7095ab-5193-420c-b2fb-2d343c57ef52
monmaptool: writing epoch 0 to /root/ceph/monmap (1 monitors)
# scp monmap ceph-mon1:/tmp/
# ssh ceph-mon1 "chown ceph:ceph /tmp/monmap"

4.3.3 创建monitor数据目录

在ceph-mon1节点上操作

# sudo -u ceph ceph-mon --mkfs -i mon1 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
# tree /var/lib/ceph/mon/ceph-mon1
/var/lib/ceph/mon/ceph-mon1
├── keyring
├── kv_backend
└── store.db
    ├── 000004.log
    ├── CURRENT
    ├── IDENTITY
    ├── LOCK
    ├── MANIFEST-000003
    └── OPTIONS-000006

4.3.4 启动monitor服务

在ceph-mon1上配置开机启动

# systemctl enable ceph-mon@mon1
# systemctl start ceph-mon.target
# ss -lnt
State    Recv-Q   Send-Q     Local Address:Port       Peer Address:Port   Process   
LISTEN   0        128           10.2.20.90:6789            0.0.0.0:*                
LISTEN   0        128              0.0.0.0:22              0.0.0.0:*                
LISTEN   0        128                 [::]:22                 [::]:*                
# ceph config set mon auth_allow_insecure_global_id_reclaim false
# ps -ef | grep ceph-mon
ceph        1106       1  0 08:11 ?        00:00:02 /usr/bin/ceph-mon -f --cluster ceph --id mon1 --setuser ceph --setgroup ceph

查看

# ceph mon stat
e2: 1 mons at {mon1=[v2:10.2.20.90:3300/0,v1:10.2.20.90:6789/0]} removed_ranks: {}, election epoch 7, leader 0 mon1, quorum 0 mon1
# ceph -s
  cluster:
    id:     9b7095ab-5193-420c-b2fb-2d343c57ef52
    health: HEALTH_WARN
            1 monitors have not enabled msgr2
 
  services:
    mon: 1 daemons, quorum mon1 (age 39m)
    mgr: no daemons active
    osd: 0 osds: 0 up, 0 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:         

4.4 mgr进程配置

在ceph-mon1节点上操作。

4.4.1 配置mgr服务

yum -y install ceph-mgr*
mgr_name='mgr1'
mkdir /var/lib/ceph/mgr/ceph-${mgr_name}
ceph auth get-or-create mgr.${mgr_name} mon 'allow profile mgr' osd 'allow *' mds 'allow *' > /var/lib/ceph/mgr/ceph-${mgr_name}/keyring
chown ceph:ceph -R /var/lib/ceph/mgr/ceph-${mgr_name}
ceph mon enable-msgr2

将如下内容添加到管理节点/root/ceph/ceph.conf文件

[mgr.mgr1]
# mon host = 10.2.20.90
keyring =  /var/lib/ceph/mgr/ceph-mgr1/keyring 

更新配置文件

ansible ceph -m copy -a "src=ceph.conf dest=/etc/ceph/"

启动ceph-mgr守护程序:

# systemctl enable ceph-mgr@mgr1
# systemctl start ceph-mgr.target     

查看mgr是否启动

# ps -ef | grep mgr
ceph        2059       1 78 09:16 ?        00:00:20 /usr/bin/ceph-mgr -f --cluster ceph --id mgr1 --setuser ceph --setgroup ceph
root        2205    1677  0 09:16 pts/0    00:00:00 grep --color=auto mgr
# ceph -s
  cluster:
    id:     9b7095ab-5193-420c-b2fb-2d343c57ef52
    health: HEALTH_OK
 
  services:
    mon: 1 daemons, quorum mon1 (age 7m)
    mgr: mgr1(active, since 9s)
    osd: 0 osds: 0 up, 0 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:   
# ss -lntp
State   Recv-Q   Send-Q     Local Address:Port     Peer Address:Port  Process                               
LISTEN  0        128           10.2.20.90:6800          0.0.0.0:*      users:(("ceph-mgr",pid=2059,fd=30))  
LISTEN  0        128           10.2.20.90:6801          0.0.0.0:*      users:(("ceph-mgr",pid=2059,fd=31))  
LISTEN  0        128              0.0.0.0:22            0.0.0.0:*      users:(("sshd",pid=1024,fd=4))       
LISTEN  0        128           10.2.20.90:3300          0.0.0.0:*      users:(("ceph-mon",pid=1106,fd=27))  
LISTEN  0        128           10.2.20.90:6789          0.0.0.0:*      users:(("ceph-mon",pid=1106,fd=28))  
LISTEN  0        128                 [::]:22               [::]:*      users:(("sshd",pid=1024,fd=6))  

4.4.2 配置mgr模块

查看mgr模块

# ceph mgr module ls
MODULE                              
balancer              on (always on)
crash                 on (always on)
devicehealth          on (always on)
orchestrator          on (always on)
pg_autoscaler         on (always on)
progress              on (always on)
rbd_support           on (always on)
status                on (always on)
telemetry             on (always on)
volumes               on (always on)
iostat                on            
nfs                   on            
restful               on            
alerts                -             
cephadm               -             
dashboard             -             
diskprediction_local  -             
influx                -             
insights              -             
k8sevents             -             
localpool             -             
mds_autoscaler        -             
mirroring             -             
osd_perf_query        -             
osd_support           -             
prometheus            -             
rook                  -             
selftest              -             
snap_schedule         -             
stats                 -             
telegraf              -             
test_orchestrator     -             
zabbix                -         

启用prometheus

ceph mgr module enable prometheus

查看prometheus采集指标
http://10.2.20.90:9283/metrics

4.4.2 配置mgr模块dashbord

将域名证书复制到/etc/ceph/cert/目录

ceph mgr module enable dashboard
ceph dashboard set-ssl-certificate -i /etc/ceph/cert/web.pem
ceph dashboard set-ssl-certificate-key -i /etc/ceph/cert/web-key.pem
echo "abc123xyz" > pwd.txt
ceph dashboard ac-user-create admin -i ./pwd.txt administrator

查看
https://ceph.demo.com:8443/
在这里插入图片描述
在这里插入图片描述
当ceph集群配置完成后,可以通过dashboard看到详细信息。

4.5 存储osd配置

4.5.1 ceph.conf配置

ceph.conf添加如下信息

[client.bootstrap-osd]
# mon host = 10.2.20.90
keyring = /var/lib/ceph/bootstrap-osd/ceph.keyring

更新配置ceph.conf到各节点

# ansible ceph -m copy -a "src=ceph.conf dest=/etc/ceph/"

将bootstrap-osd key复制到各节点

ansible ceph -m copy -a "src=ceph.keyring dest=/var/lib/ceph/bootstrap-osd/"
ansible ceph -m shell -a "chown ceph:ceph -R /var/lib/ceph/bootstrap-osd"

4.5.2 添加ceph卷

在osd节点上操作,以ceph-node1为例。每个osd节点相同操作。
查看节点上的祼硬盘情况

# lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda           8:0    0    8G  0 disk 
├─sda1        8:1    0  600M  0 part /boot/efi
├─sda2        8:2    0    1G  0 part /boot
└─sda3        8:3    0  6.4G  0 part 
  ├─cl-root 253:0    0  5.6G  0 lvm  /
  └─cl-swap 253:1    0  820M  0 lvm  [SWAP]
sdb           8:16   0   10G  0 disk 
sdc           8:32   0   10G  0 disk 
sr0          11:0    1 1024M  0 rom  

将/dev/sdb和/dev/sdc挂入ceph,创建ceph卷,每一个osd都有独立的id.

# ceph-volume lvm create --data /dev/sdb
...
Running command: /usr/bin/systemctl enable --runtime ceph-osd@0
 stderr: Created symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd@0.service → /usr/lib/systemd/system/ceph-osd@.service.
Running command: /usr/bin/systemctl start ceph-osd@0
--> ceph-volume lvm activate successful for osd ID: 0
--> ceph-volume lvm create successful for: /dev/sdb

# ceph-volume lvm create --data /dev/sdc
Running command: /usr/bin/systemctl enable --runtime ceph-osd@1
 stderr: Created symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd@1.service → /usr/lib/systemd/system/ceph-osd@.service.
Running command: /usr/bin/systemctl start ceph-osd@1
--> ceph-volume lvm activate successful for osd ID: 1
--> ceph-volume lvm create successful for: /dev/sdc

启动服务,每一个osd都有独立的id,启动osd服务时需指定osd id。

# systemctl enable ceph-osd@0
# systemctl enable ceph-osd@1
# systemctl start ceph-osd.target

# ps -ef | grep osd
ceph        3492       1  0 10:38 ?        00:00:01 /usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser ceph --setgroup ceph
ceph        4993       1  1 10:39 ?        00:00:01 /usr/bin/ceph-osd -f --cluster ceph --id 1 --setuser ceph --setgroup ceph

其它两个osd节点做相同操作,完成后可查看osd状态。

# ceph osd status
ID  HOST         USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE      
 0  ceph-node1  20.6M  9.97G      0        0       0        0   exists,up  
 1  ceph-node1  20.6M  9.97G      0        0       0        0   exists,up  
 2  ceph-node2  21.0M  9.97G      0        0       0        0   exists,up  
 3  ceph-node2  20.3M  9.97G      0        0       0        0   exists,up  
 4  ceph-node3  19.7M  9.97G      0        0       0        0   exists,up  
 5  ceph-node3  20.2M  9.97G      0        0       0        0   exists,up  

# ceph osd df
ID  CLASS  WEIGHT   REWEIGHT  SIZE    RAW USE  DATA     OMAP  META     AVAIL   %USE  VAR   PGS  STATUS
 0    hdd  0.00980   1.00000  10 GiB   21 MiB  496 KiB   0 B   21 MiB  10 GiB  0.21  1.01   16      up
 1    hdd  0.00980   1.00000  10 GiB   21 MiB  500 KiB   0 B   21 MiB  10 GiB  0.21  1.02   24      up
 2    hdd  0.00980   1.00000  10 GiB   22 MiB  920 KiB   0 B   21 MiB  10 GiB  0.21  1.02   22      up
 3    hdd  0.00980   1.00000  10 GiB   21 MiB  928 KiB   0 B   20 MiB  10 GiB  0.20  0.99   24      up
 4    hdd  0.00980   1.00000  10 GiB   20 MiB  500 KiB   0 B   20 MiB  10 GiB  0.20  0.97   18      up
 5    hdd  0.00980   1.00000  10 GiB   21 MiB  908 KiB   0 B   20 MiB  10 GiB  0.20  0.98   19      up
                       TOTAL  60 GiB  127 MiB  4.2 MiB   0 B  122 MiB  60 GiB  0.21                   
MIN/MAX VAR: 0.97/1.02  STDDEV: 0.00

# ceph -s
  cluster:
    id:     9b7095ab-5193-420c-b2fb-2d343c57ef52
    health: HEALTH_OK
 
  services:
    mon: 1 daemons, quorum mon1 (age 103m)
    mgr: mgr1(active, since 77m)
    osd: 6 osds: 6 up (since 74s), 6 in (since 106s)
 
  data:
    pools:   1 pools, 1 pgs
    objects: 2 objects, 449 KiB
    usage:   123 MiB used, 60 GiB / 60 GiB avail
    pgs:     1 active+clean

# ceph osd pool ls
.mgr

4.6 mds进程配置

一个 Ceph 文件系统需要至少两个 RADOS 存储池,一个用于数据、一个用于元数据。
在生产中,配置这些存储池时需考虑:
1。为元数据存储池设置较高的副本水平,因为此存储池丢失任何数据都会导致整个文件系统失效。
2。为元数据存储池分配低延时存储器(像 SSD ),因为它会直接影响到客户端的操作延时。

将mds配置在ceph-mon1节点,在多个节点上可以配置多个mds服务。

4.6.1 配置mds服务

创建mds数据目录。

sudo -u ceph mkdir -p /var/lib/ceph/mds/ceph-mon1

创建keyring,并配置权限。

ceph-authtool --create-keyring /var/lib/ceph/mds/ceph-mon1/keyring --gen-key -n mds.mon1
ceph auth add mds.mon1 osd "allow rwx" mds "allow" mon "allow profile mds" -i /var/lib/ceph/mds/ceph-mon1/keyring
chown ceph:ceph -R /var/lib/ceph/mds/ceph-mon1

ceph.conf添加如下信息

[mds.mon1]
host = ceph-mon1
#mon host = 10.2.20.90
keyring = /var/lib/ceph/mds/ceph-mon1/keyring

更新配置ceph.conf到各节点

ansible ceph -m copy -a "src=ceph.conf dest=/etc/ceph/"

运行服务

# systemctl enable ceph-mds@mon1
# systemctl start ceph-mds.target

查看ceph状态

# ps -ef | grep mds
ceph        3617       1  1 11:16 ?        00:00:00 /usr/bin/ceph-mds -f --cluster ceph --id mon1 --setuser ceph --setgroup ceph
# ceph -s
  cluster:
    id:     9b7095ab-5193-420c-b2fb-2d343c57ef52
    health: HEALTH_OK
 
  services:
    mon: 1 daemons, quorum mon1 (age 2h)
    mgr: mgr1(active, since 2h)
    osd: 6 osds: 6 up (since 49m), 6 in (since 49m)
 
  data:
    pools:   1 pools, 1 pgs
    objects: 2 objects, 449 KiB
    usage:   123 MiB used, 60 GiB / 60 GiB avail
    pgs:     1 active+clean
# ceph mds stat
 1 up:standby

4.6.2 创建fs卷

1。一个cephfs最多占用2个mds进程,一主一副。
2。若有多个cephfs,需配置多个mds进程。
3。一个mds服务进程只负责一个fs卷。
4。当需要多个fs卷,可采用子卷方式。

ceph osd pool create guo-metadata 8
ceph osd pool create guo-data 8
ceph fs new guo-fs guo-metadata guo-data

查看fs卷

# ceph fs ls
name: guo-fs, metadata pool: guo-metadata, data pools: [guo-data ]

# ceph mds stat
guo-fs:1 {0=mon1=up:active}

# ceph -s
  cluster:
    id:     9b7095ab-5193-420c-b2fb-2d343c57ef52
    health: HEALTH_OK
 
  services:
    mon: 1 daemons, quorum mon1 (age 2h)
    mgr: mgr1(active, since 2h)
    mds: 1/1 daemons up
    osd: 6 osds: 6 up (since 57m), 6 in (since 57m)
 
  data:
    volumes: 1/1 healthy
    pools:   3 pools, 41 pgs
    objects: 24 objects, 451 KiB
    usage:   126 MiB used, 60 GiB / 60 GiB avail
    pgs:     41 active+clean

# ceph fs volume ls
[
    {
        "name": "guo-fs"
    }
]

# ceph fs status guo-fs
guo-fs - 0 clients
======
RANK  STATE   MDS      ACTIVITY     DNS    INOS   DIRS   CAPS  
 0    active  mon1  Reqs:    0 /s    10     13     12      0   
    POOL        TYPE     USED  AVAIL  
guo-metadata  metadata  96.0k  18.9G  
  guo-data      data       0   18.9G  
MDS version: ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)

# ceph fs get guo-fs
Filesystem 'guo-fs' (1)
fs_name guo-fs
epoch   4
flags   12 joinable allow_snaps allow_multimds_snaps
created 2023-06-04T11:46:12.324425+0800
modified        2023-06-04T11:46:13.614449+0800
tableserver     0
root    0
session_timeout 60
session_autoclose       300
max_file_size   1099511627776
required_client_features        {}
last_failure    0
last_failure_osd_epoch  0
compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2}
max_mds 1
in      0
up      {0=24251}
failed
damaged
stopped
data_pools      [3]
metadata_pool   2
inline_data     disabled
balancer
standby_count_wanted    0
[mds.mon1{0:24251} state up:active seq 454 addr [v2:10.2.20.90:6802/3326420411,v1:10.2.20.90:6803/3326420411] compat {c=[1],r=[1],i=[7ff]}]

查看fs使用情况

# ceph fs volume info  guo-fs
{
    "mon_addrs": [
        "10.2.20.90:6789"
    ],
    "pools": {
        "data": [
            {
                "avail": 20347840512,
                "name": "guo-data",
                "used": 0
            }
        ],
        "metadata": [
            {
                "avail": 20347840512,
                "name": "guo-metadata",
                "used": 98304
            }
        ]
    }
}

4.6.3 cephfs mount测试

cephfs挂载的方式有多种,本文采用linux内核模块ceph方式,centos7.x或更高版本的内核默认安装ceph模块。
查验内核ceph模块

modinfo ceph

在管理机上操作

在cephfs卷上为使用创建子目录
# mount -t ceph -o mds_namespace=guo-fs,name=admin,secret=AQCwXntkCw+CGBAA/mdug0WT2jYDAFEN8tATOA== 10.2.20.90:6789:/ /root/cephfs
# mkdir -p /root/cephfs/{tp1,tp2}
# umount /root/cephfs

创建cephfs访问用户
# ceph fs authorize guo-fs client.guofs /tp1 rw
[client.guofs]
        key = AQAmFnxkwo4WAxAAPpMEpIOfTvgc6jAQBKlf8A==

查看
# ceph auth get client.guofs

删除
# ceph auth rm client.guofs

在用户机上操作

# mount -t ceph -o mds_namespace=guo-fs,name=guofs,secret=AQAmFnxkwo4WAxAAPpMEpIOfTvgc6jAQBKlf8A== 10.2.20.90:6789:/tp1 /root/tp1
# df -Th | grep tp1
10.2.20.90:6789:/tp1 ceph       19G     0   19G   0% /root/tp1

4.7 rbd块存储配置

rbd块存储不需要特别的服务进程,通过mon进程可直接访问。

4.7.1 创建块设备

创建rbd设备使用的存储池

# ceph osd pool create rbd01_pool 64 64
# ceph osd pool application enable rbd01_pool rbd
# rbd pool init rbd01_pool
# ceph osd pool application get rbd01_pool
# ceph osd pool get rbd01_pool all

创建rbd类的pool的命名空间
rbd类pool的命名空间的作用:
在pool存储池上划分多个逻辑区域,不同区域间的用户是隔离的,相同区域的多个用户是可以访问对方资源的。
rbd类pool默认没有命名空间。

# rbd namespace create rbd01_pool/ns1
# rbd namespace create rbd01_pool/ns2
# rbd namespace ls rbd01_pool
NAME
ns1 
ns2

建立块设备对像

rbd create --size 1024 --image-feature layering rbd01_pool/ns1/disk11
rbd create --size 1024 --image-feature layering rbd01_pool/ns1/disk21
rbd create --size 1024 --image-feature layering rbd01_pool/ns2/disk11
rbd create --size 1024 --image-feature layering rbd01_pool/ns2/disk21

查看块对像

# rbd list  rbd01_pool/ns1  --long
NAME    SIZE   PARENT  FMT  PROT  LOCK
disk11  1 GiB            2            
disk21  1 GiB            2 
           
# rbd list  rbd01_pool/ns2  --long
NAME    SIZE   PARENT  FMT  PROT  LOCK
disk11  1 GiB            2            
disk21  1 GiB            2  

# rbd info rbd01_pool/ns1/disk11
rbd image 'disk11':
        size 1 GiB in 256 objects
        order 22 (4 MiB objects)
        snapshot_count: 0
        id: 5f57d0156264
        block_name_prefix: rbd_data.5f57d0156264
        format: 2
        features: layering
        op_features: 
        flags: 
        create_timestamp: Sun Jun  4 12:58:54 2023
        access_timestamp: Sun Jun  4 12:58:54 2023
        modify_timestamp: Sun Jun  4 12:58:54 2023

创建rbd设备用户

# ceph auth get-or-create client.user01 mon 'profile rbd' osd 'profile rbd pool=rbd01_pool namespace=ns1'
# ceph auth get client.user01
[client.user01]
        key = AQAGL3xkXzJ8GxAAOOj9RmDe5jb96koJTYEpwA==
        caps mon = "profile rbd"
        caps osd = "profile rbd pool=rbd01_pool namespace=ns1"

4.7.2 用户使用块设备

在用户主机上操作.

低版本的ceph-common中的rbd命令不支持pool的命名空间配置,需采用高版本的ceph-common来安装rbd.

# yum -y install ceph-common

认证配置

# mkdir /etc/ceph   

# cat > /etc/ceph/ceph.conf << 'EOF'
[global]
mon_host = 10.2.20.90:6789
EOF

# cat > /etc/ceph/ceph.client.user01.keyring << 'EOF'
[client.user01]
        key = AQAGL3xkXzJ8GxAAOOj9RmDe5jb96koJTYEpwA==
EOF

查看rbc设备

# rbd -n client.user01  -m 10.2.20.90 -k /etc/ceph/ceph.client.user01.keyring  ls rbd01_pool/ns1
disk11
disk21

# rbd -n client.user01  ls rbd01_pool/ns1
disk11
disk21

针对块设备执行写入性能测试

# rbd bench --io-type write rbd01_pool/ns1/disk11 -n client.user01
bench  type write io_size 4096 io_threads 16 bytes 1073741824 pattern sequential
  SEC       OPS   OPS/SEC   BYTES/SEC
    1      6208   6066.31    24 MiB/s
    2      6672   3192.38    12 MiB/s
    3      6928   2173.41   8.5 MiB/s
    4      9712   2317.31   9.1 MiB/s
    5     11840   2363.65   9.2 MiB/s
    6     14832   1730.69   6.8 MiB/s

挂载块存储

# rbd map rbd01_pool/ns1/disk11 -n client.user01
/dev/rbd0

查看已映射块设备

# rbd showmapped
id  pool        namespace  image   snap  device   
0   rbd01_pool  ns1        disk11  -     /dev/rbd0

格式化

# mkfs.xfs /dev/rbd0
# mkdir /tp2
# mount /dev/rbd0 /tp2
# df -h
...
/dev/rbd0           1014M   40M  975M   4% /tp2

取消块设备映射

rbd unmap rbd01_pool/ns1/disk11

4.8 rgw配置

4.8.1 存储池配置

对像存储池配置

ceph osd pool create .rgw.root 16 16 replicated
ceph osd pool create zone-test.rgw.control 16 16 replicated
ceph osd pool create zone-test.rgw.meta 16 16 replicated
ceph osd pool create zone-test.rgw.log 16 16 replicated
ceph osd pool create zone-test.rgw.buckets.index 16 16 replicated
ceph osd pool create zone-test.rgw.buckets.data 16 16 replicated
ceph osd pool create zone-test.rgw.buckets.non-ect 16 16 replicated

ceph osd pool application enable  .rgw.root rgw
ceph osd pool application enable  zone-test.rgw.control rgw
ceph osd pool application enable  zone-test.rgw.meta rgw
ceph osd pool application enable  zone-test.rgw.log rgw
ceph osd pool application enable  zone-test.rgw.buckets.index rgw
ceph osd pool application enable  zone-test.rgw.buckets.data rgw
ceph osd pool application enable  zone-test.rgw.buckets.non-ect rgw

realm配置

radosgw-admin realm create --rgw-realm=realm-test --default

zonegroup配置

radosgw-admin zonegroup create --rgw-zonegroup=zonegroup-test --endpoints=10.2.20.90:80 --default --master

zone配置

radosgw-admin zone create --rgw-zone=zone-test --rgw-zonegroup=zonegroup-test  --endpoints=10.2.20.90:80 --default --master

period更新

radosgw-admin period update --commit

4.8.2 rgw进程配置

在ceph-mon1节点上面操作

4.8.2.1 keyring配置
配置实例名称变量
# instance_name=rgw1

新增keyring存放目录
# mkdir -p /var/lib/ceph/radosgw/ceph-radosgw.${instance_name}

创建rgw服务需要的keyring
# ceph auth get-or-create client.radosgw.${instance_name} osd 'allow rwx' mon 'allow rw' -o /var/lib/ceph/radosgw/ceph-radosgw.${instance_name}/keyring

配置权限
# chown -R ceph:ceph /var/lib/ceph/radosgw

查看cephx
# ceph auth get client.radosgw.${instance_name}
[client.radosgw.rgw1]
        key = AQAwQ3xkHy6/EBAAKQlW/7WXpt7HyxiOdcIv8w==
        caps mon = "allow rw"
        caps osd = "allow rwx"
4.8.2.2 rgw服务配置

ceph.conf添加如下信息

[client.radosgw.rgw1]
host = ceph-mon1
rgw_frontends = "beast port=80"
rgw_enable_usage_log = true
keyring = /var/lib/ceph/radosgw/ceph-radosgw.rgw1/keyring
rgw_realm = "realm-test"
rgw_zonegroup = "zonegroup-test"
rgw_zone = "zone-test"
rgw_verify_ssl = false

更新配置ceph.conf到各节点

ansible ceph -m copy -a "src=ceph.conf dest=/etc/ceph/"

安装rgw服务

yum -y install ceph-radosgw

启动服务

# systemctl enable ceph-radosgw@radosgw.${instance_name}
# systemctl start ceph-radosgw.target

查看进程

# ps -ef | grep radosgw
ceph       12853       1  5 00:32 ?        00:00:00 /usr/bin/radosgw -f --cluster ceph --name client.radosgw.sr1 --setuser ceph --setgroup ceph

4.8.3 rgw测试

在管理机上建立rgw用户

# radosgw-admin user create --uid="guofs" --display-name="test"
# radosgw-admin user info --uid="guofs" | grep access_key -A1
            "access_key": "LLOGCYL0FAVR2K4YFZB8",
            "secret_key": "FbkyDqNGumDob5n54NRMtaYskvrVQgRrddHRivcS"

在用户主机上配置.
测试工具
https://github.com/peak/s5cmd

外部主机上操作如下

export AWS_ENDPOINT=10.2.20.90:80
export AWS_ACCESS_KEY_ID=LLOGCYL0FAVR2K4YFZB8
export AWS_SECRET_ACCESS_KEY=FbkyDqNGumDob5n54NRMtaYskvrVQgRrddHRivcS

在上传文件前,需先建立存储桶。

# s5cmd --endpoint-url http://$AWS_ENDPOINT mb s3://test01
# s5cmd --endpoint-url http://$AWS_ENDPOINT mb s3://test02

查看存储桶

# s5cmd --endpoint-url http://$AWS_ENDPOINT ls
2023/06/04 10:36:59  s3://test01
2023/06/04 10:37:08  s3://test02

上传

# echo "hello rgw" > /tmp/test.txt
# s5cmd --endpoint-url http://$AWS_ENDPOINT cp /tmp/test.txt s3://test01

查看文件列表

# s5cmd --endpoint-url http://$AWS_ENDPOINT ls s3://test01
2023/06/04 10:37:44                10 test.txt

下载

# s5cmd --endpoint-url http://$AWS_ENDPOINT cp s3://test01/test.txt ./
cp s3://test01/test.txt test.txt
# ll
total 4
-rw-r--r-- 1 root root 10 Jun  4 18:45 test.txt

五、计算层/k8s/istio

节点说明

节点os配置ip角色
mgmRocky9.12vCPU,RAM4GB,HD:8GB10.2.20.59/192.168.3.x管理节点,ssh免密
k8s-master1Rocky9.14vCPU,RAM4GB,HD:32GB10.2.20.110/192.168.3.x主控
k8s-node1Rocky9.14vCPU,RAM4GB,HD:32GB10.2.20.111/192.168.3.xworker
k8s-node2Rocky9.14vCPU,RAM4GB,HD:32GB10.2.20.112/192.168.3.xworker

K8S版本:v1.27.2

5.1 K8S节点配置

5.1.1 基础配置

k8s所有节点配置此部分

5.1.1.1 基础包和内核参数

配置hosts文件

cat >> /etc/hosts << 'EOF'
10.2.20.110     k8s-master1
10.2.20.111     k8s-node1
10.2.20.112     k8s-node2
EOF

基础配置及软件包

swapoff -a
sed -i '/swap/s/^/#/' /etc/fstab
free -m
systemctl stop firewalld && systemctl disable firewalld
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
ip_tables
iptable_filter
overlay
EOF
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sysctl --system
yum -y install epel-release
yum -y install bash-completion net-tools gcc wget curl telnet tree lrzsz iproute zip
5.1.1.2 容器运行时配置

容器运行时有docker和cri-o两种常用。
本文采用cri-o
https://github.com/cri-o/cri-o/

安装cri-o

yum -y install curl jq tar
curl https://raw.githubusercontent.com/cri-o/cri-o/main/scripts/get | bash -s -- -a amd64
systemctl enable --now crio.service
systemctl start crio

配置cri-o

# cat /etc/crictl.yaml
runtime-endpoint: unix:///var/run/crio/crio.sock
image-endpoint: unix:///var/run/crio/crio.sock
timeout: 10
debug: false

# vi /etc/crio/crio.conf
[crio.image]
pause_image = "registry.aliyuncs.com/google_containers/pause:3.9"

# systemctl restart crio

测试

# crictl --runtime-endpoint unix:///run/crio/crio.sock version
Version:  0.1.0
RuntimeName:  cri-o
RuntimeVersion:  1.27.0
RuntimeApiVersion:  v1

# crio --version
crio version 1.27.0
Version:        1.27.0
GitCommit:      844b43be4337b72a54b53518667451c975515d0b
GitCommitDate:  2023-06-03T07:36:19Z
GitTreeState:   dirty
BuildDate:      1980-01-01T00:00:00Z
GoVersion:      go1.20.4
Compiler:       gc
Platform:       linux/amd64
Linkmode:       static
BuildTags:      
  static
  netgo
  osusergo
  exclude_graphdriver_btrfs
  exclude_graphdriver_devicemapper
  seccomp
  apparmor
  selinux
LDFlags:          unknown
SeccompEnabled:   true
AppArmorEnabled:  false
5.1.1.3 kubectl kubelet kubeadm安装

配置阿里kubernetes源

# cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/    
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

版本查看

# yum -y makecache
# yum list kubelet --showduplicates | sort -r
...
kubelet.x86_64                       1.27.2-0                        kubernetes 
kubelet.x86_64                       1.27.2-0                        @kubernetes
kubelet.x86_64                       1.27.1-0                        kubernetes 
kubelet.x86_64                       1.27.0-0                        kubernetes 
...

安装kubectl kubelet kubeadm ,默认安装最新版

# yum -y install kubectl kubelet kubeadm

提示:
在各节点安装k8s成功后再“systemctl enable kubelet”

5.1.1.4 k8s系统镜像准备

在配置master和worker节点时,会从公网拉取k8s系统镜像。
可将这些镜像提前pull到节点本地。
查看k8s系统镜像列表

# kubeadm config images list --kubernetes-version=1.27.2  --image-repository="registry.aliyuncs.com/google_containers" 
W0604 23:32:32.215609   11292 images.go:80] could not find officially supported version of etcd for Kubernetes v1.27.2, falling back to the nearest etcd version (3.5.7-0)
registry.aliyuncs.com/google_containers/kube-apiserver:v1.27.2
registry.aliyuncs.com/google_containers/kube-controller-manager:v1.27.2
registry.aliyuncs.com/google_containers/kube-scheduler:v1.27.2
registry.aliyuncs.com/google_containers/kube-proxy:v1.27.2
registry.aliyuncs.com/google_containers/pause:3.9
registry.aliyuncs.com/google_containers/etcd:3.5.7-0
registry.aliyuncs.com/google_containers/coredns:v1.10.1

pull镜像

kubeadm config images pull --kubernetes-version=1.27.2 --image-repository="registry.aliyuncs.com/google_containers"

查看

# crictl images
IMAGE                                                             TAG                 IMAGE ID            SIZE
registry.aliyuncs.com/google_containers/coredns                   v1.10.1             ead0a4a53df89       53.6MB
registry.aliyuncs.com/google_containers/etcd                      3.5.7-0             86b6af7dd652c       297MB
registry.aliyuncs.com/google_containers/kube-apiserver            v1.27.2             c5b13e4f7806d       122MB
registry.aliyuncs.com/google_containers/kube-controller-manager   v1.27.2             ac2b7465ebba9       114MB
registry.aliyuncs.com/google_containers/kube-proxy                v1.27.2             b8aa50768fd67       72.7MB
registry.aliyuncs.com/google_containers/kube-scheduler            v1.27.2             89e70da428d29       59.8MB
registry.aliyuncs.com/google_containers/pause                     3.9                 e6f1816883972       750kB

5.1.2 master节点配置

安装第一台master节点

# kubeadm init \
--kubernetes-version="1.27.2" \
--cri-socket="/var/run/crio/crio.sock" \
--control-plane-endpoint="10.2.20.110" \
--apiserver-advertise-address=10.2.20.110 \
--image-repository="registry.aliyuncs.com/google_containers" \
--service-cidr=10.10.0.0/16 \
--pod-network-cidr="10.244.0.0/16" \
--ignore-preflight-errors=Swap \
--upload-certs
输出
...
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:
第二台master安装
  kubeadm join 10.2.20.110:6443 --token y1dzd6.rmojednvdy1ukevo \
        --discovery-token-ca-cert-hash sha256:4fc878964ab80032ee47e17cdf8a67700f1cc58a72af69d7ffa3b7e0ac0b2b09 \
        --control-plane --certificate-key 45d54477eeb7228c6728cbc343c1bb59cce539f3f65e83e6136a724a43b45ac9

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:
worker节点安装
kubeadm join 10.2.20.110:6443 --token y1dzd6.rmojednvdy1ukevo \
        --discovery-token-ca-cert-hash sha256:4fc878964ab80032ee47e17cdf8a67700f1cc58a72af69d7ffa3b7e0ac0b2b09 

配置kubelet开机引导

systemctl enable kubelet.service

配置kubectl

创建kubectl环境变量
# mkdir -p $HOME/.kube
# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
# sudo chown $(id -u):$(id -g) $HOME/.kube/config
# echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> /etc/profile
# source /etc/profile
执行下面命令,使kubectl可以自动补充
# echo "source <(kubectl completion bash)" >> ~/.bash_profile
# source .bash_profile

测试

# kubectl version --short
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.27.2
Kustomize Version: v5.0.1
Server Version: v1.27.2

# kubectl get node
NAME          STATUS   ROLES           AGE     VERSION
k8s-master1   Ready    control-plane   8m45s   v1.27.2

5.1.3 worker节点配置

所有结节运行如下命令

# kubeadm join 10.2.20.110:6443 --token y1dzd6.rmojednvdy1ukevo \
        --discovery-token-ca-cert-hash sha256:4fc878964ab80032ee47e17cdf8a67700f1cc58a72af69d7ffa3b7e0ac0b2b09 
输出
...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

配置开机启动

systemctl enable kubelet.service

5.1.4 管理机mgm配置kubectl

在k8集群外安装k8s客户端命令kubectl.
创建kubectl环境变量

scp k8s-master1:/usr/bin/kubectl /usr/bin/
mkdir -p $HOME/.kube
scp k8s-master1:/etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
mkdir /etc/kubernetes
scp k8s-master1:/etc/kubernetes/admin.conf /etc/kubernetes/
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> /etc/profile
source /etc/profile

kubectl可以自动补充

echo "source <(kubectl completion bash)" >> ~/.bash_profile
source .bash_profile

测试

# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS    MESSAGE                         ERROR
controller-manager   Healthy   ok                              
scheduler            Healthy   ok                              
etcd-0               Healthy   {"health":"true","reason":""}   

# kubectl get node
NAME          STATUS   ROLES           AGE   VERSION
k8s-master1   Ready    control-plane   33m   v1.27.2
k8s-node1     Ready    <none>          20m   v1.27.2
k8s-node2     Ready    <none>          19m   v1.27.2

# kubectl get pod -A
NAMESPACE     NAME                                  READY   STATUS    RESTARTS   AGE
kube-system   coredns-7bdc4cb885-hcl6t              1/1     Running   0          16m
kube-system   coredns-7bdc4cb885-hvmgs              1/1     Running   0          16m
kube-system   etcd-k8s-master1                      1/1     Running   0          17m
kube-system   kube-apiserver-k8s-master1            1/1     Running   0          16m
kube-system   kube-controller-manager-k8s-master1   1/1     Running   0          16m
kube-system   kube-proxy-464dg                      1/1     Running   0          16m
kube-system   kube-proxy-7vtxg                      1/1     Running   0          2m53s
kube-system   kube-proxy-crfkg                      1/1     Running   0          3m52s
kube-system   kube-scheduler-k8s-master1            1/1     Running   0          16m

5.1.5 访问私有仓库harbor配置

在管理机上操作。
k8s各节点安装成功后再配置此项

5.1.5.1 k8s/crictl访问私有仓库配置

私有CA根证书添加到k8s所有节点的根证书链中

ansible k8s -m shell -a "wget http://10.2.20.59/ssl/ca.pem -O /tmp/ca.pem"
ansible k8s -m shell -a "cat /tmp/ca.pem >> /etc/pki/tls/certs/ca-bundle.crt"

创建config.json用于存储私仓用户和密码。

# cat > config.json << 'EOF'
{
        "auths": {
                "harbor.demo.com": {
                        "auth": "YWRtaW46MTIzNDU2NzgK"
                }
        }
}
EOF
# ansible k8s -m copy -a "src=config.json dest=/var/lib/kubelet/"
# ansible k8s -m shell -a "systemctl restart kubelet.service"

配置cri-o/crictl使用config.json

# vi crio.conf
...
[crio.image]
global_auth_file = "/var/lib/kubelet/config.json"

# ansible k8s -m copy -a "src=crio.conf dest=/etc/crio/"
# ansible k8s -m shell -a "systemctl restart crio"

提示:
上述办法是将私仓的帐号存储在config.json,供所有命空间使用。

5.1.5.2 测试

crictl拉取镜像(在k8s某节点上测试)

# crictl pull harbor.demo.com/web/busybox:v2.1
Image is up to date for harbor.demo.com/web/busybox@sha256:0152995fd9b720acfc49ab88e48bc9f4509974fb17025896740ae02396e37388

k8s从私仓拉取镜像

# kubectl create namespace test

# cat app-c19-1.yaml
apiVersion: apps/v1 
kind: Deployment 
metadata:
  name: app-test
  namespace: test
spec:
  selector:
    matchLabels:
      app: app-test
  replicas: 1
  template:
    metadata:
      name: app-test
      namespace: test
      labels:
        app: app-test
    spec:
      containers:
      - name: http
        image: harbor.demo.com/test/centos:v0.1.1
        imagePullPolicy: IfNotPresent
        ports:
        - name: port-test-01
          containerPort: 8080
          protocol: TCP

# kubectl -n test get all
NAME                            READY   STATUS    RESTARTS   AGE
pod/app-test-55f5b45c96-7fg8g   1/1     Running   0          17s

NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/app-test   1/1     1            1           17s

NAME                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/app-test-55f5b45c96   1         1         1       17s
5.1.5.3 secrets存储私仓帐号

"5.1.5.1"是将私仓的帐号存储在config.json,供所有命空间使用。
也可以使用secrets存储私仓的帐号。

# kubectl create secret docker-registry harbor-test \
  --docker-server="harbor.demo.com" \
  --docker-username="admin" \
  --docker-password="12qwaszx+pp"

# cat app-c19-2.yaml
apiVersion: apps/v1 
kind: Deployment 
metadata:
  name: app-test
  namespace: test
spec:
  selector:
    matchLabels:
      app: app-test
  replicas: 1
  template:
    metadata:
      name: app-test
      namespace: test
      labels:
        app: app-test
    spec:
      imagePullSecrets:
        - name: harbor-test
      containers:
      - name: http
        image: harbor.demo.com/test/centos:v0.1.2
        imagePullPolicy: IfNotPresent
        ports:
        - name: port-test-01
          containerPort: 8080
          protocol: TCP

# kubectl apply -f app-c19-2.yaml

# kubectl -n test get pod
NAME                       READY   STATUS    RESTARTS   AGE
app-test-6644fb79b-g4njz   1/1     Running   0          18s

采有imagePullSecrets指定私仓secrets.

5.2 网络配置calico

Kubernetes通过CNI协议支持多种网络模型,如Calico、Flannel、Open vSwitch、Weave、Cilium等。

本文以calico为例。

5.2.1 Calico安装

https://github.com/projectcalico/cni-plugin
https://github.com/projectcalico/calico
https://docs.tigera.io/calico/latest/getting-started/kubernetes/quickstart

本文采用Calico插件,是一个纯三层的方案,不需要 Overlay,基于 Iptables 增加了策略配置。

Calico特点

安装

# wget https://raw.githubusercontent.com/projectcalico/calico/v3.25.1/manifests/calico.yaml
# cat calico.yaml | grep "image:"
          image: docker.io/calico/cni:v3.26.0
          image: docker.io/calico/cni:v3.26.0
          image: docker.io/calico/node:v3.26.0
          image: docker.io/calico/node:v3.26.0
          image: docker.io/calico/kube-controllers:v3.26.0
这镜像转存到私仓,并修改calico.yaml中的镜像地址,将docker.io改为harbor.demo.com
# cat calico.yaml | grep "image: "
          image: harbor.demo.com/calico/cni:v3.26.0
          image: harbor.demo.com/calico/cni:v3.26.0
          image: harbor.demo.com/calico/node:v3.26.0
          image: harbor.demo.com/calico/node:v3.26.0
          image: harbor.demo.com/calico/kube-controllers:v3.26.0
安装
# kubectl apply -f  calico.yaml

查看

# kubectl get pod --all-namespaces
NAMESPACE     NAME                                      READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-868d576d4-7jrwh   1/1     Running   0          12m
kube-system   calico-node-ld8gv                         1/1     Running   0          17m
kube-system   calico-node-s5x7q                         1/1     Running   0          17m
kube-system   calico-node-zfr76                         1/1     Running   0          17m
kube-system   coredns-7bdc4cb885-hcl6t                  1/1     Running   0          4h20m
kube-system   coredns-7bdc4cb885-hvmgs                  1/1     Running   0          4h20m
kube-system   etcd-k8s-master1                          1/1     Running   0          4h20m
kube-system   kube-apiserver-k8s-master1                1/1     Running   0          4h20m
kube-system   kube-controller-manager-k8s-master1       1/1     Running   0          4h20m
kube-system   kube-proxy-464dg                          1/1     Running   0          4h20m
kube-system   kube-proxy-7vtxg                          1/1     Running   0          4h6m
kube-system   kube-proxy-crfkg                          1/1     Running   0          4h7m
kube-system   kube-scheduler-k8s-master1                1/1     Running   0          4h20m

配置cri-o采用cni插件

# tree /etc/cni/net.d/
/etc/cni/net.d/
├── 10-calico.conflist
├── 11-crio-ipv4-bridge.conflist
└── calico-kubeconfig

# tree /opt/cni/bin/
/opt/cni/bin/
├── bandwidth
├── bridge
├── calico
├── calico-ipam
├── dhcp
├── dummy
├── firewall
├── flannel
├── host-device
├── host-local
├── install
├── ipvlan
├── loopback
├── macvlan
├── portmap
├── ptp
├── sbr
├── static
├── tap
├── tuning
├── vlan
└── vrf


可修改cri-o配置来识别calico网络
# vi /etc/crio/crio.conf
[crio.network]

# The default CNI network name to be selected. If not set or "", then
# CRI-O will pick-up the first one found in network_dir.
# cni_default_network = ""

# Path to the directory where CNI configuration files are located.
network_dir = "/etc/cni/net.d/"

# Paths to directories where CNI plugin binaries are located.
plugin_dirs = [
      "/opt/cni/bin/",
]

# ansible k8s -m copy -a "src=crio.conf dest=/etc/crio/"
# ansible k8s -m shell -a "systemctl restart crio"

5.2.2 Calicoctl工具

calicoctl 是 Calico 客户端管理工具。 可以方便的管理 calico 网络,配置和安全策略,calicoctl 命令行提供了许多资源管理命令,允许您创建,修改,删除和查看不同的 Calico 资源,网络资源包含:node,bgpPeer,hostEndpoint,workloadEndpoint,ipPool,policy,profile等。

提示
1。calico版本与calicoctl版本要相同
2。在master节点安装此命令

安装

# curl -L https://github.com/projectcalico/calico/releases/latest/download/calicoctl-linux-amd64 -o calicoctl
# mv calicoctl /sbin/
# chmod +x /sbin/calicoctl

# calicoctl version
Client Version:    v3.26.0
Git commit:        8b103f46f
Cluster Version:   v3.25.1
Cluster Type:      k8s,bgp,kubeadm,kdd

# mkdir /etc/calico

# vi /etc/calico/calicoctl.cfg 
apiVersion: projectcalico.org/v3
kind: CalicoAPIConfig
metadata:
spec:
  datastoreType: "kubernetes"
  kubeconfig: "/root/.kube/config"

# calicoctl node status
Calico process is running.

IPv4 BGP status
+--------------+-------------------+-------+----------+-------------+
| PEER ADDRESS |     PEER TYPE     | STATE |  SINCE   |    INFO     |
+--------------+-------------------+-------+----------+-------------+
| 192.168.3.13 | node-to-node mesh | up    | 08:19:18 | Established |
| 192.168.3.8  | node-to-node mesh | up    | 08:19:09 | Established |
+--------------+-------------------+-------+----------+-------------+

IPv6 BGP status
No IPv6 peers found.

5.3 metrics-server配置

https://github.com/kubernetes-sigs/metrics-server
https://github.com/kubernetes-sigs/metrics-server/releases

Metrics Server 是 Kubernetes 内置自动缩放管道的可扩展、高效的容器资源指标来源,K8S 资源指标监控,如pod的内存或cpu使用情况。

Metrics Server 从 Kubelets 收集资源指标,并通过Metrics API在 Kubernetes apiserver 中公开它们,以供 HPA(Horizo​​ntal Pod Autoscaler,水平自动缩放)
和VPA(Vertical Pod Autoscaler,垂直自动缩放)使用。Metrics API 也可以通过 访问kubectl top,从而更容易调试自动缩放管道。

kube-apiserver 必须启用聚合层,即在/etc/kubernetes/manifests/kube-apiserver.yaml添加“enable-aggregator-routing=true”,如下:

# vi /etc/kubernetes/manifests/kube-apiserver.yaml
...
spec:
  containers:
  - command:
    - kube-apiserver
    - --advertise-address=10.2.20.110
    - --allow-privileged=true
    - --authorization-mode=Node,RBAC
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --enable-admission-plugins=NodeRestriction
    - --enable-bootstrap-token-auth=true
    - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
    - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
    - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
    - --etcd-servers=https://127.0.0.1:2379
    - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
    - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
    - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
    - --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
    - --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
    - --requestheader-allowed-names=front-proxy-client
    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    - --requestheader-extra-headers-prefix=X-Remote-Extra-
    - --requestheader-group-headers=X-Remote-Group
    - --requestheader-username-headers=X-Remote-User
    - --secure-port=6443
    - --service-account-issuer=https://kubernetes.default.svc.cluster.local
    - --service-account-key-file=/etc/kubernetes/pki/sa.pub
    - --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
    - --service-cluster-ip-range=10.10.0.0/16
    - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
    - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
    - --enable-aggregator-routing=true
    image: registry.aliyuncs.com/google_containers/kube-apiserver:v1.27.2
    imagePullPolicy: IfNotPresent
...
# systemctl restart kubelet.service

安装

# wget github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.3/components.yaml

添加kubelet-insecure-tls,如下:
# vi components.yaml 
...
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls
        image: harbor.demo.com/metrics-server/metrics-server:v0.6.3
...

# kubectl apply -f components.yaml

测试

# kubectl top node
NAME              CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
c8-k8s-master01   242m         6%     1454Mi          40%       
c8-k8s-worker01   92m          2%     687Mi           18%       
c8-k8s-worker02   102m         2%     725Mi           19%   

# kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"
{"kind":"NodeMetricsList","apiVersion":"metrics.k8s.io/v1beta1","metadata":{},"items":[{"metadata":{"name":"k8s-master1","creationTimestamp":"2023-06-05T08:10:10Z","labels":{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/os":"linux","kubernetes.io/arch":"amd64","kubernetes.io/hostname":"k8s-master1","kubernetes.io/os":"linux","node-role.kubernetes.io/control-plane":"","node.kubernetes.io/exclude-from-external-load-balancers":""}},"timestamp":"2023-06-05T08:09:56Z","window":"20.059s","usage":{"cpu":"283196769n","memory":"1611636Ki"}},{"metadata":{"name":"k8s-node1","creationTimestamp":"2023-06-05T08:10:10Z","labels":{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/os":"linux","kubernetes.io/arch":"amd64","kubernetes.io/hostname":"k8s-node1","kubernetes.io/os":"linux"}},"timestamp":"2023-06-05T08:09:54Z","window":"20.042s","usage":{"cpu":"104153377n","memory":"1059760Ki"}},{"metadata":{"name":"k8s-node2","creationTimestamp":"2023-06-05T08:10:10Z","labels":{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/os":"linux","kubernetes.io/arch":"amd64","kubernetes.io/hostname":"k8s-node2","kubernetes.io/os":"linux"}},"timestamp":"2023-06-05T08:09:55Z","window":"20.042s","usage":{"cpu":"104032381n","memory":"976512Ki"}}]}    

5.4 metallb配置

https://github.com/google/metallb
https://metallb.universe.tf/installation/
https://metallb.universe.tf/configuration/

MetalLB 是一个负载均衡器,专门解决裸金属 Kubernetes 集群中无法使用 LoadBalancer 类型服务的痛点。MetalLB 使用标准化的路由协议,
以便裸金属 Kubernetes 集群上的外部服务也尽可能地工作。

在云厂商提供的 Kubernetes 集群中,Service 声明使用 LoadBalancer时,云平台会自动分配一个负载均衡器的IP地址给你,应用可以通过
这个地址来访问。

MetalLB 会在 Kubernetes 内运行,监控服务对象的变化,一旦监测到有新的 LoadBalancer 服务运行,并且没有可申请的负载均衡器之后,
就会完成地址分配和外部声明两部分的工作。

网络宣告方式

1。Layer 2 模式
ayer 2 模式下,每个 Service 会有集群中的一个 Node 来负责。服务的入口流量全部经由单个节点,然后该节点的 Kube-Proxy 会把流量再转
发给服务的 Pods。也就是说,该模式下 MetalLB 并没有真正提供负载均衡器。尽管如此,MetalLB 提供了故障转移功能,如果持有 IP 的节点
出现故障,则默认 10 秒后即发生故障转移,IP 会被分配给其它健康的节点。

优点:
是它的通用性:它可以在任何以太网网络上运行,不需要特殊的硬件。

2。BGP 模式
BGP 模式下,集群中所有node都会跟上联路由器建立BGP连接,并且会告知路由器应该如何转发service的流量。

优点:
BGP模式下才是一个真正的 LoadBalancer,通过BGP协议正确分布流量,不再需要一个Leader节点。

5.4.1 安装

# wget https://raw.githubusercontent.com/metallb/metallb/v0.13.10/config/manifests/metallb-native.yaml 
# cat metallb-native.yaml | grep image:
        image: quay.io/metallb/controller:v0.13.10
        image: quay.io/metallb/speaker:v0.13.10
将镜像转存在私有仓库
# cat metallb-native.yaml | grep image:
        image: harbor.demo.com/metallb/controller:v0.13.10
        image: harbor.demo.com/metallb/speaker:v0.13.10

# kubectl apply -f ./metallb-native.yaml

# kubectl get all -n metallb-system
NAME                              READY   STATUS    RESTARTS   AGE
pod/controller-746c786cf9-hdcvp   0/1     Running   0          17s
pod/speaker-224m6                 0/1     Running   0          17s
pod/speaker-cqhnr                 0/1     Running   0          17s
pod/speaker-s2fq6                 0/1     Running   0          17s

NAME                      TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
service/webhook-service   ClusterIP   10.10.237.41   <none>        443/TCP   18s

NAME                     DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
daemonset.apps/speaker   3         3         0       3            0           kubernetes.io/os=linux   18s

NAME                         READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/controller   0/1     1            0           18s

NAME                                    DESIRED   CURRENT   READY   AGE
replicaset.apps/controller-746c786cf9   1         1         0       17s

5.4.2 配置LB网络

# cat metallb-cm.yaml
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: first-pool
  namespace: metallb-system
spec:

  addresses:
  - 192.168.3.180-192.168.3.200


---

apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: example
  namespace: metallb-system
spec:
  ipAddressPools:
  - first-pool

# kubectl apply -f  metallb-cm.yaml

# kubectl get IPAddressPool,L2Advertisement -n metallb-system
NAME                                  AUTO ASSIGN   AVOID BUGGY IPS   ADDRESSES
ipaddresspool.metallb.io/first-pool   true          false             ["192.168.3.180-192.168.3.200"]

NAME                                 IPADDRESSPOOLS   IPADDRESSPOOL SELECTORS   INTERFACES
l2advertisement.metallb.io/example   ["first-pool"]                         

5.4.3 测试

pod/svc样例,配置为:“type: LoadBalancer”

# cat app-c19-3.yaml
apiVersion: apps/v1 
kind: Deployment 
metadata:
  name: app-test
  namespace: test
spec:
  selector:
    matchLabels:
      app: app-test
  replicas: 1
  template:
    metadata:
      name: app-test
      namespace: test
      labels:
        app: app-test
    spec:
      containers:
      - name: http
        image: harbor.demo.com/test/centos:v0.1.1
        imagePullPolicy: IfNotPresent
        ports:
        - name: port-test-01
          containerPort: 8080
          protocol: TCP

---

apiVersion: v1
kind: Service
metadata:
  labels:
    app: app-test
  name: app-test
  namespace: test
spec:
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800
  selector:
    app: app-test
  ports:
    - name: port01
      port: 7071
      targetPort: 8080
      protocol: TCP
  type: LoadBalancer

应用及查看

# kubectl apply -f app-c19-3.yaml 

# kubectl -n test get svc
NAME       TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)          AGE
app-test   LoadBalancer   10.10.39.255   192.168.3.180   7071:31874/TCP   95s

这样可以直接使用LB IP来访问

# curl -X POST http://192.168.3.180:7071/test
POST    Hello,world-v10
hello,method is post

5.5 Ingress配置

pod对外提供服务。主要有如下两种方式。

其一:pod直接对外方式

类型说明
hostNetwork:truepod直接使用物理节点的网络命名空间资源
hostPort: 8088pod仅使用物理节点的节点自身IP和某个端口

其二:svc方式(svc转发pod端口流量,原理是基于iptables和ipvs)

类型说明
ClusterIP情况一:svc只供svc所在网络内部访问
情况二:通过代理方式(如kubectl proxy、kube proxy)将svc服务代理出去
NodePort使用节点ip和端口将svc服务暴露出去
LoadBalancer此时需需metalLB支持
情况一:svc直接使用外部网络(非节点网络)将svc暴露出去
情况二:采用Ingress Controller,将svc服务代理出去.此时svc采用ClusterIP方式,而Ingress采用LB方式

其中Ingress Controller是主要方式。

5.5.1 Ingress Controller

Service他的工作原理是基于iptables和ipvs的,iptables和ipvs是四层代理的,Layer4层代理有个缺陷就是他只是工作在tcp/ip协议栈,而无法处理Layer7流量,基于此需求背景而产生Ingress。

Ingress 提供了负载均衡器的典型特性:HTTP 路由、粘性会话、SSL 终止、SSL直通、TCP 和 UDP 负载平衡等。

Ingress 只是一个统称,其由 Ingress 和 Ingress Controller 两部分组成,如下。

类型说明
ingress resourcesingress规则,这个就是一个类型为Ingress的k8s api对象
ingress controller核心是一个deployment,实现方式有很多,比如nginx, Contour, Haproxy, trafik, Istio,其中service的类型用LoadBalancer方式

5.5.2 nginx ingress

参考
http://github.com/nginxinc/kubernetes-ingress
https://docs.nginx.com/nginx-ingress-controller/installation/installation-with-manifests/
https://hub.docker.com/r/nginx/nginx-ingress
https://docs.nginx.com/nginx-ingress-controller
https://www.nginx.com/products/nginx-ingress-controller

这是 NGINX 公司开发的官方产品,它也有一个基于 NGINX Plus 的商业版。NGINX 的控制器具有很高的稳定性、持续的向后兼容性,且没有任何第三方模块。

5.5.2.1 安装

下载

# wget https://github.com/nginxinc/kubernetes-ingress/archive/refs/heads/main.zip
# unzip main.zip
# cd kubernetes-ingress-main/deployments

安装

kubectl apply -f common/ns-and-sa.yaml
kubectl apply -f rbac/rbac.yaml
kubectl apply -f common/nginx-config.yaml
kubectl apply -f common/ingress-class.yaml
kubectl apply -f common/crds/k8s.nginx.org_virtualservers.yaml
kubectl apply -f common/crds/k8s.nginx.org_virtualserverroutes.yaml
kubectl apply -f common/crds/k8s.nginx.org_transportservers.yaml
kubectl apply -f common/crds/k8s.nginx.org_policies.yaml
kubectl apply -f common/crds/k8s.nginx.org_globalconfigurations.yaml
kubectl apply -f deployment/nginx-ingress.yaml
kubectl apply -f service/loadbalancer.yaml

查看

# kubectl get ingressclass
NAME    CONTROLLER                     PARAMETERS   AGE
nginx   nginx.org/ingress-controller   <none>       25s

# kubectl get svc --namespace=nginx-ingress
NAME            TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)                      AGE
nginx-ingress   LoadBalancer   10.10.122.65   192.168.3.180   80:31582/TCP,443:32381/TCP   23s

# kubectl get pods --namespace=nginx-ingress
NAME                             READY   STATUS    RESTARTS   AGE
nginx-ingress-6f6b89c69b-nxgq4   1/1     Running   0          39s
5.5.2.2 测试

命名空间定义

kubectl create namespace test
kubectl config set-context kubernetes-admin@kubernetes --namespace='test'

创建pod/svc

# cat app-c24.yaml 
apiVersion: apps/v1 
kind: Deployment 
metadata:
  name: app-test
  namespace: test
spec:
  selector:
    matchLabels:
      app: app-test
  replicas: 1
  template:
    metadata:
      name: app-test
      namespace: test
      labels:
        app: app-test
    spec:
      containers:
      - name: http
        image: harbor.demo.com/test/centos:v0.1.1
        imagePullPolicy: IfNotPresent
        ports:
        - name: port-test-01
          containerPort: 8080
          protocol: TCP

---

apiVersion: v1
kind: Service
metadata:
  labels:
    app: app-test
  name: app-test
  namespace: test
spec:
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800
  selector:
    app: app-test
  ports:
    - name: port01
      port: 7071
      targetPort: 8080
      protocol: TCP
  type: ClusterIP

# kubectl apply -f app-c24.yaml

http测试

# cat ingress-02.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: demo
  namespace: test
spec:
  ingressClassName: nginx
  rules:
  - host: www.test.com 
    http:
      paths:
      - backend:
          service:
            name: app-test
            port:
              number: 7071
        path: /
        pathType: Prefix

# kubectl apply -f ingress-02.yaml

# curl http://www.test.com/test
GET     Hello,world-v10
hello,method is get

https测试

# kubectl delete -f ingress-02.yaml
# kubectl create secret tls web-ssl --cert=./web.pem --key=./web-key.pem

# cat ingress-01.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: demo
  namespace: test
spec:
  ingressClassName: nginx
  tls:
  - hosts:
      - www.demo.com
      - www.test.com
    secretName: web-ssl
  rules:
  - host: www.test.com 
    http:
      paths:
      - backend:
          service:
            name: app-test
            port:
              number: 7071
        path: /
        pathType: Prefix

# kubectl apply -f ingress-01.yaml

# curl -v -L https://www.test.com/test
*   Trying 192.168.3.180:443...
* Connected to www.test.com (192.168.3.180) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
*  CAfile: /etc/pki/tls/certs/ca-bundle.crt
* TLSv1.0 (OUT), TLS header, Certificate Status (22):
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS header, Certificate Status (22):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS header, Finished (20):
* TLSv1.2 (IN), TLS header, Unknown (23):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.2 (IN), TLS header, Unknown (23):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS header, Unknown (23):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.2 (IN), TLS header, Unknown (23):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.2 (OUT), TLS header, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS header, Unknown (23):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: C=CN; ST=ZhuMaDian; L=HeNan; O=k8s; OU=System; CN=domain-ssl-test
*  start date: May 14 08:27:00 2023 GMT
*  expire date: Apr 20 08:27:00 2123 GMT
*  subjectAltName: host "www.test.com" matched cert's "*.test.com"
*  issuer: C=CN; ST=shenzhen; L=Gudong; O=k8s; OU=System; CN=Root-CA-gfs
*  SSL certificate verify ok.
* TLSv1.2 (OUT), TLS header, Unknown (23):
> GET /test HTTP/1.1
> Host: www.test.com
> User-Agent: curl/7.76.1
> Accept: */*
> 
* TLSv1.2 (IN), TLS header, Unknown (23):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.2 (IN), TLS header, Unknown (23):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
* TLSv1.2 (IN), TLS header, Unknown (23):
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Server: nginx/1.23.4
< Date: Mon, 05 Jun 2023 17:02:29 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 40
< Connection: keep-alive
< 
GET     Hello,world-v10
hello,method is get
* Connection #0 to host www.test.com left intact

http跳转https
# curl -L http://www.test.com/test
GET     Hello,world-v10
hello,method is get

5.5.3 istio ingress

https://istio.io/latest/zh/
建议大家查阅istio官网。

在 Kubernetes 下,对网络流量的管理只能到 Pod 级别,更细粒度的控制,依然得靠应用代码层面来支撑。也就是说,与业务无关的网络控制逻辑依然夹杂在程序员开发的业务代码中。在此背景下,为让程序员更专注于业务代码,服务网格类软件出现,如istio,让运维工作和开发工作更加各自专业本职工作。从某种意上讲,istio是k8s的深度延申。

Istio 使用功能强大的 Envoy 服务代理扩展了 Kubernetes,以建立一个可编程的、可感知的应用程序网络。Istio 与 Kubernetes 和传统工作负载一起使用,为复杂的部署带来了标准的通用流量管理、遥测和安全性。

Istio 由以下组件组成
1。Envoy
每个微服务的 Sidecar 代理,用于处理集群中服务之间以及从服务到外部服务的入口/出口流量。这些代理形成了一个安全的微服务网格,提供了一组丰富的功能,如发现、丰富的第 7 层路由、断路器、策略实施和遥测记录/报告功能。
2。Istiod
Istio 控制平面。它提供服务发现、配置和证书管理。它由以下子组件组成:
Pilot - 负责在运行时配置代理。
Citadel - 负责证书的颁发和轮换。
Galley - 负责在 Istio 中验证、摄取、聚合、转换和分发配置。
3。Operator
该组件提供用户友好的选项来操作 Istio 服务网格。
istio架构,istio官网提供

5.5.3.1 安装istio(从私仓安装)

https://istio.io/latest/zh/docs/setup/getting-started/#download

下载 Istio 发行版

# curl -L https://istio.io/downloadIstio | sh -
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   101  100   101    0     0    168      0 --:--:-- --:--:-- --:--:--   168
100  4541  100  4541    0     0   3699      0  0:00:01  0:00:01 --:--:--  9914

Downloading istio-1.17.2 from https://github.com/istio/istio/releases/download/1.17.2/istio-1.17.2-linux-amd64.tar.gz ...

Istio 1.17.2 Download Complete!

Istio has been successfully downloaded into the istio-1.17.2 folder on your system.

Next Steps:
See https://istio.io/latest/docs/setup/install/ to add Istio to your Kubernetes cluster.

To configure the istioctl client tool for your workstation,
add the /root/tp/istio-1.17.2/bin directory to your environment path variable with:
         export PATH="$PATH:/root/tp/istio-1.17.2/bin"

Begin the Istio pre-installation check by running:
         istioctl x precheck 

Need more information? Visit https://istio.io/latest/docs/setup/install/ 

# cp -fr bin/istioctl /sbin/

安装清单修改(从私仓安装)

导出清单
# istioctl manifest generate --set profile=demo --set components.cni.enabled=true > istio.yaml

查看配置信息中的镜像
# cat istio.yaml | grep image:
              image:
            image: "{{ annotation .ObjectMeta `sidecar.istio.io/proxyImage` .Values.global.proxy_init.image }}"
            image: "{{ .ProxyImage }}"
            image: "{{ annotation .ObjectMeta `sidecar.istio.io/proxyImage` .Values.global.proxy_init.image }}"
            image: "{{ .ProxyImage }}"
            image: "{{ annotation .ObjectMeta `sidecar.istio.io/proxyImage` .Values.global.proxy.image }}"
            image: "{{ .ProxyImage }}"
            image: "{{ annotation .ObjectMeta `sidecar.istio.io/proxyImage` .Values.global.proxy.image }}"
            image: "{{ .ProxyImage }}"
              image: busybox:1.28
            image: "{{ annotation .ObjectMeta `sidecar.istio.io/proxyImage` .Values.global.proxy.image }}"
            image: "{{ .ProxyImage }}"
          image: "docker.io/istio/install-cni:1.17.2"
        image: docker.io/istio/proxyv2:1.17.2
        image: docker.io/istio/proxyv2:1.17.2
        image: docker.io/istio/pilot:1.17.2

将上面的镜像转存在私仓,其中“{{ .ProxyImage }}”为边车注入镜像。
修改配置文件,如下
sed -i 's/{{ .ProxyImage }}/harbor.demo.com\/istio\/proxyv2:1.17.2/g' istio.yaml
sed -i 's/busybox:1.28/harbor.demo.com\/istio\/busybox:1.28/g' istio.yaml
sed -i 's/image: docker.io\/istio/image: harbor.demo.com\/istio/g' istio.yaml
sed -i 's/docker.io\/istio\/install-cni:1.17.2/harbor.demo.com\/istio\/install-cni:1.17.2/g' istio.yaml
查看
# cat istio.yaml | grep image:
              image:
            image: "{{ annotation .ObjectMeta `sidecar.istio.io/proxyImage` .Values.global.proxy_init.image }}"
            image: "harbor.demo.com/istio/proxyv2:1.17.2"
            image: "{{ annotation .ObjectMeta `sidecar.istio.io/proxyImage` .Values.global.proxy_init.image }}"
            image: "harbor.demo.com/istio/proxyv2:1.17.2"
            image: "{{ annotation .ObjectMeta `sidecar.istio.io/proxyImage` .Values.global.proxy.image }}"
            image: "harbor.demo.com/istio/proxyv2:1.17.2"
            image: "{{ annotation .ObjectMeta `sidecar.istio.io/proxyImage` .Values.global.proxy.image }}"
            image: "harbor.demo.com/istio/proxyv2:1.17.2"
              image: harbor.demo.com/istio/busybox:1.28
            image: "{{ annotation .ObjectMeta `sidecar.istio.io/proxyImage` .Values.global.proxy.image }}"
            image: "harbor.demo.com/istio/proxyv2:1.17.2"
          image: "harbor.demo.com/istio/install-cni:1.17.2"
        image: harbor.demo.com/istio/proxyv2:1.17.2
        image: harbor.demo.com/istio/proxyv2:1.17.2
        image: harbor.demo.com/istio/pilot:1.17.2

安装

# kubectl create namespace istio-system    //创建istio-system命名空间
# kubectl apply -f istio.yaml

验证安装是否成功
# istioctl verify-install -f istio.yaml  -n istio-system

查看

# kubectl -n istio-system get pod
# kubectl -n istio-system get deploy,ds
# kubectl -n istio-system get svc
5.5.3.2 边车代理注入

https://istio.io/latest/zh/docs/setup/additional-setup/sidecar-injection/
Pod 中注入 Istio Sidecar 的两种方法:使用 istioctl 手动注入或启用 Pod 所属命名空间的 Istio sidecar 注入器自动注入。当 Pod 所属命名空间启用自动注入后,自动注入器会使用准入控制器在创建 Pod 时自动注入代理配置。
自动注入配置

自动注入配置
# kubectl label namespace test istio-injection=enabled  --overwrite=true
namespace/test labeled

检查默认策略
在 istio-sidecar-injector configmap 中检查默认注入策略。
# kubectl -n istio-system get configmap istio-sidecar-injector -o jsonpath='{.data.config}' | grep policy:
policy: enabled

查看
# kubectl get namespace -L istio-injection
NAME                   STATUS   AGE    ISTIO-INJECTION
default                Active   24d    
istio-system           Active   166m   
kube-node-lease        Active   24d    
kube-public            Active   24d    
kube-system            Active   24d    
kubernetes-dashboard   Active   22d    
metallb-system         Active   5d5h   
test                   Active   16d    enabled

解除自动注入sidecar
# kubectl label namespace test istio-injection=disabled --overwrite=true
namespace/test labeled

测试

# cat 1pod.yaml 
apiVersion: apps/v1 
kind: Deployment 
metadata:
  name: app-test
  namespace: test
spec:
  selector:
    matchLabels:
      app: app-test
      ver: v1
  replicas: 1
  template:
    metadata:
      name: app-test
      namespace: test
      labels:
        app: app-test
        ver: v1
    spec:
      containers:
      - name: http
        image: harbor.demo.com/test/centos:v0.1.1
        imagePullPolicy: IfNotPresent
        ports:
        - name: port-test-01
          containerPort: 8080
          protocol: TCP
# kubectl apply -f 1pod.yaml

查看

# kubectl describe pod/app-test-6474687d88-2qthn 
...
Init Containers:
  istio-validation:
    Container ID:  cri-o://655a8fde38216d8bb183c8b45e533d924fa74106f3e5fb222d39cfe41f0215bf
    Image:         harbor.demo.com/istio/proxyv2:1.17.2
    ...
Containers:
  http:
    Container ID:   cri-o://0c2142ef3680f6c530b5b848ecf5e2ede8312c5e42ff21754031d4435284fde8
    Image:          harbor.demo.com/test/centos:v0.1.1
    ...
  istio-proxy:
    Container ID:  cri-o://a46c0a2deeeff68b8ca5b696903c1d6942b9949317ba99ccc3b251fbb0e6f203
    Image:         harbor.demo.com/istio/proxyv2:1.17.2

在pod模板中只有一个镜像,但启动模板后被注入两个镜像,且从私仓中拉取。

注入istio-proxy后,pod流量将被代理。

5.5.3.3 http测试

只简单http/https测试,流量治理测试请查看5.7部分。

查看istio入口地址

# kubectl -n istio-system get svc
NAME                   TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                                                                      AGE
istio-egressgateway    ClusterIP      10.10.196.10    <none>          80/TCP,443/TCP                                                               137m
istio-ingressgateway   LoadBalancer   10.10.217.255   192.168.3.181   15021:30721/TCP,80:30406/TCP,443:31152/TCP,31400:30352/TCP,15443:30154/TCP   137m
istiod                 ClusterIP      10.10.79.63     <none>          15010/TCP,15012/TCP,443/TCP,15014/TCP                                        137m

域名解析到istio入口地址

# nslookup -q www.test.com 192.168.3.250
*** Invalid option: q
Server:         192.168.3.250
Address:        192.168.3.250#53

Name:   www.test.com
Address: 192.168.3.181

创建入口网关Gateway

# cat gateway-http.yaml
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: app-test-getway
  namespace: test
spec:
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - www.demo.com
    - www.test.com

创建VirtualService

# cat virsr.yaml
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: app-test-virsr
  namespace: test
spec:
  hosts:
  - www.test.com
  - www.demo.com
  gateways:
  - app-test-getway
  http:
  - match:
    - uri:
        prefix: /
    route:
    - destination:
        host: app-test
        port:

创建pod及svc

# cat deply_pod_svc_centos_1pod.yaml 
apiVersion: apps/v1 
kind: Deployment 
metadata:
  name: app-test
  namespace: test
spec:
  selector:
    matchLabels:
      app: app-test
      ver: v1
  replicas: 1
  template:
    metadata:
      name: app-test
      namespace: test
      labels:
        app: app-test
        ver: v1
    spec:
      containers:
      - name: http
        image: harbor.demo.com/test/centos:v0.1.1
        imagePullPolicy: Always
        ports:
        - name: port-test-01
          containerPort: 8080
          protocol: TCP

---

apiVersion: v1
kind: Service
metadata:
  labels:
    app: app-test
  name: app-test
  namespace: test
spec:
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800
  selector:
    app: app-test
  ports:
    - name: port01
      port: 7071
      targetPort: 8080
      protocol: TCP
  type: ClusterIP

配置应用

# kubectl apply -f gateway-http.yaml -f virsr.yaml -f deply_pod_svc_centos_1pod.yaml

测试效果

# curl -XPOST http://www.test.com/test
POST    Hello,world-v10
hello,method is post

查看istio gateway日志

# kubectl -n istio-system get pod
NAME                                    READY   STATUS    RESTARTS   AGE
istio-cni-node-25vdk                    1/1     Running   0          155m
istio-cni-node-74txt                    1/1     Running   0          155m
istio-cni-node-pbmhn                    1/1     Running   0          155m
istio-egressgateway-bfc9d88d8-7p74f     1/1     Running   0          155m
istio-ingressgateway-775955bfb4-252gt   1/1     Running   0          155m
istiod-555d5d64fb-rnxx2                 1/1     Running   0          155m

# kubectl -n istio-system logs pod/istio-ingressgateway-775955bfb4-252gt -f
...
[2023-06-07T09:17:52.962Z] "GET /test HTTP/1.1" 200 - via_upstream - "-" 0 40 3 2 "10.244.36.65" "curl/7.76.1" "e607b10f-dcb6-969e-a41f-758b97572b85" "www.test.com" "10.244.169.141:8080" outbound|7071||app-test.test.svc.cluster.local 10.244.169.133:37806 10.244.169.133:8080 10.244.36.65:34632 - -

查看pod的proxy日志

# kubectl get pod
NAME                        READY   STATUS    RESTARTS   AGE
app-test-6fb979b794-p5h6d   2/2     Running   0          62m

# kubectl logs pod/app-test-6fb979b794-p5h6d -c istio-proxy -f
...
[2023-06-07T09:17:52.963Z] "GET /test HTTP/1.1" 200 - via_upstream - "-" 0 40 1 0 "10.244.36.65" "curl/7.76.1" "e607b10f-dcb6-969e-a41f-758b97572b85" "www.test.com" "10.244.169.141:8080" inbound|8080|| 127.0.0.6:47637 10.244.169.141:8080 10.244.36.65:0 outbound_.7071_._.app-test.test.svc.cluster.local default
5.5.3.4 https测试
# kubectl -n istio-system create secret tls web-ssl --cert=./web.pem --key=./web-key.pem
存储证书的secret必须位于名为istio-system的命名空间,否则无法找到。

# cat gateway-https.yaml
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: app-test-getway
  namespace: test
spec:
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - www.demo.com
    - www.test.com
    tls:
      httpsRedirect: true # sends 301 redirect for http requests
  - port:
      number: 443
      name: https-443
      protocol: HTTPS
    hosts:
    - www.test.com
    - www.demo.com
    tls:
      mode: SIMPLE
      credentialName: web-ssl

# kubectl apply -f gateway-https.yaml

测试

# curl -L https://www.test.com/test
GET     Hello,world-v10
hello,method is get
# curl -L http://www.test.com/test
GET     Hello,world-v10
hello,method is get
5.5.3.5 istio dashboard

Istio 和几个遥测应用(如下)做了集成。 遥测能帮您了解服务网格的结构、展示网络的拓扑结构、分析网格的健康状态。

# tree samples/addons/
samples/addons/
├── extras
│   ├── prometheus-operator.yaml
│   ├── prometheus_vm_tls.yaml
│   ├── prometheus_vm.yaml
│   ├── skywalking.yaml
│   └── zipkin.yaml
├── grafana.yaml
├── jaeger.yaml
├── kiali.yaml
├── prometheus.yaml
└── README.md
组件类别组件说明
服务拓扑结构显示类Kiali,会调用jaeger/Prometheus等,建议先安装jaeger/Prometheus
监控类Prometheus:数据采集
Grafana:可视化
服务追踪类Zipkin:由Twitter开发,基于语言的探针
jaeger:由Uber推出的一款开源分布式追踪系统,兼容OpenTracing API
skywalking:基于语言的探针

提示:
istio集成的Prometheus/Grafana版本低于其官方版本,建议采用官方版本

准备
1。三个域名(kiaki.test.com、prometheus-istio.test.com、jaeger.test.com),指向istio-ingress.
2。将所需应用的镜像转存私仓。
2。创建gw和vs,用于暴露服务。

# cat istio-dashboard.yaml 
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: istio-ui-gw
  namespace: istio-system
spec:
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - kiaki.test.com
    - prometheus.test.com
    - jaeger.test.com

---

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: istio-ui-kiali
  namespace: istio-system
spec:
  hosts:
  - kiaki.test.com
  gateways:
  - istio-ui-gw
  http:
  - match:
    - uri:
        prefix: /
    route:
    - destination:
        host: kiali
        port:
          number: 20001
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: istio-ui-prometheus
  namespace: istio-system
spec:
  hosts:
  - prometheus-istio.test.com
  gateways:
  - istio-ui-gw
  http:
  - match:
    - uri:
        prefix: /
    route:
    - destination:
        host: prometheus
        port:
          number: 9090

---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: istio-ui-jaeger
  namespace: istio-system
spec:
  hosts:
  - jaeger.test.com
  gateways:
  - istio-ui-gw
  http:
  - match:
    - uri:
        prefix: /
    route:
    - destination:
        host: tracing
        port:
          number: 80

安装

kubectl apply -f istio-dashboard.yaml
kubectl apply -f prometheus.yaml 
kubectl apply -f jaeger.yaml 
kubectl -n istio-system get svc

测试

# for i in $(seq 1 100);do curl http://www.test.com/test;done

查看
http://kiaki.test.com
在这里插入图片描述
http://jaeger.test.com
在这里插入图片描述

5.6 存储持久化

https://kubernetes.io/docs/concepts/storage/volumes

k8s的存储方案有很多种,如ceph、glusterfs、cinder、nfs等等,支持CSI(Container Storage Interface ,第三方容器存储接口协议)。
本文以cehph为主。
k8s存储持久化,涉及较多概念,如:
在这里插入图片描述

5.6.1 准备

下载cephcsi

# git clone https://github.com/ceph/ceph-csi.git

# tree ceph-csi
ceph-csi/
├── actions
├── api
├── assets
├── build.env
├── charts
├── cmd
├── deploy
│   ├── ceph-conf.yaml
│   ├── cephcsi
│   ├── cephfs
│   ├── csi-config-map-sample.yaml
│   ├── Makefile
│   ├── nfs
│   ├── rbd
│   ├── scc.yaml
│   └── service-monitor.yaml
├── deploy.sh
├── docs
├── e2e
├── examples
│   ├── ceph-conf.yaml
│   ├── cephfs
│   ├── csi-config-map.yaml
│   ├── csi-kms-config-map.yaml
│   ├── kms
│   ├── nfs
│   ├── rbd
│   └── README.md
├── go.mod
├── go.sum
├── internal
├── LICENSE
├── Makefile
├── README.md
├── scripts
├── tools
├── troubleshooting
└── vendor

转存镜像到本地私仓

# find ceph-csi/deploy/{cephfs,rbd} | xargs grep image:  > image.txt
# cat image.txt | cut -d' ' -f12 | sort | uniq
quay.io/cephcsi/cephcsi:canary
registry.k8s.io/sig-storage/csi-attacher:v4.3.0
registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.8.0
registry.k8s.io/sig-storage/csi-provisioner:v3.5.0
registry.k8s.io/sig-storage/csi-resizer:v1.8.0
registry.k8s.io/sig-storage/csi-snapshotter:v6.2.2

可以从国内镜像源拉取

# find ceph-csi/deploy/{cephfs,rbd} | xargs grep image:  > image.txt
# cat image.txt | cut -d' ' -f12 | sort | uniq
quay.io/cephcsi/cephcsi:canary
registry.k8s.io/sig-storage/csi-attacher:v4.3.0
registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.8.0
registry.k8s.io/sig-storage/csi-provisioner:v3.5.0
registry.k8s.io/sig-storage/csi-resizer:v1.8.0
registry.k8s.io/sig-storage/csi-snapshotter:v6.2.2

转存后记得更改如下清单文件中的镜像拉取地址为私仓。

# cat image.txt
ceph-csi/deploy/cephfs/kubernetes/csi-cephfsplugin-provisioner.yaml:          image: registry.k8s.io/sig-storage/csi-provisioner:v3.5.0
ceph-csi/deploy/cephfs/kubernetes/csi-cephfsplugin-provisioner.yaml:          image: registry.k8s.io/sig-storage/csi-resizer:v1.8.0
ceph-csi/deploy/cephfs/kubernetes/csi-cephfsplugin-provisioner.yaml:          image: registry.k8s.io/sig-storage/csi-snapshotter:v6.2.2
ceph-csi/deploy/cephfs/kubernetes/csi-cephfsplugin-provisioner.yaml:          image: quay.io/cephcsi/cephcsi:canary
ceph-csi/deploy/cephfs/kubernetes/csi-cephfsplugin-provisioner.yaml:          image: quay.io/cephcsi/cephcsi:canary
ceph-csi/deploy/cephfs/kubernetes/csi-cephfsplugin.yaml:          image: registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.8.0
ceph-csi/deploy/cephfs/kubernetes/csi-cephfsplugin.yaml:          image: quay.io/cephcsi/cephcsi:canary
ceph-csi/deploy/cephfs/kubernetes/csi-cephfsplugin.yaml:          image: quay.io/cephcsi/cephcsi:canary
ceph-csi/deploy/rbd/kubernetes/csi-rbdplugin-provisioner.yaml:          image: registry.k8s.io/sig-storage/csi-provisioner:v3.5.0
ceph-csi/deploy/rbd/kubernetes/csi-rbdplugin-provisioner.yaml:          image: registry.k8s.io/sig-storage/csi-snapshotter:v6.2.2
ceph-csi/deploy/rbd/kubernetes/csi-rbdplugin-provisioner.yaml:          image: registry.k8s.io/sig-storage/csi-attacher:v4.3.0
ceph-csi/deploy/rbd/kubernetes/csi-rbdplugin-provisioner.yaml:          image: registry.k8s.io/sig-storage/csi-resizer:v1.8.0
ceph-csi/deploy/rbd/kubernetes/csi-rbdplugin-provisioner.yaml:          image: quay.io/cephcsi/cephcsi:canary
ceph-csi/deploy/rbd/kubernetes/csi-rbdplugin-provisioner.yaml:          image: quay.io/cephcsi/cephcsi:canary
ceph-csi/deploy/rbd/kubernetes/csi-rbdplugin-provisioner.yaml:          image: quay.io/cephcsi/cephcsi:canary
ceph-csi/deploy/rbd/kubernetes/csi-rbdplugin.yaml:          image: registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.8.0
ceph-csi/deploy/rbd/kubernetes/csi-rbdplugin.yaml:          image: quay.io/cephcsi/cephcsi:canary
ceph-csi/deploy/rbd/kubernetes/csi-rbdplugin.yaml:          image: quay.io/cephcsi/cephcsi:canary

可采用如下脚本更改,例如私仓地址为harbor.demo.com/ceph.

sed -i 's/image: registry.k8s.io\/sig-storage/image: harbor.demo.com\/ceph/g'	ceph-csi/deploy/cephfs/kubernetes/csi-cephfsplugin-provisioner.yaml
sed -i 's/image: registry.k8s.io\/sig-storage/image: harbor.demo.com\/ceph/g'	ceph-csi/deploy/cephfs/kubernetes/csi-cephfsplugin.yaml
sed -i 's/image: quay.io\/cephcsi/image: harbor.demo.com\/ceph/g'		ceph-csi/deploy/cephfs/kubernetes/csi-cephfsplugin-provisioner.yaml
sed -i 's/image: quay.io\/cephcsi/image: harbor.demo.com\/ceph/g'		ceph-csi/deploy/cephfs/kubernetes/csi-cephfsplugin.yaml
sed -i 's/image: registry.k8s.io\/sig-storage/image: harbor.demo.com\/ceph/g'	ceph-csi/deploy/rbd/kubernetes/csi-rbdplugin-provisioner.yaml
sed -i 's/image: registry.k8s.io\/sig-storage/image: harbor.demo.com\/ceph/g'	ceph-csi/deploy/rbd/kubernetes/csi-rbdplugin.yaml
sed -i 's/image: quay.io\/cephcsi/image: harbor.demo.com\/ceph/g'		ceph-csi/deploy/rbd/kubernetes/csi-rbdplugin-provisioner.yaml
sed -i 's/image: quay.io\/cephcsi/image: harbor.demo.com\/ceph/g'		ceph-csi/deploy/rbd/kubernetes/csi-rbdplugin.yaml

创建命名空间,并配置为当前上下文的默认命名空间。

kubectl create namespace ceph-csi
kubectl config set-context kubernetes-admin@kubernetes --namespace='ceph-csi'

在dynamic/ceph-csi/deploy/cephfs/kubernetes中的csi-nodeplugin-rbac.yaml和csi-provisioner-rbac.yaml文件中没有给k8s资源在定义时没指定命令空间,此时会采用当前默认的命名空间。

配置csi认证方式

# cat ceph-csi/deploy/ceph-conf.yaml 
---
# This is a sample configmap that helps define a Ceph configuration as required
# by the CSI plugins.

# Sample ceph.conf available at
# https://github.com/ceph/ceph/blob/master/src/sample.ceph.conf Detailed
# documentation is available at
# https://docs.ceph.com/en/latest/rados/configuration/ceph-conf/
apiVersion: v1
kind: ConfigMap
data:
  ceph.conf: |
    [global]
    auth_cluster_required = cephx
    auth_service_required = cephx
    auth_client_required = cephx

  # keyring is a required key and its value should be empty
  keyring: |
metadata:
  name: ceph-config

# kubectl apply -f ceph-csi/deploy/ceph-conf.yaml

配置csi-config-map

# ceph mon dump
epoch 2
fsid 9b7095ab-5193-420c-b2fb-2d343c57ef52
last_changed 2023-06-04T09:09:31.753367+0800
created 2023-06-04T01:27:47.063896+0800
min_mon_release 17 (quincy)
election_strategy: 1
0: [v2:10.2.20.90:3300/0,v1:10.2.20.90:6789/0] mon.mon1
dumped monmap epoch 2

cat > ceph-csi/deploy/csi-config-map.yaml << 'EOF'
apiVersion: v1
kind: ConfigMap
data:
  config.json: |-
    [
      {
        "clusterID": "9b7095ab-5193-420c-b2fb-2d343c57ef52",
        "monitors": [
          "10.2.20.90:6789"
        ]
      }
    ]
metadata:
  name: ceph-csi-config
EOF

# kubectl apply -f ceph-csi/deploy/csi-config-map.yaml

配置CSI-KMS-config-map

ceph-csi 还需要一个额外的 ConfigMap 对象来 定义密钥管理服务 (KMS) 提供程序详细信息。
如果未设置 KMS, CSI-KMS-config-map.yaml 文件按以下配置或参考示例 在 https://github.com/ceph/ceph-csi/tree/master/examples/kms

# cat <<EOF > ceph-csi/deploy/csi-kms-config-map.yaml
---
apiVersion: v1
kind: ConfigMap
data:
  config.json: |-
    {}
metadata:
  name: ceph-csi-encryption-kms-config
EOF

# kubectl apply -f ceph-csi/deploy/csi-kms-config-map.yaml

5.6.2 ceph-fs csi配置

5.6.2.1 安装ceph-fs csi

进入cephfs插件目录

# cd ceph-csi/deploy/cephfs/kubernetes
# tree .
.
├── csi-cephfsplugin-provisioner.yaml
├── csi-cephfsplugin.yaml
├── csi-config-map.yaml
├── csidriver.yaml
├── csi-nodeplugin-rbac.yaml
└── csi-provisioner-rbac.yaml

更改清单文件中的命名空间名称

# NAMESPACE=ceph-csi
# sed -r -i "s/namespace: [^ ]+/namespace: $NAMESPACE/g" ./*.yaml

安装

kubectl apply -f ./csidriver.yaml
kubectl apply -f ./csi-provisioner-rbac.yaml
kubectl apply -f ./csi-nodeplugin-rbac.yaml
kubectl apply -f ./csi-cephfsplugin-provisioner.yaml
kubectl apply -f ./csi-cephfsplugin.yaml

查看

# kubectl get pod
NAME                                            READY   STATUS    RESTARTS   AGE
csi-cephfsplugin-5bwgs                          3/3     Running   0          82s
csi-cephfsplugin-provisioner-64b57b7f4c-d4hr7   5/5     Running   0          83s
csi-cephfsplugin-provisioner-64b57b7f4c-hwcd4   0/5     Pending   0          82s  # 由于master节点不支持普通pod安装所制。
csi-cephfsplugin-provisioner-64b57b7f4c-x6qtb   5/5     Running   0          82s
csi-cephfsplugin-q5wks                          3/3     Running   0          82s
5.6.2.2 sc存储类配置

提示:
sc/pvc对像的命名空间与用户pod相同。但不必与ceph-csi的pod在同一个命名空间。本例中:ceph-csi在ceph-fs命名空间,而Secret/sc/pv/pvc在test命名空间。

修改上下文的默认命名空间

# kubectl config set-context kubernetes-admin@kubernetes --namespace='test'

创建fs卷

ceph osd pool create k8s-metadata 4 4
ceph osd pool create k8s-data 4 4
ceph fs new k8s-fs k8s-metadata k8s-data

# ceph fs ls
name: k8s-fs, metadata pool: k8s-metadata, data pools: [k8s-data ]

重要提示:
在建立fs卷前,一定要查看下mds进程的个数和fs卷的个数,只有“fs卷的个数 < mds进程的个数”时才可以创建。若不能创建,就使用已存在的fs卷。或删除某个卷再建新卷。

配置csi-fs-secret
csi会通过pod访问ceph集群,使用client.admin

查看管理员密码
# ceph auth get client.admin
[client.admin]
        key = AQCwXntkCw+CGBAA/mdug0WT2jYDAFEN8tATOA==
        caps mds = "allow *"
        caps mgr = "allow *"
        caps mon = "allow *"
        caps osd = "allow *"

创建访问fs卷的用户
# ceph auth get-or-create client.k8sfs mon "allow * fsname=k8s-fs" mds "allow * fsname=k8s-fs" osd "allow * tag cephfs data=k8s-fs"
[client.k8sfs]
        key = AQCX94FkmzL0KRAA3t/G7S7Qn631V9YeUYMuWQ==

# ceph auth get client.k8sfs
[client.k8sfs]
        key = AQCX94FkmzL0KRAA3t/G7S7Qn631V9YeUYMuWQ==
        caps mds = "allow * fsname=k8s-fs"
        caps mon = "allow * fsname=k8s-fs"
        caps osd = "allow * tag cephfs data=k8s-fs"

配置secret
cat > ./csi-fs-secret.yaml << 'EOF'
---
apiVersion: v1
kind: Secret
metadata:
  name: csi-cephfs-secret
  namespace: test
stringData:
  # Required for statically provisioned volumes
  userID: k8sfs
  userKey: AQCX94FkmzL0KRAA3t/G7S7Qn631V9YeUYMuWQ==

  # Required for dynamically provisioned volumes
  adminID: admin
  adminKey: AQCwXntkCw+CGBAA/mdug0WT2jYDAFEN8tATOA==
EOF

# kubectl apply -f csi-fs-secret.yaml
此csi-cephfs-secret供sc存储类使用。

配置ceph-fs存储类

# cat storageclass.yaml 
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: csi-cephfs-sc
  namespace: test
provisioner: cephfs.csi.ceph.com
parameters:
  # (required) String representing a Ceph cluster to provision storage from.
  # Should be unique across all Ceph clusters in use for provisioning,
  # cannot be greater than 36 bytes in length, and should remain immutable for
  # the lifetime of the StorageClass in use.
  # Ensure to create an entry in the configmap named ceph-csi-config, based on
  # csi-config-map-sample.yaml, to accompany the string chosen to
  # represent the Ceph cluster in clusterID below
  clusterID: 9b7095ab-5193-420c-b2fb-2d343c57ef52

  # (required) CephFS filesystem name into which the volume shall be created
  # eg: fsName: myfs
  fsName: k8s-fs

  # (optional) Ceph pool into which volume data shall be stored
  # pool: <cephfs-data-pool>

  # (optional) Comma separated string of Ceph-fuse mount options.
  # For eg:
  # fuseMountOptions: debug

  # (optional) Comma separated string of Cephfs kernel mount options.
  # Check man mount.ceph for mount options. For eg:
  # kernelMountOptions: readdir_max_bytes=1048576,norbytes

  # The secrets have to contain user and/or Ceph admin credentials.
  csi.storage.k8s.io/provisioner-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/provisioner-secret-namespace: test
  csi.storage.k8s.io/controller-expand-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/controller-expand-secret-namespace: test
  csi.storage.k8s.io/node-stage-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/node-stage-secret-namespace: test

  # (optional) The driver can use either ceph-fuse (fuse) or
  # ceph kernelclient (kernel).
  # If omitted, default volume mounter will be used - this is
  # determined by probing for ceph-fuse and mount.ceph
  # mounter: kernel

  # (optional) Prefix to use for naming subvolumes.
  # If omitted, defaults to "csi-vol-".
  # volumeNamePrefix: "foo-bar-"

  # (optional) Boolean value. The PVC shall be backed by the CephFS snapshot
  # specified in its data source. `pool` parameter must not be specified.
  # (defaults to `true`)
  # backingSnapshot: "false"

  # (optional) Instruct the plugin it has to encrypt the volume
  # By default it is disabled. Valid values are "true" or "false".
  # A string is expected here, i.e. "true", not true.
  # encrypted: "true"

  # (optional) Use external key management system for encryption passphrases by
  # specifying a unique ID matching KMS ConfigMap. The ID is only used for
  # correlation to configmap entry.
  # encryptionKMSID: <kms-config-id>


reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
  - debug

# kubectl apply -f storageclass.yaml
# kubectl get sc -n test
NAME            PROVISIONER           RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
csi-cephfs-sc   cephfs.csi.ceph.com   Delete          Immediate           true                   72s
5.6.2.3 无状态部署测试

配置PVC

# cat pvc.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: csi-cephfs-pvc
  namespace: test
spec:
  storageClassName: csi-cephfs-sc
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 2Gi

# kubectl apply -f pvc.yaml 
# kubectl get sc,pv,pvc -n test
NAME                                        PROVISIONER           RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
storageclass.storage.k8s.io/csi-cephfs-sc   cephfs.csi.ceph.com   Delete          Immediate           true                   2m52s

NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                 STORAGECLASS    REASON   AGE
persistentvolume/pvc-2c474ef8-d076-4d64-8cf4-75c47d868d67   2Gi        RWX            Delete           Bound    test/csi-cephfs-pvc   csi-cephfs-sc            9s

NAME                                   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS    AGE
persistentvolumeclaim/csi-cephfs-pvc   Bound    pvc-2c474ef8-d076-4d64-8cf4-75c47d868d67   2Gi        RWX            csi-cephfs-sc   10s

pod模板配置

# cat test-pod-1.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-test
  namespace: test
spec:
  selector:
    matchLabels:
      app: app-test
      ver: v1
  replicas: 1
  template:
    metadata:
      name: app-test
      namespace: test
      labels:
        app: app-test
        ver: v1
    spec:
      volumes:
      - name: mypvc-cephfs
        persistentVolumeClaim:
          claimName: csi-cephfs-pvc
          readOnly: false
      containers:
      - name: http
        image: harbor.demo.com/web/centos:v0.1
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: mypvc-cephfs
          mountPath: /home/cephfs
        ports:
        - name: port-test-01
          containerPort: 8080
          protocol: TCP

# kubectl apply -f test-pod-1.yaml          

查看

# kubectl get pod -n test
NAME                        READY   STATUS    RESTARTS   AGE
app-test-7fc767985c-cbc8k   2/2     Running   0          10s
# kubectl -n test exec po/app-test-7fc767985c-cbc8k -- df -Th
Filesystem           Type            Size      Used Available Use% Mounted on
overlay              overlay        27.2G      6.7G     20.5G  25% /
tmpfs                tmpfs          64.0M         0     64.0M   0% /dev
shm                  tmpfs          64.0M         0     64.0M   0% /dev/shm
tmpfs                tmpfs         730.8M     10.6M    720.2M   1% /etc/resolv.conf
tmpfs                tmpfs         730.8M     10.6M    720.2M   1% /etc/hostname
tmpfs                tmpfs         730.8M     10.6M    720.2M   1% /run/.containerenv
10.2.20.90:6789:/volumes/csi/csi-vol-11adc09a-f118-44ee-81ca-3b80427cbcb6/a84f61f3-d0a1-4952-925b-987bc6dbe401     //静态PV
                     ceph            2.0G         0      2.0G   0% /home/cephfs
/dev/mapper/rl-root  xfs            27.2G      6.7G     20.5G  25% /etc/hosts
/dev/mapper/rl-root  xfs            27.2G      6.7G     20.5G  25% /dev/termination-log
tmpfs                tmpfs           3.5G     12.0K      3.5G   0% /var/run/secrets/kubernetes.io/serviceaccount
devtmpfs             devtmpfs        4.0M         0      4.0M   0% /proc/kcore
devtmpfs             devtmpfs        4.0M         0      4.0M   0% /proc/keys
devtmpfs             devtmpfs        4.0M         0      4.0M   0% /proc/timer_list

若pod多副本,它们之间是共享存储。

5.6.2.4 有状态部署测试

采用卷模板为每一个pod副本提供一个pvc.

# cat test-pod-2.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: app-test
  namespace: test
spec:
  selector:
    matchLabels:
      app: app-test
      ver: v1
  replicas: 3
  template:
    metadata:
      name: app-test
      namespace: test
      labels:
        app: app-test
        ver: v1
    spec:
      containers:
      - name: http
        image: harbor.demo.com/web/centos:v0.1
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: test-sf
          mountPath: /home/cephfs
        ports:
        - name: port-test-01
          containerPort: 8080
          protocol: TCP
  volumeClaimTemplates:               
  - metadata:
      name: test-sf
    spec:
      accessModes: [ "ReadWriteMany" ]
      storageClassName: csi-cephfs-sc
      resources:
        requests:
          storage: 1Gi 

# kubectl apply -f test-pod-2.yaml           

查看

# kubectl -n test get sc,pv,pvc
NAME                                        PROVISIONER           RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
storageclass.storage.k8s.io/csi-cephfs-sc   cephfs.csi.ceph.com   Delete          Immediate           true                   10h

NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                     STORAGECLASS    REASON   AGE
persistentvolume/pvc-34d03e58-bd10-4a94-b427-83cd1de389b2   1Gi        RWX            Delete           Bound    test/test-sf-app-test-2   csi-cephfs-sc            9h
persistentvolume/pvc-7a85a6f6-c48c-4474-a498-eaeb802dd331   1Gi        RWX            Delete           Bound    test/test-sf-app-test-1   csi-cephfs-sc            9h
persistentvolume/pvc-f276331a-416f-45af-82eb-c008cf57839a   1Gi        RWX            Delete           Bound    test/test-sf-app-test-0   csi-cephfs-sc            9h

NAME                                       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS    AGE
persistentvolumeclaim/test-sf-app-test-0   Bound    pvc-f276331a-416f-45af-82eb-c008cf57839a   1Gi        RWX            csi-cephfs-sc   9h
persistentvolumeclaim/test-sf-app-test-1   Bound    pvc-7a85a6f6-c48c-4474-a498-eaeb802dd331   1Gi        RWX            csi-cephfs-sc   9h
persistentvolumeclaim/test-sf-app-test-2   Bound    pvc-34d03e58-bd10-4a94-b427-83cd1de389b2   1Gi        RWX            csi-cephfs-sc   9h

5.6.3 ceph-rbd csi配置

5.6.3.1 安装ceph-rbd csi
进入rbd插件目录
# cd ceph-csi/deploy/rbd/kubernetes
# tree .
.
├── csi-config-map.yaml
├── csidriver.yaml
├── csi-nodeplugin-rbac.yaml
├── csi-provisioner-rbac.yaml
├── csi-rbdplugin-provisioner.yaml
└── csi-rbdplugin.yaml


更改清单文件中的命名空间名称
# NAMESPACE=ceph-csi
# sed -r -i "s/namespace: [^ ]+/namespace: $NAMESPACE/g" ./*.yaml
# kubectl config set-context kubernetes-admin@kubernetes --namespace='ceph-csi'


安装(安装前需将镜像转存私仓)
kubectl apply -f ./csi-provisioner-rbac.yaml
kubectl apply -f ./csi-nodeplugin-rbac.yaml
kubectl apply -f ./csi-rbdplugin-provisioner.yaml
kubectl apply -f ./csi-rbdplugin.yaml

查看

# kubectl -n ceph-csi get pod
NAME                                            READY   STATUS    RESTARTS   AGE
csi-rbdplugin-provisioner-7b657bbc4-bq9lv       0/7     Pending   0          3h9m
csi-rbdplugin-provisioner-7b657bbc4-rrtr5       7/7     Running   0          3h9m
csi-rbdplugin-provisioner-7b657bbc4-sbbkr       7/7     Running   0          3h9m
csi-rbdplugin-pw76p                             3/3     Running   0          3h9m
csi-rbdplugin-tm8st                             3/3     Running   0          3h9m
5.6.3.2 sc存储类配置

提示:
sc/pvc对像的命名空间与用户pod相同。但不必与ceph-csi的pod在同一个命名空间。本例中:ceph-csi在ceph-fs命名空间,而Secret/sc/pv/pvc在test命名空间。

修改上下文的默认命名空间

# kubectl config set-context kubernetes-admin@kubernetes --namespace='test'

创建k8s的rbd池

# ceph osd pool create k8s-pool 8 8
# rbd pool init k8s-pool

配置csi-rbd-secret

创建访问pool的用户
# ceph auth get-or-create client.k8srbd mon 'profile rbd' mgr 'profile rbd pool=k8s-poold'  osd 'profile rbd pool=k8s-pool' 
[client.k8srbd]
        key = AQDZ44Fk2SGfJRAAGOQnJFAGzDlRw00r/s+s2A==
此值用于k8s中的secret建立。
# ceph auth get client.k8srbd
[client.k8srbd]
        key = AQDZ44Fk2SGfJRAAGOQnJFAGzDlRw00r/s+s2A==
        caps mgr = "profile rbd pool=k8s-poold"
        caps mon = "profile rbd"
        caps osd = "profile rbd pool=k8s-pool"

# cat > ./csi-rbd-secret.yaml << 'EOF'
apiVersion: v1
kind: Secret
metadata:
  name: csi-rbd-secret
  namespace: test
stringData:
  userID: k8srbd
  userKey: AQDZ44Fk2SGfJRAAGOQnJFAGzDlRw00r/s+s2A==
EOF

# kubectl apply -f csi-rbd-secret.yaml

配置ceph-rbd存储类

# cat storageclass.yaml 
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: csi-rbd-sc
   namespace: test
provisioner: rbd.csi.ceph.com
# If topology based provisioning is desired, delayed provisioning of
# PV is required and is enabled using the following attribute
# For further information read TODO<doc>
# volumeBindingMode: WaitForFirstConsumer
parameters:
   # (required) String representing a Ceph cluster to provision storage from.
   # Should be unique across all Ceph clusters in use for provisioning,
   # cannot be greater than 36 bytes in length, and should remain immutable for
   # the lifetime of the StorageClass in use.
   # Ensure to create an entry in the configmap named ceph-csi-config, based on
   # csi-config-map-sample.yaml, to accompany the string chosen to
   # represent the Ceph cluster in clusterID below
   clusterID: 9b7095ab-5193-420c-b2fb-2d343c57ef52

   # (optional) If you want to use erasure coded pool with RBD, you need to
   # create two pools. one erasure coded and one replicated.
   # You need to specify the replicated pool here in the `pool` parameter, it is
   # used for the metadata of the images.
   # The erasure coded pool must be set as the `dataPool` parameter below.
   # dataPool: <ec-data-pool>

   # (required) Ceph pool into which the RBD image shall be created
   # eg: pool: rbdpool
   pool: k8s-pool

   # (optional) RBD image features, CSI creates image with image-format 2 CSI
   # RBD currently supports `layering`, `journaling`, `exclusive-lock`,
   # `object-map`, `fast-diff`, `deep-flatten` features.
   # Refer https://docs.ceph.com/en/latest/rbd/rbd-config-ref/#image-features
   # for image feature dependencies.
   # imageFeatures: layering,journaling,exclusive-lock,object-map,fast-diff
   imageFeatures: "layering"

   # (optional) Options to pass to the `mkfs` command while creating the
   # filesystem on the RBD device. Check the man-page for the `mkfs` command
   # for the filesystem for more details. When `mkfsOptions` is set here, the
   # defaults will not be used, consider including them in this parameter.
   #
   # The default options depend on the csi.storage.k8s.io/fstype setting:
   # - ext4: "-m0 -Enodiscard,lazy_itable_init=1,lazy_journal_init=1"
   # - xfs: "-onouuid -K"
   #
   # mkfsOptions: "-m0 -Ediscard -i1024"

   # (optional) Specifies whether to try other mounters in case if the current
   # mounter fails to mount the rbd image for any reason. True means fallback
   # to next mounter, default is set to false.
   # Note: tryOtherMounters is currently useful to fallback from krbd to rbd-nbd
   # in case if any of the specified imageFeatures is not supported by krbd
   # driver on node scheduled for application pod launch, but in the future this
   # should work with any mounter type.
   # tryOtherMounters: false

   # (optional) mapOptions is a comma-separated list of map options.
   # For krbd options refer
   # https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
   # For nbd options refer
   # https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
   # Format:
   # mapOptions: "<mounter>:op1,op2;<mounter>:op1,op2"
   # An empty mounter field is treated as krbd type for compatibility.
   # eg:
   # mapOptions: "krbd:lock_on_read,queue_depth=1024;nbd:try-netlink"

   # (optional) unmapOptions is a comma-separated list of unmap options.
   # For krbd options refer
   # https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
   # For nbd options refer
   # https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
   # Format:
   # unmapOptions: "<mounter>:op1,op2;<mounter>:op1,op2"
   # An empty mounter field is treated as krbd type for compatibility.
   # eg:
   # unmapOptions: "krbd:force;nbd:force"

   # The secrets have to contain Ceph credentials with required access
   # to the 'pool'.
   csi.storage.k8s.io/provisioner-secret-name: csi-rbd-secret
   csi.storage.k8s.io/provisioner-secret-namespace: test
   csi.storage.k8s.io/controller-expand-secret-name: csi-rbd-secret
   csi.storage.k8s.io/controller-expand-secret-namespace: test
   csi.storage.k8s.io/node-stage-secret-name: csi-rbd-secret
   csi.storage.k8s.io/node-stage-secret-namespace: test

   # (optional) Specify the filesystem type of the volume. If not specified,
   # csi-provisioner will set default as `ext4`.
   csi.storage.k8s.io/fstype: ext4

   # (optional) uncomment the following to use rbd-nbd as mounter
   # on supported nodes
   # mounter: rbd-nbd

   # (optional) ceph client log location, eg: rbd-nbd
   # By default host-path /var/log/ceph of node is bind-mounted into
   # csi-rbdplugin pod at /var/log/ceph mount path. This is to configure
   # target bindmount path used inside container for ceph clients logging.
   # See docs/rbd-nbd.md for available configuration options.
   # cephLogDir: /var/log/ceph

   # (optional) ceph client log strategy
   # By default, log file belonging to a particular volume will be deleted
   # on unmap, but you can choose to just compress instead of deleting it
   # or even preserve the log file in text format as it is.
   # Available options `remove` or `compress` or `preserve`
   # cephLogStrategy: remove

   # (optional) Prefix to use for naming RBD images.
   # If omitted, defaults to "csi-vol-".
   # volumeNamePrefix: "foo-bar-"

   # (optional) Instruct the plugin it has to encrypt the volume
   # By default it is disabled. Valid values are "true" or "false".
   # A string is expected here, i.e. "true", not true.
   # encrypted: "true"

   # (optional) Select the encryption type when encrypted: "true" above.
   # Valid values are:
   #   "file": Enable file encryption on the mounted filesystem
   #   "block": Encrypt RBD block device
   # When unspecified assume type "block". "file" and "block" are
   # mutally exclusive.
   # encryptionType: "block"

   # (optional) Use external key management system for encryption passphrases by
   # specifying a unique ID matching KMS ConfigMap. The ID is only used for
   # correlation to configmap entry.
   # encryptionKMSID: <kms-config-id>

   # Add topology constrained pools configuration, if topology based pools
   # are setup, and topology constrained provisioning is required.
   # For further information read TODO<doc>
   # topologyConstrainedPools: |
   #   [{"poolName":"pool0",
   #     "dataPool":"ec-pool0" # optional, erasure-coded pool for data
   #     "domainSegments":[
   #       {"domainLabel":"region","value":"east"},
   #       {"domainLabel":"zone","value":"zone1"}]},
   #    {"poolName":"pool1",
   #     "dataPool":"ec-pool1" # optional, erasure-coded pool for data
   #     "domainSegments":[
   #       {"domainLabel":"region","value":"east"},
   #       {"domainLabel":"zone","value":"zone2"}]},
   #    {"poolName":"pool2",
   #     "dataPool":"ec-pool2" # optional, erasure-coded pool for data
   #     "domainSegments":[
   #       {"domainLabel":"region","value":"west"},
   #       {"domainLabel":"zone","value":"zone1"}]}
   #   ]

   # Image striping, Refer https://docs.ceph.com/en/latest/man/8/rbd/#striping
   # For more details
   # (optional) stripe unit in bytes.
   # stripeUnit: <>
   # (optional) objects to stripe over before looping.
   # stripeCount: <>
   # (optional) The object size in bytes.
   # objectSize: <>
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
   - discard

# kubectl apply -f storageclass.yaml
5.6.3.3 无状态部署测试

PVC配置

# cat test-pvc-block-fs.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ceph-pvc-rbd-fs
  namespace: test
spec:
  storageClassName: csi-rbd-sc
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 2Gi
      
# kubectl apply -f test-pvc-block-fs.yaml

查看

# kubectl get sc,pv,pvc
NAME                                        PROVISIONER           RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
storageclass.storage.k8s.io/csi-cephfs-sc   cephfs.csi.ceph.com   Delete          Immediate           true                   11h
storageclass.storage.k8s.io/csi-rbd-sc      rbd.csi.ceph.com      Delete          Immediate           true                   82m

NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS     CLAIM                   STORAGECLASS   REASON   AGE
persistentvolume/pvc-6d68f26c-28e5-45f2-9a3f-f9ff77769684   2Gi        RWO            Delete           Bound      test/ceph-pvc-rbd-fs    csi-rbd-sc              39s
persistentvolume/pvc-89aa9a09-1aa1-42f7-b8b0-58061953d681   3Gi        RWO            Delete           Released   test/ceph-pvc-rbd-raw   csi-rbd-sc              37m

NAME                                    STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/ceph-pvc-rbd-fs   Bound    pvc-6d68f26c-28e5-45f2-9a3f-f9ff77769684   2Gi        RWO            csi-rbd-sc     40s

pod模板

# cat test-pod-fs.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-test
  namespace: test
spec:
  selector:
    matchLabels:
      app: app-test
  replicas: 1
  template:
    metadata:
      name: app-test
      namespace: test
      labels:
        app: app-test
    spec:
      volumes:
      - name: ceph-rbd-fs
        persistentVolumeClaim:
          claimName: ceph-pvc-rbd-fs
      containers:
      - name: http
        image: harbor.demo.com/web/centos:v0.1
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: ceph-rbd-fs
          mountPath: /home/ceph
          readOnly: false

查看

# kubectl get pod
NAME                        READY   STATUS    RESTARTS   AGE
app-test-68b7fccb49-g2mvz   1/1     Running   0          46s

# kubectl exec pod/app-test-68b7fccb49-mrb7b -- df -Th
Filesystem           Type            Size      Used Available Use% Mounted on
overlay              overlay        27.2G      6.7G     20.5G  25% /
tmpfs                tmpfs          64.0M         0     64.0M   0% /dev
shm                  tmpfs          64.0M         0     64.0M   0% /dev/shm
tmpfs                tmpfs         730.8M     11.3M    719.5M   2% /etc/resolv.conf
tmpfs                tmpfs         730.8M     11.3M    719.5M   2% /etc/hostname
tmpfs                tmpfs         730.8M     11.3M    719.5M   2% /run/.containerenv
/dev/rbd0            ext4            1.9G     28.0K      1.9G   0% /home/ceph     成功
/dev/mapper/rl-root  xfs            27.2G      6.7G     20.5G  25% /etc/hosts
/dev/mapper/rl-root  xfs            27.2G      6.7G     20.5G  25% /dev/termination-log
tmpfs                tmpfs           3.5G     12.0K      3.5G   0% /var/run/secrets/kubernetes.io/serviceaccount
devtmpfs             devtmpfs        4.0M         0      4.0M   0% /proc/kcore
devtmpfs             devtmpfs        4.0M         0      4.0M   0% /proc/keys
devtmpfs             devtmpfs        4.0M         0      4.0M   0% /proc/timer_list
5.6.3.4 有状态部署测试

pod模板

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: app-test
  namespace: test
spec:
  selector:
    matchLabels:
      app: app-test
      ver: v1
  replicas: 3
  template:
    metadata:
      name: app-test
      namespace: test
      labels:
        app: app-test
        ver: v1
    spec:
      containers:
      - name: http
        image: harbor.demo.com/web/centos:v0.1
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: test-sf
          mountPath: /home/cephfs
        ports:
        - name: port-test-01
          containerPort: 8080
          protocol: TCP
  volumeClaimTemplates:               
  - metadata:
      name: test-sf
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: csi-rbd-sc
      resources:
        requests:
          storage: 1Gi            

查看pvc

# kubectl get pvc
NAME                 STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
test-sf-app-test-0   Bound    pvc-ef0e2be8-6192-4e27-aab4-1a1482dc2077   1Gi        RWO            csi-rbd-sc     9m38s
test-sf-app-test-1   Bound    pvc-e92e1ec2-497a-4c03-ba5a-b0fc4a30aa5f   1Gi        RWO            csi-rbd-sc     9m22s
test-sf-app-test-2   Bound    pvc-56bbbbf9-ac02-49cf-a5ae-b0b979fd1d59   1Gi        RWO            csi-rbd-sc     9m17s

查看ceph

# rbd ls k8s-pool
csi-vol-5fd326fa-45a1-460f-911a-c47d004fd215
csi-vol-c0dd0862-a4cf-488e-8100-e5e79c4d98e1
csi-vol-e95e63d1-a943-4003-9e41-c8aac0068ac8

5.7 测试

5.7.1 istio流量治理

Istio流量治理是Istio的核心能力,通过服务级别的配置,如使用Istio可以管理服务网格的服务发现、流量路由、负载均衡,实现蓝绿发布(A/B发布)、灰度发布以及百分比流量策略发布等,Istio 还可以实现诸如故障注入、熔断限流、超时重试等流量治理功能。

路由规则是将特定流量子集路由到指定目标地址的强大工具。
您可以在流量端口、header 字段、URI 等内容上面设置匹配条件。

5.7.1.1 微服务准备

三个微服务配置

# cat deply_pod_svc_centos_3pod.yaml 
apiVersion: apps/v1 
kind: Deployment 
metadata:
  name: app-test-1
  namespace: test
spec:
  selector:
    matchLabels:
      app: app-test
      ver: v1
  replicas: 1
  template:
    metadata:
      name: app-test-1
      namespace: test
      labels:
        app: app-test
        ver: v1
    spec:
      containers:
      - name: http
        image: harbor.demo.com/web/centos:v0.1
        imagePullPolicy: IfNotPresent
        ports:
        - name: port-test-01
          containerPort: 8080
          protocol: TCP
---
apiVersion: apps/v1
kind: Deployment 
metadata:
  name: app-test-2
  namespace: test
spec:
  selector:
    matchLabels:
      app: app-test
      ver: v2
  replicas: 1
  template:
    metadata:
      name: app-test-2
      namespace: test
      labels:
        app: app-test
        ver: v2
    spec:
      containers:
      - name: http
        image: harbor.demo.com/web/centos:v0.2
        imagePullPolicy: IfNotPresent
        ports:
        - name: port-test-02
          containerPort: 8080
          protocol: TCP
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-test-3
  namespace: test
spec:
  selector:
    matchLabels:
      app: app-test
      ver: v3
  replicas: 1
  template:
    metadata:
      name: app-test-3
      namespace: test
      labels:
        app: app-test
        ver: v3
    spec:
      containers:
      - name: http
        image: harbor.demo.com/web/centos:v0.3
        imagePullPolicy: IfNotPresent
        ports:
        - name: port-test-03
          containerPort: 8080
          protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: app-test
  name: app-test
  namespace: test
spec:
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800
  selector:
    app: app-test
  ports:
    - name: port01
      port: 7071
      targetPort: 8080
      protocol: TCP
  type: ClusterIP

配置网关

# cat gateway-http.yaml 
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: app-test-getway
  namespace: test
spec:
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - www.demo.com
    - www.test.com

配置虚拟服务

# cat virsr.yaml 
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: app-test-virsr
  namespace: test
spec:
  hosts:
  - www.test.com
  - www.demo.com
  gateways:
  - app-test-getway
  http:
  - match:
    - uri:
        prefix: /
    route:
    - destination:
        host: app-test
        port:
          number: 7071

生效应用

# kubectl apply -f virsr.yaml -f gateway-http.yaml -f deply_pod_svc_centos_3pod.yaml

测试

# for i in $(seq 1 6);do elinks --dump http://www.test.com/test;done
   GET testing,v0.3 hello,method is get
   GET testing,v0.2 hello,method is get
   GET testing,v0.3 hello,method is get
   GET testing,v0.1 hello,method is get
   GET testing,v0.2 hello,method is get
   GET testing,v0.2 hello,method is get

上面配置是三个微服务随机对外提供服务。

配置目标规则,此步为细分流量做准备。

# cat dr.yaml 
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: dest-rule-test
  namespace: test
spec:
  host: app-test
  trafficPolicy:
    loadBalancer:
      simple: RANDOM
  subsets:
  - name: v1
    labels:
      ver: v1
  - name: v2
    labels:
      ver: v2
  - name: v3
    labels:
      ver: v3

# kubectl apply -f dr.yaml       
5.7.1.2 百分比流量策略发布

配置流量规则,如下:
微服务v1占10%流量、v2占20%流量、v3占70%流量。

# cat virsr.yaml 
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: app-test-virsr
  namespace: test
spec:
  hosts:
  - www.test.com
  - www.demo.com
  gateways:
  - app-test-getway
  http:
  - match:
    - uri:
        prefix: /
    route:
    - destination:
        host: app-test
        port:
          number: 7071
        subset: v1
      weight: 10
    - destination:
        host: app-test
        port:
          number: 7071
        subset: v2
      weight: 20
    - destination:
        host: app-test
        port:
          number: 7071
        subset: v3
      weight: 70

# kubectl apply -f virsr.yaml       

测试

# for i in $(seq 1 10);do elinks --dump http://www.test.com/test;done
   GET testing,v0.3 hello,method is get
   GET testing,v0.3 hello,method is get
   GET testing,v0.1 hello,method is get
   GET testing,v0.3 hello,method is get
   GET testing,v0.2 hello,method is get
   GET testing,v0.3 hello,method is get
   GET testing,v0.2 hello,method is get
   GET testing,v0.2 hello,method is get
   GET testing,v0.3 hello,method is get
   GET testing,v0.1 hello,method is get
5.7.1.3 识别浏览器类型

配置流量规则,如下:
微服务v1提供给edge、v2提供给chrome、v3提供给其它。

# cat virsr-chrome.yaml 
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: app-test-virsr
  namespace: test
spec:
  hosts:
  - www.test.com
  - www.demo.com
  gateways:
  - app-test-getway
  http:
  - name: test1
    match:
    - headers:
        User-Agent:
          regex: ".*Edg.*"
    route:
    - destination:
        host: app-test
        port:
          number: 7071
        subset: v1
  - name: test2
    match:
    - headers:
        User-Agent:
          regex: ".*Chrome.*"
    route:
    - destination:
        host: app-test
        port:
          number: 7071
        subset: v2
  - name: test3
    route:
    - destination:
        host: app-test
        port:
          number: 7071
        subset: v3

测试
在这里插入图片描述

5.7.1.4 识别header信息

配置流量规则,如下:
hearder中有信息“X-Guo-1: abc1”时由微服务v1提供服务,并添加hearder信息x-version: v1。
hearder中有信息“X-Guo-2: abc2”时由微服务v2提供服务,并添加hearder信息x-version: v2。
hearder中有信息“X-Guo-3: abc3”时由微服务v3提供服务,并添加hearder信息x-version: v3。

# cat virsr-header.yaml
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: app-test-virsr
  namespace: test
spec:
  hosts:
  - www.test.com
  - www.demo.com
  gateways:
  - app-test-getway
  http:
  - name: test1
    match:
    - headers:
        X-Guo-1:
          exact: abc1
    route:
    - destination:
        host: app-test
        port:
          number: 7071
        subset: v1
      headers:
        response:
          add:
            x-version: "v1"          
  - name: test2
    match:
    - headers:
        X-Guo-2:
          exact: abc2
    route:
    - destination:
        host: app-test
        port:
          number: 7071
        subset: v2
      headers:
        response:
          add:
            x-version: "v2"          
  - name: test3
    match:
    - headers:
        X-Guo-3:
          exact: abc3
    route:
    - destination:
        host: app-test
        port:
          number: 7071
        subset: v3
      headers:
        response:
          add:
            x-version: "v3"        

测试

# curl -i -H "X-Guo-1: abc1" -XGET 'http://www.test.com/test'
HTTP/1.1 200 OK
date: Wed, 07 Jun 2023 15:39:15 GMT
content-length: 37
content-type: text/plain; charset=utf-8
x-envoy-upstream-service-time: 5
server: istio-envoy
x-version: v1

GET     testing,v0.1
hello,method is get

# curl -i -H "X-Guo-2: abc2" -XGET 'http://www.test.com/test'
HTTP/1.1 200 OK
date: Wed, 07 Jun 2023 15:39:20 GMT
content-length: 37
content-type: text/plain; charset=utf-8
x-envoy-upstream-service-time: 2
server: istio-envoy
x-version: v2

GET     testing,v0.2
hello,method is get

# curl -i -H "X-Guo-3: abc3" -XGET 'http://www.test.com/test'
HTTP/1.1 200 OK
date: Wed, 07 Jun 2023 15:39:24 GMT
content-length: 37
content-type: text/plain; charset=utf-8
x-envoy-upstream-service-time: 3
server: istio-envoy
x-version: v3

GET     testing,v0.3
hello,method is get

5.7.2 CI/CD测试(AB部署)

本节是前面3.3节后续,部署到k8s上面。

升级docker-in-docker所需镜像,让它可以访问k8s,以便可以部署应用。

5.7.2.1 Ingress配置http/https

demo.test.io由app-test-1提供服务,模拟测试环境,由开发人员自动发布。
www.test.io由app-test-2提供服务,模拟生产环境,由管理员通过OA流程发布。
两者是同一个镜像,但发布条件和环境不同。

查看ingress网关地址

# kubectl get svc -A | grep LoadBalancer
istio-system     istio-ingressgateway           LoadBalancer   10.10.217.255   192.168.3.181   15021:30721/TCP,80:30406/TCP,443:31152/TCP,31400:30352/TCP,15443:30154/TCP   2d21h
nginx-ingress    nginx-ingress                  LoadBalancer   10.10.122.65    192.168.3.180   80:31582/TCP,443:32381/TCP                                                   4d8h

以istio为例

配置解析

# nslookup demo.test.io 192.168.3.250
Server:         192.168.3.250
Address:        192.168.3.250#53

Name:   demo.test.io
Address: 192.168.3.181

# nslookup www.test.io 192.168.3.250
Server:         192.168.3.250
Address:        192.168.3.250#53

Name:   www.test.io
Address: 192.168.3.181

参照5.5.3.4,配置如下。

# kubectl config get-contexts 
CURRENT   NAME                          CLUSTER      AUTHINFO           NAMESPACE
*         kubernetes-admin@kubernetes   kubernetes   kubernetes-admin   test

# kubectl  -n istio-system create secret tls web-ssl --cert=./web.pem --key=./web-key.pem

# cat istio-test.yaml 
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: app-test-getway-demo
  namespace: test
spec:
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - demo.test.io
    - www.test.io
    tls:
      httpsRedirect: true # sends 301 redirect for http requests
  - port:
      number: 443
      name: https-443
      protocol: HTTPS
    hosts:
    - demo.test.io
    - www.test.io
    tls:
      mode: SIMPLE
      credentialName: web-ssl

---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: app-test-demo
  namespace: test
spec:
  hosts:
  - demo.test.io
  gateways:
  - app-test-getway-demo
  http:
  - match:
    - uri:
        prefix: /
    route:
    - destination:
        host: app-test-1
        port:
          number: 7071
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: app-test-www
  namespace: test
spec:
  hosts:
  - www.test.io
  gateways:
  - app-test-getway-demo
  http:
  - match:
    - uri:
        prefix: /
    route:
    - destination:
        host: app-test-2
        port:
          number: 7071

5.7.2.2 制作gitlab-ci
stages/任务说明
build/job1编译go源码为二进制文件
push_image/job2将job1生成的应用打包上镜像,并push到私有仓库
test/job3开发人员push时触发CI/CD。
将微服务镜像部署在测试环境,若deploy已存在则更新它,若不存在则创建它
deploy/job4OA流触发CI/CD。
将微服务镜像部署在正式环境,若deploy已存在则更新它,若不存在则创建它

.gitlab-ci.yml文件详细如下

#全局变量赋值
variables:
  image_name: "busybox"
  #image_name: "centos"
  image_ver: "v0.1.4"

#定义stages
stages:
  - build
  - push_image
  - test
  - deploy

#job1:编译go源码为二进制文件
#局部变量Is_Run默认值为yes,则在push时会执行此任务。
job1:
  variables:
    Is_Run: "yes"
  stage: build
  script:
    - echo "build the code..."
    - export GOROOT=/usr/local/go
    - export PATH=$PATH:/usr/local/go/bin
    - export GOPATH=/opt
    - export GO115MODULE=on
    - export GOOS=linux
    - export GOARCH=amd64
    - export GOPROXY="https://goproxy.cn,direct"
    - go version
    - go mod tidy
    - go build -o app .
    - mkdir build
    - mv app build/
    - mv Dockerfile_nobuild build/Dockerfile
  artifacts:
    paths:
      - build
  tags:
    - docker-in-docker-test-1
  rules:
    - if: $Is_Run == "yes"

#job2的工作是将job1生成的应用打包上镜像,并push到私有仓库。
#局部变量Is_Run默认值为yes,则在push时会执行此任务。
#提示:$UserName和$PassWord是在gitlab项目定义的项目级变量,用于存放私有仓库的用户名和密码
job2:
  variables:
    Is_Run: "yes"
  stage: push_image
  needs:
    - job: job1
      artifacts: true
  script:
    - echo "build image and push harbor register ..."
    - cd build/
    - ls -l
    - docker build -t harbor.demo.com/web/$image_name:$image_ver .
    - docker logout harbor.demo.com
    - echo $PassWord | base64 -d | docker login --username $UserName  --password-stdin harbor.demo.com
    - docker push harbor.demo.com/web/$image_name:$image_ver
    - docker rmi harbor.demo.com/web/$image_name:$image_ver
  tags:
    - docker-in-docker-test-1
  rules:
    - if: $Is_Run == "yes"

#job3的任务是测试应用。
#局部变量Is_Run默认值为yes,则在push时会执行此任务。通常开发过程中测试。
#script中表示,若指定的deploy资源不存,就创建它,若存在,就更新它不触发滚动更新。
job3:
  variables:
    Is_Run: "yes"
    deploy_svc_name: "app-test-1"
    namespace: "test"
  stage: test
  script:
    - echo "deploy_to_k8s, $deploy_svc_name, https://demo.test.io ..."
    - kubectl -n $namespace get deploy $deploy_svc_name 2>/dev/null &&  if [ 0 -eq 0 ]; then containers_name=`kubectl get deployments.apps $deploy_svc_name -o jsonpath={.spec.template.spec.containers[0].name}`; updateRolly=`openssl rand -hex 8`; test='{"spec":{"template":{"spec":{"containers":[{"name":"'$containers_name'","image":"harbor.demo.com/web/'$image_name:$image_ver'","imagePullPolicy":"Always","env":[{"name":"rollingUpdate","value":"'$updateRolly'"}]}]}}}}';   kubectl -n $namespace  patch deployment $deploy_svc_name --patch ''$test''; fi
    - kubectl -n $namespace get deploy $deploy_svc_name 2>/dev/null ||  if [ 0 -eq 0 ]; then kubectl -n $namespace create deployment $deploy_svc_name --image=harbor.demo.com/web/$image_name:$image_ver --port=8080 --replicas=1 ;   kubectl -n $namespace create service clusterip $deploy_svc_name --tcp=7071:8080;fi
  tags:
    - docker-in-docker-test-1
  rules:
    - if: $Is_Run == "yes"

#job4的任务用于发布
#局部变量Is_Run默认值为no,则在push时不会执行此任务,执行条件为:$Is_Run == "deploy"。
#需通过webhook方式执行此任务。通常用于在OA工作流中供领导审批是否正式发布此应用。
#script中表示,若指定的deploy资源不存,就创建它,若存在,就更新它不触发滚动更新。
job4:
  variables:
    Is_Run: "no"
    deploy_svc_name: "app-test-2"
    namespace: "test"
  stage: deploy
  script:
    - echo "deploy_to_k8s, $deploy_svc_name, https://www.test.io ..."
    - kubectl -n $namespace get deploy $deploy_svc_name 2>/dev/null &&  if [ 0 -eq 0 ]; then containers_name=`kubectl get deployments.apps $deploy_svc_name -o jsonpath={.spec.template.spec.containers[0].name}`; updateRolly=`openssl rand -hex 8`; test='{"spec":{"template":{"spec":{"containers":[{"name":"'$containers_name'","image":"harbor.demo.com/web/'$image_name:$image_ver'","imagePullPolicy":"Always","env":[{"name":"rollingUpdate","value":"'$updateRolly'"}]}]}}}}';   kubectl -n $namespace  patch deployment $deploy_svc_name --patch ''$test''; fi
    - kubectl -n $namespace get deploy $deploy_svc_name 2>/dev/null ||  if [ 0 -eq 0 ]; then kubectl -n $namespace create deployment $deploy_svc_name --image=harbor.demo.com/web/$image_name:$image_ver --port=8080 --replicas=1 ;   kubectl -n $namespace create service clusterip $deploy_svc_name --tcp=7071:8080;fi
  tags:
    - docker-in-docker-test-1
  rules:
    - if: $Is_Run == "deploy"
5.7.2.3 开发人员push发布到测试环境

push源码

git add *
git commit -m "test-1"
git push http://git.demo.com/guofs/cicdtest.git main

查看

# curl https://demo.test.io/test
GET     testing v0.1.6
hello,method is get
5.7.2.4 OA流程调用webhook发布到正式环境

测试环境通过后,可通过OA流程调用webhook发布到正式环境。
用于领导申批是否正式发布。

curl -X POST \
     --fail \
     -F token=glptt-938d9966afdc10180540a775d6e5e399fcd2cea0 \
     -F ref=main \
     -F "variables[Is_Run]=deploy" \
     -F "variables[deploy_svc_name]=app-test-2" \
     http://git.demo.com/api/v4/projects/8/trigger/pipeline

查看

# curl -Xhttps://www.test.io/test
GET     testing v0.1.6
hello,method is get

5.7.3 滚动更新

K8S默认配置滚动更新,默认策略如下

  replicas: 1
  revisionHistoryLimit: 10
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate

触发滚动升级的条件

1。pod模板中镜像的策略必须是"imagePullPolicy: Always"。
2。Deployment中的pod模板中的配置发生变化。

发生滚动升级后,有如下变化

1。更新后,pod的ID发生变化
2。产生一个新的replicaset.apps对像

结合5.7.2办法,用如下go代码更新并产生新镜像。

package main

import (
	"fmt"
	"log"
	"net/http"
)

func handler_url(w http.ResponseWriter, r *http.Request) {
	fmt.Fprintf(w, "%v\t%v\n", r.Method, "testing v0.1.1")
	if r.Method == "GET" {
		fmt.Fprintf(w, "hello,method is %v\n", "get")
	}
	if r.Method == "POST" {
		fmt.Fprintf(w, "hello,method is %v\n", "post")
	}
	if r.Method == "PUT" {
		fmt.Fprintf(w, "%v\n", "post")
	}

}

func main() {
	http.HandleFunc("/test", handler_url)
	err := http.ListenAndServe(":8080", nil) //启动http服务器
	if err != nil {
		log.Fatalln(err)
	}
}

更改Fprintf输出后依次v0.1、v0.2、v0.3、v0.4镜像。

查看deployt和rs

# kubectl get deploy,rs  -o wide
NAME                         READY   UP-TO-DATE   AVAILABLE   AGE     CONTAINERS   IMAGES                             SELECTOR
deployment.apps/app-test-1   1/1     1            1           8m45s   busybox      harbor.demo.com/web/busybox:v0.4   app=app-test-1

NAME                                    DESIRED   CURRENT   READY   AGE     CONTAINERS   IMAGES                             SELECTOR
replicaset.apps/app-test-1-6467784f89   0         0         0       5m55s   busybox      harbor.demo.com/web/busybox:v0.2   app=app-test-1,pod-template-hash=6467784f89
replicaset.apps/app-test-1-7b5f4489d4   0         0         0       3m42s   busybox      harbor.demo.com/web/busybox:v0.3   app=app-test-1,pod-template-hash=7b5f4489d4
replicaset.apps/app-test-1-7c997956fc   1         1         1       98s     busybox      harbor.demo.com/web/busybox:v0.4   app=app-test-1,pod-template-hash=7c997956fc
replicaset.apps/app-test-1-86cdb4866c   0         0         0       8m45s   busybox      harbor.demo.com/web/busybox:v0.1   app=app-test-1,pod-template-hash=86cdb4866c

查看滚动历史

# kubectl rollout history deployment app-test-1 
deployment.apps/app-test-1 
REVISION  CHANGE-CAUSE
1         <none>
2         <none>
3         <none>
4         <none>

当前版本是v4.

# curl https://demo.test.io/test
GET     testing v0.4
hello,method is get

回滚到v2

kubectl rollout undo deployment app-test-1 --to-revision=2

查看回滚效果

# curl https://demo.test.io/test
GET     testing v0.2
hello,method is get

查看回滚后deployt和rs

# kubectl get deploy,rs  -o wide
NAME                         READY   UP-TO-DATE   AVAILABLE   AGE   CONTAINERS   IMAGES                             SELECTOR
deployment.apps/app-test-1   1/1     1            1           14m   busybox      harbor.demo.com/web/busybox:v0.2   app=app-test-1

NAME                                    DESIRED   CURRENT   READY   AGE     CONTAINERS   IMAGES                             SELECTOR
replicaset.apps/app-test-1-6467784f89   1         1         1       11m     busybox      harbor.demo.com/web/busybox:v0.2   app=app-test-1,pod-template-hash=6467784f89
replicaset.apps/app-test-1-7b5f4489d4   0         0         0       9m31s   busybox      harbor.demo.com/web/busybox:v0.3   app=app-test-1,pod-template-hash=7b5f4489d4
replicaset.apps/app-test-1-7c997956fc   0         0         0       7m27s   busybox      harbor.demo.com/web/busybox:v0.4   app=app-test-1,pod-template-hash=7c997956fc
replicaset.apps/app-test-1-86cdb4866c   0         0         0       14m     busybox      harbor.demo.com/web/busybox:v0.1   app=app-test-1,pod-template-hash=86cdb4866c

5.7.4 HPA弹性计算

使用如下go代码创建微服务镜像harbor.demo.com/web/busybox:v0.6

package main

import (
	"fmt"
	"log"
	"math"
	"net/http"
)

func handler_url(w http.ResponseWriter, r *http.Request) {
	var num = 3.1425926
	var i = 0
	for ; i < 10000000; i++ {
		num += math.Sqrt(num)
	}

	fmt.Fprintf(w, "num=%v,i=%v\n", num, i)
	fmt.Fprintf(w, "%v\t%v\n", r.Method, "testing v0.5")
	if r.Method == "GET" {
		fmt.Fprintf(w, "hello,method is %v\n", "get")
	}
	if r.Method == "POST" {
		fmt.Fprintf(w, "hello,method is %v\n", "post")
	}
	if r.Method == "PUT" {
		fmt.Fprintf(w, "%v\n", "post")
	}

}

func main() {

	http.HandleFunc("/test", handler_url)
	err := http.ListenAndServe(":8080", nil) //启动http服务器
	if err != nil {
		log.Fatalln(err)
	}
}

创建pod模板

# cat pod.yaml
apiVersion: apps/v1 
kind: Deployment 
metadata:
  name: app-test
  namespace: test
spec:
  selector:
    matchLabels:
      app: app-test
  replicas: 2
  template:
    metadata:
      name: app-test
      namespace: test
      labels:
        app: app-test
    spec:
      containers:
      - name: http
        image: harbor.demo.com/web/busybox:v0.6
        imagePullPolicy: IfNotPresent
        ports:
        - name: port-test-01
          containerPort: 8080
          protocol: TCP
        resources:
          limits:
            cpu: 80m
          requests:
            cpu: 20m
---

apiVersion: v1
kind: Service
metadata:
  labels:
    app: app-test
  name: app-test
  namespace: test
spec:
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800
  selector:
    app: app-test
  ports:
    - name: port01
      port: 7071
      targetPort: 8080
      protocol: TCP
  type: ClusterIP

创建ingress配置

# cat istio-test.yaml
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: app-test-getway-hpa
  namespace: test
spec:
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - www.test.com
    tls:
      httpsRedirect: true
  - port:
      number: 443
      name: https-443
      protocol: HTTPS
    hosts:
    - www.test.com
    tls:
      mode: SIMPLE
      credentialName: web-ssl        

---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: app-test-hpa
  namespace: test
spec:
  hosts:
  - www.test.com
  gateways:
  - app-test-getway-hpa
  http:
  - match:
    - uri:
        prefix: /
    route:
    - destination:
        host: app-test
        port:
          number: 7071

创建hpa

kubectl autoscale deployment app-test --cpu-percent=50 --min=1 --max=10

测试前

# kubectl get hpa
NAME       REFERENCE             TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
app-test   Deployment/app-test   16%/50%   1         10        1          6m21s

测试

# for i in $(seq 1 200);do curl https://www.test.com/test;done
num=2.499998060324e+13,i=10000000
GET     testing v0.5
hello,method is get
num=2.499998060324e+13,i=10000000
GET     testing v0.5
hello,method is get
num=2.499998060324e+13,i=10000000
GET     testing v0.5
hello,method is get
num=2.499998060324e+13,i=10000000
GET     testing v0.5
hello,method is get
...

查看hpa,可以看到pod副本扩容效果增到9。

# kubectl get hpa
NAME       REFERENCE             TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
app-test   Deployment/app-test   48%/50%   1         10        9          9m9s

# kubectl get pod
NAME                        READY   STATUS    RESTARTS   AGE
app-test-748d6b44bb-5vwnl   2/2     Running   0          2m
app-test-748d6b44bb-78llx   2/2     Running   0          3m30s
app-test-748d6b44bb-8b5sn   2/2     Running   0          95m
app-test-748d6b44bb-8zqzk   2/2     Running   0          3m30s
app-test-748d6b44bb-glgvc   2/2     Running   0          3m15s
app-test-748d6b44bb-r77cj   2/2     Running   0          45s
app-test-748d6b44bb-w8z92   2/2     Running   0          2m
app-test-748d6b44bb-xwqct   2/2     Running   0          3m30s
app-test-748d6b44bb-zrrqn   2/2     Running   0          3m15s

停止测试

# kubectl get hpa
NAME       REFERENCE             TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
app-test   Deployment/app-test   8%/50%   1         10        1          24m

# kubectl top pod 
NAME                        CPU(cores)   MEMORY(bytes)   
app-test-748d6b44bb-8b5sn   4m           61Mi            

需等待一些时间可以看到缩容效果。

六、第三方镜像部署

6.1 数据库mysql配置(有状态部署)

采用有状态部署mysql,mysql对主机名和数据库存储目录要求

# cat mysql.yaml 
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: app-test-mysql
  namespace: test
spec:
  serviceName: app-test-mysql
  selector:
    matchLabels:
      app: app-test-mysql
  replicas: 1
  template:
    metadata:
      name: app-test-mysql
      namespace: test
      labels:
        app: app-test-mysql
    spec:
      containers:
      - name: mysql
        image: harbor.demo.com/app/mysql:latest
        #image: harbor.demo.com/web/busybox:v0.1
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: mysql-data
          mountPath: /var/lib/mysql
        ports:
        - name: port-test-01
          containerPort: 3306
          protocol: TCP
        env:
        - name: MYSQL_ROOT_PASSWORD
          value: 12qwaszx
  volumeClaimTemplates:               
  - metadata:
      name: mysql-data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: csi-rbd-sc
      resources:
        requests:
          storage: 5Gi            
---

apiVersion: v1
kind: Service
metadata:
  labels:
    app: app-test-mysql
  name: app-test-mysql
  namespace: test
spec:
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800
  selector:
    app: app-test-mysql
  ports:
    - name: port01
      port: 3306
      targetPort: 3306
      protocol: TCP
  type: ClusterIP


# kubectl apply -f mysql.yaml 

测试pod(mysql client)

# cat client.yaml
apiVersion: apps/v1 
kind: Deployment 
metadata:
  name: app-test
  namespace: test
spec:
  selector:
    matchLabels:
      app: app-test
  template:
    metadata:
      name: app-test
      namespace: test
      labels:
        app: app-test
    spec:
      containers:
      - name: mysql-client
        image: harbor.demo.com/test/testtool:v0.2
        imagePullPolicy: IfNotPresent

# kubectl apply -f client.yaml 

# kubectl get pod
NAME                        READY   STATUS    RESTARTS   AGE
app-test-6b76f4f697-mzn7c   2/2     Running   0          22m
app-test-mysql-0            2/2     Running   0          22m


# kubectl exec pod/app-test-6b76f4f697-mzn7c -it /bin/bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
[root@app-test-6b76f4f697-mzn7c app]# mysql -u root -h app-test-mysql.test.svc.cluster.local -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 16
Server version: 8.0.33 MySQL Community Server - GPL

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| gfs                |
| information_schema |
| mysql              |
| performance_schema |
| sys                |
+--------------------+
5 rows in set (0.00 sec)

删除pod再产生新的pod副本,数据仍然存在。

6.2 私有化部署chatgpt

七、运维层

7.1 dashboard配置

安装k8s官方提供的dashboard

# wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml -o dashboard.yaml

# cat dashboard.yaml | grep image:
          image: kubernetesui/dashboard:v2.7.0
          image: kubernetesui/metrics-scraper:v1.0.8
转存到私仓,并修改镜像地址,如下

# cat dashboard.yaml | grep image:
          image: harbor.demo.com/k8s/dashboard:v2.7.0
          image: harbor.demo.com/k8s/metrics-scraper:v1.0.8

配置svc采用LB方式
# vi dashboard.yaml 
...
kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 443
      targetPort: 8443
  selector:
    k8s-app: kubernetes-dashboard
    type: LoadBalancer
...


# kubectl apply -f dashboard.yaml 
# kubectl get all -n kubernetes-dashboard
NAME                                            READY   STATUS    RESTARTS   AGE
pod/dashboard-metrics-scraper-d97df5556-vvv9w   1/1     Running   0          16s
pod/kubernetes-dashboard-6694866798-pcttp       1/1     Running   0          16s

NAME                                TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)         AGE
service/dashboard-metrics-scraper   ClusterIP      10.10.153.173   <none>          8000/TCP        17s
service/kubernetes-dashboard        LoadBalancer   10.10.186.6     192.168.3.182   443:31107/TCP   18s

NAME                                        READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/dashboard-metrics-scraper   1/1     1            1           17s
deployment.apps/kubernetes-dashboard        1/1     1            1           17s

NAME                                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/dashboard-metrics-scraper-d97df5556   1         1         1       17s
replicaset.apps/kubernetes-dashboard-6694866798       1         1         1       17s

添加用户

# cat admin-user.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin-user
  namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: admin-user
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: admin-user
  namespace: kubernetes-dashboard

# kubectl apply -f admin-user.yaml

# kubectl -n kubernetes-dashboard create token admin-user
eyJhbGciOiJSUzI1NiIsImtpZCI6IjlUNmROZTZZSEJ4WEJIell2OG5IQS1oTGVLYjJWRU9QRlhzUFBmdlVONU0ifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNjg2NTQ3MTE3LCJpYXQiOjE2ODY1NDM1MTcsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJhZG1pbi11c2VyIiwidWlkIjoiNjk1MWFlODktODYwMi00NzAzLTk3NzYtMmNhNmU0OTJlZjQ2In19LCJuYmYiOjE2ODY1NDM1MTcsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlcm5ldGVzLWRhc2hib2FyZDphZG1pbi11c2VyIn0.j9XHrznphuwv56hcSGRlcOxvzuGGbKEdPZB1r5jc84kNICp2sTwXvr71d6wdYtzGxjODZ81kTqVqRQUcUKi0Uh8OWjxWcspNJIWk0y6_Eub823YWzkusktb7NdqCb6BYIyX79V4iFUQaVjp9BlEXSZ6vnuJhwvEonumDrIo0JtUF8PT1ZV3319kajFTZMWza-QHRMFOjGC74YleMd-7gDA-aimoxjPQIVfIWF2PhssLj38Ci-KZddxOE1yE42QFOmPozOzCT348ZEJEO1lhDC4trnK2TTU8jb1sM7RyPKuvyY0fbimqNi6iGL-aqCaQT6_nWDvxkVycapJ3KAwz2Zw

访问

# nslookup dashboard.demo.com
Server:         192.168.3.250
Address:        192.168.3.250#53

Name:   dashboard.demo.com
Address: 192.168.3.182

https://dashboard.demo.com
在这里插入图片描述

7.2 rancher

https://github.com/rancher/rancher

常用的k8s容器编排工具如openshift和rancher.
本文以rancher为例。rancher可以同时管理多个k8s集群。

rancher版本
latest 当前最新版,v2.7.3
stable 当前稳定版,v2.6.12

7.2.1 rancher节点安装

安装docker

yum -y install yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum -y install docker-ce docker-ce-cli containerd.io
systemctl enable docker containerd
systemctl start docker containerd

rancher镜像拉取(比较大,建议拉取后转存私仓)

# docker pull rancher/rancher:latest

创建rancher节点目录

mkdir -p /opt/racher
7.2.1.1 rancher-web被访问时的证书配置
将域名证书复制到/opt/racher/ca

将证书等改名,如下:
ca.pem ----> cacerts.pem
web-key.pem  ---> key.pem
web.pem ---> cert.pem

# tree /opt/rancher/ca
/opt/rancher/ca
├── cacerts.pem
├── cert.pem
└── key.pem
7.2.1.2 私仓配置
# cat /opt/rancher/registries.yaml 
mirrors:
  harbor.demo.com:
    endpoint:
      - "harbor.demo.com"
configs:
  "harbor.demo.com":
    auth:
      username: admin
      password: 12qwaszx+pp
    tls:
      ca_file: /opt/harbor/ca.crt
      cert_file: /opt/harbor/harbor.demo.com.cert
      key_file: /opt/harbor/harbor.demo.com.key
其中的 /opt/harbor/ 目录是rancher节点运行时其容器内部的目录。


私仓库被访问时需使用的证书
# tree /opt/rancher/ca_harbor/
/opt/rancher/ca_harbor/
├── ca.crt
├── harbor.demo.com.cert
└── harbor.demo.com.key
在启动rancher时,需将 /opt/rancher/ca_harbor/ 映射到容器的 /opt/harbor/ 目录(在registries-https.yaml中已指定该目录)
7.2.1.3 安装rancher节点
# docker run -d -it -p 80:80 -p 443:443  --name rancher --privileged=true --restart=unless-stopped \
-v /opt/rancher/k8s:/var/lib/rancher \
-v /opt/rancher/ca:/etc/rancher/ssl \
-e SSL_CERT_DIR="/etc/rancher/ssl" \
-e CATTLE_SYSTEM_DEFAULT_REGISTRY=harbor.demo.com \
-v /opt/rancher/registries.yaml:/etc/rancher/k3s/registries.yaml \
-v /opt/rancher/ca_harbor:/opt/harbor \
rancher/rancher:latest

查看启动日志

# docker logs rancher -f
7.2.1.4 访问rancher web
# nslookup rancher.demo.com
Server:         192.168.3.250
Address:        192.168.3.250#53

Name:   rancher.demo.com
Address: 10.2.20.151

访问https://rancher.demo.com
在这里插入图片描述

查看默认admin用户密码

# docker exec -it rancher kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{{.data.bootstrapPassword|base64decode}}{{"\n"}}'

若忘记密码,则重设密码

# docker exec -it rancher reset-password

7.2.2 添加外部k8s集群

在这里插入图片描述
在这里插入图片描述

# curl --insecure -sfL https://rancher.demo.com/v3/import/rndjzbgwn78v6v6dx28dlngn7r7qrlwv4b949c47567ltjz7g76tqn_c-m-68r9m4vz.yaml  -o rancher.yaml
# cat rancher.yaml | grep image:
          image: rancher/rancher-agent:v2.7.3
改为私仓地址
# cat rancher.yaml | grep image:
          image: harbor.demo.com/rancher/rancher-agent:v2.7.3

安装
# kubectl apply -f rancher.yaml 

查看
# kubectl -n cattle-system get all
NAME                                        READY   STATUS              RESTARTS   AGE
pod/cattle-cluster-agent-5cb7bb7b9b-kc5fn   0/1     ContainerCreating   0          27s

NAME                           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
service/cattle-cluster-agent   ClusterIP   10.10.104.246   <none>        80/TCP,443/TCP   27s

NAME                                   READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cattle-cluster-agent   0/1     1            0           28s

NAME                                              DESIRED   CURRENT   READY   AGE
replicaset.apps/cattle-cluster-agent-5cb7bb7b9b   1         1         0       28s

在这里插入图片描述

在这里插入图片描述
查看某个pod
在这里插入图片描述

7.2.3 新增k8s集群

使用rancher建立新的k8s集群较简单,在目标节点上直接运行相应命令即可。
在这里插入图片描述
在这里插入图片描述

7.3 prometheus/grafana

https://github.com/prometheus/prometheus/

Prometheus是一个开源的系统监控和报警系统,是当前一套非常流行的开源监控和报警系统之一,CNCF托管的项目,在kubernetes容器管理系统中,通常会搭配prometheus进行监控,Prometheus性能足够支撑上万台规模的集群。
在这里插入图片描述
prometheus显著特点
1。多维数据模型(时间序列由metrics指标名字和设置key/value键/值的labels构成),高效的存储
2。灵活的查询语言(PromQL)
3。采用http协议,使用pull模式,拉取数据
4。通过中间网关支持推送。
5。丰富多样的Metrics采样器exporter。
6。与Grafana完美结合,由Grafana提供数据可视化能力。

Grafana 是一个用于可视化大型测量数据的开源系统,它的功能非常强大,界面也非常漂亮,使用它可以创建自定义的控制面板,你可以在面板中配置要显示的数据和显示方式,有大量第三方可视插件可使用。

Prometheus/Grafana支持多种方式安装,如源码、二进制、镜像等方式,都比较简单。

本文以kube-prometheus方式在k8s上安装Prometheus/Grafana.
官方安装文档:https://prometheus-operator.dev/docs/prologue/quick-start/ ​
安装要求:https://github.com/prometheus-operator/kube-prometheus#compatibility ​
官方Github地址:https://github.com/prometheus-operator/kube-prometheus

7.3.1 kube-prometheus安装

下载

# git clone https://github.com/coreos/kube-prometheus.git
# cd kube-prometheus/manifests

查看所需镜像,并转存到私仓,同时修改清单文件中的镜像地址改为私仓

# find ./ | xargs grep image:
# cat prometheusOperator-deployment.yaml | grep prometheus-config-reloader
        - --prometheus-config-reloader=quay.io/prometheus-operator/prometheus-config-reloader:v0.65.2

更改prometheusAdapter-clusterRoleAggregatedMetricsReader.yaml

# cat prometheusAdapter-clusterRoleAggregatedMetricsReader.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app.kubernetes.io/component: metrics-adapter
    app.kubernetes.io/name: prometheus-adapter
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 0.10.0
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-view: "true"
  name: system:aggregated-metrics-reader
  #namespace: monitoring
rules:
- apiGroups:
  - metrics.k8s.io
  resources:
  - services
  - endpoints
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch

提示:
本配置定义的资源在安装metrics-reader时已提供,此时只需更新一下配置即可。

更改prometheus-clusterRole.yaml

下面的更改,参照istio中提供的prometheus配置
# cat prometheus-clusterRole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app.kubernetes.io/component: prometheus
    app.kubernetes.io/instance: k8s
    app.kubernetes.io/name: prometheus
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 2.44.0
  name: prometheus-k8s
rules:
  - apiGroups:
      - ""
    resources:
      - nodes
      - nodes/proxy
      - nodes/metrics
      - services
      - endpoints
      - pods
      - ingresses
      - configmaps
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - "extensions"
      - "networking.k8s.io"
    resources:
      - ingresses/status
      - ingresses
    verbs:
      - get
      - list
      - watch
  - nonResourceURLs:
      - "/metrics"
    verbs:
      - get

说明:
若采用kube-promethenus提供的配置,则在创建ServiceMonitor时不会被prometheus识别。

安装

# kubectl create -f setup/
# kubectl apply -f ./prometheusAdapter-clusterRoleAggregatedMetricsReader.yaml
# rm -f prometheusAdapter-clusterRoleAggregatedMetricsReader.yaml
# kubectl create -f ./
查看svc
# kubectl -n monitoring get svc
NAME                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
alertmanager-main       ClusterIP   10.10.251.51    <none>        9093/TCP,8080/TCP            7m18s
alertmanager-operated   ClusterIP   None            <none>        9093/TCP,9094/TCP,9094/UDP   6m51s
blackbox-exporter       ClusterIP   10.10.195.115   <none>        9115/TCP,19115/TCP           7m17s
grafana                 ClusterIP   10.10.121.183   <none>        3000/TCP                     7m13s
kube-state-metrics      ClusterIP   None            <none>        8443/TCP,9443/TCP            7m12s
node-exporter           ClusterIP   None            <none>        9100/TCP                     7m10s
prometheus-k8s          ClusterIP   10.10.230.211   <none>        9090/TCP,8080/TCP            7m9s
prometheus-operated     ClusterIP   None            <none>        9090/TCP                     6m48s
prometheus-operator     ClusterIP   None            <none>        8443/TCP                     7m8s

域名解析

prometheus.demo.com 192.168.3.180
grafana.demo.com 192.168.3.180
alert.demo.com 192.168.3.180

对外开放服务

# cat open-ui.yaml 
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: monitor-prometheus
  namespace: monitoring
spec:
  ingressClassName: nginx
  rules:
  - host: prometheus.demo.com
    http:
      paths:
      - backend:
          service:
            name: prometheus-k8s
            port:
              number: 9090
        path: /
        pathType: Prefix
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: monitor-grafana
  namespace: monitoring
spec:
  ingressClassName: nginx
  rules:
  - host: grafana.demo.com
    http:
      paths:
      - backend:
          service:
            name: grafana
            port:
              number: 3000
        path: /
        pathType: Prefix
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: monitor-alert
  namespace: monitoring
spec:
  ingressClassName: nginx
  rules:
  - host: alert.demo.com
    http:
      paths:
      - backend:
          service:
            name: alertmanager-main
            port:
              number: 9093
        path: /
        pathType: Prefix


# kubectl apply -f open-ui.yaml

访问
http://grafana.demo.com (默认用户名admin,密码admin)

kube-prometheus已配置一些模板,如:
在这里插入图片描述
可以从https://grafana.com/grafana/dashboards/找到所需模板,如14518模板
在这里插入图片描述

7.3.2 应用配置样例

ServiceMonitor 自定义资源(CRD)能够声明如何监控一组动态服务的定义。它使用标签选择定义一组需要被监控的服务。这样就允许组织引入如何暴露 metrics 的规定,只要符合这些规定新服务就会被发现列入监控,而不需要重新配置系统。

Prometheus就是通过ServiceMonitor提供的metrics数据接口把我们数据pull过来的。ServiceMonitor 通过label标签和对应的 endpoint 和 svc 进行关联。

查看kube-prometheus自带的ServiceMonitor

# kubectl -n monitoring get ServiceMonitor
NAME                      AGE
alertmanager-main         96m
app-test-ext-test         91s
blackbox-exporter         96m
coredns                   96m
grafana                   96m
kube-apiserver            96m
kube-controller-manager   96m
kube-scheduler            96m
kube-state-metrics        96m
kubelet                   96m
node-exporter             96m
prometheus-k8s            96m
prometheus-operator       96m

7.3.2.1 ceph Metrics(集群外部ceph)

启动ceph exporter的监听端口

# ceph mgr module enable prometheus
# ss -lntp | grep mgr
LISTEN 0      128       10.2.20.90:6802      0.0.0.0:*    users:(("ceph-mgr",pid=1108,fd=27))
LISTEN 0      128       10.2.20.90:6803      0.0.0.0:*    users:(("ceph-mgr",pid=1108,fd=28))
LISTEN 0      5                  *:8443            *:*    users:(("ceph-mgr",pid=1108,fd=47))
LISTEN 0      5                  *:9283            *:*    users:(("ceph-mgr",pid=1108,fd=38))    //其中9283是ceph exporter的监听端口

查看
http://10.2.20.90:9283/metrics

创建svc和endpoints

# cat ext-ceph.yaml 
apiVersion: v1
kind: Service
metadata:
  labels:
    app: app-test-ext-ceph
  name: app-test-ext-ceph
  namespace: test
spec:
  ports:
  - name: ceph-metrics
    port: 9283
    targetPort: 9283
    protocol: TCP

---
apiVersion: v1
kind: Endpoints
metadata:
  name: app-test-ext-ceph
  namespace: test
  labels:
    app: app-test-ext-ceph
subsets:
- addresses:
  - ip: 10.2.20.90
  ports:
  - name: ceph-metrics
    port: 9283

创建ServiceMonitor

# cat sm.yaml 
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app: app-test-ext-ceph
  name: app-test-ext-ceph
  namespace: test
spec:
  jobLabel: app-test-ext-ceph
  endpoints:
  - interval: 10s
    port: ceph-metrics
    path: /metrics
    scheme: http
  selector:
    matchLabels:
      app: app-test-ext-ceph
  namespaceSelector:
    matchNames:
    - test

# kubectl apply -f ext-ceph.yaml -f sm.yaml 

在prometheus中可以发现添加的ServiceMonitor已被识别。
在这里插入图片描述

在grafana添加ceph模板2842,例如:
在这里插入图片描述

7.3.2.2 istio边车容器metrics(自动发现)

istio的边车容器默认要用15020作为metrics采集端口。

配置svc对像

# cat sv.yaml 
apiVersion: v1
kind: Service
metadata:
  labels:
    sidecar-metrice: test
  name: app-test-15020
  namespace: test
spec:
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800
  selector:
    security.istio.io/tlsMode: istio	//匹对启用边车代理的容器。
  ports:
    - name: istio-sidecar
      port: 15020
      targetPort: 15020
      protocol: TCP
  type: ClusterIP
                

创建sm

# cat sm-test.yaml 
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app: app-test-sm
  name: app-test-sm-istio-sidecar
  namespace: test
spec:
  jobLabel: app.kubernetes.io/name
  endpoints:
  - interval: 5s
    port: istio-sidecar
    path: /metrics
    scheme: http
  selector:
    matchLabels:
      sidecar-metrice: test		//匹对带有“sidecar-metrice: test”标签的svc/endpoint
  namespaceSelector:
    matchNames:
    - test     

在prometheus中可以发现添加的ServiceMonitor已被识别。

7.3.2.3 nginx ingress Metrics

修改nginx ingress的pod模板,打开metrics。

# vi ingress/nginx/kubernetes-ingress-main/deployments/deployment/nginx-ingress.yaml
...
        ports:
        - name: http
          containerPort: 80
        - name: https
          containerPort: 443
        - name: readiness-port
          containerPort: 8081
        - name: prometheus
          containerPort: 9113
...
        args:
          - -nginx-configmaps=$(POD_NAMESPACE)/nginx-config
         #- -default-server-tls-secret=$(POD_NAMESPACE)/default-server-secret
         #- -include-year
         #- -enable-cert-manager
         #- -enable-external-dns
         #- -v=3 # Enables extensive logging. Useful for troubleshooting.
         #- -report-ingress-status
         #- -external-service=nginx-ingress
          - -enable-prometheus-metrics       #取消注解。 
...

# kubectl apply -f ingress/nginx/kubernetes-ingress-main/deployments/deployment/nginx-ingress.yaml

创建针对metrics的服务

# cat sv.yaml 
apiVersion: v1
kind: Service
metadata:
  labels:
    nginx-metrice: test
  name: app-test-9113
  namespace: nginx-ingress
spec:
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800
  selector:
    app: nginx-ingress
  ports:
    - name: prometheus
      port: 9113
      targetPort: 9113
      protocol: TCP
  type: ClusterIP

创建sm

# cat sm.yaml 
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app: app-test-nginx-ingress
  name: app-test-nginx-ingress
  namespace: nginx-ingress
spec:
  jobLabel: app.kubernetes.io/name
  endpoints:
  - interval: 10s
    port: prometheus
    path: /metrics
    scheme: http
  selector:
    matchLabels:
      nginx-metrice: test
  namespaceSelector:
    matchNames:
    - nginx-ingress

则在prometheus中可以发现添加的ServiceMonitor已被识别。
在这里插入图片描述

在grafana中添加模板即可,如下
在这里插入图片描述

7.3.2.4 mysql Metrics

拉取mysqld-exporter并转私仓harbor.demo.com/exporter/mysqld-exporter:latest

# docker pull bitnami/mysqld-exporter

开启数据库,采用有状态部署方式

# cat mysql.yaml 
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: app-test-mysql
  namespace: test
spec:
  serviceName: app-test-mysql
  selector:
    matchLabels:
      app: app-test-mysql
  replicas: 1
  template:
    metadata:
      name: app-test-mysql
      namespace: test
      labels:
        app: app-test-mysql
    spec:
      containers:
      - name: mysql
        image: harbor.demo.com/app/mysql:latest
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: mysql-data
          mountPath: /var/lib/mysql
        ports:
        - name: port-test-01
          containerPort: 3306
          protocol: TCP
        env:
        - name: MYSQL_ROOT_PASSWORD
          value: abc123456
        args:
        - --character-set-server=gbk
  volumeClaimTemplates:               
  - metadata:
      name: mysql-data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: csi-rbd-sc
      resources:
        requests:
          storage: 5Gi            
---

apiVersion: v1
kind: Service
metadata:
  labels:
    app: app-test-mysql
  name: app-test-mysql
  namespace: test
spec:
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800
  selector:
    app: app-test-mysql
  ports:
    - name: port01
      port: 3306
      targetPort: 3306
      protocol: TCP
  type: ClusterIP

mysql库创建相应用户并赋权

use mysql;
create user 'admin'@'%' identified with mysql_native_password  by '123456';
grant ALL on *.* to 'admin'@'%' with grant option;
flush privileges;

开启exporter,默认采用9104端口来收集metrics

# cat exporter.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-test-mysql-exporter
  namespace: test
spec:
  selector:
    matchLabels:
      app: app-test-mysql-exporter
  replicas: 1
  template:
    metadata:
      name: app-test-mysql-exporter
      namespace: test
      labels:
        app: app-test-mysql-exporter
    spec:
      containers:
      - name: mysqld-exporter
        image: harbor.demo.com/exporter/mysqld-exporter:latest
        imagePullPolicy: IfNotPresent
        ports:
        - name: port-test-02
          containerPort: 9104
          protocol: TCP
        env:
        - name: DATA_SOURCE_NAME
          value: 'admin:123456@(app-test-mysql.test.svc:3306)/'
---

apiVersion: v1
kind: Service
metadata:
  labels:
    app: app-test-mysql-exporter
  name: app-test-mysql-exporter
  namespace: test
spec:
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800
  selector:
    app: app-test-mysql-exporter
  ports:
  - name: mysql-exporter
    port: 9104
    targetPort: 9104
    protocol: TCP
  type: ClusterIP

创建sm

# cat sm.yaml 
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app: app-test-mysql-exporter
  name: app-test-mysql-exporter
  namespace: test
spec:
  jobLabel: app.kubernetes.io/name
  endpoints:
  - interval: 10s
    port: mysql-exporter
    path: /metrics
    scheme: http
  selector:
    matchLabels:
      app: app-test-mysql-exporter
  namespaceSelector:
    matchNames:
    - test 

则在prometheus中可以发现添加的ServiceMonitor已被识别。
在grafana中添加模板即可,如:在这里插入图片描述

7.3.3 配置rule和告警

7.3.3.1 配置rule

查看kube-prometheus安装时提供的规则

# kubectl -n monitoring get prometheusrules
NAME                              AGE
alertmanager-main-rules           22h
grafana-rules                     22h
kube-prometheus-rules             22h
kube-state-metrics-rules          22h
kubernetes-monitoring-rules       22h
node-exporter-rules               22h
prometheus-k8s-prometheus-rules   22h
prometheus-operator-rules         22h

为方便查看测试效果,可删除默认安装的规则.

添加测试规则

# cat rule1.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    prometheus: k8s
    role: alert-rules
  name: rule-test-1
  namespace: monitoring
spec:
  groups:
  - name: rule-test-1
    rules:
    - alert: InstanceDown
      expr: up == 0
      for: 1m
      labels:
        severity: warning
      annotations:
        summary: "服务 {{ $labels.instance }} 下线了"
        description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes."


# cat rule2.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    prometheus: k8s
    role: alert-rules
  name: rule-test-2
  namespace: monitoring
spec:
  groups:
  - name: rule-test-2
    rules:
    - alert: Watchdog
      annotations:
        message: |
          此警报旨在确认整个警报管道功能性的。这个警报始终处于触发状态,因此它应始终在Alertmanager中触发,并始终针对各类接收器发送。
      expr: vector(1)
      labels:
        severity: none

# kubectl create -f rule1.yaml -f rule2.yaml
# kubectl -n monitoring get prometheusrules
NAME          AGE
rule-test-1   87m
rule-test-2   32m

在这里插入图片描述

7.3.3.1 配置alert

# cat /tmp/alert.conf
global:
  smtp_smarthost: 'smtp.139.com:25'
  smtp_from: 'guofs@139.com'
  smtp_auth_username: 'guofs@139.com'
  smtp_auth_password: 'abc1239034b78612345678'
  smtp_require_tls: false

route:
  group_by: ['alertname']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 1h
  receiver: 'default-receiver'

receivers:
- name: 'default-receiver'
  email_configs:
  - to: 'guofs@163.com'

inhibit_rules:
- source_match:
    severity: 'critical'
  target_match:
    severity: 'warning'
  equal: ['alertname', 'dev', 'instance']


# kubectl -n monitoring create secret generic alertmanager-main --from-file=alertmanager.yaml=/tmp/alert.conf --dry-run -o yaml  |  kubectl -n=monitoring create -f -

重启pod,以便配置生效。
kubectl -n monitoring delete pod alertmanager-main-0


查看日志
# kubectl -n monitoring logs pod/alertmanager-main-0 -f

收到告警邮件如下
在这里插入图片描述

八、附录

针对本文所涉及的开源软件 ,csdn上面有很多优秀的网贴,针对某一个知识点讲得很深入。
我在测试时prometheus/alertmanager报警部分时,参照了如下网贴,感谢作者辛苦付出。
https://blog.csdn.net/qq_40859395/article/details/124111257
https://blog.csdn.net/zfw_666666/article/details/126870239
https://blog.csdn.net/weixin_43451568/article/details/129781576

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值