CentOS7搭建etcd服务–错误排查(k8s学习-笔录)
今天在学习k8s集群搭建环境准备时,搭建etcd服务启动后一直显示start
状态,使用systemctl status etcd.service -l
查看详细信息如下
1、错误信息
[root@hdss7-21 cfg]# systemctl status etcd.service -l
● etcd.service - Etcd Server
Loaded: loaded (/etc/systemd/system/etcd.service; disabled; vendor preset: disabled)
Active: activating (start) since 五 2020-12-25 15:53:26 CST; 49s ago
Main PID: 4150 (etcd)
Memory: 17.6M
CGroup: /system.slice/etcd.service
└─4150 /opt/etcd/bin/etcd --name=etcd-2 --data-dir=/var/lib/etcd/default.etcd --listen-peer-urls=https://192.168.6.21:2380 --listen-client-urls=https://192.168.6.21:2379,http://127.0.0.1:2379 --advertise-client-urls=https://192.168.6.21:2379,http://127.0.0.1:2379 --initial-advertise-peer-urls=https://192.168.6.21:2380 --initial-cluster=etcd-1=https://192.168.6.12:2380,etcd-2=https://192.168.6.21:2380,etcd-3=https://192.168.6.22:2380 --initial-cluster-token=etcd-cluster --initial-cluster-state=new --cert-file=/opt/etcd/ssl/server.pem --key-file=/opt/etcd/ssl/server-key.pem --peer-cert-file=/opt/etcd/ssl/server.pem --peer-key-file=/opt/etcd/ssl/server-key.pem --trusted-ca-file=/opt/etcd/ssl/ca.pem --peer-trusted-ca-file=/opt/etcd/ssl/ca.pem
12月 25 15:54:15 hdss7-21.host.com etcd[4150]: rejected connection from "192.168.6.22:49034" (error "remote error: tls: bad certificate", ServerName "")
12月 25 15:54:15 hdss7-21.host.com etcd[4150]: rejected connection from "192.168.6.12:55488" (error "remote error: tls: bad certificate", ServerName "")
12月 25 15:54:15 hdss7-21.host.com etcd[4150]: rejected connection from "192.168.6.12:55492" (error "remote error: tls: bad certificate", ServerName "")
12月 25 15:54:15 hdss7-21.host.com etcd[4150]: rejected connection from "192.168.6.22:49040" (error "remote error: tls: bad certificate", ServerName "")
12月 25 15:54:15 hdss7-21.host.com etcd[4150]: rejected connection from "192.168.6.22:49038" (error "remote error: tls: bad certificate", ServerName "")
12月 25 15:54:15 hdss7-21.host.com etcd[4150]: rejected connection from "192.168.6.12:55500" (error "remote error: tls: bad certificate", ServerName "")
12月 25 15:54:15 hdss7-21.host.com etcd[4150]: rejected connection from "192.168.6.12:55498" (error "remote error: tls: bad certificate", ServerName "")
2、我的etcd服务配置
目录结构
[root@hdss7-21 opt]# tree etcd/
etcd/
├── bin
│ ├── etcd
│ └── etcdctl
├── cfg
│ └── etcd.conf
├── etcd.service
├── ssl
│ ├── ca.pem
│ ├── etcd-peer-key.pem
│ └── etcd-peer.pem
└── ssl-bak
├── ca.pem
├── server-key.pem
└── server.pem
主要的配置文件:etcd.conf 和 etcd.service
etcd.conf文件是要是etcd服务启动参数配置信息:
[root@hdss7-21 cfg]# cat etcd.conf
#[Member]
ETCD_NAME="etcd-2"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.6.21:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.6.21:2379"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.6.21:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.6.21:2379,http://127.0.0.1:2379"
ETCD_INITIAL_CLUSTER="etcd-1=https://192.168.6.12:2380,etcd-2=https://192.168.6.21:2380,etcd-3=https://192.168.6.22:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
参数说明:可参考:etcd配置文件详解
ETCD_NAME=“etcd-2” : 节点名称
ETCD_DATA_DIR : 数据保存的目录
ETCD_LISTEN_PEER_URLS:用于监听其他etcd member的url
ETCD_LISTEN_CLIENT_URLS:对外提供服务的地址
ETCD_INITIAL_ADVERTISE_PEER_URLS:与其他节点交互信息的地址
ETCD_ADVERTISE_CLIENT_URLS:etcd客户端交互信息的地址
ETCD_INITIAL_CLUSTER:集群中所有节点的信息。
ETCD_INITIAL_CLUSTER_TOKEN:创建集群的 token,这个值每个集群保持唯一。
ETCD_INITIAL_CLUSTER_STATE:初始集群状态
etcd.service文件
赋予执行权限并拷贝改文件到/etc/systemd/system/
(个人理解:实现服务注册)
文件信息如下:
[root@hdss7-12 etcd]# cat etcd.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
EnvironmentFile=/opt/etcd/cfg/etcd.conf
ExecStart=/opt/etcd/bin/etcd \
--name=${ETCD_NAME} \
--data-dir=${ETCD_DATA_DIR} \
--listen-peer-urls=${ETCD_LISTEN_PEER_URLS} \
--listen-client-urls=${ETCD_LISTEN_CLIENT_URLS},http://127.0.0.1:2379 \
--advertise-client-urls=${ETCD_ADVERTISE_CLIENT_URLS} \
--initial-advertise-peer-urls=${ETCD_INITIAL_ADVERTISE_PEER_URLS} \
--initial-cluster=${ETCD_INITIAL_CLUSTER} \
--initial-cluster-token=${ETCD_INITIAL_CLUSTER_TOKEN} \
--initial-cluster-state=new \
--cert-file=/opt/etcd/ssl/server.pem \
--key-file=/opt/etcd/ssl/server-key.pem \
--peer-cert-file=/opt/etcd/ssl/server.pem \
--peer-key-file=/opt/etcd/ssl/server-key.pem \
--trusted-ca-file=/opt/etcd/ssl/ca.pem \
--peer-trusted-ca-file=/opt/etcd/ssl/ca.pem
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
3、问题解决
由于是从别的地方拷贝的etcd服务软件包,没有跟新证书,因此更新证书后,使用新的证书后(主要更改/etc/systemd/system/etcd.service
),如下信息:
[root@hdss7-21 system]# cat etcd.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
EnvironmentFile=/opt/etcd/cfg/etcd.conf
ExecStart=/opt/etcd/bin/etcd \
--name=${ETCD_NAME} \
--data-dir=${ETCD_DATA_DIR} \
--listen-peer-urls=${ETCD_LISTEN_PEER_URLS} \
--listen-client-urls=${ETCD_LISTEN_CLIENT_URLS},http://127.0.0.1:2379 \
--advertise-client-urls=${ETCD_ADVERTISE_CLIENT_URLS} \
--initial-advertise-peer-urls=${ETCD_INITIAL_ADVERTISE_PEER_URLS} \
--initial-cluster=${ETCD_INITIAL_CLUSTER} \
--initial-cluster-token=${ETCD_INITIAL_CLUSTER_TOKEN} \
--initial-cluster-state=new \
--cert-file=/opt/etcd/ssl/etcd-peer.pem \
--key-file=/opt/etcd/ssl/etcd-peer-key.pem \
--peer-cert-file=/opt/etcd/ssl/etcd-peer.pem \
--peer-key-file=/opt/etcd/ssl/etcd-peer-key.pem \
--trusted-ca-file=/opt/etcd/ssl/ca.pem \
--peer-trusted-ca-file=/opt/etcd/ssl/ca.pem
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
4、重启服务
systemctl daemon-reaload
systemctl start etcd
服务ok
信息如下
[root@hdss7-21 system]# systemctl status etcd.service
● etcd.service - Etcd Server
Loaded: loaded (/etc/systemd/system/etcd.service; disabled; vendor preset: disabled)
Active: active (running) since 五 2020-12-25 16:12:11 CST; 39s ago
Main PID: 4306 (etcd)
Memory: 11.0M
CGroup: /system.slice/etcd.service
└─4306 /opt/etcd/bin/etcd --name=etcd-2 --data-dir=/var/lib/etcd/default.etcd --listen-peer-urls...
12月 25 16:12:20 hdss7-21.host.com etcd[4306]: health check for peer 44cf137c1267f893 could not connec...GE")
12月 25 16:12:25 hdss7-21.host.com etcd[4306]: health check for peer 44cf137c1267f893 could not connec...OT")
12月 25 16:12:25 hdss7-21.host.com etcd[4306]: health check for peer 44cf137c1267f893 could not connec...GE")
12月 25 16:12:26 hdss7-21.host.com etcd[4306]: peer 44cf137c1267f893 became active
12月 25 16:12:26 hdss7-21.host.com etcd[4306]: established a TCP streaming connection with peer 44cf13...ter)
12月 25 16:12:26 hdss7-21.host.com etcd[4306]: established a TCP streaming connection with peer 44cf13...ter)
12月 25 16:12:26 hdss7-21.host.com etcd[4306]: established a TCP streaming connection with peer 44cf13...der)
12月 25 16:12:26 hdss7-21.host.com etcd[4306]: established a TCP streaming connection with peer 44cf13...der)
12月 25 16:12:27 hdss7-21.host.com etcd[4306]: updated the cluster version from 3.0 to 3.3
12月 25 16:12:27 hdss7-21.host.com etcd[4306]: enabled capabilities for version 3.3
Hint: Some lines were ellipsized, use -l to show in full.
5、查看etcd服务集群状态
[root@hdss7-21 bin]# ./etcdctl cluster-health
member 44cf137c1267f893 is healthy: got healthy result from http://127.0.0.1:2379
member 47856ed020c3771a is healthy: got healthy result from http://127.0.0.1:2379
member 4f089e69d0c31399 is healthy: got healthy result from http://127.0.0.1:2379
cluster is healthy