Centos7.9上利用cephadm安装Ceph Octopus 15.2的采坑记录,附带K8S挂载方法

本文详细记录了在Centos7.9上使用cephadm安装Ceph Octopus 15.2过程中遇到的问题及其解决方法,包括系统内核升级、容器服务升级、时间同步配置、防火墙设置等。此外,还介绍了如何解决K8S挂载CephFS时出现的权限和内核版本问题,以及Ceph对象存储RGW的配置和访问故障排查。
摘要由CSDN通过智能技术生成

0.亮点

1 准备

1.1 修改历史记录

1、修改profile文件
(1)修改history增加日期时间:在最后增加HISTTIMEFORMAT
(2)增加history记录的条数:修改histsize, 默认1000条改为10000条

# vi /etc/profile
export HISTTIMEFORMAT='%F %T '
HISTSIZE=10000

2、 使配置生效

# source /etc/profile

1.2 升级系统内核

1、检查现有内核:

# uname -rs 

2、在 CentOS7 上启用 ELRepo 仓库
导入该源的秘钥

# rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org

3、Centos7启用该源仓库

# rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm

Cento8:启用该源仓库

# yum install https://www.elrepo.org/elrepo-release-8.0-1.el8.elrepo.noarch.rpm
Retrieving http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm
Retrieving http://elrepo.org/elrepo-release-7.0-4.el7.elrepo.noarch.rpm
Preparing...                          ################################# [100%]
Updating / installing...
   1:elrepo-release-7.0-4.el7.elrepo  ################################# [100%]

4、查看有哪些内核版本可供安装

# yum --disablerepo="*" --enablerepo="elrepo-kernel" list available
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * elrepo-kernel: mirrors.neusoft.edu.cn
elrepo-kernel                                                                                                                                                          | 3.0 kB  00:00:00     
elrepo-kernel/primary_db                                                                                                                                                  | 2.0 MB  00:00:16     
Available Packages
elrepo-release.noarch                                                                           7.0-5.el7.elrepo						elrepo-kernel				kernel-lt.x86_64                                                                                5.4.113-1.el7.elrepo					elrepo-kernel				kernel-lt-devel.x86_64                                                                          5.4.113-1.el7.elrepo					elrepo-kernel				kernel-lt-doc.noarch                                                                            5.4.113-1.el7.elrepo					elrepo-kernel				kernel-lt-headers.x86_64                                                                        5.4.113-1.el7.elrepo					elrepo-kernel				kernel-lt-tools.x86_64                                                                          5.4.113-1.el7.elrepo					elrepo-kernel				kernel-lt-tools-libs.x86_64                                                                     5.4.113-1.el7.elrepo					elrepo-kernel				kernel-lt-tools-libs-devel.x86_64                                                               5.4.113-1.el7.elrepo					elrepo-kernel				kernel-ml.x86_64                                                                                5.11.15-1.el7.elrepo					elrepo-kernel				kernel-ml-devel.x86_64                                                                          5.11.15-1.el7.elrepo					elrepo-kernel				kernel-ml-doc.noarch                                                                            5.11.15-1.el7.elrepo					elrepo-kernel				kernel-ml-headers.x86_64                                                                        5.11.15-1.el7.elrepo					elrepo-kernel				kernel-ml-tools.x86_64                                                                          5.11.15-1.el7.elrepo					elrepo-kernel				kernel-ml-tools-libs.x86_64                                                                     5.11.15-1.el7.elrep	o					elrepo-kernel				kernel-ml-tools-libs-devel.x86_64                                                               5.11.15-1.el7.elrepo					elrepo-kernel				perf.x86_64                                                                                     5.11.15-1.el7.elrepo					elrepo-kernel				python-perf.x86_64                                                                              

5、安装的长期稳定版本lt(推荐,!!!不要安装不稳定版本)

# yum --enablerepo=elrepo-kernel install kernel-lt -y

6、解决无法启动问题
本步一定要做,否则无法启动:问题解决:centos7更换内核后,启动时出现:pstore: unknown compression: deflate,并卡死

# vi /etc/default/grub

在 GRUB_CMDLINE_LINUX 最后添加 mgag200.modeset=0,即:

GRUB_CMDLINE_LINUX="crashkernel=auto spectre_v2=retpoline rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet mgag200.modeset=0"

7、 重新生成grub

# grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg

8、查看当前的启动顺序

# awk -F\' '$1=="menuentry " {print i++ " : " $2}' /boot/efi/EFI/centos/grub.cfg
0 : CentOS Linux (5.4.113-1.el7.elrepo.x86_64) 7 (Core)
1 : CentOS Linux (3.10.0-1160.el7.x86_64) 7 (Core)
2 : CentOS Linux (0-rescue-022eb2e91fc04884b3ecfa2051d8e32a) 7 (Core)

9、调整默认启动内核

# grub2-set-default "CentOS Linux (5.4.113-1.el7.elrepo.x86_64) 7 (Core)"

10、查看当前默认启动内核

# grub2-editenv list
saved_entry=CentOS Linux (5.4.113-1.el7.elrepo.x86_64) 7 (Core)

11、 重启系统

# reboot

12、检查当前启动的内核版本:

# uname -rs

1.3 配置免密登录

1、每个节点:编辑hosts文件,建立HOSTNAME

# sudo vi /etc/hosts
10.14.83.183 ceph1 ceph1
10.14.83.184 ceph2 ceph2
10.14.83.119 ceph3 ceph3

2、每个节点:添加ceph用户(注意用户名不要用ceph,ceph系统内部保留用户,会无法安装)

ceph1:# useradd cephpmsc
ceph2:# useradd cephpmsc
ceph3:# useradd cephpmsc

3、每个节点:添加ceph用户密码

ceph1:# passwd cephpmsc
ceph1:# B9N5SjkXJd*****
ceph2:# passwd cephpmsc
ceph2:# B9N5SjkXJd*****
ceph3:# passwd cephpmsc
ceph3:# B9N5SjkXJd*****

4、每个节点:sudo权限

ceph1:# echo "cephpmsc ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/cephpmsc
ceph1:# sudo chmod 0440 /etc/sudoers.d/cephpmsc
ceph2:# echo "cephpmsc ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/cephpmsc
ceph2:# sudo chmod 0440 /etc/sudoers.d/cephpmsc
ceph3:# echo "cephpmsc ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/cephpmsc
ceph3:# sudo chmod 0440 /etc/sudoers.d/cephpmsc

5、每个节点:切换到ceph用户

ceph1: # su - cephpmsc
ceph2: # su - cephpmsc
ceph3: # su - cephpmsc

6、每个节点建立config:

$: mkdir ~/.ssh
$: vi ~/.ssh/config
Host ceph1
   Hostname ceph1
   User cephpmsc
Host ceph2
   Hostname ceph2
   User cephpmsc
Host ceph3
   Hostname ceph3
   User cephpmsc

7、对config修改权限(否则会提示Bad owner or permissions on .ssh/config的解决)

ceph1: $ sudo chmod 700 ~/.ssh/config
ceph2: $ sudo chmod 700 ~/.ssh/config
ceph3: $ sudo chmod 700 ~/.ssh/config
ceph1: $ sudo chmod 600 ~/.ssh/authorized_keys
ceph2: $ sudo chmod 600 ~/.ssh/authorized_keys
ceph3: $ sudo chmod 600 ~/.ssh/authorized_keys
ceph1: $ sudo chmod 755 ~/.ssh
ceph2: $ sudo chmod 755 ~/.ssh
ceph3: $ sudo chmod 755 ~/.ssh

8、每个节点:生成SSH密钥(用ceph用户),直接回车两次

ceph1: $ssh-keygen
ceph2: $ssh-keygen
ceph3: $ssh-keygen

9、把公钥拷贝到各Ceph节点(已经建立用户,添加了hosts文件,不用给本机拷贝,输入yes, 密码B9N5SjkXJ******)

ceph1: $ssh-copy-id cephpmsc@ceph2
ceph1: $ssh-copy-id cephpmsc@ceph3
ceph2: $ssh-copy-id cephpmsc@ceph1
ceph2: $ssh-copy-id cephpmsc@ceph3
ceph3: $ssh-copy-id cephpmsc@ceph1
ceph3: $ssh-copy-id cephpmsc@ceph2

10、测试免密登录(注意本机也需要)

ceph1: $ssh cephpmsc@ceph1
ceph1: $ssh cephpmsc@ceph2
ceph1: $ssh cephpmsc@ceph3
ceph2: $ssh cephpmsc@ceph1
ceph2: $ssh cephpmsc@ceph2
ceph2: $ssh cephpmsc@ceph3
ceph3: $ssh cephpmsc@ceph1
ceph3: $ssh cephpmsc@ceph2
ceph3: $ssh cephpmsc@ceph3

问题1:免密登录失败,检查安全日志

# sudo tail /var/log/secure -n 20

发现:

Apr 19 14:56:19 ceph2 sshd[2748]: Authentication refused: bad ownership or modes for directory /home/cephpmsc/.ssh

解决办法:
用户目录权限为 755 或者 700,不能是77x。
检查:

$ sudo ls ~/ -la
drwxrwxr-x. 2 cephpmsc cephpmsc  94 Apr 19 14:58 .ssh

改为:

$ sudo chmod 755 ~/.ssh
drwxr-xr-x. 2 cephpmsc cephpmsc  94 Apr 19 14:45 .ssh

重新执行

# ssh-copy-id cephpmsc@ceph2
ceph1: $ssh cephpmsc@ceph2

1.4升级容器服务

centos7中docker必须升级到docker-ce 20版本以上,或者安装podman。注意,集群各节点不建议混用docker和podman。
1、请先卸载旧版本docker

# sudo yum remove docker docker-common container-selinux docker-selinux docker-engine docker-ce

2、更新yum

# yum update

3、安装 yum-utils,它提供了 yum-config-manager,可用来管理yum源

#sudo yum install -y yum-utils

4、添加yum源

# sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
Loaded plugins: fastestmirror
adding repo from: http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
grabbing file http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo to /etc/yum.repos.d/docker-ce.repo
repo saved to /etc/yum.repos.d/docker-ce.repo

5、更新索引

# sudo yum makecache fast

6、安装 docker-ce

# sudo yum install -y docker-ce

7、启动 docker

# sudo systemctl start docker

8、验证是否安装成功

# sudo docker info

9、或者可以改用podman,集群内所有节点安装podman服务(跟升级docker-ce二选一)

# yum install -y podman

1.5 配置时间同步

1、安装chrony
需要在所有节点都开启时间同步服务,否则无法满足分布一致性协议要求,健康状态会报警
由于容器内部采用Centos8,已经不支持ntp服务,改用了新的时间服务chrony,需要安装:

# yum install chrony -y

2、启动服务并加入开机自启动

# systemctl enable chronyd.service
# systemctl start chronyd.service
# systemctl status chronyd.service
● chronyd.service - NTP client/server
   Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2021-08-24 14:32:11 CST; 5s ago
     Docs: man:chronyd(8)
           man:chrony.conf(5)
  Process: 2743 ExecStartPost=/usr/libexec/chrony-helper update-daemon (code=exited, status=0/SUCCESS)
  Process: 2737 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 2740 (chronyd)
    Tasks: 1
   Memory: 2.6M
   CGroup: /system.slice/chronyd.service
           └─2740 /usr/sbin/chronyd

3、设置当前系统为Asia/Shanghai上海时区:

$ timedatectl set-timezone Asia/Shanghai

4、 设置完时区后,强制同步下系统时钟:

$ chronyc -a makestep
200 OK

5、设置硬件时间,硬件时间默认为UTC:

$ timedatectl set-local-rtc 1

6、启用NTP时间同步:

$ timedatectl set-ntp yes

7、校准时间服务器:

$ chronyc tracking
Reference ID    : 55D62674 (mail.light-speed.de)
Stratum         : 3
Ref time (UTC)  : Sat May 22 01:18:49 2021
System time     : 0.016832426 seconds fast of NTP time
Last offset     : +0.006914454 seconds
RMS offset      : 0.014994740 seconds
Frequency       : 26.872 ppm slow
Residual freq   : +1.933 ppm
Skew            : 132.852 ppm
Root delay      : 0.404872686 seconds
Root dispersion : 0.017830199 seconds
Update interval : 64.7 seconds
Leap status     : Normal

7、修改chrony配置改为内网源(未来无法访问外网)

[root@ceph3 ~]# vi /etc/chrony.conf
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
server 10.14.83.52 iburst
server 10.0.64.17 iburst

8、重启chronyd服务,配置生效

[root@ceph3 ~]# systemctl restart chronyd

9、验证chronyd服务配置

[root@ceph3 ~]# chronyc sources -v
210 Number of sources = 1

  .-- Source mode  '^' = server, '=' = peer, '#' = local clock.
 / .- Source state '*' = current synced, '+' = combined , '-' = not combined,
| /   '?' = unreachable, 'x' = time may be in error, <
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值