ceph 报警解决案列

最新推荐文章于 2023-04-15 19:30:46 发布

老BUG蹦迪

最新推荐文章于 2023-04-15 19:30:46 发布

阅读量1.7k

点赞数 1

分类专栏： ceph 文章标签： linux 运维服务器

本文链接：https://blog.csdn.net/wlh695543990/article/details/121531554

版权

ceph 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

问题1，ceph报警"low disk space"解决

简介：参考报警信息
[root@hh-yun-puppet-129021 ~]# ceph health detail
HEALTH_WARN
mon.hh-yun-ceph-cinder026-128076 low disk space
mon.hh-yun-ceph-cinder026-128076 low disk space
– 30% avail 参考磁盘空间

问题解决：

du -sh ./* |grep G

问题2，Ceph14版本增加pool application enable设置

背景
ceph 14版本的集群部署完成后，创建pool的过程中没有设置pool application，导致ceph -s查看的时候有WARN提醒。
目标
在创建pool完成后，增加设置pool application这个属性
操作步骤
3.1. rbd
操作指令：ceph osd pool application enable ${POOL_NAME} rbd
涉及到的pool
volumes
images
vms
3.2. rgw
操作指令：ceph osd pool application enable ${POOL_NAME} rgw
涉及到的pool

.rgw
.rgw.root
.rgw.control
.rgw.gc
.log
.users
.users.uid
.rgw.buckets.index
.rgw.buckets
.rgw.buckets.extra
default.rgw.log
3.3. cephfs
操作指令：ceph osd pool application enable ${POOL_NAME} cephfs
涉及到的pool
cephfs-metadata
cephfs-data

问题3，ceph application not enabled on 1 pool(s)

解决方法:
手工给pool追加application

root@ceph-node3:~# ceph osd pool application enable cephfs_data cephfs
root@ceph-node3:~# ceph osd pool application enable cephfs_metadata cephfs
root@ceph-node3:~# ceph osd pool application enable rbd rbd --yes-i-really-mean-it

问题4， authentication error connecting to the cluster

root@ceph-client:~# ceph -s
2017-08-05 19:27:40.722096 7fa149759700 0 librados: client.admin authentication
[errno 1] error connecting to the cluster

解决方法:

root@ceph-admin:~/my-cluster# ceph-deploy install ceph-client
root@ceph-admin:~/my-cluster# ceph-deploy --overwrite-conf admin ceph-client
root@ceph-client:~# ceph -s
cluster:
id: 135bca7f-4582-4d35-a1fa-aa9b86b9c730
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3
mgr: ceph-node3(active), standbys: ceph-node2, ceph-node1
mds: 1/1/1 up {0=ceph-node3=up:active}, 2 up:standby
osd: 3 osds: 3 up, 3 in
rgw: 3 daemons active
data:
pools: 7 pools, 68 pgs
objects: 212 objects, 3394 bytes
usage: 3509 MB used, 27211 MB / 30720 MB avail
pgs: 68 active+clean
root@ceph-client:~#

问题5， feature set mismatch

在CephFS, RBD测试过程遭遇这个问题, Mount Ceph FS with the Kernel Driver

root@ceph-client:~# mount /root/test_fs
mount error 110 = Connection timed out
root@ceph-client:/var/log# tail -f syslog
Aug 5 14:03:06 ceph-client kernel: [ 565.194592] libceph: mon0 192.168.56.201:6789 feature set mismatch, my 107b84a842aca < server's 40107b84a842aca, missing 400000000000000

解决方法

root@ceph-client:/var/log# ceph osd crush tunables hammer
root@ceph-client:/var/log# ceph osd crush reweight-all

问题6, librados: client.bootstrap-rgw authentication error (1) Operation not permitted

解决方法, 重新copy key到正确路径
root@ceph-admin:~/my-cluster# cp ceph.bootstrap-rgw.keyring /var/lib/ceph/bootstrap-rgw/ceph.keyring

问题7, s3cmd 无法生成bucket， gaierror: [Errno -2] Name or service not known

问题8, IE无法访问Bucket - AccessDenied

http://192.168.56.200:7480/my-new-bucket
<?xml version="1.0" encoding="UTF-8"?>
-<Error>
<Code>AccessDenied</Code>
<BucketName>my-new-bucket</BucketName>
<RequestId>tx000000000000000000007-00598d4b45-78a3e-default</RequestId>
<HostId>78a3e-default-default</HostId>
</Error>

问题9, 需要disable 4个features,才能mount起来cephfs使用

问题10, osd pg number exceeds 300

解决方法

在ceph-admin上更新配置文件!
vi /root/my-cluster/ceph.conf
mon_pg_warn_max_per_osd = 1000

同步最新配置文件到整个Cluster
ceph-deploy --overwrite-conf config push ceph-admin ceph-node1 ceph-node2 ceph-node3

重启各个Mon, OSD
root@ceph-node3:/etc/ceph# systemctl restart ceph-mon.target
root@ceph-node3:/etc/ceph# systemctl restart ceph-osd.target