ceph 报警 解决案列

问题1,ceph报警"low disk space"解决

简介: 参考报警信息
[root@hh-yun-puppet-129021 ~]# ceph health detail
HEALTH_WARN
mon.hh-yun-ceph-cinder026-128076 low disk space
mon.hh-yun-ceph-cinder026-128076 low disk space
– 30% avail 参考磁盘空间

问题解决:

du -sh ./* |grep G

问题2,Ceph14版本增加pool application enable设置

  1. 背景
    ceph 14版本的集群部署完成后,创建pool的过程中没有设置pool application,导致ceph -s查看的时候有WARN提醒。
  2. 目标
    在创建pool完成后,增加设置pool application这个属性
  3. 操作步骤
    3.1. rbd
    操作指令:ceph osd pool application enable ${POOL_NAME} rbd
    涉及到的pool
    volumes
    images
    vms
    3.2. rgw
    操作指令:ceph osd pool application enable ${POOL_NAME} rgw
    涉及到的pool
.rgw
.rgw.root
.rgw.control
.rgw.gc
.log
.users
.users.uid
.rgw.buckets.index
.rgw.buckets
.rgw.buckets.extra
default.rgw.log
3.3. cephfs
操作指令:ceph osd pool application enable ${POOL_NAME} cephfs
涉及到的pool
cephfs-metadata
cephfs-data

问题3,ceph application not enabled on 1 pool(s)

解决方法:
手工给pool追加application

root@ceph-node3:~# ceph osd pool application enable cephfs_data cephfs
root@ceph-node3:~# ceph osd pool application enable cephfs_metadata cephfs
root@ceph-node3:~# ceph osd pool application enable rbd rbd --yes-i-really-mean-it

问题4, authentication error connecting to the cluster

root@ceph-client:~# ceph -s
2017-08-05 19:27:40.722096 7fa149759700 0 librados: client.admin authentication
[errno 1] error connecting to the cluster

解决方法:

root@ceph-admin:~/my-cluster# ceph-deploy install ceph-client
root@ceph-admin:~/my-cluster# ceph-deploy --overwrite-conf admin ceph-client
root@ceph-client:~# ceph -s
cluster:
id: 135bca7f-4582-4d35-a1fa-aa9b86b9c730
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3
mgr: ceph-node3(active), standbys: ceph-node2, ceph-node1
mds: 1/1/1 up {0=ceph-node3=up:active}, 2 up:standby
osd: 3 osds: 3 up, 3 in
rgw: 3 daemons active
data:
pools: 7 pools, 68 pgs
objects: 212 objects, 3394 bytes
usage: 3509 MB used, 27211 MB / 30720 MB avail
pgs: 68 active+clean
root@ceph-client:~#

问题5, feature set mismatch

在CephFS, RBD测试过程遭遇这个问题, Mount Ceph FS with the Kernel Driver

root@ceph-client:~# mount /root/test_fs
mount error 110 = Connection timed out
root@ceph-client:/var/log# tail -f syslog
Aug 5 14:03:06 ceph-client kernel: [ 565.194592] libceph: mon0 192.168.56.201:6789 feature set mismatch, my 107b84a842aca < server's 40107b84a842aca, missing 400000000000000

解决方法

root@ceph-client:/var/log# ceph osd crush tunables hammer
root@ceph-client:/var/log# ceph osd crush reweight-all

问题6, librados: client.bootstrap-rgw authentication error (1) Operation not permitted

解决方法, 重新copy key到正确路径
root@ceph-admin:~/my-cluster# cp ceph.bootstrap-rgw.keyring /var/lib/ceph/bootstrap-rgw/ceph.keyring

问题7, s3cmd 无法生成bucket, gaierror: [Errno -2] Name or service not known

问题8, IE无法访问Bucket - AccessDenied

http://192.168.56.200:7480/my-new-bucket
<?xml version="1.0" encoding="UTF-8"?>
-<Error>
<Code>AccessDenied</Code>
<BucketName>my-new-bucket</BucketName>
<RequestId>tx000000000000000000007-00598d4b45-78a3e-default</RequestId>
<HostId>78a3e-default-default</HostId>
</Error>

问题9, 需要disable 4个features,才能mount起来cephfs使用

问题10, osd pg number exceeds 300

解决方法

在ceph-admin上更新配置文件!
vi /root/my-cluster/ceph.conf
mon_pg_warn_max_per_osd = 1000

同步最新配置文件到整个Cluster
ceph-deploy --overwrite-conf config push ceph-admin ceph-node1 ceph-node2 ceph-node3

重启各个Mon, OSD
root@ceph-node3:/etc/ceph# systemctl restart ceph-mon.target
root@ceph-node3:/etc/ceph# systemctl restart ceph-osd.target

问题11, 在fstab中添加rdb device的mount entry, 导致系统不能自动启动,需要每次执行

rbd map rbd/rbd_test_image --id admin --keyring /etc/ceph/ceph.client.admin.keyring
这个问题需要进一步trouble shooting

问题12, Failed to start Ceph object storage daemon 系统启动时遭遇这个failure

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值