版本
Ceph Luminous 12.2.11
报错信息
/home/jenkins-build/build/workspace/ceph-build/ARCH/arm64/AVAILABLE_ARCH/arm64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.11/rpm/el7/BUILD/ceph-12.2.11/src/mon/AuthMonitor.cc: In function 'virtual void AuthMonitor::update_from_paxos(bool*)' thread ffff79840010 time 2019-10-11 14:28:13.809177
/home/jenkins-build/build/workspace/ceph-build/ARCH/arm64/AVAILABLE_ARCH/arm64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.11/rpm/el7/BUILD/ceph-12.2.11/src/mon/AuthMonitor.cc: 157: FAILED assert(ret == 0)
ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x100) [0xaaaaada5ef3c]
2: (AuthMonitor::update_from_paxos(bool*)+0x1340) [0xaaaaad8967a8]
3: (PaxosService::refresh(bool*)+0x1b4) [0xaaaaad9568ec]
4: (Monitor::refresh_from_paxos(bool*)+0x184) [0xaaaaad8336b8]
5: (Monitor::init_paxos()+0x114) [0xaaaaad833abc]
6: (Monitor::preinit()+0xa68) [0xaaaaad834588]
7: (main()+0x3cd0) [0xaaaaad780ae0]
8: (__libc_start_main()+0xf0) [0xffff79970d64]
9: (()+0x3ac768) [0xaaaaad80c768]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
处理过程
- 除此节点的 mon 服务外,其他服务和其他节点均运行正常
- 手动在命令行启动 mon 服务报错相同
- 未发现系统或其他报错
恢复过程
尝试直接重建此节点的 mon
cd /var/lib/ceph/mon/
ceph mon getmap -o /tmp/monmap
cp -a ceph-mon01/keyring /tmp/
rm -rf ceph-mon01/
ceph-mon --cluster ceph --id mon01 --mkfs --monmap /tmp/monmap --keyring /tmp/keyring
chown -R ceph:ceph ceph-mon01/
systemctl start ceph-mon@mon01.service