环境
Red Hat OpenShift Data Foundations (RHODF) v4.x
Red Hat OpenShift Container Storage (RHOCS) v4.x
Red Hat OpenShift Container Platform (RHOCP) v4.x
Red Hat Ceph Storage (RHCS) 4.x
Red Hat Ceph Storage (RHCS) 5.x
Red Hat Ceph Storage (RHCS) 6.x
问题
具有HEALTH_WARN with 1 MDSs report slow requests and 1 MDSs behind on trimming
的集群。
例如:
[pao@edon1 ~]$ ceph -s
cluster:
id: 8d23xxxx-Redacted-Cluster-ID-yyyya00794f2
health: HEALTH_WARN
1 MDSs report slow requests
1 MDSs behind on trimming
services:
mon: 3 daemons, quorum edon3,edon2,edon1 (age 6d)
mgr: edon1(active, since 6d), standbys: edon1, edon1
mds: 2/2 daemons up, 1 standby
osd: 324 osds: 324 up (since 6d), 324 in (since 6d)
data:
volumes: 1/1 healthy
pools: 13 pools, 10625 pgs
objects: 101.79M objects, 51 TiB
usage: 158 TiB used, 458 TiB / 616 TiB avail
pgs: 10625 active+clean
[pao@edon1 ~]$ ceph health detail
HEALTH_WARN 1 MDSs report slow requests; 1 MDSs behind on trimming
[WRN] MDS_SLOW_REQUEST: 1 MDSs report slow requests
mds.cephfs.edon1.xxxyyy(mds.0): 569 slow requests are blocked > 30 secs
[WRN] MDS_TRIM: 1 MDSs behind on trimming
mds.cephfs.edon1.xxxyyy(mds.0): Behind on trimming (431866/128) max_segments: 128, num_segments: 431866
此外,这个KCS是针对MDS Slow Requests
( Blocked Ops
)的,这些请求已经存在了数小时,并且MDS尚未重启。
通常情况下,在MDS behind on trimming
时重启MDS是一个非常糟糕的主意。
解决办法
在RHCS 6.1z2及更高版本中解决了代码问题。代码问题将在RHCS 5.3z6中解决。
Red Hat建议升级以从您的环境中消除此问题的可能性。
应变方法:
- 驱逐持有
Slow Request
或Blocked Op
的客户端。 - 一般不建议重启MDS服务
- 但是由于MDS处于
behind in trimming
状态,重启MDS可能会使MDS陷入日志重放循环。 - 使用“诊断步骤”确定问题是否匹配。
- 如果匹配,为了所有其他MDS客户端,驱逐最早的1个MDS客户端。
- 一旦解决,客户端环境中可能会有大量的清理工作。
- 这是由于MDS被阻塞了好几个小时。
根本原因
2个MDS请求死锁的小代码问题
工件:
RHCS 5.3.6 Errata RHSA-2024:0745
RHCS 6.1.2 Errata RHSA-2023:5693
RHCS 7.0 Errata RHBA-2023:7780
诊断步骤
检查以确保问题匹配:
[pao@edon1 ~]$ ceph tell mds.cephfs.edon1.xxxyyy dump_blocked_ops > mds.edon1_dump_blocked_ops
[pao@edon1 ~]$ grep initiated_at mds.edon1_dump_blocked_ops -c
17771
[pao@edon1 ~]$ grep initiated_at mds.edon1_dump_blocked_ops | sort | head -4
"initiated_at": "2023-08-27T05:15:06.892950+0000",
"initiated_at": "2023-08-27T05:15:06.946241+0000",
"initiated_at": "2023-08-27T05:15:08.815470+0000",
"initiated_at": "2023-08-27T05:15:12.489081+0000",
[pao@edon1 ~]$ grep initiated_at mds.edon1_dump_blocked_ops | sort | tail -4
"initiated_at": "2023-08-28T12:35:48.849942+0000",
"initiated_at": "2023-08-28T12:35:55.138531+0000",
"initiated_at": "2023-08-28T12:36:01.142467+0000",
"initiated_at": "2023-08-28T12:36:03.433241+0000",
我们看到有17771个被阻塞的Ops,这个问题在大约32小时前就开始了。如果遇到类似的问题,请遵循以下步骤。
请注意,从2023-08-27T05:15:06
开始,两个操作之间出现了死锁。
如果Linux命令jq
不可用,则使用less
或vim
在输出文件中查找时间戳,并记录下两个最老的Blocked Ops的所有数据。
[pao@edon1 ~]$ cat mds.edon1_dump_blocked_ops | jq '.ops | sort_by(.initiated_at)' | less
[
{
"description": "client_request(client.190868098:174670 getattr AsLsXsFs #0x100923f6cef 2023-08-27T05:15:06.891957+0000 caller_uid=0, caller_gid=0{0,})",
"initiated_at": "2023-08-27T05:15:06.892950+0000",
"age": 112893.175672991,
"duration": 112893.406657507,
"type_data": {
"flag_point": "failed to rdlock, waiting",
"reqid": "client.190868098:174670",
"op_type": "client_request",
"client_info": {
"client": "client.190868098", <---- Client to evict
"tid": 174670
},
"events": [
{
"time": "2023-08-27T05:15:06.892950+0000",
"event": "initiated"
},
{
"time": "2023-08-27T05:15:06.892953+0000",
"event": "throttled"
},
{
"time": "2023-08-27T05:15:06.892950+0000",
"event": "header_read"
},
{
"time": "2023-08-27T05:15:06.892962+0000",
"event": "all_read"
},
{
"time": "2023-08-27T05:15:06.957344+0000",
"event": "dispatched"
},
{
"time": "2023-08-27T05:15:06.957437+0000",
"event": "failed to rdlock, waiting"
},
{
"time": "2023-08-27T05:15:06.962274+0000",
"event": "failed to rdlock, waiting"
},
{
"time": "2023-08-27T05:15:06.970398+0000",
"event": "failed to rdlock, waiting"
},
{
"time": "2023-08-27T05:15:06.972494+0000",
"event": "failed to rdlock, waiting"
},
{
"time": "2023-08-27T05:15:06.974707+0000",
"event": "failed to rdlock, waiting"
},
{
"time": "2023-08-27T05:15:06.989712+0000",
"event": "failed to rdlock, waiting"
}
]
}
},
{
"description": "client_request(client.190797717:174052 create #0x1000006fb46/container-pod-namespace 2023-08-27T05:15:06.944967+0000 caller_uid=0, caller_gid=0{0,})",
"initiated_at": "2023-08-27T05:15:06.946241+0000",
"age": 112893.12238222299,
"duration": 112893.370337324,
"type_data": {
"flag_point": "failed to rdlock, waiting",
"reqid": "client.190797717:174052",
"op_type": "client_request",
"client_info": {
"client": "client.190797717", <---- Client to evict
"tid": 174052
},
"events": [
{
"time": "2023-08-27T05:15:06.946241+0000",
"event": "initiated"
},
{
"time": "2023-08-27T05:15:06.946242+0000",
"event": "throttled"
},
{
"time": "2023-08-27T05:15:06.946241+0000",
"event": "header_read"
},
{
"time": "2023-08-27T05:15:06.946252+0000",
"event": "all_read"
},
{
"time": "2023-08-27T05:15:06.957448+0000",
"event": "dispatched"
},
{
"time": "2023-08-27T05:15:06.957736+0000",
"event": "failed to wrlock, waiting"
},
{
"time": "2023-08-27T05:15:06.967760+0000",
"event": "acquired locks"
},
{
"time": "2023-08-27T05:15:06.967764+0000",
"event": "acquired locks"
},
{
"time": "2023-08-27T05:15:06.967766+0000",
"event": "failed to xlock, waiting"
},
{
"time": "2023-08-27T05:15:06.972425+0000",
"event": "acquired locks"
},
{
"time": "2023-08-27T05:15:06.972437+0000",
"event": "failed to xlock, waiting"
},
{
"time": "2023-08-27T05:15:06.974656+0000",
"event": "acquired locks"
},
{
"time": "2023-08-27T05:15:06.974704+0000",
"event": "failed to xlock, waiting"
},
{
"time": "2023-08-27T05:15:06.989666+0000",
"event": "acquired locks"
},
{
"time": "2023-08-27T05:15:06.989710+0000",
"event": "failed to rdlock, waiting"
}
]
}
},
选择要从MDS中驱逐的2个客户端之一。在这种情况下,我们选择了 create
,认为它刚刚开始,而不是已经在中间。
[pao@edon1 ~]$ ceph tell mds.cephfs.edon1.xxxyyy client evict id=190797717
ceph -s
的输出应该显示系统恢复正常。