Ceph Cluster in HEALTH_WARN with 1 MDSs report slow requests and 1 MDSs behind on trimming

环境

Red Hat OpenShift Data Foundations (RHODF) v4.x

Red Hat OpenShift Container Storage (RHOCS) v4.x

Red Hat OpenShift Container Platform (RHOCP) v4.x

Red Hat Ceph Storage (RHCS) 4.x

Red Hat Ceph Storage (RHCS) 5.x

Red Hat Ceph Storage (RHCS) 6.x

问题

具有HEALTH_WARN with 1 MDSs report slow requests and 1 MDSs behind on trimming的集群。

例如:

[pao@edon1 ~]$ ceph -s
  cluster:
    id:     8d23xxxx-Redacted-Cluster-ID-yyyya00794f2
    health: HEALTH_WARN
            1 MDSs report slow requests
            1 MDSs behind on trimming

  services:
    mon: 3 daemons, quorum edon3,edon2,edon1 (age 6d)
    mgr: edon1(active, since 6d), standbys: edon1, edon1
    mds: 2/2 daemons up, 1 standby
    osd: 324 osds: 324 up (since 6d), 324 in (since 6d)

  data:
    volumes: 1/1 healthy
    pools:   13 pools, 10625 pgs
    objects: 101.79M objects, 51 TiB
    usage:   158 TiB used, 458 TiB / 616 TiB avail
    pgs:     10625 active+clean

[pao@edon1 ~]$ ceph health detail
HEALTH_WARN 1 MDSs report slow requests; 1 MDSs behind on trimming
[WRN] MDS_SLOW_REQUEST: 1 MDSs report slow requests
    mds.cephfs.edon1.xxxyyy(mds.0): 569 slow requests are blocked > 30 secs
[WRN] MDS_TRIM: 1 MDSs behind on trimming
    mds.cephfs.edon1.xxxyyy(mds.0): Behind on trimming (431866/128) max_segments: 128, num_segments: 431866

此外,这个KCS是针对MDS Slow RequestsBlocked Ops)的,这些请求已经存在了数小时,并且MDS尚未重启。

通常情况下,在MDS behind on trimming 时重启MDS是一个非常糟糕的主意。

解决办法

在RHCS 6.1z2及更高版本中解决了代码问题。代码问题将在RHCS 5.3z6中解决。

Red Hat建议升级以从您的环境中消除此问题的可能性。

应变方法:

  • 驱逐持有Slow RequestBlocked Op的客户端。
  • 一般不建议重启MDS服务
  • 但是由于MDS处于behind in trimming状态,重启MDS可能会使MDS陷入日志重放循环。
  • 使用“诊断步骤”确定问题是否匹配。
  • 如果匹配,为了所有其他MDS客户端,驱逐最早的1个MDS客户端。
  • 一旦解决,客户端环境中可能会有大量的清理工作。
  • 这是由于MDS被阻塞了好几个小时。

根本原因

2个MDS请求死锁的小代码问题

工件:

Ceph Upstream Tracker #62052

RHCS 5.x Bugzilla #2236190

RHCS 6.x Bugzilla #2236188

RHCS 7.x Bugzilla #2235338

RHCS 5.3.6 Errata RHSA-2024:0745

RHCS 6.1.2 Errata RHSA-2023:5693

RHCS 7.0 Errata RHBA-2023:7780

诊断步骤

检查以确保问题匹配:

[pao@edon1 ~]$ ceph tell mds.cephfs.edon1.xxxyyy dump_blocked_ops  > mds.edon1_dump_blocked_ops

[pao@edon1 ~]$ grep initiated_at mds.edon1_dump_blocked_ops -c
17771

[pao@edon1 ~]$ grep initiated_at mds.edon1_dump_blocked_ops | sort | head -4
            "initiated_at": "2023-08-27T05:15:06.892950+0000",
            "initiated_at": "2023-08-27T05:15:06.946241+0000",
            "initiated_at": "2023-08-27T05:15:08.815470+0000",
            "initiated_at": "2023-08-27T05:15:12.489081+0000",

[pao@edon1 ~]$ grep initiated_at mds.edon1_dump_blocked_ops | sort | tail -4
            "initiated_at": "2023-08-28T12:35:48.849942+0000",
            "initiated_at": "2023-08-28T12:35:55.138531+0000",
            "initiated_at": "2023-08-28T12:36:01.142467+0000",
            "initiated_at": "2023-08-28T12:36:03.433241+0000",

我们看到有17771个被阻塞的Ops,这个问题在大约32小时前就开始了。如果遇到类似的问题,请遵循以下步骤。

请注意,从2023-08-27T05:15:06开始,两个操作之间出现了死锁。

如果Linux命令jq不可用,则使用lessvim在输出文件中查找时间戳,并记录下两个最老的Blocked Ops的所有数据。

[pao@edon1 ~]$ cat mds.edon1_dump_blocked_ops |  jq '.ops | sort_by(.initiated_at)' | less
[
  {
    "description": "client_request(client.190868098:174670 getattr AsLsXsFs #0x100923f6cef 2023-08-27T05:15:06.891957+0000 caller_uid=0, caller_gid=0{0,})",
    "initiated_at": "2023-08-27T05:15:06.892950+0000",
    "age": 112893.175672991,
    "duration": 112893.406657507,
    "type_data": {
      "flag_point": "failed to rdlock, waiting",
      "reqid": "client.190868098:174670",
      "op_type": "client_request",
      "client_info": {
        "client": "client.190868098",            <---- Client to evict
        "tid": 174670
      },
      "events": [
        {
          "time": "2023-08-27T05:15:06.892950+0000",
          "event": "initiated"
        },
        {
          "time": "2023-08-27T05:15:06.892953+0000",
          "event": "throttled"
        },
        {
          "time": "2023-08-27T05:15:06.892950+0000",
          "event": "header_read"
        },
        {
          "time": "2023-08-27T05:15:06.892962+0000",
          "event": "all_read"
        },
        {
          "time": "2023-08-27T05:15:06.957344+0000",
          "event": "dispatched"
        },
        {
          "time": "2023-08-27T05:15:06.957437+0000",
          "event": "failed to rdlock, waiting"
        },
        {
          "time": "2023-08-27T05:15:06.962274+0000",
          "event": "failed to rdlock, waiting"
        },
        {
          "time": "2023-08-27T05:15:06.970398+0000",
          "event": "failed to rdlock, waiting"
        },
        {
          "time": "2023-08-27T05:15:06.972494+0000",
          "event": "failed to rdlock, waiting"
        },
        {
          "time": "2023-08-27T05:15:06.974707+0000",
          "event": "failed to rdlock, waiting"
        },
        {
          "time": "2023-08-27T05:15:06.989712+0000",
          "event": "failed to rdlock, waiting"
        }
      ]
    }
  },
  {
    "description": "client_request(client.190797717:174052 create #0x1000006fb46/container-pod-namespace 2023-08-27T05:15:06.944967+0000 caller_uid=0, caller_gid=0{0,})",
    "initiated_at": "2023-08-27T05:15:06.946241+0000",
    "age": 112893.12238222299,
    "duration": 112893.370337324,
    "type_data": {
      "flag_point": "failed to rdlock, waiting",
      "reqid": "client.190797717:174052",
      "op_type": "client_request",
      "client_info": {
        "client": "client.190797717",            <---- Client to evict
        "tid": 174052
      },
      "events": [
        {
          "time": "2023-08-27T05:15:06.946241+0000",
          "event": "initiated"
        },
        {
          "time": "2023-08-27T05:15:06.946242+0000",
          "event": "throttled"
        },
        {
          "time": "2023-08-27T05:15:06.946241+0000",
          "event": "header_read"
        },
        {
          "time": "2023-08-27T05:15:06.946252+0000",
          "event": "all_read"
        },
        {
          "time": "2023-08-27T05:15:06.957448+0000",
          "event": "dispatched"
        },
        {
          "time": "2023-08-27T05:15:06.957736+0000",
          "event": "failed to wrlock, waiting"
        },
        {
          "time": "2023-08-27T05:15:06.967760+0000",
          "event": "acquired locks"
        },
        {
          "time": "2023-08-27T05:15:06.967764+0000",
          "event": "acquired locks"
        },
        {
          "time": "2023-08-27T05:15:06.967766+0000",
          "event": "failed to xlock, waiting"
        },
        {
          "time": "2023-08-27T05:15:06.972425+0000",
          "event": "acquired locks"
        },
        {
          "time": "2023-08-27T05:15:06.972437+0000",
          "event": "failed to xlock, waiting"
        },
        {
          "time": "2023-08-27T05:15:06.974656+0000",
          "event": "acquired locks"
        },
        {
          "time": "2023-08-27T05:15:06.974704+0000",
          "event": "failed to xlock, waiting"
        },
        {
          "time": "2023-08-27T05:15:06.989666+0000",
          "event": "acquired locks"
        },
        {
          "time": "2023-08-27T05:15:06.989710+0000",
          "event": "failed to rdlock, waiting"
        }
      ]
    }
  },

选择要从MDS中驱逐的2个客户端之一。在这种情况下,我们选择了 create ,认为它刚刚开始,而不是已经在中间。

[pao@edon1 ~]$ ceph tell mds.cephfs.edon1.xxxyyy client evict id=190797717

ceph -s的输出应该显示系统恢复正常。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值