手动修复 rabbitmq 报错 “Crash dump is being written to“

rabbitmq 报错:

2023-11-07 16:38:52.682 [error] emulator Error in process <0.368.0> on node 'rabbit@rabbitmq-0.rabbitmq-discovery.openstack.svc.cluster.local' with exit value:
{shutdown,[{mnesia_loader,handle_exit,2,[{file,"mnesia_loader.erl"},{line,963}]},{mnesia_loader,tab_receiver,5,[{file,"mnesia_loader.erl"},{line,440}]},{mnesia_loader,spawned_receiver,8,[{file,"mnesia_loader.erl"},{line,343}]}]}
2023-11-07 16:38:52.683 [error] emulator Error in process <0.367.0> on node 'rabbit@rabbitmq-0.rabbitmq-discovery.openstack.svc.cluster.local' with exit value:
{badarg,[{ets,insert,[mnesia_gvar,{last_error,{{shutdown,[{mnesia_loader,handle_exit,2,[{file,"mnesia_loader.erl"},{line,963}]},{mnesia_loader,tab_receiver,5,[{file,"mnesia_loader.erl"},{line,440}]},{mnesia_loader,spawned_receiver,8,[{file,"mnesia_loader.erl"},{line,343}]}]},[{mnesia_loader,wait_on_load_complete,1,[{file,"mnesia_loader.erl"},{line,359}]},{mnesia_tm,apply_fun,3,[{file,"mnesia_tm.erl"},{line,840}]},{mnesia_tm,execute_transaction,5,[{file,"mnesia_tm.erl"},{line,816}]},{mnesia_loader,init_receiver,5,[{file,"mnesia_loader.erl"},{line,285}]},{mnesia_loader,do_get_network_copy,5,[{file,"mnesia_loader.erl"},{line,221}]},{mnesia_controller,'-load_table_fun/1-fun-4-',5,[{file,"mnesia_controller.erl"},{line,2186}]},{mnesia_controller,'-load_and_reply/2-fun-0-',2,[{file,"mnesia_controller.erl"},{line,2133}]}]}}],[]},{mnesia_lib,set,2,[{file,"mnesia_lib.erl"},{line,443}]},{mnesia_lib,fix_error,1,[{file,"mnesia_lib.erl"},{line,906}]},{mnesia_tm,return_abort,3,[{file,"mnesia_tm.erl"},{line,962}]},{mnesia_loader,init_receiver,5,[{file,"mnesia_loader.erl"},{line,285}]},{mnesia_loader,do_get_network_copy,5,[{file,"mnesia_loader.erl"},{line,221}]},{mnesia_controller,'-load_table_fun/1-fun-4-',5,[{file,"mnesia_controller.erl"},{line,2186}]},{mnesia_controller,'-load_and_reply/2-fun-0-',2,[{file,"mnesia_controller.erl"},{line,2133}]}]}
2023-11-07 16:38:52.685 [info] <0.43.0> Application mnesia exited with reason: stopped
2023-11-07 16:38:52.685 [info] <0.43.0> Application tools exited with reason: stopped
2023-11-07 16:38:52.685 [error] <0.8.0> 
Error description:
    init:do_boot/3
    init:start_em/1
    rabbit:start_it/1 line 465
    rabbit:broker_start/1 line 341
    rabbit:start_loaded_apps/2 line 586
    app_utils:manage_applications/6 line 126
    lists:foldl/3 line 1263
    rabbit:'-handle_app_error/1-fun-0-'/3 line 709
throw:{could_not_start,ra,
       {ra,
        {{shutdown,
          {failed_to_start_child,ra_system_sup,
           {shutdown,
            {failed_to_start_child,ra_log_sup,
             {shutdown,
              {failed_to_start_child,ra_log_wal_sup,
               {shutdown,
                {failed_to_start_child,ra_log_wal,
                 {{case_clause,{ok,<<>>}},
                  [{ra_log_wal,open_existing,1,
                    [{file,"src/ra_log_wal.erl"},{line,556}]},
                   {ra_log_wal,'-recover_wal/2-lc$^0/1-0-',1,
                    [{file,"src/ra_log_wal.erl"},{line,240}]},
                   {ra_log_wal,recover_wal,2,
                    [{file,"src/ra_log_wal.erl"},{line,243}]},
                   {ra_log_wal,init,1,
                    [{file,"src/ra_log_wal.erl"},{line,186}]},
                   {gen_batch_server,init_it,6,
                    [{file,"src/gen_batch_server.erl"},{line,125}]},
                   {proc_lib,init_p_do_apply,3,
                    [{file,"proc_lib.erl"},{line,249}]}]}}}}}}}}},
         {ra_app,start,[normal,[]]}}}}
Log file(s) (may contain more information):
   <stdout>

BOOT FAILED
===========

Error description:
    init:do_boot/3
    init:start_em/1
    rabbit:start_it/1 line 465
    rabbit:broker_start/1 line 341
    rabbit:start_loaded_apps/2 line 586
    app_utils:manage_applications/6 line 126
    lists:foldl/3 line 1263
    rabbit:'-handle_app_error/1-fun-0-'/3 line 709
throw:{could_not_start,ra,
       {ra,
        {{shutdown,
          {failed_to_start_child,ra_system_sup,
           {shutdown,
            {failed_to_start_child,ra_log_sup,
             {shutdown,
              {failed_to_start_child,ra_log_wal_sup,
               {shutdown,
                {failed_to_start_child,ra_log_wal,
                 {{case_clause,{ok,<<>>}},
                  [{ra_log_wal,open_existing,1,
                    [{file,"src/ra_log_wal.erl"},{line,556}]},
                   {ra_log_wal,'-recover_wal/2-lc$^0/1-0-',1,
                    [{file,"src/ra_log_wal.erl"},{line,240}]},
                   {ra_log_wal,recover_wal,2,
                    [{file,"src/ra_log_wal.erl"},{line,243}]},
                   {ra_log_wal,init,1,
                    [{file,"src/ra_log_wal.erl"},{line,186}]},
                   {gen_batch_server,init_it,6,
                    [{file,"src/gen_batch_server.erl"},{line,125}]},
                   {proc_lib,init_p_do_apply,3,
                    [{file,"proc_lib.erl"},{line,249}]}]}}}}}}}}},
         {ra_app,start,[normal,[]]}}}}
Log file(s) (may contain more information):
   <stdout>

{"init terminating in do_boot",{could_not_start,ra,{ra,{{shutdown,{failed_to_start_child,ra_system_sup,{shutdown,{failed_to_start_child,ra_log_sup,{shutdown,{failed_to_start_child,ra_log_wal_sup,{shutdown,{failed_to_start_child,ra_log_wal,{{case_clause,{ok,<<>>}},[{ra_log_wal,open_existing,1,[{file,"src/ra_log_wal.erl"},{line,556}]},{ra_log_wal,'-recover_wal/2-lc$^0/1-0-',1,[{file,"src/ra_log_wal.erl"},{line,240}]},{ra_log_wal,recover_wal,2,[{file,"src/ra_log_wal.erl"},{line,243}]},{ra_log_wal,init,1,[{file,"src/ra_log_wal.erl"},{line,186}]},{gen_batch_server,init_it,6,[{file,"src/gen_batch_server.erl"},{line,125}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}}}}}}}}},{ra_app,start,[normal,[]]}}}}}
init terminating in do_boot ({could_not_start,ra,{ra,{{shutdown,{_}},{ra_app,start,[_]}}}})

Crash dump is being written to: /var/log/rabbitmq/erl_crash.dump...done

修复方法:
(1) 找到 rabbitmq 使用的 pv,例如: rabbitmq-0 的 pod:

# kubectl get pv | grep rabbitmq-0
pvc-70ed48bf-bef8-4658-b530-1fd3a6ef5937   200Gi      RWO            Delete           Bound    openstack/rabbitmq-data-rabbitmq-0                                    ceph-ssd                6d17h

(2) 找到 pv 使用的信息:

# kubectl get pv pvc-70ed48bf-bef8-4658-b530-1fd3a6ef5937 -o yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    kubernetes.io/createdby: rbd-dynamic-provisioner
    pv.kubernetes.io/bound-by-controller: "yes"
    pv.kubernetes.io/provisioned-by: kubernetes.io/rbd
  creationTimestamp: "2023-10-31T15:40:59Z"
  finalizers:
  - kubernetes.io/pv-protection
  name: pvc-70ed48bf-bef8-4658-b530-1fd3a6ef5937
  resourceVersion: "7552"
  uid: 6848417a-dd4f-430c-85e5-f3234a1ac6bf
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 200Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: rabbitmq-data-rabbitmq-0
    namespace: openstack
    resourceVersion: "4704"
    uid: 70ed48bf-bef8-4658-b530-1fd3a6ef5937
  persistentVolumeReclaimPolicy: Delete
  rbd:
    image: kubernetes-dynamic-pvc-c8a3585f-dc7b-438c-a22e-cca9d84c341f
    keyring: /etc/ceph/keyring
    monitors:
    - ceph-mon.ceph.svc.cluster.local:6789
    pool: ssdpool
    secretRef:
      name: pvc-ceph-client-key
    user: admin
  storageClassName: ceph-ssd
  volumeMode: Filesystem
status:
  phase: Bound

需要的信息:

    image: kubernetes-dynamic-pvc-c8a3585f-dc7b-438c-a22e-cca9d84c341f

(3) 在 pod 节点上查看对应的物理设备

# ssh node-2 rbd showmapped | grep kubernetes-dynamic-pvc-c8a3585f-dc7b-438c-a22e-cca9d84c341f
0  ssdpool           kubernetes-dynamic-pvc-c8a3585f-dc7b-438c-a22e-cca9d84c341f -    /dev/rbd0  

(4) 查看设备挂载目录

# ssh node-2 mount | grep rbd0
/dev/rbd0 on /var/lib/kubelet/plugins/kubernetes.io/rbd/mounts/ssdpool-image-kubernetes-dynamic-pvc-c8a3585f-dc7b-438c-a22e-cca9d84c341f type ext4 (rw,relatime,stripe=1024)
/dev/rbd0 on /var/lib/kubelet/pods/3a37e264-4fd5-4cb8-844b-6b6cd4a6859c/volumes/kubernetes.io~rbd/pvc-70ed48bf-bef8-4658-b530-1fd3a6ef5937 type ext4 (rw,relatime,stripe=1024)

(5) 查找 wal 文件路径,查找的路径来自步骤 (4)

# ssh node-2 find /var/lib/kubelet/pods/3a37e264-4fd5-4cb8-844b-6b6cd4a6859c/volumes/kubernetes.io~rbd/pvc-70ed48bf-bef8-4658-b530-1fd3a6ef5937 -name "*.wal"
/var/lib/kubelet/pods/3a37e264-4fd5-4cb8-844b-6b6cd4a6859c/volumes/kubernetes.io~rbd/pvc-70ed48bf-bef8-4658-b530-1fd3a6ef5937/mnesia/rabbit@rabbitmq-0.rabbitmq-discovery.openstack.svc.cluster.local/quorum/rabbit@rabbitmq-0.rabbitmq-discovery.openstack.svc.cluster.local/00000025.wal

(6) 删除 wal 文件
此步骤请慎重操作,建议将文件备份后再操作。

# ssh node-2 rm -rf /var/lib/kubelet/pods/3a37e264-4fd5-4cb8-844b-6b6cd4a6859c/volumes/kubernetes.io~rbd/pvc-70ed48bf-bef8-4658-b530-1fd3a6ef5937/mnesia/rabbit@rabbitmq-0.rabbitmq-discovery.openstack.svc.cluster.local/quorum/rabbit@rabbitmq-0.rabbitmq-discovery.openstack.svc.cluster.local/00000025.wal
Warning: Permanently added 'node-2' (ED25519) to the list of known hosts.

(7) 删除 pod,重新启动 pod

# kubectl delete pods rabbitmq-0 -n openstack 
pod "rabbitmq-0" deleted

等待 pod 再次启动,过一会重新数据同步恢复。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
这个问题的原因是RabbitMQ在写Crash dump时遇到了问题,路径指向了Windows系统的C盘用户文件夹。根据引用和引用的内容,你可能遇到了安装路径或文件名的问题。请尝试按照以下步骤解决该问题: 1. 检查RabbitMQ的安装路径,确保路径中没有空格或特殊字符。根据引用中的建议,你可以将安装路径改为"/usr/local/Cellar/rabbitmq/3.8.9_1",然后执行"source .zshrc"命令。 2. 检查Erlang和RabbitMQ的版本兼容性。根据引用中的说明,RabbitMQ的版本和Erlang的版本需要匹配。你可以在RabbitMQ的官方网站上查看版本兼容性信息,确认你的Erlang版本是否在兼容范围内。 3. 如果你已经更改了安装路径或文件名,请确保在".zshrc"文件中更新了相关的路径。然后执行"source .zshrc"命令使更改生效。 4. 最后,再次执行"sudo rabbitmq-server -detached"命令,看看是否还出现相同的错误消息。 希望这些步骤能够帮助你解决问题。如果问题仍然存在,请提供更多的详细信息,以便我们可以进一步帮助你解决。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* *3* [【启用rabbitmq遇到的问题全】command not found /Crash dump is being written to: erl_crash.dump...done](https://blog.csdn.net/weixin_51349493/article/details/123183441)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT0_1"}}] [.reference_item style="max-width: 50%"] - *2* [Erlang报错Crash dump is being written to: erl_crash.dump...done](https://blog.csdn.net/weixin_67851823/article/details/127531792)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT0_1"}}] [.reference_item style="max-width: 50%"] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值