浪潮云启操作系统(InLinux)使用crash排查问题

前言

随着操作系统的使用普及,遇到了很多系统问题在应用层面无法排查,本文整理了一份基于浪潮云启操作系统(InLinux)使用crash工具和kdump生成的vmcore文件排查应用问题的方法。

操作系统版本

[root@localhost ~]# cat /etc/os-release
NAME="InLinux"
VERSION="23.12 (LTS-SP1)"
ID="InLinux"
VERSION_ID="23.12"
PRETTY_NAME="InLinux 23.12 (LTS-SP1)"
ANSI_COLOR="0;31"
BUILD_TIME="2024-04-23_15:40:11"

组件安装

安装crash

yum install -y crash-debuginfo crash

安装kernel-debuginfo和kernel-debugsource 

安装此组件是为了获取vmlinux。安装后vmlinux的路径为:
/usr/lib/debug/lib/modules/$(uname -r)/vmlinux

yum install -y  kernel-debugsource kernel-debuginfo

安装kexec-tools

yum install kernel-debuginfo-$(uname -r) kexec-tools

crash使用

说明

crash分析问题,需要使用vmcore文件和vmlinux文件。其中注意点如下:

  •   vmcore文件时通过kdump生成,一般是在路径‘/var/crash/’目录下,如果有多个,根据自己的需要来选在。
  • vmlinux文件路径:/usr/lib/debug/lib/modules/$(uname -r)/vmlinux
  • 确保kernel、kernel-debuginfo的版本完全相同

生成测试用vmcore

vmcore是系统宕机或者panic 时 kdump生成的系统运行在某个时间点的内存状态的快照,我们可以通过模拟的方式生成vmcore文件。
手动触发crash,等待几分钟,虚机自动重启,测试启动后kdump转存vmcore日志,触发命令如下:

# echo 1 > /proc/sys/kernel/sysrq
# echo c > /proc/sysrq-trigger

生成的vmcore文件位于/var/crash/目录下。

命令运行方式

crash命令执行方式如下:
crash {vmcore文件} {调试内核vmlinux}

  • 第一个参数为kdump生成的vmcore文件,可以模拟生成。
  • 第二个参数为vmlinux,安装kernel-debuginfo时安装的程序

命令示例如下:
 

crash  /var/crash/127.0.0.1-2024-08-05-08\:47\:09/vmcore  /usr/lib/debug/lib/modules/$(uname -r)/vmlinux

crash工具开始调试

[root@localhost ~]# crash  /var/crash/127.0.0.1-2024-08-05-08\:47\:09/vmcore  /usr/lib/debug/lib/modules/$(uname -r)/vmlinux

crash 8.0.2-1.ile2312sp1
Copyright (C) 2002-2022  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011, 2020-2022  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
Copyright (C) 2015, 2021  VMware, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.

GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...

WARNING: kernel version inconsistency between vmlinux and dumpfile

      KERNEL: /usr/lib/debug/lib/modules/5.10.0-197.0.0.110.ile2312sp1.x86_64/vmlinux
    DUMPFILE: /var/crash/127.0.0.1-2024-08-05-08:47:09/vmcore  [PARTIAL DUMP]
        CPUS: 16
        DATE: Mon Aug  5 08:47:04 CST 2024
      UPTIME: 2 days, 15:28:13
LOAD AVERAGE: 0.00, 0.05, 0.09
       TASKS: 227
    NODENAME: localhost.localdomain
     RELEASE: 5.10.0-197.0.0.110.ile2312sp1.x86_64
     VERSION: #1 SMP Tue Apr 30 10:18:42 UTC 2024
     MACHINE: x86_64  (2194 Mhz)
      MEMORY: 16 GB
       PANIC: "Kernel panic - not syncing: sysrq triggered crash"
         PID: 7961
     COMMAND: "bash"
        TASK: ffff9c36410bb400  [THREAD_INFO: ffff9c36410bb400]
         CPU: 1
       STATE: TASK_RUNNING (PANIC)

crash>

 

查看日志log/dmesg命令

通过查看系统日志,可以排查大部分应用程序的问题。


log命令
 

crash> log
[    0.000000] Linux version 5.10.0-197.0.0.110.ile2312sp1.x86_64 (abuild@obsworker208) (gcc_old (GCC) 10.3.1, GNU ld (GNU Binutils) 2.37) #1 SMP Tue Apr 30 10:18:42 UTC 2024
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.10.0-197.0.0.110.ile2312sp1.x86_64 root=UUID=17bb1f2f-3fb1-49de-b9b2-747a32892161 ro cgroup_disable=files apparmor=0 crashkernel=512M
[    0.000000] signal: max sigframe size: 944
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bffd8fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000bffd9000-0x00000000bfffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000043fffffff] usable
...

[  503.309075] capability: warning: `yum' uses 32-bit capabilities (legacy support in use)
[228492.577924] sysrq: Trigger a crash
[228492.578847] Kernel panic - not syncing: sysrq triggered crash
[228492.579793] CPU: 1 PID: 7961 Comm: bash Kdump: loaded Not tainted 5.10.0-197.0.0.110.ile2312sp1.x86_64 #1
[228492.581162] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[228492.582513] Call Trace:
[228492.583225]  dump_stack+0x57/0x6e
[228492.583939]  panic+0x10e/0x2ef
[228492.584615]  ? printk+0x58/0x73
[228492.585332]  sysrq_handle_crash+0x16/0x20
[228492.586137]  __handle_sysrq.cold+0x43/0x11a
[228492.586561]  write_sysrq_trigger+0x34/0x60
[228492.586983]  proc_reg_write+0x40/0x90
[228492.587384]  vfs_write+0xde/0x250
[228492.587764]  ksys_write+0x5f/0xe0
[228492.588151]  do_syscall_64+0x40/0x80
[228492.588554]  entry_SYSCALL_64_after_hwframe+0x62/0xc7
[228492.589013] RIP: 0033:0x7f72f9878c67
[228492.589401] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[228492.590702] RSP: 002b:00007fffc84d3cb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[228492.591383] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f72f9878c67
[228492.592025] RDX: 0000000000000002 RSI: 0000559b08fc7870 RDI: 0000000000000001
[228492.592673] RBP: 0000559b08fc7870 R08: 00007f72f992c380 R09: 00007f72f992c400
[228492.593313] R10: 00007f72f992c300 R11: 0000000000000246 R12: 0000000000000002
[228492.593967] R13: 00007f72f996e5a0 R14: 0000000000000002 R15: 00007f72f996e7a0
[228492.597097] kexec: Bye!
crash>


dmesg命令


crash> dmesg
[    0.000000] Linux version 5.10.0-197.0.0.110.ile2312sp1.x86_64 (abuild@obsworker208) (gcc_old (GCC) 10.3.1, GNU ld (GNU Binutils) 2.37) #1 SMP Tue Apr 30 10:18:42 UTC 2024
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.10.0-197.0.0.110.ile2312sp1.x86_64 root=UUID=17bb1f2f-3fb1-49de-b9b2-747a32892161 ro cgroup_disable=files apparmor=0 crashkernel=512M
[    0.000000] signal: max sigframe size: 944
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bffd8fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000bffd9000-0x00000000bfffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000043fffffff] usable
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] SMBIOS 2.8 present.
[    0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[    0.000000] Hypervisor detected: KVM
[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[    0.000000] kvm-clock: cpu 0, msr 409401001, primary cpu clock
[    0.000000] kvm-clock: using sched offset of 475504025149 cycles
[    0.000008] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[    0.000016] tsc: Detected 2194.908 MHz processor
[    0.001274] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[    0.001278] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.001283] last_pfn = 0x440000 max_arch_pfn = 0x400000000
[    0.001322] MTRR default type: write-back
[    0.001324] MTRR fixed ranges enabled:
[    0.001325]   00000-9FFFF write-back
[    0.001326]   A0000-BFFFF uncachable
[    0.001327]   C0000-FFFFF write-protect
[    0.001328] MTRR variable ranges enabled:
...

[  503.309075] capability: warning: `yum' uses 32-bit capabilities (legacy support in use)
[228492.577924] sysrq: Trigger a crash
[228492.578847] Kernel panic - not syncing: sysrq triggered crash
[228492.579793] CPU: 1 PID: 7961 Comm: bash Kdump: loaded Not tainted 5.10.0-197.0.0.110.ile2312sp1.x86_64 #1
[228492.581162] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[228492.582513] Call Trace:
[228492.583225]  dump_stack+0x57/0x6e
[228492.583939]  panic+0x10e/0x2ef
[228492.584615]  ? printk+0x58/0x73
[228492.585332]  sysrq_handle_crash+0x16/0x20
[228492.586137]  __handle_sysrq.cold+0x43/0x11a
[228492.586561]  write_sysrq_trigger+0x34/0x60
[228492.586983]  proc_reg_write+0x40/0x90
[228492.587384]  vfs_write+0xde/0x250
[228492.587764]  ksys_write+0x5f/0xe0
[228492.588151]  do_syscall_64+0x40/0x80
[228492.588554]  entry_SYSCALL_64_after_hwframe+0x62/0xc7
[228492.589013] RIP: 0033:0x7f72f9878c67
[228492.589401] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[228492.590702] RSP: 002b:00007fffc84d3cb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[228492.591383] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f72f9878c67
[228492.592025] RDX: 0000000000000002 RSI: 0000559b08fc7870 RDI: 0000000000000001
[228492.592673] RBP: 0000559b08fc7870 R08: 00007f72f992c380 R09: 00007f72f992c400
[228492.593313] R10: 00007f72f992c300 R11: 0000000000000246 R12: 0000000000000002
[228492.593967] R13: 00007f72f996e5a0 R14: 0000000000000002 R15: 00007f72f996e7a0
[228492.597097] kexec: Bye!
crash>

bt命令

bt查看堆栈: 展示调用堆栈信息,如果不加参数那么就可以利用SP和FP进行栈回溯打印。

当日志不能判断应用的问题时,可以通过使用bt命令查看系统的堆栈调用信息,对问题进行深入排查。

crash> bt
PID: 7961     TASK: ffff9c36410bb400  CPU: 1    COMMAND: "bash"
 #0 [ffffbfbac6b3bde0] panic at ffffffff9089b0f2
 #1 [ffffbfbac6b3be60] sysrq_handle_crash at ffffffff904fad86
 #2 [ffffbfbac6b3be68] __handle_sysrq.cold at ffffffff908c13b8
 #3 [ffffbfbac6b3be98] write_sysrq_trigger at ffffffff904fb6a4
 #4 [ffffbfbac6b3beb0] proc_reg_write at ffffffff9022c900
 #5 [ffffbfbac6b3bec8] vfs_write at ffffffff9019946e
 #6 [ffffbfbac6b3bf00] ksys_write at ffffffff901998cf
 #7 [ffffbfbac6b3bf38] do_syscall_64 at ffffffff908e0750
 #8 [ffffbfbac6b3bf50] entry_SYSCALL_64_after_hwframe at ffffffff90a000da
    RIP: 00007f72f9878c67  RSP: 00007fffc84d3cb8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000000000000002  RCX: 00007f72f9878c67
    RDX: 0000000000000002  RSI: 0000559b08fc7870  RDI: 0000000000000001
    RBP: 0000559b08fc7870   R8: 00007f72f992c380   R9: 00007f72f992c400
    R10: 00007f72f992c300  R11: 0000000000000246  R12: 0000000000000002
    R13: 00007f72f996e5a0  R14: 0000000000000002  R15: 00007f72f996e7a0
    ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b
crash>

bt –T : -T显示一个进程从thread_info以上一直到堆栈底部的所有symbol信息,一般比不加参数打印出的信息更多

crash> bt -T
PID: 7961     TASK: ffff9c36410bb400  CPU: 1    COMMAND: "bash"
  [ffffbfbac6b3b480] __update_blocked_fair at ffffffff8ff2977d
  [ffffbfbac6b3b4f0] raw_spin_rq_unlock at ffffffff8ff1e7ea
  [ffffbfbac6b3b530] update_nohz_stats at ffffffff8ff2cba0
  [ffffbfbac6b3b538] cpumask_next_and at ffffffff903c773a
  [ffffbfbac6b3b540] find_busiest_group at ffffffff8ff3d1c2
  [ffffbfbac6b3b670] can_migrate_task at ffffffff8ff39855
  [ffffbfbac6b3b690] detach_tasks at ffffffff8ff39e4f
  [ffffbfbac6b3b6e0] load_balance at ffffffff8ff3dc66
  [ffffbfbac6b3b788] __update_load_avg_cfs_rq at ffffffff8ff4f19c
  [ffffbfbac6b3b798] __update_load_avg_se at ffffffff8ff4ee88
  [ffffbfbac6b3b7a0] update_curr at ffffffff8ff30c0e
  [ffffbfbac6b3b7e8] set_next_entity at ffffffff8ff2e453
  [ffffbfbac6b3b818] pick_next_task_fair at ffffffff8ff3eaf9
  [ffffbfbac6b3b890] vsnprintf at ffffffff903d68cc
  [ffffbfbac6b3b8a8] number at ffffffff903d1c4f
  [ffffbfbac6b3b8f0] widen_string at ffffffff903d24fb
  [ffffbfbac6b3b910] vsnprintf at ffffffff903d690e
  [ffffbfbac6b3b928] number at ffffffff903d1c4f
  [ffffbfbac6b3b970] widen_string at ffffffff903d24fb
  [ffffbfbac6b3b980] number at ffffffff903d1c4f
  [ffffbfbac6b3b9c8] widen_string at ffffffff903d24fb
  [ffffbfbac6b3b9e8] vsnprintf at ffffffff903d690e
  [ffffbfbac6b3ba40] vgacon_scroll at ffffffff9042d9cf
  [ffffbfbac6b3ba68] desc_read_finalized_seq at ffffffff8ff6971f
  [ffffbfbac6b3ba70] con_scroll at ffffffff9050595a
  [ffffbfbac6b3ba90] prb_read at ffffffff8ff697f0
  [ffffbfbac6b3baa8] kvm_io_delay at ffffffff8fe751a0
  [ffffbfbac6b3bab0] atomic_notifier_call_chain at ffffffff8ff15257
  [ffffbfbac6b3bb08] _prb_read_valid at ffffffff8ff699dd
  [ffffbfbac6b3bb60] prb_read_valid at ffffffff8ff6a6d7
  [ffffbfbac6b3bc28] vprintk_emit at ffffffff8ff689a8
  [ffffbfbac6b3bc70] printk at ffffffff908a0b78
  [ffffbfbac6b3bcd0] machine_kexec.cold at ffffffff90897c76
  [ffffbfbac6b3bd20] __crash_kexec at ffffffff8ffb409a
  [ffffbfbac6b3bda8] __crash_kexec at ffffffff8ffb40c8
  [ffffbfbac6b3bde0] panic at ffffffff9089b0f2
  [ffffbfbac6b3be08] printk at ffffffff908a0b78
  [ffffbfbac6b3be60] sysrq_handle_crash at ffffffff904fad86
  [ffffbfbac6b3be68] __handle_sysrq.cold at ffffffff908c13b8
  [ffffbfbac6b3be98] write_sysrq_trigger at ffffffff904fb6a4
  [ffffbfbac6b3beb0] proc_reg_write at ffffffff9022c900
  [ffffbfbac6b3bec8] vfs_write at ffffffff9019946e
  [ffffbfbac6b3bf00] ksys_write at ffffffff901998cf
  [ffffbfbac6b3bf38] do_syscall_64 at ffffffff908e0750
  [ffffbfbac6b3bf50] entry_SYSCALL_64_after_hwframe at ffffffff90a000da
    RIP: 00007f72f9878c67  RSP: 00007fffc84d3cb8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000000000000002  RCX: 00007f72f9878c67
    RDX: 0000000000000002  RSI: 0000559b08fc7870  RDI: 0000000000000001
    RBP: 0000559b08fc7870   R8: 00007f72f992c380   R9: 00007f72f992c400
    R10: 00007f72f992c300  R11: 0000000000000246  R12: 0000000000000002
    R13: 00007f72f996e5a0  R14: 0000000000000002  R15: 00007f72f996e7a0
    ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b
crash>

bt –a: 显示所有active task的堆栈信息。
 

crash> bt -a
PID: 0        TASK: ffffffff91812940  CPU: 0    COMMAND: "swapper/0"
 #0 [fffffe1743122e50] crash_nmi_callback at ffffffff8fe6011b
 #1 [fffffe1743122e58] nmi_handle at ffffffff8fe2b408
 #2 [fffffe1743122ea0] default_do_nmi at ffffffff908e1e22
 #3 [fffffe1743122ec8] exc_nmi at ffffffff908e2042
 #4 [fffffe1743122ef0] end_repeat_nmi at ffffffff90a01549
    [exception RIP: default_idle+19]
    RIP: ffffffff908f12f3  RSP: ffffffff91803ec0  RFLAGS: 00000246
    RAX: ffffffff908f12e0  RBX: ffffffff91812940  RCX: ffff9c396f636c80
    RDX: 00000000003f3e9a  RSI: 0000000000000000  RDI: ffff9c396f627820
    RBP: 0000000000000000   R8: 000000cd42e4dffb   R9: 0000000000000001
    R10: 0000000000000001  R11: 00000000000123ed  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
 #5 [ffffffff91803ec0] default_idle at ffffffff908f12f3
 #6 [ffffffff91803ec0] default_idle_call at ffffffff908f1544
 #7 [ffffffff91803ec8] cpuidle_idle_call at ffffffff8ff28345
 #8 [ffffffff91803f00] do_idle at ffffffff8ff283f2
 #9 [ffffffff91803f18] cpu_startup_entry at ffffffff8ff285c9
#10 [ffffffff91803f28] start_kernel at ffffffff9226c856
#11 [ffffffff91803f50] secondary_startup_64_no_verify at ffffffff8fe00107

PID: 7961     TASK: ffff9c36410bb400  CPU: 1    COMMAND: "bash"
 #0 [ffffbfbac6b3bde0] panic at ffffffff9089b0f2
 #1 [ffffbfbac6b3be60] sysrq_handle_crash at ffffffff904fad86
 #2 [ffffbfbac6b3be68] __handle_sysrq.cold at ffffffff908c13b8
 #3 [ffffbfbac6b3be98] write_sysrq_trigger at ffffffff904fb6a4
 #4 [ffffbfbac6b3beb0] proc_reg_write at ffffffff9022c900
 #5 [ffffbfbac6b3bec8] vfs_write at ffffffff9019946e
 #6 [ffffbfbac6b3bf00] ksys_write at ffffffff901998cf
 #7 [ffffbfbac6b3bf38] do_syscall_64 at ffffffff908e0750
 #8 [ffffbfbac6b3bf50] entry_SYSCALL_64_after_hwframe at ffffffff90a000da
    RIP: 00007f72f9878c67  RSP: 00007fffc84d3cb8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000000000000002  RCX: 00007f72f9878c67
    RDX: 0000000000000002  RSI: 0000559b08fc7870  RDI: 0000000000000001
    RBP: 0000559b08fc7870   R8: 00007f72f992c380   R9: 00007f72f992c400
    R10: 00007f72f992c300  R11: 0000000000000246  R12: 0000000000000002
    R13: 00007f72f996e5a0  R14: 0000000000000002  R15: 00007f72f996e7a0
    ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b

...
PID: 0        TASK: ffff9c3640340000  CPU: 15   COMMAND: "swapper/15"
 #0 [fffffe3f47bcfe50] crash_nmi_callback at ffffffff8fe6011b
 #1 [fffffe3f47bcfe58] nmi_handle at ffffffff8fe2b408
 #2 [fffffe3f47bcfea0] default_do_nmi at ffffffff908e1e22
 #3 [fffffe3f47bcfec8] exc_nmi at ffffffff908e2042
 #4 [fffffe3f47bcfef0] end_repeat_nmi at ffffffff90a01549
    [exception RIP: default_idle+19]
    RIP: ffffffff908f12f3  RSP: ffffbfbac00ebee8  RFLAGS: 00000242
    RAX: ffffffff908f12e0  RBX: ffff9c3640340000  RCX: ffff9c396fdb6c80
    RDX: 000000000037b5da  RSI: 0000000000000083  RDI: 000000000000000f
    RBP: 0000000000000000   R8: 0000d03ebd5eae08   R9: 0000000000000001
    R10: 0000000000000001  R11: 0000000000002800  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
 #5 [ffffbfbac00ebee8] default_idle at ffffffff908f12f3
 #6 [ffffbfbac00ebee8] default_idle_call at ffffffff908f1544
 #7 [ffffbfbac00ebef0] cpuidle_idle_call at ffffffff8ff28345
 #8 [ffffbfbac00ebf28] do_idle at ffffffff8ff283f2
 #9 [ffffbfbac00ebf40] cpu_startup_entry at ffffffff8ff285c9
#10 [ffffbfbac00ebf50] secondary_startup_64_no_verify at ffffffff8fe00107
crash>

ps命令

ps:展示系统中的进程状态,和正常系统运行时的ps命令类似流程

crash> ps
      PID    PPID  CPU       TASK        ST  %MEM      VSZ      RSS  COMM
>       0       0   0  ffffffff91812940  RU   0.0        0        0  [swapper/0]
        0       0   1  ffff9c36402e0000  RU   0.0        0        0  [swapper/1]
>       0       0   2  ffff9c36402e4e00  RU   0.0        0        0  [swapper/2]
>       0       0   3  ffff9c36402e3400  RU   0.0        0        0  [swapper/3]
>       0       0   4  ffff9c3640311a00  RU   0.0        0        0  [swapper/4]
>       0       0   5  ffff9c3640310000  RU   0.0        0        0  [swapper/5]
>       0       0   6  ffff9c3640314e00  RU   0.0        0        0  [swapper/6]
>       0       0   7  ffff9c3640313400  RU   0.0        0        0  [swapper/7]
>       0       0   8  ffff9c3640320000  RU   0.0        0        0  [swapper/8]
>       0       0   9  ffff9c3640324e00  RU   0.0        0        0  [swapper/9]
>       0       0  10  ffff9c3640323400  RU   0.0        0        0  [swapper/10]
>       0       0  11  ffff9c3640321a00  RU   0.0        0        0  [swapper/11]
>       0       0  12  ffff9c3640344e00  RU   0.0        0        0  [swapper/12]
>       0       0  13  ffff9c3640343400  RU   0.0        0        0  [swapper/13]
>       0       0  14  ffff9c3640341a00  RU   0.0        0        0  [swapper/14]
>       0       0  15  ffff9c3640340000  RU   0.0        0        0  [swapper/15]
        1       0   4  ffff9c364028ce00  IN   0.1   170388    19044  systemd
        2       0  12  ffff9c364028b400  IN   0.0        0        0  [kthreadd]
        3       2   0  ffff9c3640289a00  ID   0.0        0        0  [rcu_gp]
        4       2   0  ffff9c3640288000  ID   0.0        0        0  [rcu_par_gp]
        6       2   0  ffff9c36402bb400  ID   0.0        0        0  [kworker/0:0H]
        8       2   0  ffff9c36402b8000  ID   0.0        0        0  [mm_percpu_wq]
        9       2   0  ffff9c36402cb400  IN   0.0        0        0  [rcu_tasks_rude_]
       10       2   0  ffff9c36402c9a00  IN   0.0        0        0  [rcu_tasks_trace]
       11       2   0  ffff9c36402c8000  IN   0.0        0        0  [ksoftirqd/0]
       12       2   5  ffff9c36402cce00  ID   0.0        0        0  [rcu_sched]
       13       2   0  ffff9c36402e1a00  IN   0.0        0        0  [migration/0]
       14       2   0  ffff9c3640369a00  IN   0.0        0        0  [cpuhp/0]
       15       2   1  ffff9c3640368000  IN   0.0        0        0  [cpuhp/1]
       16       2   1  ffff9c364036ce00  IN   0.0        0        0  [migration/1]
       17       2   1  ffff9c364036b400  IN   0.0        0        0  [ksoftirqd/1]
       19       2   1  ffff9c3640383400  ID   0.0        0        0  [kworker/1:0H]
       20       2   2  ffff9c3640381a00  IN   0.0        0        0  [cpuhp/2]
       21       2   2  ffff9c3640380000  IN   0.0        0        0  [migration/2]
       22       2   2  ffff9c36403ab400  IN   0.0        0        0  [ksoftirqd/2]
       24       2   2  ffff9c36403a8000  ID   0.0        0        0  [kworker/2:0H]
       25       2   3  ffff9c36403ace00  IN   0.0        0        0  [cpuhp/3]
       26       2   3  ffff9c36403d0000  IN   0.0        0        0  [migration/3]
       27       2   3  ffff9c36403d4e00  IN   0.0        0        0  [ksoftirqd/3]
       29       2   3  ffff9c36403d1a00  ID   0.0        0        0  [kworker/3:0H]
       30       2   4  ffff9c36403f3400  IN   0.0        0        0  [cpuhp/4]

dis命令

dis反汇编命令
dis <address>:反汇编命令,-l可以展示源代码行。
先使用bt查看调用信息,再使用dis对查到的地址 执行反汇编。

crash> bt
PID: 7961     TASK: ffff9c36410bb400  CPU: 1    COMMAND: "bash"
 #0 [ffffbfbac6b3bde0] panic at ffffffff9089b0f2
 #1 [ffffbfbac6b3be60] sysrq_handle_crash at ffffffff904fad86
 #2 [ffffbfbac6b3be68] __handle_sysrq.cold at ffffffff908c13b8
 #3 [ffffbfbac6b3be98] write_sysrq_trigger at ffffffff904fb6a4
 #4 [ffffbfbac6b3beb0] proc_reg_write at ffffffff9022c900
 #5 [ffffbfbac6b3bec8] vfs_write at ffffffff9019946e
 #6 [ffffbfbac6b3bf00] ksys_write at ffffffff901998cf
 #7 [ffffbfbac6b3bf38] do_syscall_64 at ffffffff908e0750
 #8 [ffffbfbac6b3bf50] entry_SYSCALL_64_after_hwframe at ffffffff90a000da
    RIP: 00007f72f9878c67  RSP: 00007fffc84d3cb8  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000000000000002  RCX: 00007f72f9878c67
    RDX: 0000000000000002  RSI: 0000559b08fc7870  RDI: 0000000000000001
    RBP: 0000559b08fc7870   R8: 00007f72f992c380   R9: 00007f72f992c400
    R10: 00007f72f992c300  R11: 0000000000000246  R12: 0000000000000002
    R13: 00007f72f996e5a0  R14: 0000000000000002  R15: 00007f72f996e7a0
    ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b
crash> dis ffffffff908e0750
0xffffffff908e0750 <do_syscall_64+64>:  mov    %rax,0x50(%r12)
crash>

mount命令

mount:展示当前挂载的文件系统的命令

crash> mount
     MOUNT           SUPERBLK     TYPE   DEVNAME   DIRNAME
ffff9c36401f1180 ffff9c364004a800 rootfs rootfs    /
ffff9c3948ee4140 ffff9c3645c9b800 proc   proc      /proc
ffff9c3948ee43c0 ffff9c3645c99000 sysfs  sysfs     /sys
ffff9c3948ee1900 ffff9c396f64e000 devtmpfs devtmpfs /dev
ffff9c3948ee0280 ffff9c396f64e800 securityfs securityfs /sys/kernel/security
ffff9c3949405400 ffff9c3645c9f800 tmpfs  tmpfs     /dev/shm
ffff9c3949404c80 ffff9c3645c98000 devpts devpts    /dev/pts
ffff9c3949404b40 ffff9c3645c99800 tmpfs  tmpfs     /run
ffff9c3949406b40 ffff9c3645c98800 tmpfs  tmpfs     /sys/fs/cgroup
ffff9c3949406280 ffff9c3645c9d000 cgroup cgroup    /sys/fs/cgroup/systemd
ffff9c3949407e00 ffff9c3645c9f000 bpf    none      /sys/fs/bpf
ffff9c3949412280 ffff9c3641002800 cgroup cgroup    /sys/fs/cgroup/pids
ffff9c39494143c0 ffff9c3641006800 cgroup cgroup    /sys/fs/cgroup/net_cls,net_prio
ffff9c3949414b40 ffff9c3641006000 cgroup cgroup    /sys/fs/cgroup/perf_event
ffff9c3949415180 ffff9c3641004000 cgroup cgroup    /sys/fs/cgroup/cpuset
ffff9c3949414500 ffff9c3641005800 cgroup cgroup    /sys/fs/cgroup/hugetlb
ffff9c3949416c80 ffff9c3641002000 cgroup cgroup    /sys/fs/cgroup/freezer
ffff9c3949417900 ffff9c3641007000 cgroup cgroup    /sys/fs/cgroup/cpu,cpuacct
ffff9c3949416280 ffff9c3641005000 cgroup cgroup    /sys/fs/cgroup/rdma
ffff9c39497e1900 ffff9c3641000800 cgroup cgroup    /sys/fs/cgroup/devices
ffff9c39497e1cc0 ffff9c3641001800 cgroup cgroup    /sys/fs/cgroup/blkio
ffff9c39497e1a40 ffff9c3641003000 cgroup cgroup    /sys/fs/cgroup/memory
ffff9c3645c42780 ffff9c3645cf7000 xfs    /dev/vda1 /
ffff9c3948e2ef00 ffff9c364226c000 selinuxfs selinuxfs /sys/fs/selinux
ffff9c3641092140 ffff9c3645ef6000 autofs systemd-1 /proc/sys/fs/binfmt_misc
ffff9c3949469900 ffff9c394778b800 hugetlbfs hugetlbfs /dev/hugepages
ffff9c3641321400 ffff9c364226d800 mqueue mqueue    /dev/mqueue
ffff9c3948c80000 ffff9c396f64a800 debugfs debugfs  /sys/kernel/debug
ffff9c3648648a00 ffff9c3640f3d000 tracefs tracefs  /sys/kernel/tracing
ffff9c3948d4bcc0 ffff9c3948cc0800 tmpfs  tmpfs     /tmp
ffff9c3948e197c0 ffff9c3648430000 configfs configfs /sys/kernel/config
ffff9c396f66ba40 ffff9c3645cf5000 fusectl fusectl  /sys/fs/fuse/connections
ffff9c3949413a40 ffff9c36438bd800 xfs    /dev/vda2 /boot
crash>

net命令

net:展示网络相关的信息

crash> net
   NET_DEVICE     NAME   IP ADDRESS(ES)
ffff9c3641554000  lo     127.0.0.1
ffff9c3645863000  ens3   192.168.xxx.xxx
crash>

退出 crash 工具

exit命令

crash> exit

help 帮助

以上是常用的命令,如果想进一步学习,可以在crash中执行help命令获取帮助

crash> help

*              files          mod            sbitmapq       union
alias          foreach        mount          search         vm
ascii          fuser          net            set            vtop
bpf            gdb            p              sig            waitq
bt             help           ps             struct         whatis
btop           ipcs           pte            swap           wr
dev            irq            ptob           sym            q
dis            kmem           ptov           sys
eval           list           rd             task
exit           log            repeat         timer
extend         mach           runq           tree

crash version: 8.0.2-1.ile2312sp1   gdb version: 10.2
For help on any command above, enter "help <command>".
For help on input options, enter "help input".
For help on output options, enter "help output".

crash>

help获取特定命令的帮助

crash> help ps

NAME
  ps - display process status information

SYNOPSIS
  ps [-k|-u|-G|-y policy] [-s] [-p|-c|-t|-[l|m][-C cpu]|-a|-g|-r|-S|-A]
     [pid | task | command] ...

DESCRIPTION
  This command displays process status for selected, or all, processes
  in the system.  If no arguments are entered, the process data is
  is displayed for all processes.  Specific processes may be selected
  by using the following identifier formats:

       pid  a process PID.
      task  a hexadecimal task_struct pointer.
   command  a command name.  If a command name is made up of letters that
            are all numerical values, precede the name string with a "\".
            If the command string is enclosed within "'" characters, then
            the encompassed string must be a POSIX extended regular expression
            that will be used to match task names.

  The process list may be further restricted by the following options:

        -k  restrict the output to only kernel threads.
        -u  restrict the output to only user tasks.
        -G  display only the thread group leader in a thread group.
 -y policy  restrict the output to tasks having a specified scheduling policy
            expressed by its integer value or by its (case-insensitive) name;
            multiple policies may be entered in a comma-separated list:
              0 or NORMAL
              1 or FIFO
              2 or RR
              3 or BATCH
              4 or ISO
              5 or IDLE
              6 or DEADLINE

  The process identifier types may be mixed.  For each task, the following
  items are displayed:

    1. the process PID.
    2. the parent process PID.
    3. the CPU number that the task ran on last.
    4. the task_struct address or the kernel stack pointer of the process.
       (see -s option below)
    5. the task state (RU, IN, UN, ZO, ST, TR, DE, SW, WA, PA, ID, NE).
    6. the percentage of physical memory being used by this task.
    7. the virtual address size of this task in kilobytes.
    8. the resident set size of this task in kilobytes.
    9. the command name.

总结

以上是对浪潮云启操作系统(InLinux)下使用crash排查问题的简单介绍,大家可以根据自己的需要,去做深入探索。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值