kernel crash "kernel tried to execute NX-protected page"

环境

  • Red Hat Enterprise Linux 6, 7
    RHEL 7

     

    • kernel-3.10.0-514.2.2.el7
      RHEL 6
    • kernel-2.6.32-220.13.1.el6
    • kernel-2.6.32-431.23.3.el6
    • kernel-2.6.32-504.8.1.el6

问题

  • Server got crashed and rebooted itself
  • Kernel crashed with following panic message:

Raw

[3574572.516975] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
[3574572.518146] BUG: unable to handle kernel paging request at ffff88010162fbf8
[3574572.519369] IP: [<ffff88010162fbf8>] 0xffff88010162fbf8
[3574572.520376] PGD 1a86063 PUD 80000001000001e3 
[3574572.521248] Oops: 0011 [#1] SMP 
[3574572.521959] last sysfs file: /sys/devices/system/cpu/cpu11/cache/index2/shared_cpu_map
[3574572.523456] CPU 5 
[3574572.523876] Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss mptctl mptbase openafs(P)(U) autofs4 ipmi_devintf sunrpc bonding ipv6 ext3 jbd dm_multipath power_meter ipmi_si ipmi_msghandler hpilo hpwdt lpfc scsi_dh_emc scsi_transport_fc scsi_tgt bnx2 sg e1000e microcode serio_raw iTCO_wdt iTCO_vendor_support i7core_edac edac_core shpchp ext4 mbcache jbd2 sd_mod crc_t10dif hpsa radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
[3574572.532479] 
[3574572.532807] Pid: 139, comm: kswapd1 Tainted: P           ----------------   2.6.32-220.13.1.el6.x86_64 #1 HP ProLiant DL360 G6
[3574572.534715] RIP: 0010:[<ffff88010162fbf8>]  [<ffff88010162fbf8>] 0xffff88010162fbf8
[3574572.536038] RSP: 0018:ffff88060e1f7b08  EFLAGS: 00010202
[3574572.536935] RAX: ffff88010162fb28 RBX: ffff88010162fac0 RCX: ffff88060ce54d00
[3574572.538169] RDX: dead000000100100 RSI: ffff88060f538000 RDI: ffff88010162fac0
[3574572.539405] RBP: ffff88060e1f7b20 R08: ffff88060ce544d8 R09: 0000000000000000
[3574572.540669] R10: ffff880028402a00 R11: 0000000000000000 R12: ffff88060ce544c0
[3574572.541916] R13: ffff8804d6201040 R14: ffff8804d62012b8 R15: 0000000000000000
[3574572.543168] FS:  0000000000000000(0000) GS:ffff880635440000(0000) knlGS:0000000000000000
[3574572.544578] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[3574572.545543] CR2: ffff88010162fbf8 CR3: 00000007c67c4000 CR4: 00000000000006e0
[3574572.546806] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[3574572.548046] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[3574572.549308] Process kswapd1 (pid: 139, threadinfo ffff88060e1f6000, task ffff88060e1f0ac0)
[3574572.550725] Stack:
[3574572.551220]  ffffffffa04ad49d ffff88060ce544c0 ffff88060ce544c0 ffff88060e1f7b40
[3574572.552451] <0> ffffffffa05c33ca ffff88060e1f7ba0 ffff88060e1f7b70 ffff88060e1f7b60
[3574572.553851] <0> ffffffffa05c3cc1 ffff8804d6201110 ffff88060e1f7b70 ffff88060e1f7bb0
[3574572.555339] Call Trace:
[3574572.555832]  [<ffffffffa04ad49d>] ? put_rpccred+0x13d/0x150 [sunrpc]
[3574572.556893]  [<ffffffffa05c33ca>] nfs_access_free_entry+0x1a/0x40 [nfs]
[3574572.558213]  [<ffffffffa05c3cc1>] nfs_access_free_list+0x31/0x40 [nfs]
[3574572.559306]  [<ffffffffa05c3fc0>] nfs_access_zap_cache+0xe0/0x120 [nfs]
[3574572.560424]  [<ffffffffa05caefe>] nfs_clear_inode+0x3e/0x60 [nfs]
[3574572.561492]  [<ffffffff811911ac>] clear_inode+0xac/0x140
[3574572.562388]  [<ffffffff81191280>] dispose_list+0x40/0x120
[3574572.563317]  [<ffffffff811915d4>] shrink_icache_memory+0x274/0x2e0
[3574572.564364]  [<ffffffff81129afa>] shrink_slab+0x12a/0x1a0
[3574572.565280]  [<ffffffff8112c8ad>] balance_pgdat+0x57d/0x7e0
[3574572.566249]  [<ffffffff8112cec0>] ? isolate_pages_global+0x0/0x350
[3574572.567294]  [<ffffffff8112cc46>] kswapd+0x136/0x3b0
[3574572.568150]  [<ffffffff81090c30>] ? autoremove_wake_function+0x0/0x40
[3574572.569221]  [<ffffffff8112cb10>] ? kswapd+0x0/0x3b0
[3574572.570085]  [<ffffffff810908c6>] kthread+0x96/0xa0
[3574572.570914]  [<ffffffff8100c14a>] child_rip+0xa/0x20
[3574572.571765]  [<ffffffff81090830>] ? kthread+0x0/0xa0
[3574572.572611]  [<ffffffff8100c140>] ? child_rip+0x0/0x20
[3574572.573486] Code: 88 ff ff 98 fc 62 01 01 88 ff ff 00 00 00 00 00 00 00 00 b0 fc 62 01 01 88 ff ff 20 46 ec 81 ff ff ff ff 00 ab a1 f9 01 88 ff ff <38> fc 62 01 01 88 ff ff 69 25 0a 81 ff ff ff ff 18 fd 62 01 01 
[3574572.577192] RIP  [<ffff88010162fbf8>] 0xffff88010162fbf8
[3574572.578099]  RSP <ffff88060e1f7b08>
[3574572.578719] CR2: ffff88010162fbf8

Raw

kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
BUG: unable to handle kernel paging request at ffff88302c61e8c0
IP: [<ffff88302c61e8c0>] 0xffff88302c61e8c0
PGD 1a86063 PUD 80000030000001e3 
Oops: 0011 [#1] SMP 
last sysfs file: /sys/devices/virtual/net/bond0/carrier
CPU 31 
Modules linked in: oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U) mpt3sas mpt2sas raid_class mptctl ipmi_devintf dell_rbu nfs lockd fscache auth_rpcgss nfs_acl sunrpc bonding 8021q garp stp llc dm_round_robin dm_multipath microcode iTCO_wdt iTCO_vendor_support dcdbas power_meter acpi_ipmi ipmi_si ipmi_msghandler sg shpchp sb_edac edac_core lpc_ich mfd_core ext4 jbd2 mbcache sr_mod cdrom ahci sd_mod crc_t10dif bnx2x libcrc32c wmi tg3 ptp pps_core dm_mirror dm_region_hash dm_log dm_mod crc32c_intel be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi qla2xxx scsi_transport_fc scsi_tgt mptsas mptscsih mptbase scsi_transport_sas megaraid_sas [last unloaded: scsi_wait_scan]

Pid: 45116, comm: awk Tainted: P        W  ---------------    2.6.32-504.8.1.el6.x86_64 #1 Dell Inc. PowerEdge R720/0020HJ
RIP: 0010:[<ffff88302c61e8c0>]  [<ffff88302c61e8c0>] 0xffff88302c61e8c0
RSP: 0000:ffff88303c719c20  EFLAGS: 00010206
RAX: ffff88303c719df8 RBX: ffff883034719d00 RCX: ffff883043c22d38
RDX: 0000003fb4e00280 RSI: ffff88303c719c58 RDI: ffff883034719d00
RBP: ffff88303c719ca8 R08: 0000000000000000 R09: 0000000000000028
R10: ffff88304452ae00 R11: 0000000000000000 R12: ffff883054a79000
R13: ffff883043c22d38 R14: 0000003fb4e00280 R15: ffff88304452ae00
FS:  0000000000000000(0000) GS:ffff8818d49e0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff88302c61e8c0 CR3: 00000030285ad000 CR4: 00000000000407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process awk (pid: 45116, threadinfo ffff88303c718000, task ffff8830418b6ae0)
Stack:
 ffffffff8114eae4 8000001800000028 0000000000000000 0000000000000000
<d> ffffea00a7c5dd30 00000200b5082000 ffff882fef63c410 ffff883000000028
<d> 0000000000000000 0000003fb4e00000 0000000000000000 ffff881880021b48
Call Trace:
 [<ffffffff8114eae4>] ? __do_fault+0x54/0x530
 [<ffffffff8114f0b7>] handle_pte_fault+0xf7/0xb00
 [<ffffffff8116c69a>] ? alloc_pages_current+0xaa/0x110
 [<ffffffff810516b7>] ? pte_alloc_one+0x37/0x50
 [<ffffffff8114fcea>] handle_mm_fault+0x22a/0x300
 [<ffffffff8104d0d8>] __do_page_fault+0x138/0x480
 [<ffffffff81156225>] ? do_mmap_pgoff+0x335/0x380
 [<ffffffff8152ffde>] do_page_fault+0x3e/0xa0
 [<ffffffff8152d395>] page_fault+0x25/0x30
Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a8 e8 61 2c 30 88 ff ff a8 e8 61 2c 30 88 ff ff 08 36 d9 29 30 88 ff ff <40> 98 05 44 30 88 ff ff e8 b0 e3 55 30 88 ff ff 80 d5 e1 5b 30 
RIP  [<ffff88302c61e8c0>] 0xffff88302c61e8c0
 RSP <ffff88303c719c20>
CR2: ffff88302c61e8c0

决议

  • Contact hardware vendor (DELL) and perform a complete hardware diagnostics tests.
    • Focus should be on CPU and Memory.

根源

Raw

RIP  [<ffff88302c61e8c0>] 0xffff88302c61e8c0

This address belong to kernel slab object size-1024 as could be seen below. This is an invalid RIP address which caused the kernel panic.

The CPU's RIP (Instruction Pointer) register should point to the next instruction to be executed. It stores the offset address of the next instruction to be executed. But in this crash instead of a valid offset address of the next instruction, the RIP has a pointer to the file in the slab. That's illegal and generated an exception.

This kind of misbehaviour is generally a result of hardware fault. Most likely due to the CPU or it's associated socket fault.

诊断步骤

System Information:

Raw

crash> sys | grep -e NODE -e RELEASE -e PANIC
    NODENAME: dcba3
     RELEASE: 2.6.32-504.8.1.el6.x86_64
       PANIC: "BUG: unable to handle kernel paging request at ffff88302c61e8c0"

crash> log | grep DMI:
DMI: Dell Inc. PowerEdge R720/0020HJ, BIOS 2.4.3 07/09/2014

CPU family, model name:

Raw

crash> px boot_cpu_data.x86_model_id
$1 = "Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"

The "nx-protected page" message means kernel tried to execute a data area.

Kernel Ring Buffer:

Raw

crash> log
[..]
kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
BUG: unable to handle kernel paging request at ffff88302c61e8c0
IP: [<ffff88302c61e8c0>] 0xffff88302c61e8c0
PGD 1a86063 PUD 80000030000001e3 
Oops: 0011 [#1] SMP 
last sysfs file: /sys/devices/virtual/net/bond0/carrier
CPU 31 
Modules linked in: oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U) mpt3sas mpt2sas raid_class mptctl ipmi_devintf dell_rbu nfs lockd fscache auth_rpcgss nfs_acl sunrpc bonding 8021q garp stp llc dm_round_robin dm_multipath microcode iTCO_wdt iTCO_vendor_support dcdbas power_meter acpi_ipmi ipmi_si ipmi_msghandler sg shpchp sb_edac edac_core lpc_ich mfd_core ext4 jbd2 mbcache sr_mod cdrom ahci sd_mod crc_t10dif bnx2x libcrc32c wmi tg3 ptp pps_core dm_mirror dm_region_hash dm_log dm_mod crc32c_intel be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi qla2xxx scsi_transport_fc scsi_tgt mptsas mptscsih mptbase scsi_transport_sas megaraid_sas [last unloaded: scsi_wait_scan]

Pid: 45116, comm: awk Tainted: P        W  ---------------    2.6.32-504.8.1.el6.x86_64 #1 Dell Inc. PowerEdge R720/0020HJ
RIP: 0010:[<ffff88302c61e8c0>]  [<ffff88302c61e8c0>] 0xffff88302c61e8c0
RSP: 0000:ffff88303c719c20  EFLAGS: 00010206
RAX: ffff88303c719df8 RBX: ffff883034719d00 RCX: ffff883043c22d38
RDX: 0000003fb4e00280 RSI: ffff88303c719c58 RDI: ffff883034719d00
RBP: ffff88303c719ca8 R08: 0000000000000000 R09: 0000000000000028
R10: ffff88304452ae00 R11: 0000000000000000 R12: ffff883054a79000
R13: ffff883043c22d38 R14: 0000003fb4e00280 R15: ffff88304452ae00
FS:  0000000000000000(0000) GS:ffff8818d49e0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff88302c61e8c0 CR3: 00000030285ad000 CR4: 00000000000407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process awk (pid: 45116, threadinfo ffff88303c718000, task ffff8830418b6ae0)
Stack:
 ffffffff8114eae4 8000001800000028 0000000000000000 0000000000000000
<d> ffffea00a7c5dd30 00000200b5082000 ffff882fef63c410 ffff883000000028
<d> 0000000000000000 0000003fb4e00000 0000000000000000 ffff881880021b48
Call Trace:
 [<ffffffff8114eae4>] ? __do_fault+0x54/0x530
 [<ffffffff8114f0b7>] handle_pte_fault+0xf7/0xb00
 [<ffffffff8116c69a>] ? alloc_pages_current+0xaa/0x110
 [<ffffffff810516b7>] ? pte_alloc_one+0x37/0x50
 [<ffffffff8114fcea>] handle_mm_fault+0x22a/0x300
 [<ffffffff8104d0d8>] __do_page_fault+0x138/0x480
 [<ffffffff81156225>] ? do_mmap_pgoff+0x335/0x380
 [<ffffffff8152ffde>] do_page_fault+0x3e/0xa0
 [<ffffffff8152d395>] page_fault+0x25/0x30
Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a8 e8 61 2c 30 88 ff ff a8 e8 61 2c 30 88 ff ff 08 36 d9 29 30 88 ff ff <40> 98 05 44 30 88 ff ff e8 b0 e3 55 30 88 ff ff 80 d5 e1 5b 30 
RIP  [<ffff88302c61e8c0>] 0xffff88302c61e8c0
 RSP <ffff88303c719c20>
CR2: ffff88302c61e8c0
[..]

Backtrace of panic task;

Raw

crash> set -p
    PID: 45116
COMMAND: "awk"
   TASK: ffff8830418b6ae0  [THREAD_INFO: ffff88303c718000]
    CPU: 31
  STATE: TASK_RUNNING (PANIC)

crash> bt
PID: 45116  TASK: ffff8830418b6ae0  CPU: 31  COMMAND: "awk"
 #0 [ffff88303c719810] machine_kexec at ffffffff8103b5bb
 #1 [ffff88303c719870] crash_kexec at ffffffff810c9852
 #2 [ffff88303c719940] oops_end at ffffffff8152e090
 #3 [ffff88303c719970] no_context at ffffffff8104c80b
 #4 [ffff88303c7199c0] __bad_area_nosemaphore at ffffffff8104ca95
 #5 [ffff88303c719a10] bad_area_nosemaphore at ffffffff8104cb63
 #6 [ffff88303c719a20] __do_page_fault at ffffffff8104d2bf
 #7 [ffff88303c719b40] do_page_fault at ffffffff8152ffde
 #8 [ffff88303c719b70] page_fault at ffffffff8152d395
 #9 [ffff88303c719cb0] handle_pte_fault at ffffffff8114f0b7
#10 [ffff88303c719d90] handle_mm_fault at ffffffff8114fcea
#11 [ffff88303c719e00] __do_page_fault at ffffffff8104d0d8
#12 [ffff88303c719f20] do_page_fault at ffffffff8152ffde
#13 [ffff88303c719f50] page_fault at ffffffff8152d395
    RIP: 0000003fb3a09210  RSP: 00007fff671bfe38  RFLAGS: 00010206
    RAX: 0000003fb4e00280  RBX: 00007ffa26c68000  RCX: 0000003fb3a16ee7
    RDX: 0000003fb5082f88  RSI: 00007ffa26c68040  RDI: 00007ffa26c68000
    RBP: 00007fff671bffb0   R8: 0000000070000029   R9: 000000006ffffdff
    R10: 000000006ffffeff  R11: 0000000000000246  R12: 00007fff671c0088
    R13: 000000006fffff48  R14: 00007fff671bfd20  R15: 00007fff671bfcc0
    ORIG_RAX: ffffffffffffffff  CS: 0033  SS: 002b

crash> dis -rl ffff88302c61e8c0
WARNING: ffff88302c61e8c0: no associated kernel symbol found
   0xffff88302c61e8c0:  rex cwtl 

A detail investigation shows that this address {ffff88302c61e8c0} belongs to "filp" cache.

Raw

crash> kmem ffff88302c61e8c0
CACHE            NAME                 OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE
ffff88305be30500 filp                     192      11911     16180    809     4k
SLAB              MEMORY            TOTAL  ALLOCATED  FREE
ffff88302c61e000  ffff88302c61e080     20         19     1
FREE / [ALLOCATED]
  [ffff88302c61e8c0]

      PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
ffffea00a89b5690 302c61e000                0 ffff88305483d740  1 c0000000000080 slab
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值