libvirt-1.2.5的绑核bug分析以及修复

BUG触发方式 [注:本分析假设物理机有32个核]

  1. 复现前提:虚拟机创建时所有vcpu处于已绑定状态(即非绑定到所有CPU上)
  2. 复现场景1:
    1. 对所有vcpu执行解绑 (绑定到0-31)
    2. 对虚拟机执行shutdown操作
  3. 复现场景2:
    1. 对所有vcpu执行解绑 (绑定到0-31)
    2. 对虚拟机任意一个vcpu执行重新绑定操做

 

BUG分析

全解绑会执行virDomainVcpuPinDel函数,该函数会删除def->cputune.vcpupin中的所有元素(cputune.vcpupin用于指向每个vcpu对应的物理cpu绑定关系)。当执行全解绑后,cputune.vcpupin指向的元素会被全部删除,cputune.vcpupin成为悬空指针,且cputune.nvcpupin=0。如果此时对cputune.vcpupin再执行free操作,便会触发二次释放错误

cputune结构体:

 

Cpupin相关结构体:

 

Cpu执行解绑操作时调用关系:

 

  qemuDomainPinVcpuFlags  -->

virDomainVcpuPinDel  -->

       VIR_DELETE_ELEMENT  [实际调用了virDeleteElementsN函数]

  

执行全解绑才会将doReset置为非0(重要的触发条件之一),从而调用vir DomainVcpuPindel

此处virDeleteElementsN函数会用来删除cputune.vcpupin对应的所有元素。该操作最终会导致cputune.vcpupin成为悬空指针。

对于前述场景1和场景2,最终都会调用一个关键函数:virDomainVcpuPinDefArrayFree

当满足前面的条件后,cputune.vcpupin成为悬空指针,此处作为参数传入def,执行VIR_FREE(def)便会触发异常

 

场景1崩溃函数栈【关虚机】:

(gdb) bt

#0  0x00007ffff4a5c625 in raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64

#1  0x00007ffff4a5de05 in abort () at abort.c:92

#2  0x00007ffff4a9a537 in __libc_message (do_abort=2, fmt=0x7ffff4b828c0 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:198

#3  0x00007ffff4a9ff4e in malloc_printerr (action=3, str=0x7ffff4b82c50 "double free or corruption (!prev)", ptr=<value optimized out>, ar_ptr=<value optimized out>) at malloc.c:6350

#4  0x00007ffff4aa2cf0 in _int_free (av=0x7ffff4db9e80, p=0x7fffdc0ec890, have_lock=0) at malloc.c:4836

#5  0x00007ffff74284b9 in virFree (ptrptr=0x7fffffffcd58) at util/viralloc.c:582

#6  0x00007ffff74990dd in virDomainVcpuPinDefArrayFree (def=0x7fffdc0ec8a0, nvcpupin=0) at conf/domain_conf.c:1929

#7  0x00007ffff749c883 in virDomainDefFree (def=0x7fffdc0ec190) at conf/domain_conf.c:2083

#8  0x00007fffe35a873f in qemuProcessStop (driver=<value optimized out>, vm=0x7fffdc140b40, reason=VIR_DOMAIN_SHUTOFF_SHUTDOWN, flags=<value optimized out>) at qemu/qemu_process.c:4527

#9  0x00007fffe35a8e8c in qemuProcessHandleMonitorEOF (mon=<value optimized out>, vm=0x7fffdc140b40, opaque=0x7fffdc012c90) at qemu/qemu_process.c:329

#10 0x00007fffe35c6d11 in qemuMonitorIO (watch=<value optimized out>, fd=<value optimized out>, events=<value optimized out>, opaque=0x7fffdc13f530) at qemu/qemu_monitor.c:746

#11 0x00007ffff74452f7 in virEventPollDispatchHandles () at util/vireventpoll.c:510

#12 virEventPollRunOnce () at util/vireventpoll.c:660

#13 0x00007ffff7443c90 in virEventRunDefaultImpl () at util/virevent.c:308

#14 0x00007ffff7fe460d in virNetServerRun (srv=0x7ffff8225980) at rpc/virnetserver.c:1139

#15 0x00007ffff7faab67 in main (argc=<value optimized out>, argv=<value optimized out>) at libvirtd.c:1507

 

场景2崩溃函数栈【全解绑后重绑定】:

(gdb) bt

#0  0x00007ffff4a5c625 in raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64

#1  0x00007ffff4a5de05 in abort () at abort.c:92

#2  0x00007ffff4a9a537 in __libc_message (do_abort=2, fmt=0x7ffff4b828c0 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:198

#3  0x00007ffff4a9ff4e in malloc_printerr (action=3, str=0x7ffff4b82b98 "double free or corruption (fasttop)", ptr=<value optimized out>, ar_ptr=<value optimized out>) at malloc.c:6350

#4  0x00007ffff4aa2cad in _int_free (av=0x7fffcc000020, p=0x7fffcc0008b0, have_lock=0) at malloc.c:4836

#5  0x00007ffff74284b9 in virFree (ptrptr=0x7fffe8f718f8) at util/viralloc.c:582

#6  0x00007ffff74990dd in virDomainVcpuPinDefArrayFree (def=0x7fffcc0008c0, nvcpupin=0) at conf/domain_conf.c:1929

#7  0x00007fffe35f67a1 in qemuDomainPinVcpuFlags (dom=<value optimized out>, vcpu=0, cpumap=0x7fffd40016e0 "\377\377\377\177\377\177", maplen=4, flags=0) at qemu/qemu_driver.c:4435

#8  0x00007ffff75335aa in virDomainPinVcpu (domain=0x7fffd4001350, vcpu=0, cpumap=0x7fffd40016e0 "\377\377\377\177\377\177", maplen=4) at libvirt.c:9621

#9  0x00007ffff7fcdba4 in remoteDispatchDomainPinVcpu (server=<value optimized out>, client=<value optimized out>, msg=<value optimized out>, rerr=0x7fffe8f71ba0, args=<value optimized out>,

    ret=<value optimized out>) at remote_dispatch.h:6578

#10 remoteDispatchDomainPinVcpuHelper (server=<value optimized out>, client=<value optimized out>, msg=<value optimized out>, rerr=0x7fffe8f71ba0, args=<value optimized out>, ret=<value optimized out>)

    at remote_dispatch.h:6556

#11 0x00007ffff7590578 in virNetServerProgramDispatchCall (prog=0x7ffff8227800, server=0x7ffff8225980, client=0x7ffff8232240, msg=0x7ffff8231e70) at rpc/virnetserverprogram.c:437

#12 virNetServerProgramDispatch (prog=0x7ffff8227800, server=0x7ffff8225980, client=0x7ffff8232240, msg=0x7ffff8231e70) at rpc/virnetserverprogram.c:307

#13 0x00007ffff7fe4f2e in virNetServerProcessMsg (srv=<value optimized out>, client=0x7ffff8232240, prog=<value optimized out>, msg=0x7ffff8231e70) at rpc/virnetserver.c:172

#14 0x00007ffff7fe5ba8 in virNetServerHandleJob (jobOpaque=<value optimized out>, opaque=0x7ffff8225980) at rpc/virnetserver.c:193

#15 0x00007ffff74822ee in virThreadPoolWorker (opaque=<value optimized out>) at util/virthreadpool.c:145

#16 0x00007ffff7481776 in virThreadHelper (data=<value optimized out>) at util/virthread.c:197

#17 0x00007ffff51ccaa1 in start_thread (arg=0x7fffe8f72700) at pthread_create.c:301

#18 0x00007ffff4b1293d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

 

 

BUG修复:

修改virDomainVcpuPinDefArrayFree函数即可:

 

void virDomainVcpuPinDefArrayFree(virDomainVcpuPinDefPtr *def, int nvcpupin)

{

    size_t i;

 

    if (!def)

        return;

 

    for (i = 0; i < nvcpupin; i++) {

        virDomainVcpuPinDefFree(def[i]);

    }

 

    If(nvcpupin)  //为0时候不执行下面的FREE操作,对应悬空指针,不能再次释放。

            VIR_FREE(def);

}

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值