Perform a bisect test to identify the kernel problem (by quqi99)

标签: kernel
*作者:张华 发表于:2016-12-15
  1. 先测Xenial最新的Ubuntu Kernel (sudo apt-cache policy linux-image-generic-lts-xenial)。如果它没有问题,说明Upstream Kernel中的fixed patch已经被backport到了Ubuntu Xenial Kernel;如果它也有问题,一般需要在Upstream Kernel里bisect
  2. 初略扫描代码
git clone git://
git clone git://
git log --tags --simplify-by-decoration --pretty="format:%ci %d"
git log --since="2015-06-21 22:05:43" --before="2015-07-05 11:01:52" --oneline --no-merges |wc -l
#filter by commit's author date rather than commit date, commit date may change when merging branch
git log --since="2015-06-21 22:05:43" --before="2015-07-05 11:01:52" --format="%ad %h %s" --date=iso --no-merges > diff
#Unfortunately, 'git log b953c0d..d770e55' or 'git log tag1..tag2' is not guaranteed to work
#because those commits may be in different branches.
#'..' means all of the commits that are in the v4.2-rc1 branch, but aren’t in the v4.1 branch
#git log --no-merges --oneline v4.1..v4.2-rc1 > diff
  1. 在Upstream Kernel (中继续bisect确定问题出在4.1和4.2-rc1之间. 参考:
git bisect start
git bisect bad v4.2-rc1
git bisect good v4.1

git checkout bisect/bad
#git bisect log > bisectlog
git bisect reset
#git bisect replay bisectlog

sudo apt-get install git build-essential kernel-package fakeroot libncurses5-dev libssl-dev ccache
cp /boot/config-`uname -r` .config   #or make menuconfig 
make localmodconfig                  #speed compiling time for test by just using existing module config
make clean
yes '' | make oldconfig              #bring the config file up to date.
make -j `getconf _NPROCESSORS_ONLN` deb-pkg LOCALVERSION=-custom
#make drivers/usb/ehci.ko            #just build a module
  1. (optional) sriov test method
$ cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-4.1.36-040136-generic root=UUID=70f8e4d5-1773-4bb9-9dcc-f46ac966ebac ro pci-stub.ids=8086:10ca apparmor=0 intel_iommu=pt intel_iommu=on pci=assign-busses

ln -s /etc/apparmor.d/usr.sbin.libvirtd  /etc/apparmor.d/disable/
ln -s /etc/apparmor.d/usr.lib.libvirt.virt-aa-helper  /etc/apparmor.d/disable/
apparmor_parser -R  /etc/apparmor.d/usr.sbin.libvirtd
apparmor_parser -R  /etc/apparmor.d/usr.lib.libvirt.virt-aa-helper

sudo dmesg |grep -e DMAR -e IOMMU
sudo lspci |grep 82576
sudo virsh nodedev-list --tree 
find /sys/kernel/iommu_groups/ -type l

readlink -f /sys/bus/pci/devices/0000:07:10.2/iommu_group
echo 0000:07:10.2 > /sys/bus/pci/devices/0000:07:10.2/driver/unbind
echo 0000:07:10.2 > /sys/bus/pci/drivers/vfio-pci/bind
echo "8086 10ca" > /sys/bus/pci/drivers/vfio-pci/new_id

sudo tailf /var/log/libvirt/qemu/openstack.log
sudo virsh attach-device openstack /root/new-device.xml --live --config
$ sudo cat /root/new-device.xml
<hostdev mode='subsystem' type='pci' managed='yes'>
   <address domain='0x0000' bus='0x07' slot='0x10' function='0x3'/>
  1. Fixed patch找到了的SRU问题,Upstream Kernel有LTS(, Ubuntu Kernel也有LTS(。
    a) 如对于Xenial, Ubuntu Kernel是v4.4, 而v4.4也是Upstream LTS Kernel,所以需要将Fixed patch先backport到Upstream Kernel v4.4,而Ubuntu LTS Kernel会定期从Upstream LTS Kernel里拉,所以就不需要backport到Ubuntu kernel了。
    b) 如对于trusty, Ubuntu Kernel是3.13, 该版本不是Upstream LTS Kernel,故需直接backport到Ubuntu Kernel v3.13即可(




