首先两个非常有深度和帮助性的链接:
http://www.kgdb.info/category/kgdb/understand_kgdb/
http://kernel.org/pub/linux/kernel/people/jwessel/kdb/index.html
下面是步骤,罗嗦在最后,贴了不少源码:
内核版本2.6.32,gdb版本6.8(应该适合高版本),如何编译内核,网络上有很多,但关键的几个,不再叙述。
1.编译内核,添加内核启动参数 kgdboc=ttyS0,内核参数kgdbwait在内核启动过程中停止,但内核基本启动完毕,如果使用kgdbwait,则它一定放在kgdboc~~参数后面才起作用.
2.gdb vmlinux,进入输入命令 target remote /dev/pts/2,由于使用kvm,所以是虚拟串口文件
3.如果使用kgdbwait,此时kgdb应该链接上,如下所示:
- (gdb) target remote /dev/pts/5
- Remote debugging using /dev/pts/5
- kgdb_breakpoint () at kernel/kgdb.c:1721
- 1721 wmb(); /* Sync point after breakpoint */
- (gdb) bt
- #0 kgdb_breakpoint () at kernel/kgdb.c:1721
- #1 kgdb_initial_breakpoint () at kernel/kgdb.c:1631
- #2 kgdb_register_io_module (new_kgdb_io_ops=0xc08e5638) at kernel/kgdb.c:1673
- #3 0xc063557e in configure_kgdboc () at drivers/serial/kgdboc.c:67
- #4 0xc092209e in init_kgdboc () at drivers/serial/kgdboc.c:88
- #5 0xc0401033 in do_one_initcall (fn=0xc092208b <init_kgdboc>)
- at init/main.c:721
- #6 0xc08fa375 in do_initcalls () at init/main.c:761
- #7 do_basic_setup () at init/main.c:783
- #8 kernel_init (unused=<value optimized out>) at init/main.c:879
- #9 0xc042e0c7 in kernel_thread_helper () at arch/x86/kernel/entry_32.S:990
- (gdb)
这是就进入调试模式了。
4.如果没有使用kgdbwait(一般不用,很麻烦),在虚拟机正常启动后,输入
- [root@localhost /]# echo g > /proc/sysrq-trigger
- Program received signal SIGTRAP, Trace/breakpoint trap.
- [Switching to Thread 943]
- kgdb_breakpoint () at kernel/kgdb.c:1721
- 1721 wmb(); /* Sync point after breakpoint */
- (gdb) bt
- #0 kgdb_breakpoint () at kernel/kgdb.c:1721
- #1 sysrq_handle_gdb (key=<value optimized out>, tty=0x0) at kernel/kgdb.c:1581
- #2 0xc061eae6 in __handle_sysrq (key=103, tty=0x0, check_mask=0)
- at drivers/char/sysrq.c:521
- #3 0xc061eb9c in write_sysrq_trigger (file=<value optimized out>,
- buf=0xb77b0000 "g\n", count=2, ppos=0xdeeb4f98) at drivers/char/sysrq.c:599
- #4 0xc05452da in proc_reg_write (file=0xdf8a77c0, buf=0xb77b0000 "g\n",
- count=2, ppos=0xdeeb4f98) at fs/proc/inode.c:207
- #5 0xc050af1c in vfs_write (file=0xdf8a77c0, buf=0xb77b0000 "g\n",
- count=<value optimized out>, pos=0xdeeb4f98) at fs/read_write.c:347
- #6 0xc050b09d in sys_write (fd=1, buf=0xb77b0000 "g\n", count=2)
- at fs/read_write.c:399
- #7 0xc042d595 in system_call () at arch/x86/kernel/entry_32.S:529
- #8 0xb77b0000 in ?? ()
- #9 0x00000002 in ?? ()
- #10 0xb77b0000 in ?? ()
- #11 0xbf9a7d70 in ?? ()
- #12 0x00000004 in ?? ()
- Backtrace stopped: previous frame inner to this frame (corrupt stack?)
- (gdb)
5.使用kgdb本来是想调内核模块,内核只是想看看里面的模样,但这时是没法调试模块的,只能使用gdb的add-symbol-file命令加载符号,进入调试状态:
继续,在虚拟机中加载内核 insmod my_dir.ko,然后gdb中显示如下:
谁能给我解释一下mod一会没了,一回又有了,如果一直没有,那你到load_module函数里面去搞。
结束,建议先看开始的链接,我也并不清楚sysrq之类的,但我估计是调用了一个中断触发指令,然后让系统陷入异常处理函数,让kgdb接管系统。这里在网上找了资料,什么gdbmod,还有gdb-light之类的,都没弄通(后者gdb7.0没编译过去,自己的f10太老了吧,不想折腾),参考一下add-symbol-file命令,如下:
给定elf文件,还有text段的起始地址即可,自己一开始也就直接使用
add-symbol-file /home/sword/se/my_proc/my_dir.o 0xXXXXXXXX
但是没法调试my_init,也就是模块初始化函数,这里自己写的内核测试函数,都是在my_init里面跑一下,返回-1就散了,省等程序错了没法卸载模块,还要折腾重启。所以就像上面所说的添加了.init.text和.exit.text两个段。可以调试自己模块初始化函数,如果你的模块中有全局变量之类的,还得添加别的data段之类的,可以像上面写的试试
(gdb) p *(mod->sect_attrs->attrs+n)
n的最大值可以达16,你可以使用objdump命令来看看它的段表。
这里您要是调的话不免看一下您的源码,mod结构的sect_attrs变量比一定都有的,有时需要配置一下内核编译选项才行。
最后一句话 insmod_module和add-symbol-table都是解析elf头。
- (gdb) l sys_init_module
- 2562 }
- 2563
- 2564 /* This is where the real work happens */
- 2565 SYSCALL_DEFINE3(init_module, void __user *, umod,
- 2566 unsigned long, len, const char __user *, uargs)
- 2567 {
- 2568 struct module *mod;
- 2569 int ret = 0;
- 2570
- 2571 /* Must have permission */
- (gdb) l
- 2572 if (!capable(CAP_SYS_MODULE) || modules_disabled)
- 2573 return -EPERM;
- 2574
- 2575 /* Only one module load at a time, please */
- 2576 if (mutex_lock_interruptible(&module_mutex) != 0)
- 2577 return -EINTR;
- 2578
- 2579 /* Do all the hard work */
- 2580 mod = load_module(umod, len, uargs);
- 2581 if (IS_ERR(mod)) {
- (gdb)
- 2582 mutex_unlock(&module_mutex);
- 2583 return PTR_ERR(mod);
- 2584 }
- 2585
- 2586 /* Drop lock so they can recurse */
- 2587 mutex_unlock(&module_mutex);
- 2588
- 2589 blocking_notifier_call_chain(&module_notify_list,
- 2590 MODULE_STATE_COMING, mod);
- 2591
- (gdb)
- 2592 do_mod_ctors(mod);
- 2593 /* Start the module */
- 2594 if (mod->init != NULL)
- (gdb) b 2580
- Breakpoint 4 at 0xc049fd9c: file kernel/module.c, line 2580.
- (gdb) c
- Breakpoint 4, sys_init_module (umod=0x9002018, len=99891, uargs=0x9002008 "")
- at kernel/module.c:2580
- 2580 mod = load_module(umod, len, uargs);
- (gdb) p *mod
- Cannot access memory at address 0x0
- (gdb) n
- 2581 if (IS_ERR(mod)) {
- (gdb) p *mod
- Cannot access memory at address 0x0
- (gdb) n
- 2580 mod = load_module(umod, len, uargs);
- (gdb) p *mod
- $5 = {state = MODULE_STATE_COMING, list = {next = 0xc08cfce4,
- prev = 0xc08cfce4}, name = "my_dir", '\0' <repeats 53 times>, mkobj = {
- kobj = {name = 0xdef4cd20 "my_dir", entry = {next = 0xdf859800,
- prev = 0xdf8b0a04}, parent = 0xdf85980c, kset = 0xdf859800,
- ktype = 0xc08cf480, sd = 0xdef4d8fc, kref = {refcount = {counter = 3}},
- state_initialized = 1, state_in_sysfs = 1, state_add_uevent_sent = 1,
- state_remove_uevent_sent = 0, uevent_suppress = 0}, mod = 0xe0b8a160,
- drivers_dir = 0x0, mp = 0x0}, modinfo_attrs = 0xdeeb1080, version = 0x0,
- srcversion = 0xdf801200 "A8DC41FB925141E6B17A45D", holders_dir = 0xdf371a80,
- syms = 0x0, crcs = 0x0, num_syms = 0, kp = 0x0, num_kp = 0,
- num_gpl_syms = 0, gpl_syms = 0x0, gpl_crcs = 0x0, unused_syms = 0x0,
- unused_crcs = 0x0, num_unused_syms = 0, num_unused_gpl_syms = 0,
- unused_gpl_syms = 0x0, unused_gpl_crcs = 0x0, gpl_future_syms = 0x0,
- gpl_future_crcs = 0x0, num_gpl_future_syms = 0, num_exentries = 0,
- extable = 0x0, init = 0xe0b8d000, module_init = 0xe0b8d000,
- module_core = 0xe0b8a000, init_size = 1291, core_size = 830,
- init_text_size = 295, core_text_size = 173, arch = {<No data fields>},
- taints = 0, num_bugs = 0, bug_list = {next = 0xc08dd054, prev = 0xc08dd054},
- bug_table = 0x0, symtab = 0xe0b8d128, core_symtab = 0xe0b8a2c0,
- num_symtab = 44, core_num_syms = 5, strtab = 0xe0b8d3e8 "",
- core_strtab = 0xe0b8a310 "", sect_attrs = 0xdee0a000,
- notes_attrs = 0xdfa4f340, percpu = 0x0, args = 0xdef4cd80 "",
- tracepoints = 0x0, num_tracepoints = 0, trace_bprintk_fmt_start = 0x0,
- num_trace_bprintk_fmt = 0, trace_events = 0x0, num_trace_events = 0,
- modules_which_use_me = {next = 0xe0b8a2a4, prev = 0xe0b8a2a4},
- waiter = 0xdf8e4170, exit = 0xe0b8a0a8, refptr = 0xc0977220 "", ctors = 0x0,
- num_ctors = 0}
- (gdb)
- (gdb) p *(mod->sect_attrs->attrs+1)
- $6 = {mattr = {attr = {name = 0xdeeab460 ".text", owner = 0x0, mode = 292},
- show = 0xc049d3e0 <module_sect_show>, store = 0, setup = 0, test = 0,
- free = 0}, name = 0xdeeab460 ".text", address = 3770195968}
- (gdb) p *(mod->sect_attrs->attrs+2)
- $7 = {mattr = {attr = {name = 0xdeeab4e0 ".exit.text", owner = 0x0,
- mode = 292}, show = 0xc049d3e0 <module_sect_show>, store = 0, setup = 0,
- test = 0, free = 0}, name = 0xdeeab4e0 ".exit.text", address = 3770196136}
- (gdb) p *(mod->sect_attrs->attrs+3)
- $8 = {mattr = {attr = {name = 0xdeeab5a0 ".init.text", owner = 0x0,
- mode = 292}, show = 0xc049d3e0 <module_sect_show>, store = 0, setup = 0,
- test = 0, free = 0}, name = 0xdeeab5a0 ".init.text", address = 3770208256}
- (gdb) add-symbol-file /home/sword/se/my_proc/my_dir.o 3770195968 -s .exit.text 3770196136 -s .init.text 3770208256
- add symbol table from file "/home/sword/se/my_proc/my_dir.o" at
- .text_addr = 0xe0b8a000
- .exit.text_addr = 0xe0b8a0a8
- .init.text_addr = 0xe0b8d000
- (y or n) y
- Reading symbols from /home/sword/se/my_proc/my_dir.o...done.
- (gdb) b my_init
- Breakpoint 5 at 0xe0b8d00e: file /home/sword/se/my_proc/my_dir.c, line 84.
- (gdb) c
- Continuing.
- Breakpoint 5, kmem_cache_alloc_notrace () at include/linux/slab_def.h:120
- 120 return kmem_cache_alloc(cachep, flags);
- (gdb) bt
- #0 kmem_cache_alloc_notrace () at include/linux/slab_def.h:120
- #1 kmalloc () at include/linux/slab_def.h:155
- #2 my_init () at /home/sword/se/my_proc/my_dir.c:84
- #3 0xc0401033 in do_one_initcall (fn=0xe0b8d000 <my_init>) at init/main.c:721
- #4 0xc049fe01 in sys_init_module (umod=0x9002018, len=99891,
- uargs=0x9002008 "") at kernel/module.c:2595
- #5 0xc042d595 in system_call () at arch/x86/kernel/entry_32.S:529
- #6 0x00018633 in ?? ()
- #7 0x09002008 in ?? ()
- #8 0x00000000 in ?? ()
- (gdb)
- (gdb) help add-symbol-file
- Load symbols from FILE, assuming FILE has been dynamically loaded.
- Usage: add-symbol-file FILE ADDR [-s <SECT> <SECT_ADDR> -s <SECT> <SECT_ADDR> ...]
- ADDR is the starting address of the file's text.
- The optional arguments are section-name section-address pairs and
- should be specified if the data and bss segments are not contiguous
- with the text. SECT is a section name to be loaded at SECT_ADDR.
- (gdb)
add-symbol-file /home/sword/se/my_proc/my_dir.o 0xXXXXXXXX
但是没法调试my_init,也就是模块初始化函数,这里自己写的内核测试函数,都是在my_init里面跑一下,返回-1就散了,省等程序错了没法卸载模块,还要折腾重启。所以就像上面所说的添加了.init.text和.exit.text两个段。可以调试自己模块初始化函数,如果你的模块中有全局变量之类的,还得添加别的data段之类的,可以像上面写的试试
(gdb) p *(mod->sect_attrs->attrs+n)
n的最大值可以达16,你可以使用objdump命令来看看它的段表。
这里您要是调的话不免看一下您的源码,mod结构的sect_attrs变量比一定都有的,有时需要配置一下内核编译选项才行。
最后一句话 insmod_module和add-symbol-table都是解析elf头。