作者:洪斌
MySQL数据库最大的优势,想必就是可以直接通过代码调试来学习数据库内部逻辑。任何问题、任何疑惑在debug源码面前都无法掩盖,还可以提升对数据库内核的理解能力,是不是有一种可以掌控一切的感觉!
一直以来Mac都是我的主力机,尝试了几次gdb体验都不怎么好用。几个明显的问题,gdb加载程序源码构建的MySQL时Reading symbols很久CPU飙升(lldb在symbol处理效率要更好),Mac系统的symbol gdb也无法识别。尝试了lldb没有这些问题,使用起来也很流畅。
LLDB目标是成为新一代、高性能的基础debugger平台。从Xcode 5开始已用lldb完全替代了gdb,与Xcode完美结合可实现非常友好的可视化调试工作,gdb一直都欠缺好的GUI前端。
- 它还有这些特点:
- 高性能和高效率的内存使用
- 优秀的多线程调试能力
- 插件式架构,支持Python可编程扩展
- 完美支持C,C++,Objective-C
- 多平台支持Mac OS X,iOS,Linux,FreeBSD,Windows
gdb像是个“年迈而经验丰富的老人”,lldb像是个“与时俱进的活力青年”。就像它们的创造者一样,一个是自由软件的灵魂人物 Richard Stallman,一个是扛起了Apple Swift大旗的Chris Lattner,都是神级的人物。致敬!
Richard Stallman
Chris Lattner
源码构建MySQL
调试MySQL前需要准备具备完整symbol信息的mysqld程序,官方发布的版本通常都是stripped的,也没有启用debug编译,缺少足够的symbol信息,调试时无法看到相应源码。所以我选择从MySQL源码仓库构建版本,在整个源码库下你可以checkout任何branch,构建任意版本,例如:
$ git clone https://github.com/mysql/mysql-server.git
$ cd mysql-server
$ git checkout mysql-5.7.17
$ cd BUILD; cmake .. -DWITH_DEBUG=1 -DWITH_BOOST=/usr/local/Cellar/boost@1.59/1.59.0/ -DWITH_UNIT_TESTS=off
$ make
$ make install DESTDIR="/Users/hongbin/mysql"
$ git clean -df
如果是Linux系统需要先安装这些程序
$ sudo yum -y install gcc gcc-c++ gcc-g77 autoconf automake zlib* fiex* libxml* ncurses-devel libmcrypt* libtool-ltdl-devel* make cmake readline-devel
LLDB调试
Mac自带了lldb,Linux还需要安装下。
首先启动lldb,加载从源码构建的二进制程序,指定MySQL配置文件,就可以启动MySQL了
(lldb) file /Users/hongbin/bin/mysqld
(lldb) process launch -- --defaults-file=/Users/hongbin/.my.cnf
如何设置好的断点是调试必备技能,lldb也提供了多种灵活的断点设置方法。
比如:以函数名设置断点,按tab键还可以补全函数名称
(lldb) br set -n do_comm
Available completions:
XA_prepare_log_event::do_commit(THD*)
Xid_log_event::do_commit(THD*)
do_command(THD*)
(lldb) br set -n do_command
Breakpoint 3: where = mysqld`do_command(THD*) + 15 at sql_parse.cc:874, address = 0x0000000100be53ef
又或者以指定文件名和行号。
(lldb) br s -f mysql
Available completions:
mysqld.cc
mysqld_thd_manager.cc
mysqld_daemon.cc
mysql_malloc_service.c
mysql_string_service.c
(lldb) br s -f mysqld.cc -l 6973
Breakpoint 5: where = mysqld`mysql_init_variables() + 47 at mysqld.cc:6973, address = 0x0000000100da789f
设了哪些断点不记得了?列出所有断点信息,pending表示没有找到此断点位置,也可以set breakpoint pending off关闭这个设置
(lldb) br l
Current breakpoints:
3: name = 'do_command', locations = 1, resolved = 1, hit count = 1
3.1: where = mysqld`do_command(THD*) + 15 at sql_parse.cc:874, address = 0x0000000100be53ef, resolved, hit count = 1
4: file = 'select_lex_visitor.cc', line = 300, exact_match = 0, locations = 0 (pending)
5: file = 'mysqld.cc', line = 6973, exact_match = 0, locations = 1, resolved = 1, hit count = 0
5.1: where = mysqld`mysql_init_variables() + 47 at mysqld.cc:6973, address = 0x0000000100da789f, resolved, hit count = 0
想删除某个断点?just do it
(lldb) br de 4
1 breakpoints deleted; 0 breakpoint locations disabled.
触发到断点程序会挂起,让你一探究竟,lldb会温馨提示你当前thread id,frame id,停止的原因。
按c继续运行程序,等待下一次触发断点。
(lldb) c
Process 34336 resuming
Process 34336 stopped
* thread #28, stop reason = breakpoint 3.1
frame #0: 0x0000000100be53ef mysqld`do_command(thd=0x00000001040efa00) at sql_parse.cc:874
871 bool return_value;
872 int rc;
873 const bool classic=
-> 874 (thd->get_protocol()->type() == Protocol::PROTOCOL_TEXT ||
875 thd->get_protocol()->type() == Protocol::PROTOCOL_BINARY);
876
877 NET *net= NULL;
遇到断点处想要一步步执行,按n(代码级逐步执行), 它是thread step-over的别名,ni(指令级逐步执行)是thread step-inst-over的别名
(lldb) n
Process 34336 stopped
* thread #29, stop reason = step over
frame #0: 0x0000000100be5461 mysqld`do_command(thd=0x000000010bb8fc00) at sql_parse.cc:880
877 NET *net= NULL;
878 enum enum_server_command command;
879 COM_DATA com_data;
-> 880 DBUG_ENTER("do_command");
881
882 /*
883 indicator of uninitialized lex => normal flow of errors handling
想知道当前变量到底是什么值,这对调试非常重要,按p打印你想看的变量内容。
(lldb) p com_data
(COM_DATA) $8 = {
com_init_db = (db_name = , length = 4491639808)
com_refresh = (options = 'x06')
com_shutdown = (level = 6)
com_kill = (id = 72057594037927942)
com_set_option = (opt_command = 6)
com_stmt_execute = (stmt_id = 72057594037927942, flags = 4491639808, params = "bxffffffd8x01x01", params_length = 123145543216640)
com_stmt_fetch = (stmt_id = 72057594037927942, num_rows = 4491639808)
com_stmt_send_long_data = (stmt_id = 72057594037927942, param_number = 196672512, longdata = "bxffffffd8x01x01", length = 123145543216640)
com_stmt_prepare = (query = , length = 196672512)
com_stmt_close = (stmt_id = 6)
com_stmt_reset = (stmt_id = 6)
com_query = (query = , length = 196672512)
com_field_list = (table_name = , table_name_length = 196672512, query = "bxffffffd8x01x01", query_length = 240905728)
}
还可以这样,贴心吧
(lldb) p com_data->com_query
(COM_QUERY_DATA) $9 = (query = , length = 196672512)
Fix-it applied, fixed expression was:
com_data.com_query
(lldb) p com_data.com_query
(COM_QUERY_DATA) $10 = (query = , length = 196672512)
对调试多线程程序,查看有哪些线程很非常重要吧,显示所有线程给我吧。
(lldb) th list
Process 34336 stopped
thread #1: tid = 0x152a4ff, 0x00007fff9318d19e libsystem_kernel.dylib`poll + 10, queue = 'com.apple.main-thread'
thread #2: tid = 0x152a516, 0x00007fff9318cd96 libsystem_kernel.dylib`kevent + 10
thread #3: tid = 0x152a518, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #4: tid = 0x152a519, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #5: tid = 0x152a51a, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #6: tid = 0x152a51b, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #7: tid = 0x152a51c, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #8: tid = 0x152a51d, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #9: tid = 0x152a51e, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #10: tid = 0x152a51f, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #11: tid = 0x152a520, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #12: tid = 0x152a521, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #13: tid = 0x152a522, 0x00007fff9318c47e libsystem_kernel.dylib`__write_nocancel + 10
thread #14: tid = 0x152a525, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #15: tid = 0x152a526, 0x00007fff9318bc22 libsystem_kernel.dylib`__psynch_mutexwait + 10
thread #16: tid = 0x152a527, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #17: tid = 0x152a528, 0x00007fff931eebf3 libsystem_malloc.dylib`default_zone_free_definite_size + 58
thread #18: tid = 0x152a529, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #19: tid = 0x152a52a, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #20: tid = 0x152a52b, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #21: tid = 0x152a52c, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #22: tid = 0x152a52d, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #23: tid = 0x152a52e, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #24: tid = 0x152a52f, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #25: tid = 0x152a530, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #26: tid = 0x152a53b, 0x00007fff9318c1fe libsystem_kernel.dylib`__sigwait + 10
thread #27: tid = 0x152a53d, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
thread #28: tid = 0x152a62e, 0x00007fff9318bbf2 libsystem_kernel.dylib`__psynch_cvwait + 10
* thread #29: tid = 0x152b38e, 0x0000000100be5461 mysqld`do_command(thd=0x000000010bb8fc00) at sql_parse.cc:880, stop reason = step over
来,选择一个你要查看的线程。
(lldb) th se 29
* thread #29, stop reason = step over
frame #0: 0x0000000100be5461 mysqld`do_command(thd=0x000000010bb8fc00) at sql_parse.cc:880
877 NET *net= NULL;
878 enum enum_server_command command;
879 COM_DATA com_data;
-> 880 DBUG_ENTER("do_command");
881
882 /*
883 indicator of uninitialized lex => normal flow of errors handling
当前线程调用栈是什么,bt变态一下,这个也是调试经常会用到的。
(lldb) bt
* thread #29, stop reason = step over
* frame #0: 0x0000000100be5461 mysqld`do_command(thd=0x000000010bb8fc00) at sql_parse.cc:880
frame #1: 0x0000000100d7f7e0 mysqld`::handle_connection(arg=0x000000010b73b980) at connection_handler_per_thread.cc:300
frame #2: 0x000000010155698c mysqld`::pfs_spawn_thread(arg=0x000000010b73bfe0) at pfs.cc:2188
frame #3: 0x00007fff9327693b libsystem_pthread.dylib`_pthread_body + 180
frame #4: 0x00007fff93276887 libsystem_pthread.dylib`_pthread_start + 286
frame #5: 0x00007fff9327608d libsystem_pthread.dylib`thread_start + 13
我想看所有线程调用栈,全体变态。
bt all
我想看下那一帧。
(lldb) fr s 1
frame #1: 0x0000000100d7f7e0 mysqld`::handle_connection(arg=0x000000010b73b980) at connection_handler_per_thread.cc:300
297 {
298 while (thd_connection_alive(thd))
299 {
-> 300 if (do_command(thd))
301 break;
302 }
303 end_connection(thd);
我想看当前帧的全部参数和局部变量。
(lldb) fr v
(void *) arg = 0x000000010b73b980
(Global_THD_manager *) thd_manager = 0x000000010b829200
(Connection_handler_manager *) handler_manager = 0x0000000103d000c0
(Channel_info *) channel_info = 0x000000011bbb2b70
(bool) pthread_reused = true
(THD *) thd = 0x000000010bb8fc00
(PSI_thread *) psi = 0x0000000108f1c180
我想查看当前源文件的全局变量。so easy!
(lldb) tar v
Global variables for /Users/hongbin/workbench/mysql-server/sql/conn_handler/connection_handler_per_thread.cc in /Users/hongbin/mysql/bin/mysqld:
(Error_log_throttle) create_thd_err_log_throttle = {
Log_throttle = (window_end = 0, window_size = 60000000, count = 0, summary_template = "Error log throttle: %10lu 'Can't create thread to handle new connection' error(s) suppressed")
log_summary = 0x00000001009efda0
}
(ulong) Per_thread_connection_handler::max_blocked_pthreads = 9
(mysql_mutex_t) Per_thread_connection_handler::LOCK_thread_cache = {
m_mutex = {
global = (__sig = 1297437786, __opaque = "")
mutex = (__sig = 1297437786, __opaque = "")
file = 0x0000000101bd31a1 "/Users/hongbin/workbench/mysql-server/sql/conn_handler/connection_handler_per_thread.cc"
line = 145
count = 0
thread = 0x0000000000000000
}
m_psi = 0x0000000108dcc600
}
我想看反编译的寄存器指令。
(lldb) di -n get_instance
mysqld`Global_THD_manager::get_instance:
0x100db3780 : pushq %rbp
0x100db3781 : movq %rsp, %rbp
0x100db3784 : leaq 0x10d195d(%rip), %rax ; Global_THD_manager::thd_manager
0x100db378b : cmpq $0x0, (%rax)
0x100db378f : setne %cl
0x100db3792 : xorb $-0x1, %cl
0x100db3795 : testb $0x1, %cl
0x100db3798 : jne 0x100db37a3 ; at mysqld_thd_manager.h:98
0x100db379e : jmp 0x100db37c2 ; at mysqld_thd_manager.h:98
0x100db37a3 : leaq 0xe298c4(%rip), %rdi ; "get_instance"
0x100db37aa : leaq 0xe42de5(%rip), %rsi ; "/Users/hongbin/workbench/mysql-server/sql/mysqld_thd_manager.h"
0x100db37b1 : movl $0x62, %edx
0x100db37b6 : leaq 0xe2a24f(%rip), %rcx ; "thd_manager != __null"
0x100db37bd : callq 0x1016cb0d6 ; symbol stub for: __assert_rtn
0x100db37c2 : jmp 0x100db37c7 ; at mysqld_thd_manager.h:98
0x100db37c7 : leaq 0x10d191a(%rip), %rax ; Global_THD_manager::thd_manager
0x100db37ce : movq (%rax), %rax
0x100db37d1 : popq %rbp
0x100db37d2 : retq
我想看mysqld的所有symbol。
(lldb) image dump symtab mysqld
我想看这个内存地址是什么鬼。
(lldb) image lookup -a 0x0000000100d86667
我想独立控制每个线程的断点,请开启non-stop模式。
(lldb) settings set target.non-stop-mode true
这些基础指令基本和gdb类似,整个使用还是蛮流畅的,也符合使用者的习惯,但没有自带pager,翻页有点痛苦。还有太多好玩的内容等你深挖,玩的开心!