原标题:MySQL DBA如何利用strace/pstack/gdb来定位问题
strace简介
strace是Linux环境下的一款程序调试工具,用来监察一个应用程序所使用的系统调用。
Strace是一个简单的跟踪系统调用执行的工具。在其最简单的形式中,它可以从开始到结束跟踪二进制的执行,并在进程的生命周期中输出一行具有系统调用名称,每个系统调用的参数和返回值的文本行。
常用选项
-T:strace输出显示时间
-t:可以在每行的输出之前添加时间戳,-tt可以精确到微妙级别,-ttt也可以精确到微妙级,但是它并不是打印当前时间,而是显示自从epoch以来的所经过的秒数
-s:指定输出的字符串的最大长度,默认为32,如果输出到文件的话会全部输出
-o:指定将strace输出到文件
-ff:如果提供-o filename,则所有进程的跟踪结果输出到相应的filename.pid中,pid是各进程的进程号.
-e:指定跟踪某个行为,例如-e trace=open指定只跟踪open行为
-y:将文件句柄用文件路径代替显示
正确姿势
简单使用strace -T -tt -o /tmp/strace.
logCMD
strace -T -tt CMD 2>& 1|tee /tmp/strace. log
strace -T -tt -s 100-o /tmp/strace. logCMD
strace -T -tt -s 100-ff -o /tmp/strace. logCMD
strace -T -tt -s 100-e trace=XXXX -o /tmp/strace. logCMD
使用案例
利用strace观察客户端client执行SQL
#通过sys.processlist表中pid可以知道客户端连接pid
MySQL [sys]> selectthd_id,conn_id,user,pid,program_name,command,current_statement fromprocesslist whereconn_id> 0andpid> 0;
+--------+---------+------------------+------+--------------+---------+-------------------+
| thd_id | conn_id | user | pid | program_name | command | current_statement |
+--------+---------+------------------+------+--------------+---------+-------------------+
| 78| 22| xucl@ 172.17.0.11| 8656| mysql | Sleep | select* fromt1 |
+--------+---------+------------------+------+--------------+---------+-------------------+
1row inset( 0.05sec)
#session1
[root@VM_0_11_centos ~] # strace -T -tt -s 100 -o /tmp/strace.log -p 5841
strace: Process 5841attached
^Cstrace: Process 5841detached
#session进行数据查询
MySQL [(none)]> use xucl
Reading table information forcompletion oftable andcolumn names
You can turn offthis feature togeta quicker startup with-A
Database changed
MySQL [xucl]> select* fromt1;
+----+----+
| id | c1 |
+----+----+
| 1| |
+----+----+
1row inset( 0.00sec)
需要提醒一下的是从客户端连接到MySQL的时候需要注意客户端的版本,例如我用mariadb的客户端连接官方版本8.0就无法获取到pid
查看strace文件
....
20: 54: 16.901980poll([{fd= 3, events=POLLIN|POLLPRI}], 1, 0) = 0(Timeout) < 0.000006>
20: 54: 16.902001write( 3, "210003select * from t1", 21) = 21< 0.000011>
20: 54: 16.902026read( 3, "10012"0023def4xucl2t12t12id2idf?0v00033P000"0033def4xucl2t12t12c12c1f!036000375t@00050043760020300500110500"..., 16384) = 106< 0.000355>
20: 54: 16.902405times({tms_utime= 0, tms_stime= 0, tms_cutime= 0, tms_cstime= 0}) = 429630696< 0.000005>
20: 54: 16.902441write( 1, "+----+----+n", 12) = 12< 0.000026>
20: 54: 16.902482write( 1, "| id | c1 |n", 12) = 12< 0.000018>
20: 54: 16.902514write( 1, "+----+----+n", 12) = 12< 0.000018>
20: 54: 16.902549write( 1, "| 1 | |n", 12) = 12< 0.000019>
20: 54: 16.902582write( 1, "+----+----+n", 12) = 12< 0.000019>
20: 54: 16.902631write( 1, "33(B33[0;1m1 row in set (0.00 sec)n", 33) = 33< 0.000019>
20: 54: 16.902666write( 1, "33(B33[m33(B33[0;1mn", 16) = 16< 0.000019>
20: 54: 16.902702stat( "/etc/localtime", {st_mode=S_IFREG| 0644, st_size= 528, ...}) = 0< 0.000006>
20: 54: 16.902752rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0< 0.000004>
20: 54: 16.902771ioctl( 0, TIOCGWINSZ, {ws_row= 42, ws_col= 163, ws_xpixel= 0, ws_ypixel= 0}) = 0< 0.000005>
20: 54: 16.902789ioctl( 0, TIOCSWINSZ, {ws_row= 42, ws_col= 163, ws_xpixel= 0, ws_ypixel= 0}) = 0< 0.000004>
20: 54: 16.902806ioctl( 0, TCGETS, {B38400 opost isig icanon echo ...}) = 0< 0.000005>
20: 54: 16.902824ioctl( 0, SNDCTL_TMR_STOP or TCSETSW, {B38400 opost isig -icanon -echo ...}) = 0< 0.000006>
20: 54: 16.902841write( 1, "33(B33[m", 6) = 6< 0.000022>
20: 54: 16.902877rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0< 0.000005>
20: 54: 16.902895rt_sigprocmask(SIG_BLOCK, [INT QUIT ALRM TSTP TTIN TTOU], [], 8) = 0< 0.000005>
20: 54: 16.902912rt_sigaction(SIGINT, { 0x55e19ff48a00, [], SA_RESTORER, 0x7fc3e63585f0}, { 0x55e19fee21d0, [INT], SA_RESTORER|SA_RESTART, 0x7fc3e63585f0}, 8) = 0< 0.000005>
20: 54: 16.902931rt_sigaction(SIGTERM, { 0x55e19ff48a00, [], SA_RESTORER, 0x7fc3e63585f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc3e63585f0}, 8) = 0< 0.000005>
20: 54: 16.902948rt_sigaction(SIGQUIT, { 0x55e19ff48a00, [], SA_RESTORER, 0x7fc3e63585f0}, { 0x55e19fee1fa0, [QUIT], SA_RESTORER|SA_RESTART, 0x7fc3e63585f0}, 8) = 0< 0.000005>
20: 54: 16.902966rt_sigaction(SIGALRM, { 0x55e19ff48a00, [], SA_RESTORER, 0x7fc3e63585f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc3e63585f0}, 8) = 0< 0.000005>
20: 54: 16.902983rt_sigaction(SIGTSTP, { 0x55e19ff48a00, [], SA_RESTORER, 0x7fc3e63585f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc3e63585f0}, 8) = 0< 0.000005>
20: 54: 16.903001rt_sigaction(SIGTTOU, { 0x55e19ff48a00, [], SA_RESTORER, 0x7fc3e63585f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc3e63585f0}, 8) = 0< 0.000004>
20: 54: 16.903018rt_sigaction(SIGTTIN, { 0x55e19ff48a00, [], SA_RESTORER, 0x7fc3e63585f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc3e63585f0}, 8) = 0< 0.000006>
20: 54: 16.903037rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0< 0.000005>
20: 54: 16.903055rt_sigaction(SIGWINCH, { 0x55e19ff481a0, [], SA_RESTORER|SA_RESTART, 0x7fc3e63585f0}, { 0x55e19fee0760, [WINCH], SA_RESTORER|SA_RESTART, 0x7fc3e63585f0}, 8) = 0< 0.000006>
20: 54: 16.903088write( 1, "MySQL [xucl]> ", 14) = 14< 0.000032>
20: 54: 16.903139read( 0,
利用strace观察server端执行#命令
strace -o /tmp/strace.log -T -tt -f -p `pidof mysqld`
#假设你想看跟IO相关的行为,可以用-e trace来指定
# mysql相关的IO操作:
# mysql: read, write, open
# innodb: pread64, pwrite64
strace -o /tmp/strace.log -T -tt -f -e trace= read, open, write,pwrite64,pread64 -p `pidof mysqld`
示例:
strace -o /tmp/strace.log -T -tt -ff -p `pidof mysqld`
#然后在本地通过socket登录执行命令
MySQL [performance_schema]> use xucl;
Reading table information forcompletion of table andcolumn names
You can turn off this feature to get a quicker startup with -A
Database changed
MySQL [xucl]> select * from t1;
+------+
| id |
+------+
| 1 |
| 2 |
| 4 |
| 3 |
| 5 |
| 6 |
+------+
6rows inset ( 0. 01sec)
#查看本线程对应的LWP
MySQL [xucl]> show processlist;
+----+------+-----------+------+---------+------+----------+------------------+
| Id |User | Host |db | Command |Time | State |Info |
+----+------+-----------+------+---------+------+----------+------------------+
| 28| root |localhost | xucl |Query | 0 |starting | show processlist |
+----+------+-----------+------+---------+------+----------+------------------+
1row inset ( 0. 00sec)
MySQL [xucl]> select * from performance_schema.threads where processlist_id= 28G
*************************** 1. row ***************************
THREAD_ID:62
NAME:thread/sql/one_connection
TYPE:FOREGROUND
PROCESSLIST_ID:28
PROCESSLIST_USER:root
PROCESSLIST_HOST:localhost
PROCESSLIST_DB:xucl
PROCESSLIST_COMMAND:Query
PROCESSLIST_TIME:0
PROCESSLIST_STATE:Sending data
PROCESSLIST_INFO:select * from performance_schema.threads where processlist_id= 28
PARENT_THREAD_ID:NULL
ROLE:NULL
INSTRUMENTED:YES
HISTORY:YES
CONNECTION_TYPE:Socket
THREAD_OS_ID:8208
1row inset ( 0. 00sec)
strace会在/tmp文件夹下生成strace.log.xxx,xxx对应的就是THREAD_OS_ID,我们直接看strace.log.8208文件
t@VM_0_9_centos tmp] # cat strace.log.8208
16: 44: 14.447709restart_syscall(< ...resuming interrupted restart_syscall ...>) = 1< 3.268331>
16: 44: 17.716076recvfrom( 134, "22000", 4, MSG_DONTWAIT, NULL, NULL) = 4< 0.000015>
16: 44: 17.716179recvfrom( 134, "3SELECT DATABASE", 18, MSG_DONTWAIT, NULL, NULL) = 18< 0.000017>
16: 44: 17.716318sendto( 134, "10011 0023def000nDATABASE0f!0f"..., 68, MSG_DONTWAIT, NULL, 0) = 68< 0.000033>
16: 44: 17.716420recvfrom( 134, "5000", 4, MSG_DONTWAIT, NULL, NULL) = 4< 0.000015>
16: 44: 17.716476recvfrom( 134, "2xucl", 5, MSG_DONTWAIT, NULL, NULL) = 5< 0.000017>
16: 44: 17.716538access( "./xucl", F_OK) = 0< 0.000016>
16: 44: 17.716604sendto( 134, "70010002000", 11, MSG_DONTWAIT, NULL, 0) = 11< 0.000090>
16: 44: 17.716738recvfrom( 134, 0x7f06ec009fe0, 4, MSG_DONTWAIT, NULL, NULL) = - 1EAGAIN (Resource temporarily unavailable) < 0.000014>
16: 44: 17.716793poll([{fd= 134, events=POLLIN|POLLPRI}], 1, 600000) = 1([{fd= 134, revents=POLLIN}]) < 1.326139>
16: 44: 19.042985recvfrom( 134, "21000", 4, MSG_DONTWAIT, NULL, NULL) = 4< 0.000014>
16: 44: 19.043048recvfrom( 134, "3select * from t1", 17, MSG_DONTWAIT, NULL, NULL) = 17< 0.000017>
16: 44: 19.043256sendto( 134, "10011"0023def4xucl2t12t12id2idf?"..., 97, MSG_DONTWAIT, NULL, 0) = 97< 0.000137>
16: 44: 19.043448recvfrom( 134, 0x7f06ec009fe0, 4, MSG_DONTWAIT, NULL, NULL) = - 1EAGAIN (Resource temporarily unavailable) < 0.000014>
16: 44: 19.043501poll([{fd= 134, events=POLLIN|POLLPRI}], 1, 600000) = 1([{fd= 134, revents=POLLIN}]) < 11.279983>
16: 44: 30.323556recvfrom( 134, "21000", 4, MSG_DONTWAIT, NULL, NULL) = 4< 0.000050>
16: 44: 30.323682recvfrom( 134, "3show processlist", 17, MSG_DONTWAIT, NULL, NULL) = 17< 0.000017>
16: 44: 30.323788sendto( 134, "100110300023def0002Id0f?02500010201000"..., 324, MSG_DONTWAIT, NULL, 0) = 324< 0.000221>
16: 44: 30.324066recvfrom( 134, 0x7f06ec009fe0, 4, MSG_DONTWAIT, NULL, NULL) = - 1EAGAIN (Resource temporarily unavailable) < 0.000014>
16: 44: 30.324120poll([{fd= 134, events=POLLIN|POLLPRI}], 1, 600000) = 1([{fd= 134, revents=POLLIN}]) < 2.943989>
16: 44: 33.268173recvfrom( 134, "A000", 4, MSG_DONTWAIT, NULL, NULL) = 4< 0.000055>
16: 44: 33.268309recvfrom( 134, "3select * from performance_schem"..., 65, MSG_DONTWAIT, NULL, NULL) = 65< 0.000017>
16: 44: 33.268592sendto( 134, "100121H0023def22performance_schema"..., 1606, MSG_DONTWAIT, NULL, 0) = 1606< 0.000142>
16: 44: 33.268790recvfrom( 134, 0x7f06ec009fe0, 4, MSG_DONTWAIT, NULL, NULL) = - 1EAGAIN (Resource temporarily unavailable) < 0.000018>
16: 44: 33.268853poll([{fd= 134, events=POLLIN|POLLPRI}], 1, 600000
可以清楚地看到有对应的SQL输出
pstack简介
pstack用来跟踪进程栈,这个命令在排查进程问题时非常有用,比如我们发现一个服务一直处于work状态(如假死状态,好似死循环),使用这个命令就能轻松定位问题所在;
可以在一段时间内,多执行几次pstack,若发现代码栈总是停在同一个位置,那个位置就需要重点关注,很可能就是出问题的地方;
正确使用姿势
pstack使用起来非常简单,直接pstack -pid即可
[root@VM_0_9_centos mysql3306] # netstat -lntp|grep 3306
tcp6 00:::33060:::*LISTEN 13499/mysqld
tcp6 00:::3306:::*LISTEN 13499/mysqld
[root@VM_0_9_centos mysql3306] # ps -Lf 13499
UID PID PPID LWP C NLWP STIME TTY STAT TIME CMD
mysql 13499113161349905320:31pts/ 0Sl 0:00/usr/local/mysql/bin/mysqld --defaults-file= /data/mysql/mysql3306/my3306.cnf --user=mysql
mysql 13499113161350605320:31pts/ 0Sl 0:00/usr/local/mysql/bin/mysqld --defaults-file= /data/mysql/mysql3306/my3306.cnf --user=mysql
mysql 13499113161350705320:31pts/ 0Sl 0:00/usr/local/mysql/bin/mysqld --defaults-file= /data/mysql/mysql3306/my3306.cnf --user=mysql
mysql 13499113161350805320:31pts/ 0Sl 0:00/usr/local/mysql/bin/mysqld --defaults-file= /data/mysql/mysql3306/my3306.cnf --user=mysql
mysql 13499113161350905320:31pts/ 0Sl 0:00/usr/local/mysql/bin/mysqld --defaults-file= /data/mysql/mysql3306/my3306.cnf --user=mysql
mysql 13499113161351005320:31pts/ 0Sl 0:00/usr/local/mysql/bin/mysqld --defaults-file= /data/mysql/mysql3306/my3306.cnf --user=mysql
mysql 13499113161351105320:31pts/ 0Sl 0:00/usr/local/mysql/bin/mysqld --defaults-file= /data/mysql/mysql3306/my3306.cnf --user=mysql
....
mysqld的pid为13499,fork了几十个线程,有的是用户连接,有的是MySQL自己的线程,例如read_thread、write_thread等等,这里的LWP对应的就是performance_schema.threads表的THREAD_OS_ID。
全部抓取
假如发生了mysqld hang住的情况,可以直接pstack mysqldpid来抓取所有的线程堆栈,这也是通常使用的方式,例如
pstack 13499>>/tmp/pstack.log
这里简单随便看下抓取到的pstack
Thread 27(Thread 0x7ff1bc29f700 (LWP 13535)):
#00x00007ff1f6368da2 inpthread_cond_timedwait@@GLIBC_2. 3.2from /lib64/libpthread.so. 0
#10x00000000020ace25 inos_event::timed_wait(timespec const*) at ../../../mysql- 8.0. 19/storage/innobase/include/sync0types.h: 540
#20x00000000020ad9e1 inos_event::wait_time_low (this= 0x7ff1dc022458, time_in_usec=, reset_sig_count= 1) at ../../../mysql- 8.0. 19/storage/innobase/os/os0event.cc: 495
#30x000000000215fe0c insrv_error_monitor_thread at ../../../mysql- 8.0. 19/storage/innobase/srv/srv0srv.cc: 1835
#40x0000000002076da5 in__invoke_impl (__f=) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/invoke.h:89
#5 __invoke (__fn=) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/invoke.h:95
#6 __call (__args=..., this=) at /opt/rh/devtoolset-8/root/usr/include/c++/8/functional:400
#7 operator<> (this=) at /opt/rh/devtoolset-8/root/usr/include/c++/8/functional:484
#8 operator (f=@ 0x7ff1dc4aec18: 0x215fd20 , this= 0x7ff1dc4aec20) at ../../../mysql- 8.0. 19/storage/innobase/include/os0thread-create.h: 101
#9__invoke_impl (__f=...) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/invoke.h:60
#10 __invoke (__fn=...) at /opt/rh/devtoolset- 8/root/usr/include/c++/ 8/bits/invoke.h: 95
#11_M_invoke< 0, 1> (this= 0x7ff1dc4aec18) at /opt/rh/devtoolset- 8/root/usr/include/c++/ 8/thread: 244
#12operator(this= 0x7ff1dc4aec18) at /opt/rh/devtoolset- 8/root/usr/include/c++/ 8/thread: 253
#13std::thread::_State_impl<:thread::_invoker void> > >::_M_run (this=0x7ff1dc4aec10) at /opt/rh/devtoolset-8/root/usr/include/c++/8/thread:196
#14 0x000000000265d64f in execute_native_thread_routine
#15 0x00007ff1f6364e65 in start_thread from /lib64/libpthread.so.0
#16 0x00007ff1f44a188d in clone from /lib64/libc.so.6
从堆栈可以看出这其实就是MySQL的error monitor线程
单个抓取
假设你运气好,你现在可以通过performance_schema.threads可以查到对应的线程的thread_os_id,你可以指定单个线程的lwp进行抓取
MySQL [performance_schema]> show processlist;
+----+-----------------+-----------+--------------------+---------+------+------------------------+---------------------------+
| Id |User | Host |db | Command |Time | State |Info |
+----+-----------------+-----------+--------------------+---------+------+------------------------+---------------------------+
| 4| event_scheduler |localhost | NULL |Daemon | 4319 |Waiting on empty queue | NULL |
| 23 |root | localhost |performance_schema | Query |0| starting |show processlist |
| 24| root |localhost | xucl |Query | 22 |User sleep | select sleep(600) from t1 |
+----+-----------------+-----------+--------------------+---------+------+------------------------+---------------------------+
3rows inset ( 0. 00sec)
MySQL [performance_schema]> select THREAD_OS_ID from threads where processlist_id= 24G
*************************** 1. row ***************************
THREAD_OS_ID:15020
1row inset ( 0. 00sec)
你可以这样操作
pstack 15020> /tmp/single_pstack. log
查看pstack结果,只抓取了15020线程的堆栈
Thread 1(process 15020):
# 00x00007ff1f6368da2inpthread_cond_timedwait@@GLIBC_2.3.2from /lib64/libpthread.so .0
# 10x0000000001189b53innative_cond_timedwait (abstime= 0x7ff1b810d7f0, mutex= 0x3b96760, cond= 0x7ff1b810d8e0) at ../../mysql -8.0.19/include/thr_cond.h: 149
# 2my_cond_timedwait (abstime= 0x7ff1b810d7f0, mp= 0x3b96760, cond= 0x7ff1b810d8e0) at ../../mysql -8.0.19/include/thr_cond.h: 149
# 3inline_mysql_cond_timedwait (src_file= 0x2de5538"../../mysql-8.0.19/sql/item_func.cc", src_line= 4786, abstime= 0x7ff1b810d7f0, mutex= 0x3b96760, that= 0x7ff1b810d8e0) at ../../mysql -8.0.19/include/mysql/psi/mysql_cond.h: 217
# 4Interruptible_wait::wait ( this= this@entry= 0x7ff1b810d890, cond= cond@entry= 0x7ff1b810d8e0, mutex= mutex@entry= 0x3b96760) at ../../mysql -8.0.19/sql/item_func.cc: 4786
# 50x0000000001189e26inItem_func_sleep::val_int ( this=) at ../../mysql -8.0.19/sql/item_func.cc: 5435
# 60x00000000010ff567inItem::send ( this= 0x7ff1d45d9380, protocol= 0x7ff1d4392240, buffer=) at ../../mysql -8.0.19/sql/item.cc: 6638
# 70x0000000000e3d36binTHD::send_result_set_row(List*) at ../../mysql -8.0.19/sql/sql_class.cc: 2530
# 80x000000000127e8f1inQuery_result_send::send_data ( this=, thd= 0x7ff1d457b7b0, items=...) at ../../mysql -8.0.19/sql/query_result.cc: 93
# 90x0000000000f5d29ainSELECT_LEX_UNIT::ExecuteIteratorQuery(THD*) at ../../mysql -8.0.19/sql/sql_union.cc: 1530
# 100x0000000000f5f3f9inSELECT_LEX_UNIT::execute(THD*) at ../../mysql -8.0.19/sql/sql_union.cc: 1588
# 110x0000000000eedf8binSql_cmd_dml::execute_inner(THD*) at ../../mysql -8.0.19/sql/sql_select.cc: 910
# 120x0000000000ef7418inSql_cmd_dml::execute ( this= 0x7ff1d45d9a38, thd= 0x7ff1d457b7b0) at ../../mysql -8.0.19/sql/sql_select.cc: 715
# 130x0000000000eab3a9inmysql_execute_command(THD*, bool) at ../../mysql -8.0.19/sql/sql_parse.cc: 4478
# 140x0000000000ead0ccinmysql_parse (thd= thd@entry= 0x7ff1d457b7b0, parser_state= parser_state@entry= 0x7ff1b810f610) at ../../mysql -8.0.19/sql/sql_parse.cc: 5288
# 150x0000000000eaeb6bindispatch_command(THD*, COM_DATA const*, enum_server_command) at ../../mysql -8.0.19/sql/sql_parse.cc: 1777
# 160x0000000000eb0104indo_command (thd= thd@entry= 0x7ff1d457b7b0) at ../../mysql -8.0.19/sql/sql_parse.cc: 1275
# 170x0000000000fc1a08inhandle_connection (arg= arg@entry= 0x43cae90) at ../../mysql -8.0.19/sql/conn_handler/connection_handler_per_thread.cc: 302
# 180x00000000023ffdecinpfs_spawn_thread (arg= 0x44c2a30) at ../../../mysql -8.0.19/storage/perfschema/pfs.cc: 2854
# 190x00007ff1f6364e65instart_thread from /lib64/libpthread.so .0
# 200x00007ff1f44a188dinclone from /lib64/libc.so .6
gdb简介
GDB 是 linux 环境下的一般功能强大的调试器,用来调试 C 或 C++ 写的程序。它可以做这些事情
Start your program, specifying anything that might affect its behavior.
Make your program stop on specified conditions.
Examine what has happened, when your program has stopped.
Change things in your program, so you can experiment with correcting the effects of one bug and go on to learn about another.
常用命令
info threads:查看全部线程
thread n:指定某个线程
b:在某处打断点
c:继续往下走
s:执行一行代码,如果代码函数调用,则进入函数
n:执行一行代码,函数调用不进入
p:打印某个变量值
list:打印代码的文本信息
bt:查看某个线程的栈帧
info b:查看当前所有断点信息
实际案例
进入gdb界面
[root@VM_0_9_centos ~] # gdb /usr/local/mysql5.7/bin/mysqld
进入调试
run --defaults-file= /data/mysql/mysql3308/my3308.cnf --user=mysql --gdb
通过info thread查看所有线程,这个时候我从另外一个会话连接到了MySQL,我从pfs.threads查寻到这个会话的LWP值为513,那么我可以通过thread 41切换到这个线程
(gdb) info thread
Id Target Id Frame
41Thread 0x7fff819cf700(LWP 513) "mysqld"0x00007ffff61abbedinpoll from/lib64/libc.so .6
40Thread 0x7fff5affd700(LWP 416) "mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2 .3.2from/lib64/libpthread.so .0
39Thread 0x7fff81a51700(LWP 415) "mysqld"0x00007ffff7bce381insigwait from/lib64/libpthread.so .0
38Thread 0x7fff5b7fe700(LWP 414) "mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2 .3.2from/lib64/libpthread.so .0
37Thread 0x7fff5bfff700(LWP 413) "mysqld"0x00007ffff7bcada2inpthread_cond_timedwait@@GLIBC_2 .3.2from/lib64/libpthread.so .0
36Thread 0x7fff6cff9700(LWP 412) "mysqld"0x00007ffff7bcada2inpthread_cond_timedwait@@GLIBC_2 .3.2from/lib64/libpthread.so .0
35Thread 0x7fff6d7fa700(LWP 411) "mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2 .3.2from/lib64/libpthread.so .0
34Thread 0x7fff6dffb700(LWP 410) "mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2 .3.2from/lib64/libpthread.so .0
33Thread 0x7fff6e7fc700(LWP 409) "mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2 .3.2from/lib64/libpthread.so .0
32Thread 0x7fff6effd700(LWP 408) "mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2 .3.2from/lib64/libpthread.so .0
31Thread 0x7fff6f7fe700(LWP 407) "mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2 .3.2from/lib64/libpthread.so .0
30Thread 0x7fff6ffff700(LWP 406) "mysqld"0x00007ffff7bcde5dinnanosleep from/lib64/libpthread.so .0
29Thread 0x7fff8084e700(LWP 405) "mysqld"0x00007ffff7bcada2inpthread_cond_timedwait@@GLIBC_2 .3.2from/lib64/libpthread.so .0
28Thread 0x7fff8104f700(LWP 404) "mysqld"0x00007ffff7bcada2inpthread_cond_timedwait@@GLIBC_2 .3.2from/lib64/libpthread.so .0
27Thread 0x7fff81850700(LWP 403) "mysqld"0x00007ffff7bcada2inpthread_cond_timedwait@@GLIBC_2 .3.2from/lib64/libpthread.so .0
24Thread 0x7fff74c24700(LWP 398) "mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2 .3.2from/lib64/libpthread.so .0
23Thread 0x7fff75425700(LWP 397) "mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2 .3.2from/lib64/libpthread.so .0
22Thread 0x7fff75c26700(LWP 396) "mysqld"0x00007ffff7bca9f5inpthread_cond_wait@@GLIBC_2 .3.2from/lib64/libpthread.so .0
21Thread 0x7fff76427700(LWP 395) "mysqld"0x00007ffff7bcada2inpthread_cond_timedwait@@GLIBC_2 .3.2from/lib64/libpthread.so .0
20Thread 0x7fff76c28700(LWP 394) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
19Thread 0x7fff77429700(LWP 393) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
18Thread 0x7fff77c2a700(LWP 392) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
17Thread 0x7fff7842b700(LWP 391) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
16Thread 0x7fff78c2c700(LWP 390) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
15Thread 0x7fff7942d700(LWP 389) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
14Thread 0x7fff79c2e700(LWP 388) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
13Thread 0x7fff7a42f700(LWP 387) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
12Thread 0x7fff7ac30700(LWP 386) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
11Thread 0x7fff7b431700(LWP 385) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
10Thread 0x7fff7bc32700(LWP 384) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
9Thread 0x7fff7c433700(LWP 383) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
8Thread 0x7fff7cc34700(LWP 382) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
7Thread 0x7fff7d435700(LWP 381) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
6Thread 0x7fff7dc36700(LWP 380) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
5Thread 0x7fff7e437700(LWP 379) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
4Thread 0x7fff7ec38700(LWP 378) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
3Thread 0x7fff7f439700(LWP 377) "mysqld"0x00007ffff79bd644in__io_getevents_0_4 from/lib64/libaio.so .1
2Thread 0x7fffec8c4700(LWP 376) "mysqld"0x00007ffff60ef53ainsigwaitinfo from/lib64/libc.so .6
* 1Thread 0x7ffff7fe4880(LWP 371) "mysqld"0x00007ffff61abbedinpoll from/lib64/libc.so .6
(gdb) thread 41
[Switching to thread 41(Thread 0x7fff819cf700(LWP 513))]
#0 0x00007ffff61abbed in poll from /lib64/libc.so.6
演示一下打断点:
(gdb) b do_command
Breakpoint 1at 0xc8d060: file /usr/local/src/mysql -5.7.30/sql/sql_parse.cc, line 903.
(gdb) c
Continuing.
(gdb) b do_command
Breakpoint 1at 0xc8d060: file /usr/local/src/mysql -5.7.30/sql/sql_parse.cc, line 903.
(gdb) n
Single stepping until exitfrom functionpoll,
whichhasnolinenumberinformation.
vio_io_wait(vio=vio@entry= 0x54fd330, event=event@entry=VIO_IO_EVENT_READ, timeout=)at/ usr/ local/ src/ mysql-5.7.30/ vio/ viosocket. c:815
815 MYSQL_END_SOCKET_WAIT(locker, 0);
(gdb) n
817}
(gdb) n
vio_socket_io_wait (vio=vio@entry= 0x54fd330, event=event@entry=VIO_IO_EVENT_READ) at /usr/local/src/mysql -5.7.30/vio/viosocket.c: 92
92ret= -1;
(gdb) n
105}
(gdb) n
vio_read (vio= 0x54fd330, buf= 0x54e3260"001", size= 4) at /usr/local/src/mysql -5.7.30/vio/viosocket.c: 134
134while((ret= mysql_socket_recv(vio->mysql_socket, (SOCKBUF_T *)buf, size, flags)) == -1)
(gdb) n
148}
(gdb) n
net_read_raw_loop (net=net@entry= 0x54841e8, count=count@entry= 4) at /usr/local/src/mysql -5.7.30/sql/net_serv.cc: 697
697if(recvcnt == VIO_SOCKET_ERROR)
(gdb) n
706elseif(!recvcnt)
(gdb) n
715thd_increment_bytes_received(recvcnt);
(gdb) n
712count-= recvcnt;
(gdb) n
713buf+= recvcnt;
(gdb) n
715thd_increment_bytes_received(recvcnt);
(gdb) n
692while(count)
(gdb) n
737}
(gdb) n
736returnMY_TEST(count);
(gdb) n
737}
(gdb) n
net_read_packet_header (net= 0x54841e8) at /usr/local/src/mysql -5.7.30/sql/net_serv.cc: 779
779server_extension->m_after_header(net, user_data, count, rc);
(gdb) n
778rc= net_read_raw_loop(net, count);
(gdb) n
779server_extension->m_after_header(net, user_data, count, rc);
(gdb) n
787if(rc)
(gdb) n
792pkt_nr= net->buff[net->where_b + 3];
(gdb) n
798if(pkt_nr != (uchar) net->pkt_nr)
(gdb) n
818net->pkt_nr++;
(gdb) n
net_read_packet (net= 0x54841e8, complen= 0x7fff819cec00) at /usr/local/src/mysql -5.7.30/sql/net_serv.cc: 850
850if(net->compress)
(gdb) n
844if(net_read_packet_header(net))
(gdb) n
847net->compress_pkt_nr= net->pkt_nr;
(gdb) n
850if(net->compress)
(gdb) n
868pkt_len= uint3korr(net->buff+net->where_b);
(gdb) n
871if(!pkt_len)
(gdb) n
868pkt_len= uint3korr(net->buff+net->where_b);
(gdb) n
871if(!pkt_len)
(gdb) p pkt_len
$ 1= 22
(gdb) n
874pkt_data_len = max(pkt_len, *complen) + net->where_b;
(gdb) n
877if((pkt_data_len >= net->max_packet) && net_realloc(net, pkt_data_len))
(gdb) n
881if(net_read_raw_loop(net, pkt_len))
(gdb) n
868pkt_len= uint3korr(net->buff+net->where_b);
(gdb) n
886returnpkt_len;
(gdb) n
891}
(gdb) n
my_net_read (net= 0x54841e8) at /usr/local/src/mysql -5.7.30/sql/net_serv.cc: 922
922if(len == MAX_PACKET_LENGTH)
(gdb) n
925ulong save_pos = net->where_b;
(gdb) n
922if(len == MAX_PACKET_LENGTH)
(gdb) n
937net->read_pos = net->buff + net->where_b;
(gdb) n
938if(len != packet_error)
(gdb) n
937net->read_pos = net->buff + net->where_b;
(gdb) n
938if(len != packet_error)
(gdb) n
939net->read_pos[len]= 0; /* Safeguard for mysql_use_result */
(gdb) n
1056MYSQL_NET_READ_DONE( 0, len);
(gdb) n
1058}
(gdb) n
Protocol_classic::read_packet (this= 0x5483a88) at /usr/local/src/mysql -5.7.30/sql/protocol_classic.cc: 820
820raw_packet= m_thd->net.read_pos;
(gdb) n
819bad_packet= false;
(gdb) n
820raw_packet= m_thd->net.read_pos;
(gdb) n
829}
(gdb) n
820raw_packet= m_thd->net.read_pos;
(gdb) n
829}
(gdb) n
Protocol_classic::get_command (this= 0x5483a88, com_data= 0x7fff819cecc0, cmd= 0x7fff819cecb0) at /usr/local/src/mysql -5.7.30/sql/protocol_classic.cc: 983
983if(packet_length == 0) /* safety */
(gdb) n
990raw_packet[packet_length]= '0'; /* safety */
(gdb) n
1002returnparse_packet(com_data, *cmd);
(gdb) n
990raw_packet[packet_length]= '0'; /* safety */
(gdb) n
992*cmd= (enum enum_server_command) raw_packet[ 0];
(gdb) n
994if(*cmd >= COM_END)
(gdb) n
992*cmd= (enum enum_server_command) raw_packet[ 0];
(gdb) n
1002returnparse_packet(com_data, *cmd);
(gdb) n
999packet_length--;
(gdb) n
1000raw_packet++;
(gdb) n
1002returnparse_packet(com_data, *cmd);
(gdb) n
1003}
(gdb) n
do_command (thd=thd@entry= 0x5482a30) at /usr/local/src/mysql -5.7.30/sql/sql_parse.cc: 974
974if(rc)
(gdb) s
971rc= thd->get_protocol->get_command(&com_data, &command);
(gdb) bt
#0 do_command (thd=thd@entry=0x5482a30) at /usr/local/src/mysql-5.7.30/sql/sql_parse.cc:971
#1 0x0000000000d4abf8 in handle_connection (arg=arg@entry=0x5418ff0) at /usr/local/src/mysql-5.7.30/sql/conn_handler/connection_handler_per_thread.cc:313
#2 0x000000000120a451 in pfs_spawn_thread (arg=0x5452410) at /usr/local/src/mysql-5.7.30/storage/perfschema/pfs.cc:2197
#3 0x00007ffff7bc6e65 in start_thread from /lib64/libpthread.so.0
#4 0x00007ffff61b688d in clone from /lib64/libc.so.6
(gdb) s
972thd->m_server_idle= false;
(gdb) s
974if(rc)
(gdb) s
1027thd->variables.net_buffer_length);
(gdb) s
Protocol_classic::get_packet (this= 0x5483a88) at /usr/local/src/mysql -5.7.30/sql/protocol_classic.cc: 807
807{
(gdb) bt
#0 Protocol_classic::get_packet (this=0x5483a88) at /usr/local/src/mysql-5.7.30/sql/protocol_classic.cc:807
#1 0x0000000000c8d1b3 in do_command (thd=thd@entry=0x5482a30) at /usr/local/src/mysql-5.7.30/sql/sql_parse.cc:1027
#2 0x0000000000d4abf8 in handle_connection (arg=arg@entry=0x5418ff0) at /usr/local/src/mysql-5.7.30/sql/conn_handler/connection_handler_per_thread.cc:313
#3 0x000000000120a451 in pfs_spawn_thread (arg=0x5452410) at /usr/local/src/mysql-5.7.30/storage/perfschema/pfs.cc:2197
#4 0x00007ffff7bc6e65 in start_thread from /lib64/libpthread.so.0
#5 0x00007ffff61b688d in clone from /lib64/libc.so.6
我们可以在调试过程中随时打印某个变量的值,例如:
(gdb) p pkt_len
$1= 22
甚至gdb可以在紧急情况下救你一命,例如,当MySQL数据库连接打满又没有后台线程可以连接到MySQL的时候,你可以通过gdb来修改MySQL的连接数,例如
gdb -p $(pidof mysqld) -ex "set max_connections=1500"-batch
本文内容比较多,给大家介绍了几个MySQL DBA(可能其他岗位同样适用)应该掌握的必备工具
strace可以用来跟踪某个线程的调用情况,例如可以适用strace跟踪客户端SQL执行情况,如果开发说有很多慢SQL,而MySQL却没有任何记录,那么就可以用strace来跟踪把锅甩给开发(案例来自有赞王航威)
pstack是平时用的比较多的工具,尤其是诊断MySQL hang住的情况,例如主从延迟特别高等
gdb可能用的不是太多,但是我认为是MySQL DBA进阶必会的技能之一,尤其是抽丝剥茧某些疑难case的时候非常有用
参考文章
《深入主从复制》专栏第29节
责任编辑: