Asterisk Debug

最新推荐文章于 2024-02-23 14:29:30 发布

nitweihong

最新推荐文章于 2024-02-23 14:29:30 发布

阅读量366

点赞数

分类专栏： Asterisk 文章标签： thread bt debugging transactions signal file

Asterisk 专栏收录该内容

77 篇文章 0 订阅

订阅专栏

Call Log

Asterisk by default writes to the system log a complete call record, for example:
> cd /var/log/asterisk/cdr-csv
> tail -f Master.csv
"","5032273698","9714042975","sip","""DORGAN M"" <5032273698>","SIP/147.135.0.129-08100358","","AGI","
/usr/local/mipl/agnese|http://www.nextbus.com/nextbus3.mipl","2004-12-09 13:14:41"
,"2004-12-09 13:14:44","2004-12-09 13:14:53",12,9,"ANSWERED","DOCUMENTATION"

Message Log

Or, if you are having problems catching intermittent problems on your system, consider adding more information to the Asterisk message log. If you look in logger.conf you will see something like:
messages => notice,warning,error
consider changing this to:
messages => notice,warning,error,debug,verbose
(for short periods of time anyway — it can really eat disk space). Note you will need to restart Asterisk or type LOGGER ROTATE at the CLI to get this change to take effect (reload doesn't do it)

Backtracing a core dump file in /tmp

start Asterisk with safe_asterisk
enter "gdb asterisk core.xxxx"
enter "bt" while in gdb (or do a "bt full")
enter "thread apply all bt"

Naturally you'll need to have gdb installed on your system

CONSOLE=no

Are you running safe_asterisk? If so try to modify safe_asterisk ... CONSOLE=yes to CONSOLE=no.

Debugging a running asterisk

List all the asterisk threads with ps axum -C asterisk to find the thread that takes the most CPU. Now connect with gdb:

gdb /usr/sbin/asterisk pid

and do "bt" and post the last few lines to the mailing list ...

ulimit

If asterisk is crashing as in exiting, issue the command

ulimit -c unlimited

and this should allow asterisk to drop a core file if it can.

HowTo Debug a DeadLock in Asterisk

1) In the asterisk makefile you need to uncomment

Optional debugging parameters

DEBUG_THREADS = #-DDEBUG_THREADS #-DDO_CRASH

NB* the DO_ CRASH arg will force a core dump on certain conditions that indicate a possible deadlock

that otherwise will just generate a verbose warning message, you prolly dont wanta do this on production box but
for testing this is a usefull option because the core will show off what was happending between the threads.
grep the sources for the CRASH macro do see under what conditons this might occur

2) Turn on Verbose logging
Current cvs as of Feb 1 2004 allows verbose msg to be logged see logger.conf add VERBOSE to the messages file.

This will allow you to log ast_verbose msg's to the logs so we can

see what the bt threads are doing in time sequence order
re-create the situation that lead to core or deadlock

3) When you deadlock don't restart the box or restart asterisk
Instead take the 5 mins while everyone is freaking out to attach gdb to the running asterisk process and do
gdb /usr/sbin/asterisk <pid of main * process>
...you can get the asterisk "Process Identification Number (PID) by asterisk -r ("-> currently running on blah (pid =9075)"). Note: If the box is truly hosed & blocked on all I/O this will fail also, you must use ps ax ) or look for lowest pid after doing ps ax -C asterisk .

4) after gdb loads do
info thread
thread apply all bt
At the very least you are now going to save that bt output to a file and post that to bugs.digium.com

5) Identify dead locked threads by this pattern
Note the "_pthread_wait_for_restart_signal". That means we are in wait loop wanting the mutex lock

Thread 23 (Thread 3576854 (LWP 2910)):

0 0x400c787e in sigsuspend () from /lib/libc.so.6
1 0x40022879 in __pthread_wait_for_restart_signal () from /lib/libpthread.so.0
2 0x40024a36 in __pthread_alt_lock () from /lib/libpthread.so.0
3 0x40020fd2 in pthread_mutex_lock () from /lib/libpthread.so.0

Note apparently not all systems implement the "pthread_wait_for_restart_signal", so I guess you might just want to scan for at least "pthread_mutex_lock". You will usually find more than one of these patterns because once a thread is dead locked on a mutex lock, other threads that want the same lock will pile up quickly.

6) Try to identify the first thread, that is dead locked.
The sequence number of bt threads is not relevent, because threads are re-used.

Look in your log files at the time stamps and try to corrolate the THREAD number (e.g. "Thread 23 (Thread 3576854 (LWP 2910))") to the earliest entry in the log file with that same THREAD number (e.g. "VERBOSE[3576854]"). Note the FRAME number just before doing "pthread_mutex_lock()", (that is the #0, #1, #2, number right after bt THREAD number).

Log files are usually in /var/log/asterisk/ . Check the files messages and debug in this directory.

7) Find the position
Now that we have our potential guilty party as the first in line for the lock do
thread <sequence number> for the THREAD of interest
frame <fame number> for the frame # before the pthread_mutex_lock ()
This now should be in our asterisk sources right where we call ast_mutex_lock(). Record the name of the lock it was trying to get eg ast_mutex_lock(&agentlock).

8) Check who has the lock
Now if we have properly turned on thread debugging we are going to be able to see into include/asterisk/lock.f ast_mutex_t struct which looks like this

pthread_mutex_t mutex;
char *file;
int lineno;
char *func;
pthread_t thread;

so now that we have our lock we can see who has it & what we are waiting on, .. do the following bt cmds

p somelockIjustFound->thread
p somelockIjustFound->file
p somelockIjustFound->func
p somelockIjustFound->lineno

This is the guility code that is holding the lock that we want to look at.

9) Now comes the hard part ...:)
Why is this Code in that thread, file, function, lineno not releasing our lock? We have to now scour the code checking all places where that lock is set & released looking for