This article is a summary from https://source.android.com/devices/tech/debug, summarized the key points of the debugging motheds in android.
1. core dump:
for crash dump, please remember:
It's possible for the crash dumper to attach only if nothing else is already attached, which means that using tools such as strace
or lldb
prevent crash dumps from occurring.
a phthon script is provided to facilitate the parsing of the dump log:
development/scripts/stack
you can also use it without the folder name after you run "lunch": stack < FS/data/tombstones/tombstone_05
debuggerd:
You can use the debuggerd
tool to get a stack dump from a running process. From the command line, invoke debuggerd
using a process ID (PID) to dump a full tombstone to stdout
. To get just the stack for every thread in the process, include the -b
or --backtrace
flag.
2. Reading Bug Reports
Android bug reports contain dumpsys
, dumpstate
, and logcat
data in text (.txt) format.
The logcat
log is a string-based dump of all logcat
information. The system part is reserved for the framework and has a longer history than main which contains everything else. Each line typically starts with timestamp UID PID TID log-level
example:
logcat -v threadtime -d *:v
The basic format of the event log (for example, dumped by "logcat -b events -v threadtime -d *:v") is: timestamp PID TID log-level log-tag tag-values
ANRs and deadlocks
https://developer.android.com/training/articles/perf-anr.html:
In Android, application responsiveness is monitored by the Activity Manager and Window Manager system services. Android will display the ANR dialog for a particular application when it detects one of the following conditions:
- No response to an input event (such as key press or screen touch events) within 5 seconds.
- A
BroadcastReceiver
hasn't finished executing within 10 seconds.
When an application does not respond within a certain time, usually due to a blocked or busy main thread, the system kills the process and dumps the stack to /data/anr. To discover the culprit behind an ANR, grep for am_anr in the binary event log.
You can also grep for ANR in in the logcat
log, which contains more information about what was using CPU at the time of the ANR.
Finding deadlocks
Deadlocks often first appear as ANRs because threads are getting stuck. If the deadlock hits the system server, the watchdog will eventually kill it, leading to an entry in the log similar to: WATCHDOG KILLING SYSTEM PROCESS
. From the user perspective, the device reboots, although technically this is a runtime restart rather than a true reboot.
- In a runtime restart, the system server dies and is restarted; the user sees the device return to the boot animation.
- In a reboot, the kernel has crashed; the user sees the device return to the Google boot logo.
From a bug report perspective, an activity is a single, focused thing a user can do, which makes locating the activity that was in focus during a crash very important. Activities (via ActivityManager) run processes, so locating all process stops and starts for a given activity can also aid troubleshooting.
To view a history of focused activities, search for am_forcused_activity
To view a history of process starts, search for Start proc
========================
Using Debuggers
GDB is deprecated and will be removed soon.
Debugging running apps or processes
To connect to a running app or native daemon, use gdbclient.py
with a PID. For example, to debug the process with PID 1234, run:
gdbclient.py -p 1234
Debugging native process startup
To debug a process as it starts, use gdbclient.py
with the -r
option:
gdbclient.py -r /system/bin/MY_TEST_APP
for "dumping user & kernel stacks" section, the link is: https://source.android.com/devices/tech/debug/native_stack_dump, but only Chinese version works.
Debugging Native Memory Use
Address Sanitizer: HWASan/ASan
Android platform developers use HWAddressSanitizer (HWASan) to find memory bugs in C/C++.
You can flash prebuilt HWASan images to supported Pixel devices from ci.android.com (detailed setup instructions).
Since Android 8.0 (Oreo) it's also possible to use ASan to debug apps on non-rooted production devices. You can find instructions on the ASan wiki.
Heapprofd
Android 10 supports heapprofd, a low-overhead, sampling heap profiler. heapprofd lets you attribute native memory usage to callstacks in your program. See heapprofd - Android Heap Profiler on the Perfetto documentation site for more information.
Malloc debug
See Malloc Debug and Native Memory Tracking using libc Callbacks for a thorough description of the debugging options available for native memory issues.
libmemunreachable
Android's libmemunreachable is a zero-overhead native memory leak detector. It uses an imprecise mark-and-sweep garbage collector pass over all native memory, reporting any unreachable blocks as leaks. See the libmemunreachable documentation for usage instructions.
Malloc hooks
If you want to build your own tools, Android's libc also supports intercepting all allocation/free calls that happen during program execution. See the malloc_hooks documentation for usage instructions.
Malloc statistics
Android supports the mallinfo(3)
and malloc_info(3)
extensions to <malloc.h>
. The malloc_info
function is available in Android 6.0 (Marshmallow) and higher and its XML schema is documented in Bionic's <malloc.h>
.
Dalvik Debug Monitor Server
You can also use the Dalvik Debug Monitor Server (DDMS) to obtain a graphical view of Malloc Debug output.
To use DDMS, first turn on its native memory UI:
- Open
~/.android/ddms.cfg
- Add the line:
native=true
Upon relaunching DDMS and selecting a process, you can switch to the new native allocation tab and populate it with a list of allocations. This is especially useful for debugging memory leaks.
Rescue Party
Validation
All rescue events are suppressed when the device has an active USB data connection because that's a strong signal that someone is debugging the device.
To override this suppression, run:
adb shell setprop persist.sys.enable_rescue 1
From there, you can trigger a system or UI crash loop.
To trigger a low-level system_server
crash loop, run:
adb shell setprop debug.crash_system 1
To trigger a mid-level SystemUI crash loop, run:
adb shell setprop debug.crash_sysui 1
Both crash loops initiate the rescue logic. All rescue operations are also logged to the persistent PackageManager logs stored at /data/system/uiderrors.txt
for later inspection and debugging. These persistent logs are also included in every bug report under the "Package warning messages" section.
Getting I/O status from the kernel
To dump I/O usage from the kernel, use the storaged
command with the -u
option.
Command: storaged -u
Command output format: name/uid fg_rchar fg_wchar fg_rbytes fg_wbytes bg_rchar bg_wchar bg_rbytes bg_wbytes fg_fsync bg_fsync
Using Strace
Building strace
To build strace, run the following:
mmma -j6 external/strace
Attaching to a running process
The simplest and most common use case for strace is to attach it to a running process, which you can do with:
adb shell strace -f -p PID
The -f
flag tells strace to attach to all the threads in the process, plus any new threads spawned later.
Using on an application
To use strace on an application:
- Set up the device so that you can run strace. You need to be root, disable SELinux, and restart the runtime to remove the seccomp filter that will otherwise prevent strace from running:
adb root
adb shell setenforce 0
adb shell stop
adb shell start
- Set up a world-writable directory for strace logs, because strace will be running under the app's uid:
adb shell mkdir -m 777 /data/local/tmp/strace
- Choose the process to trace and launch it:
adb shell setprop wrap.com.android.calendar '"logwrapper strace -f -o /data/local/tmp/strace/strace.com.android.calendar.txt"'
- Launch the process normally.
Using on the zygote
To use strace on the zygote, fix the relevant init.rc
zygote line (requires adb shell setenforce 0
):
cd system/core/
patch -p1 <<EOF --- a/rootdir/init.zygote32.rc +++ b/rootdir/init.zygote32.rc @@ -1,4 +1,4 @@ -service zygote /system/bin/app_process -Xzygote /system/bin --zygote --start-system-server +service zygote /system/bin/strace -o /data/local/tmp/zygote.strace /system/bin/app_process -Xzygote /system/bin --zygote --start-system-server class main socket zygote stream 660 root system onrestart write /sys/android_power/request_state wake EOF
Getting strace logs during Android boot
To get strace logs during Android boot, make the following changes:
- Since the process name changes from
zygote
tostrace
, the given service may fail to start due to the missing SELinuxfile_context
forstrace
. The solution is to add a new line for strace insystem/sepolicy/private/file_contexts
and copy the original file context over. Example:/dev/socket/zygote u:object_r:zygote_socket:s0 + /system/bin/strace u:object_r:zygote_socket:s0
- Add kernel command, then boot the device in SELinux permissive mode. You can do this by adding
androidboot.selinux=permissive
toBOARD_KERNEL_CMDLINE
. (This variable becomes read-only inbuild/core/Makefile
but is always available under/device/*/BoardConfig
.)
Example for the Pixel (sailfish) device in/device/google/marlin/sailfish/BoardConfig.mk
:- BOARD_KERNEL_CMDLINE := .... androidboot.hardware=sailfish ... +BOARD_KERNEL_CMDLINE := .... androidboot.hardware=sailfish ... androidboot.selinux=permissive
After making the change, build and flash the boot image and the device will boot in permissive mode.