IBM utrace ptrace uprobes论文

Ptrace, Utrace, Uprobes: Lightweight, Dynamic Tracing of User Apps

Jim Keniston Ananth Mavinakayanahalli Prasanna Panchamukhi

IBM IBM IBM

[email protected] [email protected] [email protected]

Vara Prasad

IBM

[email protected]

Abstract

Details of the design and implementation of uprobes

form the major portion of this paper.

The ptrace system-call API, though useful for many

tools such as gdb and strace, generally proves unsat-

isfactory when tracing multithreaded or multi-process

applications, especially in timing-dependent debugging

scenarios. With the utrace kernel API, a kernel-side

instrumentation module can track interesting events in

traced processes. The uprobes kernel API exploits

and extends utrace to provide kprobes-like, breakpoint-

based probing of user applications.

We describe how utrace, uprobes, and kprobes together

provide an instrumentation facility that overcomes some

limitations of ptrace. For attendees, familiarity with a

tracing API such as ptrace or kprobes will be helpful

but not essential.

1

Introduction

For a long time now, debugging user-space applications

has been dependent on the ptrace system call. Though

ptrace has been very useful and will almost certainly

continue to prove its worth, some of the requirements it

imposes on its clients are considered limiting. One im-

portant limitation is performance, which is influenced

by the high context-switch overheads inherent in the

ptrace approach.

We start by discussing the current situation in the user-

space tracing world. Sections 2 and 3 discuss the var-

ious instrumentation approaches possible and/or avail-

able. Section 4 goes on to discuss the goals that led

to the current uprobes design, while Section 5 details

the implementation. In the later sections, we put forth

some of the challenges, especially with regard to mod-

ifying text and handling of multithreaded applications.

Further, there is a brief discussion on how and where

we envision this infrastructure can be put to use. We fi-

nally conclude with a discussion on where this work is

headed.

2

Ptrace-based Application Tracing

Like many other flavors of UNIX, Linux provides the

ptrace system-call interface for tracing a running pro-

cess. This interface was designed mainly for debugging,

but it has been used for tracing purposes as well. This

section surveys some of the ptrace-based tracing tools

and presents limitations of the ptrace approach for low-

impact tracing.

Ptrace supports the following types of requests:

The utrace patchset [3] mitigates this to a large extent.

Utrace provides in-kernel callbacks for the same sorts

of events reported by ptrace. The utrace patchset re-

implements ptrace as a client of utrace.

Uprobes is another utrace client. Analogous to kprobes

for the Linux R kernel, uprobes provides a simple, easy-

to-use API to dynamically instrument user applications.

• 215 •

• Attach to, or detach from, the process being traced

(the “tracee”).

• Read or write the process’s memory, saved regis-

ters, or user area.

• Continue execution of the process, possibly until a

particular type of event occurs (e.g., a system call

is called or returns).

216 • Ptrace, Utrace, Uprobes: Lightweight, Dynamic Tracing of User Apps

• Ptrace is not a POSIX system call. Its behavior

varies from operating system to operating system,

and has even varied from version to version in

Linux. Vital operational details (“Why do I get two

SIGCHLDs here? Am I supposed to pass the pro-

cess a SIGCONT or no signal at all here?”) are not

documented, and are not easily gleaned from the

kernel source.

Events in the tracee turn into SIGCHLD signals that

are delivered to the tracing process. The associated

siginfo_t specifies the type of event.

2.1

Gdb

gdb is the most widely used application debugger in

Linux, and it runs on other flavors of UNIX as well.

gdb controls the program to be debugged using ptrace

requests. gdb is used mostly as an interactive debugger,

but also provides a batch option through which a series

of gdb commands can be executed, without user inter-

vention, each time a breakpoint is hit. This method of

tracing has significant performance overhead. gdb’s ap-

proach to tracing multithreaded applications is to stop

all threads whenever any thread hits a breakpoint.

2.2

Ltrace

The ltrace command is similar to strace, but it traces

calls and returns from dynamic library functions. It can

also trace system calls, and extern functions in the traced

program itself. ltrace uses ptrace to place breakpoints at

the entry point and return address of each probed func-

tion. ltrace is a useful tool, but it suffers from the per-

formance limitations inherent in ptrace-based tools. It

also appears not to work for multithreaded programs.

2.4

• Overheads associated with accessing the tracee’s

memory and registers are enormous—on the order

of 10x to 100x or more, compared with equivalent

in-kernel access. Ptrace’s PEEK-and-POKE inter-

face provides very low bandwidth and incurs nu-

merous context switches.

Strace

The strace command provides the ability to trace calls

and returns from all the system calls executed by

the traced process. strace exploits ptrace’s PTRACE_

SYSCALL request, which directs ptrace to continue ex-

ecution of the tracee until the next entry or exit from a

system call. strace handles multithreaded applications

well, and it has significantly less performance overhead

than the gdb scripting method; but performance is still

the number-one complaint about strace. Using strace

to trace itself shows that each system call in the tracee

yields several system calls in strace.

2.3

• The amount of perseverence and/or luck you need

to get a working program goes up as you try to

monitor more than one process or more than one

thread.

Ptrace Limitations

If gdb, strace, and ltrace don’t give you the type of in-

formation you’re looking for, you might consider writ-

ing your own ptrace-based tracing tool. But consider the

following ptrace limitations first:

• In order to trace a process, the tracer must become

the tracee’s parent. To attach to an already run-

ning process, then, the tracer must muck with the

tracee’s lineage. Also, if you decide you want to

apply more instrumentation to the same process,

you have to detach the tracer already in place.

3

Kernel-based Tracing

In the early days of Linux, the kernel code base was

manageable and most people working on the kernel

knew their core areas intimately. There was a definite

pushback from the kernel community towards including

any tracing and/or debugging features in the mainline

kernel.

Over time, Linux became more popular and the num-

ber of kernel contributors increased. A need for a flexi-

ble tracing infrastructure was recognized. To that end, a

number of projects sprung up and have achieved varied

degrees of success.

We will look at a few of these projects in this section.

Most of them are based on the kernel-module approach.

3.1

Kernel-module approach

The common thread among the following approaches

is that the instrumentation code needs to run in kernel

2007 Linux Symposium, Volume One • 217

mode. Since we wouldn’t want to burden the kernel at • Event reporting: Utrace clients register callbacks

  times when the instrumentation isn’t in use, such code to be run when the thread encounters specific

    is introduced only when needed, in the form of a kernel events of interest. These include system call en-

    module. try/exit, signals, exec, clone, exit, etc.

3.1.1 • Thread control: Utrace clients can inject signals,

       request that a thread be stopped fr

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值