Intel Pin学习笔记

无名氏a

已于 2022-07-14 10:00:40 修改

阅读量743

点赞数 1

分类专栏：研究方向文章标签： c++ linux

于 2020-07-23 08:39:39 首次发布

本文链接：https://blog.csdn.net/shanlijia/article/details/107524369

版权

研究方向专栏收录该内容

23 篇文章 0 订阅

订阅专栏

Pin
Pin 是什么？ 在一开始，可以将 Pin 理解为一种 Compiler。但是不同于传统的 compiler，Pin 的输入是可执行文件。Pin 根据我们的需求对可执行文件进行 compile 从而产生了新的可执行文件。为了满足我们的需求，Pin 提供了一系列API。

Pintool
为了实现我们的需求，需要知道：

在原始可执行文件的什么位置插入代码（检测，instrumentation）
插入什么代码（分析，analysis）
这就是 Pintool。所以Pintool可以理解成在 Pin 重新生成可执行文件过程中，告诉 Pin 在某些位置插入某些代码。比如当前我所用的 zsim 就是一个 Pintool。

由于Pin、Pintool 和可执行文件应当在同一个地址空间，Pintool 可以对整个可执行文件操作。

Instrumentation 粒度
Trace Instrumentation
调用 API：TRACE_AddInstrumentFunction

首先，需要理解 pin 里面 Trace 和 BBL 的概念。Trace 开始于一段程序的入口(branch target)，结束于unditional branch, 比如调用或者返回。Trace 入口只有一个，出口可能有多个。如果一个 Trace 内部存在某个 branch target，那么从该 branch target 开始的代码段是一个新的 trace。BBL，basic block，是一个单入口单出口的代码段。同样，一个 BBL 内部可能开始一个新的 BBL。这里的 BBL 和 compile 里面的 BBL 略有不同，细节可以参考官方提供的 Pin 文档。需要注意的是，此种模式下，同一条指令会存在于多个 Trace 或 BBL 中。

Trace Instrumentation 在一个Trace 或者 BBL 第一次执行之前开始。

2.Instruction Instrumentation

调用 API：INS_AddInstrumentFunction

一次对一条 Instruction 做检测。

3.Image Instrumentation 和 Routine Instrumentation

Pintool可以遍历 image 的 sections，sections 的 routines 以及 routine 的 instructions。

Image Instrumentation 可以让 Pintool 能够在整个 image 第一次加载的时候对其进行检测。由于 image instrumentation 需要 symbol information 来确定 routine boundaries，因此在调用 PIN_Init 之前需要先调用 PIN_InitSymbols。

Image Instrumentation 调用 API：IMG_AddInstumentFunction

在第一次加载含有 routine 的 image 时，Routine Instrumentation 可以让 Pintool 对整个 routine 进行检测。调用 API：RTN_AddInstrumentFunction 。当Pintool 在 image instrumentation 需要对 sections 和 routines 遍历时，routine instrumentation 是一个更便捷的实现。

Symbols
Pin 通过 symbol object（SYM）提供对 function names 的访问。SYM 仅仅提供关于 function symbols 的信息，其他类型的 symbols（比如 data symbols）的信息需要在 tool 里面单独获取。

Linux 提供 libelf.so 或者 libdwarf.so 获取 symbol information。

在通过 function name 访问 function 时，需调用 PIN_InitSymbols。

检测 Multi-threaded Applications 以及避免 Deadlock
在面对多线程时， Pin 提供了自有的锁与线程管理 API，Pintool 只能使用这些API。

对于多线程，Pin 在每个 thread 开始和结束提供 call-back（PIN_AddThreadStartFunction 和 PIN_Add_TreadFIniFunction）。这些 call-back使得 Pintool 方便 allocate 和 manipulate 线程的本地数据以及将这些数据存放在 thread-local storage（TLS）。

Pin 还提供 Pin-specific thread ID，这个不同于操作系统的 thread ID。这个 ID 可以被视为是线程数据或者 Pin 用户锁数据的索引值。

为避免 Deadlock ，Pin 要求在获取 lock 时遵循一定顺序（假设应用程序也有lock）。那么lock获取顺序应该是

应用程度 lock -> Pin internal lock -> Pintool Lock

在 Pintool 获取 lock 之前，Pin 通常或获得自己的内部锁（internal lock，通过 PIN_GetLock) 。

后续会具体去谈 Pin 关于 Multi-thread 部分。

注：所有内容基于 Pin 3.2，不同 Pin 版本之间存在一定差异。

基于提供的示例 source/tools/ManualExamples。Tool生成生成

所有 Tool examples，基于 Intel64 架构$ cd source/tools/ManualExamples
$ make all TARGET=intel64生成某一个 example 并且运行$ cd source/tools/ManualExamples
$ make inscount0.test TARGET=intel64
生成一个 example 但不运行$ cd source/tools/ManualExamples
$ make obj-intel64/inscount0.so TARGET=intel64
示例 1：统计 instruction 数量（instruction instrumentation）$ …/…/…/pin -t obj-intel64/inscount0.so – /bin/ls
Makefile atrace.o imageload.out itrace proccount
Makefile.example imageload inscount0 itrace.o proccount.o
atrace imageload.o inscount0.o itrace.out
$ cat inscount.out
Count 422838
$上述第一行 p i n t o o l i n s c o u n t 0 进行了检测和分析，并将内容输出到 i n s c o u n t . o u t 中。可以通过添加 o p t i o n 将分析结果输出到指定文件，比如$ …/…/…/pin -t obj-intel64/inscount0.so -o inscount0.log – /bin/ls将结果输出到 inscount0.log 中。源代码如下，代码位置 source/tools/ManualExamples/inscount0.cpp#include
#include
#include “pin.H”

ofstream OutFile;

// The running count of instructions is kept here
// make it static to help the compiler optimize docount
static UINT64 icount = 0;

// This function is called before every instruction is executed
VOID docount() { icount++; }

// Pin calls this function every time a new instruction is encountered
VOID Instruction(INS ins, VOID *v)
{
// Insert a call to docount before every instruction, no arguments are passed
INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_END);
}

KNOB KnobOutputFile(KNOB_MODE_WRITEONCE, “pintool”,
“o”, “inscount.out”, “specify output file name”);

// This function is called when the application exits
VOID Fini(INT32 code, VOID *v)
{
// Write to a file since cout and cerr maybe closed by the application
OutFile.setf(ios::showbase);
OutFile << "Count " << icount << endl;
OutFile.close();
}

/* ===================================================================== /
/ Print Help Message /
/ ===================================================================== */

INT32 Usage()
{
cerr << “This tool counts the number of dynamic instructions executed” << endl;
cerr << endl << KNOB_BASE::StringKnobSummary() << endl;
return -1;
}

/* ===================================================================== /
/ Main /
/ ===================================================================== /
/ argc, argv are the entire command line: pin -t – … /
/ ===================================================================== */

int main(int argc, char * argv[])
{
// Initialize pin
if (PIN_Init(argc, argv)) return Usage();

OutFile.open(KnobOutputFile.Value().c_str());

// Register Instruction to be called to instrument instructions
INS_AddInstrumentFunction(Instruction, 0); //这行代码是关键,Instruction实现在上面

// Register Fini to be called when the application exits
PIN_AddFiniFunction(Fini, 0);

// Start the program, never returns
PIN_StartProgram();

return 0;

}
**示例 2：指令地址追踪
上述示例中，**docount（analysis 部分）函数没有接受任何任何参数。这个例子将展示如何向 analysis 部分传递参数。Pin 可以传递很多类型参数（完整列表参考 IARG_TYPE），比如 instruction pointer，current value of registers， effective address of memory operations， constants 等等。示例运行和观察输出，输出文件 itrace.out$ …/…/…/pin -t obj-intel64/itrace.so – /bin/ls
Makefile atrace.o imageload.out itrace proccount
Makefile.example imageload inscount0 itrace.o proccount.o
atrace imageload.o inscount0.o itrace.out
$ head itrace.out
0x40001e90
0x40001e91
0x40001ee4
0x40001ee5
0x40001ee7
0x40001ee8
0x40001ee9
0x40001eea
0x40001ef0
0x40001ee0
$源文件 source/tools/ManualExamples/itrace.cpp#include <stdio.h>
#include “pin.H”

FILE * trace;

// This function is called before every instruction is executed
// and prints the IP
//传递指令地址
VOID printip(VOID *ip) { fprintf(trace, “%p\n”, ip); }

// Pin calls this function every time a new instruction is encountered
VOID Instruction(INS ins, VOID *v)
{
// Insert a call to printip before every instruction, and pass it the IP
//示例 1 中的 docount 换成 printip
INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)printip, IARG_INST_PTR, IARG_END);
}

// This function is called when the application exits
VOID Fini(INT32 code, VOID *v)
{
fprintf(trace, “#eof\n”);
fclose(trace);
}

/* ===================================================================== /
/ Print Help Message /
/ ===================================================================== */

INT32 Usage()
{
PIN_ERROR(“This Pintool prints the IPs of every instruction executed\n”
+ KNOB_BASE::StringKnobSummary() + “\n”);
return -1;
}

/* ===================================================================== /
/ Main /
/ ===================================================================== */

int main(int argc, char * argv[])
{
trace = fopen(“itrace.out”, “w”);

// Initialize pin
if (PIN_Init(argc, argv)) return Usage();

// Register Instruction to be called to instrument instructions
INS_AddInstrumentFunction(Instruction, 0);

// Register Fini to be called when the application exits
PIN_AddFiniFunction(Fini, 0);

// Start the program, never returns
PIN_StartProgram();

return 0;

}
示例 3：内存引用追踪（Memory Reference Trace）上述示例 instrument 所有的 instructions，但是有时只希望 instrument 某一类 instructions，比如内存操作。为此，可以利用 Pin 提供的，用来 classify 和 examine 指令的API。（对所有指令集的基本 API，和对某个指令集的特殊 API）本示例通过 examining 指令来选择性的 instrument instruction。Tool对程序引用的内存地址进行了追踪并输出。同时，这里调用 INS_InsertPredicatedCall 而不是 INS_InsertCall 避免在 the predicate 是 false 的情况下，产生对 predicated instructions 的引用。使用示例$ …/…/…/pin -t obj-intel64/pinatrace.so – /bin/ls
Makefile atrace.o imageload.o inscount0.o itrace.out
Makefile.example atrace.out imageload.out itrace proccount
atrace imageload inscount0 itrace.o proccount.o
$ head pinatrace.out
0x40001ee0: R 0xbfffe798
0x40001efd: W 0xbfffe7d4
0x40001f09: W 0xbfffe7d8
0x40001f20: W 0xbfffe864
0x40001f20: W 0xbfffe868
0x40001f20: W 0xbfffe86c
0x40001f20: W 0xbfffe870
0x40001f20: W 0xbfffe874
0x40001f20: W 0xbfffe878
0x40001f20: W 0xbfffe87c
$源代码 source/tools/ManualExamples/pinatrace.cpp/*

This file contains an ISA-portable PIN tool for tracing memory accesses.
*/

#include <stdio.h>
#include “pin.H”

FILE * trace;

// Print a memory read record
VOID RecordMemRead(VOID * ip, VOID * addr)
{
fprintf(trace,“%p: R %p\n”, ip, addr);
}

// Print a memory write record
VOID RecordMemWrite(VOID * ip, VOID * addr)
{
fprintf(trace,“%p: W %p\n”, ip, addr);
}

// Is called for every instruction and instruments reads and writes
VOID Instruction(INS ins, VOID *v)
{
// Instruments memory accesses using a predicated call, i.e.
// the instrumentation is called iff the instruction will actually be executed.
//
// On the IA-32 and Intel® 64 architectures conditional moves and REP
// prefixed instructions appear as predicated instructions in Pin.
UINT32 memOperands = INS_MemoryOperandCount(ins);

// Iterate over each memory operand of the instruction.
for (UINT32 memOp = 0; memOp < memOperands; memOp++)
{
    if (INS_MemoryOperandIsRead(ins, memOp))
    {
        //只对 MemOp 做 instrument 并加入 analysis 函数 RecordMemWrite
        INS_InsertPredicatedCall(
            ins, IPOINT_BEFORE, (AFUNPTR)RecordMemRead,
            IARG_INST_PTR,
            IARG_MEMORYOP_EA, memOp,
            IARG_END);
    }
    // Note that in some architectures a single memory operand can be 
    // both read and written (for instance incl (%eax) on IA-32)
    // In that case we instrument it once for read and once for write.
    if (INS_MemoryOperandIsWritten(ins, memOp))
    {
        INS_InsertPredicatedCall(
            ins, IPOINT_BEFORE, (AFUNPTR)RecordMemWrite,
            IARG_INST_PTR,
            IARG_MEMORYOP_EA, memOp,
            IARG_END);
    }
}

}

VOID Fini(INT32 code, VOID *v)
{
fprintf(trace, “#eof\n”);
fclose(trace);
}

/* ===================================================================== /
/ Print Help Message /
/ ===================================================================== */

INT32 Usage()
{
PIN_ERROR( “This Pintool prints a trace of memory addresses\n”
+ KNOB_BASE::StringKnobSummary() + “\n”);
return -1;
}

/* ===================================================================== /
/ Main /
/ ===================================================================== */

int main(int argc, char *argv[])
{
if (PIN_Init(argc, argv)) return Usage();

trace = fopen("pinatrace.out", "w");

INS_AddInstrumentFunction(Instruction, 0); //每个 instruction 都会调用
PIN_AddFiniFunction(Fini, 0);

// Never returns
PIN_StartProgram();

return 0;

}

示例 4：更高效的指令数统计（Trace Instrumentation）

示例 1 中通过在每一个 instruction 之前插入调用来统计执行指令数目。本示例在 instrument 时统计 BBL 的指令数目。源码source/tools/ManualExamples/inscount1.cpp

#include <iostream>
#include <fstream>
#include "pin.H"

ofstream OutFile;

// The running count of instructions is kept here
// make it static to help the compiler optimize docount
static UINT64 icount = 0;

// This function is called before every block
VOID docount(UINT32 c) { icount += c; }
    
// Pin calls this function every time a new basic block is encountered
// It inserts a call to docount
VOID Trace(TRACE trace, VOID *v)
{
    // Visit every basic block  in the trace
    //访问 BBL
    for (BBL bbl = TRACE_BblHead(trace); BBL_Valid(bbl); bbl = BBL_Next(bbl))
    {
        // Insert a call to docount before every bbl, passing the number of instructions
        BBL_InsertCall(bbl, IPOINT_BEFORE, (AFUNPTR)docount, IARG_UINT32, BBL_NumIns(bbl), IARG_END);
    }
}

KNOB<string> KnobOutputFile(KNOB_MODE_WRITEONCE, "pintool",
    "o", "inscount.out", "specify output file name");

// This function is called when the application exits
VOID Fini(INT32 code, VOID *v)
{
    // Write to a file since cout and cerr maybe closed by the application
    OutFile.setf(ios::showbase);
    OutFile << "Count " << icount << endl;
    OutFile.close();
}

/* ===================================================================== */
/* Print Help Message                                                    */
/* ===================================================================== */

INT32 Usage()
{
    cerr << "This tool counts the number of dynamic instructions executed" << endl;
    cerr << endl << KNOB_BASE::StringKnobSummary() << endl;
    return -1;
}

/* ===================================================================== */
/* Main                                                                  */
/* ===================================================================== */

int main(int argc, char * argv[])
{
    // Initialize pin
    if (PIN_Init(argc, argv)) return Usage();

    OutFile.open(KnobOutputFile.Value().c_str());

    // Register Instruction to be called to instrument instructions
    TRACE_AddInstrumentFunction(Trace, 0); //调用 trace instrumentation API

    // Register Fini to be called when the application exits
    PIN_AddFiniFunction(Fini, 0);
    
    // Start the program, never returns
    PIN_StartProgram();
    
    return 0;
}

关于 Image Instrumentation 和 Routine Instrumentation 的示例可以参考官方文档。
Instrumentation OrderPin 提供多种方式来让 Pintool 可以控制 analysis 函数的调用执行顺序。
执行顺序主要是取决于 insertion action（IPOINT）和 call order (CALL_ORDER)。

下面示例通过用三种不同方式instrument 所有返回指令来表现 execution order 控制。运行示例$ …/…/…/pin -t obj-ia32/invocation.so – obj-ia32/little_malloc
$ head invocation.out
After: IP = 0x64bc5e
Before: IP = 0x64bc5e
Taken: IP = 0x63a12e
After: IP = 0x64bc5e
Before: IP = 0x64bc5e
Taken: IP = 0x641c76
After: IP = 0x641ca6
After: IP = 0x64bc5e
Before: IP = 0x64bc5e
Taken: IP = 0x648b02
源码：source/tools/ManualExamples/invocation.cpp
主要注意源码中 RTN_InsertCall，INS_InsertCall，以及其参数 IPOINT_AFTER, IPOINT_BEFORE 以及 IPOINT_TAKEN_BRANCH。
#include “pin.H”
#include
#include
using namespace std;

KNOB KnobOutputFile(KNOB_MODE_WRITEONCE, “pintool”,
“o”, “invocation.out”, “specify output file name”);

ofstream OutFile;

Analysis routines
*/
VOID Taken( const CONTEXT * ctxt)
{
ADDRINT TakenIP = (ADDRINT)PIN_GetContextReg( ctxt, REG_INST_PTR );
OutFile << "Taken: IP = " << hex << TakenIP << dec << endl;
}

VOID Before(CONTEXT * ctxt)
{
ADDRINT BeforeIP = (ADDRINT)PIN_GetContextReg( ctxt, REG_INST_PTR);
OutFile << "Before: IP = " << hex << BeforeIP << dec << endl;
}

VOID After(CONTEXT * ctxt)
{
ADDRINT AfterIP = (ADDRINT)PIN_GetContextReg( ctxt, REG_INST_PTR);
OutFile << "After: IP = " << hex << AfterIP << dec << endl;
}

Instrumentation routines
*/
VOID ImageLoad(IMG img, VOID *v)
{
for (SEC sec = IMG_SecHead(img); SEC_Valid(sec); sec = SEC_Next(sec))
{
// RTN_InsertCall() and INS_InsertCall() are executed in order of
// appearance. In the code sequence below, the IPOINT_AFTER is
// executed before the IPOINT_BEFORE.
for (RTN rtn = SEC_RtnHead(sec); RTN_Valid(rtn); rtn = RTN_Next(rtn))
{
// Open the RTN.
RTN_Open( rtn );

     // IPOINT_AFTER is implemented by instrumenting each return
     // instruction in a routine.  Pin tries to find all return
     // instructions, but success is not guaranteed.
     RTN_InsertCall( rtn, IPOINT_AFTER, (AFUNPTR)After,
                     IARG_CONTEXT, IARG_END);
     
     // Examine each instruction in the routine.
     for( INS ins = RTN_InsHead(rtn); INS_Valid(ins); ins = INS_Next(ins) )
     {
         if( INS_IsRet(ins) )
         {
             // instrument each return instruction.
             // IPOINT_TAKEN_BRANCH always occurs last.
             INS_InsertCall( ins, IPOINT_BEFORE, (AFUNPTR)Before,
                            IARG_CONTEXT, IARG_END);
             INS_InsertCall( ins, IPOINT_TAKEN_BRANCH, (AFUNPTR)Taken,
                            IARG_CONTEXT, IARG_END);
         }
     }
     // Close the RTN.
     RTN_Close( rtn );
 }

}
}

VOID Fini(INT32 code, VOID *v)
{
OutFile.close();
}

/* ===================================================================== /
/ Print Help Message /
/ ===================================================================== */

INT32 Usage()
{
cerr << “This is the invocation pintool” << endl;
cerr << endl << KNOB_BASE::StringKnobSummary() << endl;
return -1;
}

/* ===================================================================== /
/ Main /
/ ===================================================================== */

int main(int argc, char * argv[])
{
// Initialize pin & symbol manager
if (PIN_Init(argc, argv)) return Usage();
PIN_InitSymbols();

// Register ImageLoad to be called to instrument instructions
IMG_AddInstrumentFunction(ImageLoad, 0);
PIN_AddFiniFunction(Fini, 0);

// Write to a file since cout and cerr maybe closed by the application
OutFile.open(KnobOutputFile.Value().c_str());
OutFile.setf(ios::showbase);

// Start the program, never returns
PIN_StartProgram();

return 0;

}
/* ===================================================================== */

无名氏a

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
Intel Pin学习笔记

PinPin 是什么？在一开始，可以将 Pin 理解为一种 Compiler。但是不同于传统的 compiler，Pin 的输入是可执行文件。Pin 根据我们的需求对可执行文件进行 compile 从而产生了新的可执行文件。为了满足我们的需求，Pin 提供了一系列API。Pintool为了实现我们的需求，需要知道：在原始可执行文件的什么位置插入代码（检测，instrumentation）插入什么代码（分析，analysis）这就是 Pintool。所以Pintool可以理解成在 Pin
复制链接

扫一扫