Intel Pin实现函数probe以及修改原代码路径

Configure-Handler

已于 2022-08-13 09:54:05 修改

阅读量634

点赞数 1

分类专栏： Linux 应用程序文章标签： bash linux 开发语言

于 2022-06-29 08:33:28 首次发布

本文链接：https://blog.csdn.net/qq_42931917/article/details/125513299

版权

Linux 应用程序专栏收录该内容

2 篇文章

订阅专栏

本文介绍了如何使用Intel Pin工具进行二进制程序的动态跟踪，包括下载、编译、运行示例，展示了如何统计指令执行次数、自定义探针函数获取函数参数和返回值，以及如何在运行时替换函数。此外，还探讨了Pin工具在Go语言二进制文件上的应用限制，并提到了跟踪过程中可能遇到的问题和解决方案。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

一、工具下载

工具下载地址：Pin - A Binary Instrumentation Tool - Downloads (intel.com)

Pin使用手册：Pin: Pin 3.23 User Guide (intel.com)

# 解压即用
tar zxvf pin-3.23-98579-gb15ab7903-gcc-linux.tar.gz

二、sample编译

Building the Example Tools

To build all examples in a directory for ia32 architecture:

$ cd source/tools/ManualExamples
$ make all TARGET=ia32

To build all examples in a directory for intel64 architecture:

$ cd source/tools/ManualExamples
$ make all TARGET=intel64

To build and run a specific example (e.g., inscount0):

$ cd source/tools/ManualExamples
$ make inscount0.test TARGET=intel64

To build a specific example without running it (e.g., inscount0):

$ cd source/tools/ManualExamples
$ make obj-intel64/inscount0.so TARGET=intel64

The above applies to the Intel® 64 architecture. For the IA-32 architecture, use TARGET=ia32 instead.

$ cd source/tools/ManualExamples
$ make obj-ia32/inscount0.so TARGET=ia32

三、工具使用

Simple Instruction Count (Instruction Instrumentation)

The example below instruments a program to count the total number of instructions executed. It inserts a call to docount before every instruction. When the program exits, it saves the count in the file inscount.out.

Here is how to run it and display its output (note that the file list is the ls output, so it may be different on your machine, similarly the instruction count will depend on the implementation of ls):

$ ../../../pin -t obj-intel64/inscount0.so -- /bin/ls
Makefile          atrace.o     imageload.out  itrace      proccount
Makefile.example  imageload    inscount0      itrace.o    proccount.o
atrace            imageload.o  inscount0.o    itrace.out
# 日志输出文件
$ cat inscount.out
Count 422838

The KNOB exhibited in the example below overwrites the default name for the output file. To use this feature, add “-o <file_name>” to the command line. Tool command line options should be inserted between the tool name and the double dash (“–”). For more information on how to add command line options to your tool, please see KNOB: Commandline Option Handling.

# 可以把日志输出到文件
$ ../../../pin -t obj-intel64/inscount0.so -o inscount0.log -- /bin/ls

四、probe自定义函数

目的：获取用户态C语言函数的入参和返回值。

#include <stdio.h>
#include <stdlib.h>

static int max(int num1, int num2)
{
    int result = 0;
    if (num1 > num2) {
        result = num1;
    } else {
        result = num2;
    }
    return result;
}

int main(void)
{
    int ret = 0;
    int a = 100, b = 200;
    
    ret = max(a, b);
    printf("最大值为 %d\n", ret);
}

intel pin调用程序

#include "pin.H"
#include <iostream>
#include <fstream>
using std::cerr;
using std::endl;
using std::hex;
using std::ios;
using std::string;

/* ===================================================================== */
/* Names of max */
/* ===================================================================== */
#define FUNC_MAX "max"

/* ===================================================================== */
/* Global Variables */
/* ===================================================================== */

std::ofstream TraceFile;

/* ===================================================================== */
/* Commandline Switches */
/* ===================================================================== */

KNOB< string > KnobOutputFile(KNOB_MODE_WRITEONCE, "pintool", "o", "mypintool.out", "specify trace file name");

/* ===================================================================== */

/* ===================================================================== */
/* Analysis routines                                                     */
/* ===================================================================== */

VOID Arg1Before(CHAR* name, ADDRINT num1, ADDRINT num2) { TraceFile << name << "(" << num1 << " " << num2 << ")" << endl; }

VOID MallocAfter(ADDRINT ret) { TraceFile << "  returns " << ret << endl; }

/* ===================================================================== */
/* Instrumentation routines                                              */
/* ===================================================================== */

VOID Image(IMG img, VOID* v)
{
    //  Find the max() function.
    RTN mallocRtn = RTN_FindByName(img, FUNC_MAX);
    if (RTN_Valid(mallocRtn))
    {
        RTN_Open(mallocRtn);

        // Instrument max() to print the input argument value and the return value.
        RTN_InsertCall(mallocRtn, IPOINT_BEFORE, (AFUNPTR)Arg1Before, IARG_ADDRINT, FUNC_MAX, IARG_FUNCARG_ENTRYPOINT_VALUE, 0,
                       IARG_FUNCARG_ENTRYPOINT_VALUE, 1, IARG_END);
        RTN_InsertCall(mallocRtn, IPOINT_AFTER, (AFUNPTR)MallocAfter, IARG_FUNCRET_EXITPOINT_VALUE, IARG_END);

	RTN_Close(mallocRtn);
    }
 }

/* ===================================================================== */

VOID Fini(INT32 code, VOID* v) { TraceFile.close(); }

/* ===================================================================== */
/* Print Help Message                                                    */
/* ===================================================================== */

INT32 Usage()
{
    cerr << "This tool produces a trace of calls to max." << endl;
    cerr << endl << KNOB_BASE::StringKnobSummary() << endl;
    return -1;
}

/* ===================================================================== */
/* Main                                                                  */
/* ===================================================================== */

int main(int argc, char* argv[])
{
    // Initialize pin & symbol manager
    PIN_InitSymbols();
    if (PIN_Init(argc, argv))
    {
        return Usage();
    }

    // Write to a file since cout and cerr maybe closed by the application
    TraceFile.open(KnobOutputFile.Value().c_str());
    TraceFile << hex;
    TraceFile.setf(ios::showbase);

    // Register Image to be called to instrument functions.
    IMG_AddInstrumentFunction(Image, 0);
    PIN_AddFiniFunction(Fini, 0);

    // Never returns
    PIN_StartProgram();

    return 0;
}

/* ===================================================================== */
/* eof */
/* ===================================================================== */

$ gcc test.c -o max
$ make obj-intel64/mypintool.so TARGET=intel64
$ cd pin-3.23-98579-gb15ab7903-gcc-linux/source/tools/ManualExample
$ ../../../pin -t obj-intel64/mypintool.so -- ./max
$ cat mypintool.out
max(0x64 0xc8)
  returns 0xc8

从上面例子来看，Pin成功获取到了函数的入参100和200，返回值为较大数200，这个功能有点类似于内核的kprobe的功能。

通过上面的介绍，可以attach上将要被执行的进程上，而实际情况却是进程已经在跑了，希望可以在进程运行时动态的attach，下面简单介绍一下如何attach上正在执行的进程。
测试程序可以改个玩法，通过死循环的方式持续调用max，然后将pin attach上被测进程。

#include <stdio.h>
#include <stdlib.h>

static int max(int num1, int num2)
{
    int result = 0;
    if (num1 > num2) {
        result = num1;
    } else {
        result = num2;
    }
    return result;
}

int main(void)
{
    int ret = 0;
    int a = 100, b = 200;
    while(1) {
    	ret = max(a, b);
    	printf("最大值为 %d\n", ret);
    }
}

$ ../../../pin -pid -t obj-intel64/mypintool.so

五、改变函数执行流程

/*
 * Copyright (C) 2006-2021 Intel Corporation.
 * SPDX-License-Identifier: MIT
 */

//  Replace an original function with a custom function defined in the tool using
//  probes.  The replacement function has a different signature from that of the
//  original replaced function.

#include "pin.H"
#include <iostream>
using std::cerr;
using std::cout;
using std::dec;
using std::endl;
using std::flush;
using std::hex;

typedef INT32 (*FP_MAX)(INT32, INT32);

// This is the replacement routine.
//
INT32 NewMax(FP_MAX orgFuncptr, INT32 arg0, INT32 arg1, ADDRINT returnIp)
{
    // Normally one would do something more interesting with this data.
    //
    cout << "NewMax (" << hex << ADDRINT(orgFuncptr) << ", " << dec << arg0 << ", " << arg1 << ", " << hex << returnIp << ")" << endl << flush;

    // Call the relocated entry point of the original (replaced) routine.
    //
    INT32 v = orgFuncptr(arg0, arg1);

    return v;
}

// Pin calls this function every time a new img is loaded.
// It is best to do probe replacement when the image is loaded,
// because only one thread knows about the image at this time.
//
VOID ImageLoad(IMG img, VOID* v)
{
    // See if malloc() is present in the image.  If so, replace it.
    //
    RTN rtn = RTN_FindByName(img, "max");

    if (RTN_Valid(rtn))
    {
        if (RTN_IsSafeForProbedReplacement(rtn))
        {
            cout << "Replacing malloc in " << IMG_Name(img) << endl;

            // Define a function prototype that describes the application routine
            // that will be replaced.
            //
            PROTO proto_malloc = PROTO_Allocate(PIN_PARG(void*), CALLINGSTD_DEFAULT, "max", PIN_PARG(int), PIN_PARG_END());

            // Replace the application routine with the replacement function.
            // Additional arguments have been added to the replacement routine.
            //
            RTN_ReplaceSignatureProbed(rtn, AFUNPTR(NewMax), IARG_PROTOTYPE, proto_malloc, IARG_ORIG_FUNCPTR,
                                       IARG_FUNCARG_ENTRYPOINT_VALUE, 0, IARG_FUNCARG_ENTRYPOINT_VALUE, 1, IARG_RETURN_IP, IARG_END);

            // Free the function prototype.
            //
            PROTO_Free(proto_malloc);
        }
        else
        {
            cout << "Skip replacing malloc in " << IMG_Name(img) << " since it is not safe." << endl;
        }
    }
}

/* ===================================================================== */
/* Print Help Message                                                    */
/* ===================================================================== */

INT32 Usage()
{
    cerr << "This tool demonstrates how to replace an original" << endl;
    cerr << " function with a custom function defined in the tool " << endl;
    cerr << " using probes.  The replacement function has a different " << endl;
    cerr << " signature from that of the original replaced function." << endl;
    cerr << endl << KNOB_BASE::StringKnobSummary() << endl;
    return -1;
}

/* ===================================================================== */
/* Main: Initialize and start Pin in Probe mode.                         */
/* ===================================================================== */

int main(INT32 argc, CHAR* argv[])
{
    // Initialize symbol processing
    //
    PIN_InitSymbols();

    // Initialize pin
    //
    if (PIN_Init(argc, argv)) return Usage();

    // Register ImageLoad to be called when an image is loaded
    //
    IMG_AddInstrumentFunction(ImageLoad, 0);

    // Start the program in probe mode, never returns
    //
    PIN_StartProgramProbed();

    return 0;
}

$ cd pin-3.23-98579-gb15ab7903-gcc-linux/source/tools/ManualExample
$ ../../../pin -t obj-intel64/mypintool.so -- ./max
Replacing malloc in /home/curtis/pin-3.23-98579-gb15ab7903-gcc-linux/source/tools/ManualExamples/max
NewMax (7fb7f4c02000, 100, 200, 55a0c12d01a9)
最大值为 200

从运行的结果上来看，成功probe了自定义函数max，且可以正常获取函数传入的参数，当然也可以根据需求，返回自定义的值。

六、恢复被跟踪的函数流程

如果需要Detaching Intel Pin需要使用PIN_Detach，使用示例如下：

// This tool shows how to detach Pin from an
// application that is under Pin's control.
 
UINT64 icount = 0;
 
#define N 10000
VOID docount()
{
    icount++;
 
    // Release control of application if 10000
    // instructions have been executed
    if ((icount % N) == 0)
    {
        PIN_Detach();
    }
}

从上面的例子来看，需要在自定义函数中显式的调用，PIN_Detach()函数，便可以恢复函数的正常执行流程。
但是在实际使用的过程中，肯定是希望可以动态的Detach，想到的解决办法如下：

flag写在全局变量中，读环境变量。
flag写在配置文件中，读配置文件。
定义一个flag全局变量，通过gdb的方式attach上被测进程，修改全局的方式detach。

七、追踪Go编译的二进制文件

既然Go语言编译完成之后其实也是一个Elf文件，那Pin是否也试用于Go语言呢？写个小用例测试一下。

package main

import "fmt"

func max(num1 , num2 int) int {
        var result int = 0
        if num1 > num2 {
                result = num1
        } else {
                result = num2
        }
        return result
}

func main() {
        a , b := 100, 200
        var ret int = 0
        ret = max(a, b)
        fmt.Printf("max = %d\n", ret)
}

# 编译go程序，添加-gcflags=-l关闭内联优化
$ go build -gcflags=-l main.go
$ nm | grep max
000000000047f5a0 T main.max

按照C/C++参数传递的方式获取Go语言的参数，得到的结果是错误的，但也是可以正常拦住目标函数。

INT32 NewMax(FP_MAX orgFuncptr, INT32 arg0, INT32 arg1, ADDRINT returnIp)
{
    // Normally one would do something more interesting with this data.
    //
    cout << "NewMax (" << hex << ADDRINT(orgFuncptr) << ", " << dec << arg0 << ", " << arg1 << ", " << hex << returnIp << ")" << endl << flush;

    // Call the relocated entry point of the original (replaced) routine.
    //
    INT32 v = orgFuncptr(arg0, arg1);

    return v;
}

//NewMax (7f6abbda9000, 0, 1, 47f5e5)
//max = 2000

原因分析：gcc编译的c/c++代码一般通过寄存器传递参数，在AMD64 Linux 平台，gcc约定函数调用时前面6个参数分别通过rdi, rsi, rdx, r10, r8及r9传递；而go语言函数调用时参数是通过栈传递给被调用函数的，最后一个参数最先入栈，第一个参数最后入栈，参数在调用者的栈帧之中，被调用函数通过rsp加一定的偏移量来获取参数；也就是要通过读取栈帧的方式获取给函数传递的参数。