使用Valgrind关于link_main.c的探索及完成memcheck、对中间语言VEX的提取

最新推荐文章于 2023-11-13 23:01:57 发布

咫尺or天涯

最新推荐文章于 2023-11-13 23:01:57 发布

阅读量1.9w

点赞数 1

文章标签： VEX Valgrind Linux 中间语言

本文链接：https://blog.csdn.net/qq_41124933/article/details/80432335

版权

一、查看Writing a New Valgrind Tool（2.2.4）

http://www.valgrind.org/docs/manual/writing-tools.html#writing-tools.writingcode

2.2.4. Writing the code

A tool must define at least these four functions:

pre_clo_init() post_clo_init() instrument() fini()

The names can be different to the above, but these are the usual names. The first one is registered using the macro

VG_DETERMINE_INTERFACE_VERSION. The last three are registered using the VG_(basic_tool_funcs) function.

In addition, if a tool wants to use some of the optional services provided by the core, it may have to define other functions and tell the core about them.

2.2.5. Initialisation

Most of the initialisation should be done in pre_clo_init. Only use post_clo_init if a tool provides command line options and must do some initialisation after option processing takes place ("clo" stands for "command line options").

First of all, various "details" need to be set for a tool, using the functions VG_(details__*). Some are all compulsory, some aren’t. Some are used when constructing the startup message, detail_bug_reports_to is used if VG_(tool_panic) is ever called, or a tool assertion fails. Others have other uses.

Second, various "needs" can be set for a tool, using the functions VG_(needs__*). They are mostly booleans, and can be left untouched (they default to False). They determine whether a tool can do various things such as: record, report and suppress errors; process command line options; wrap system calls; record extra information about heap blocks; etc.

For example, if a tool wants the core’s help in recording and reporting errors, it must call VG_(needs_tool_errors) and provide definitions of eight functions for comparing errors, printing out errors, reading suppressions from a suppressions file, etc. While writing these functions requires some work, it’s much less than doing error handling from scratch because the core is doing most of the work.

Third, the tool can indicate which events in core it wants to be notified about, using the functions VG_(track__*). These include things such as heap blocks being allocated, the stack pointer changing, a mutex being locked, etc. If a tool wants to know about this, it should provide a pointer to a function, which will be called when that event happens.

For example, if the tool want to be notified when a new heap block is allocated, it should call VG_(track_new_mem_heap) with an appropriate function pointer, and the assigned function will be called each time this happens.

More information about "details", "needs" and "trackable events" can be found in include/pub_tool_tooliface.h.

2.2.6. Instrumentation

instrument is the interesting one. It allows you to instrument VEX IR, which is Valgrind’s RISC-like intermediate language. VEX IR is described in the comments of the header file VEX/pub/libvex_ir.h.

The easiest way to instrument VEX IR is to insert calls to C functions when interesting things happen. See the tool "Lackey" (lackey/lk_main.c) for a simple example of this, or Cachegrind (cachegrind/cg_main.c) for a more complex example.

2.2.7. Finalisation

This is where you can present the final results, such as a summary of the information collected. Any log files should be written out at this point.

2.2.8. Other Important Information

Please note that the core/tool split infrastructure is quite complex and not brilliantly documented. Here are some important points, but there are undoubtedly many others that I should note but haven’t thought of.

The files include/pub_tool__*.h contain all the types, macros, functions, etc. that a tool should (hopefully) need, and are the only .h files a tool should need to #include. They have a reasonable amount of documentation in it that should hopefully be enough to get you going.

Note that you can’t use anything from the C library (there are deep reasons for this, trust us). Valgrind provides an implementation of a reasonable subset of the C library, details of which are in pub_tool_libc_*.h.

When writing a tool, in theory you shouldn’t need to look at any of the code in Valgrind’s core, but in practice it might be useful sometimes to help understand something.

The include/pub_tool_basics.h and VEX/pub/libvex_basictypes.h files have some basic types that are widely used.

Ultimately, the tools distributed (Memcheck, Cachegrind, Lackey, etc.) are probably the best documentation of all, for the moment.

The VG_ macro is used heavily. This just prepends a longer string in front of names to avoid potential namespace clashes. It is defined in include/pub_tool_basics.h.

There are some assorted notes about various aspects of the implementation in docs/internals/. Much of it isn’t that relevant to tool-writers, howe

二、查看两篇论文

第35页开始是有关于vex的东西

有一些简单语法，稍微讲了IRStmts和IRExpr的区别

这两个东西就是存储语句的

三、在VS2015里查看lackey文件夹的lk_main.c （拷贝出来头文件）

这个文件没有主函数，代码最后一行

VG_DETERMINE_INTERFACE_VERSION(lk_pre_clo_init)

相当于main函数的功能

右键转到定义

这个函数前面那部分是打印一些信息没什么用

高亮部分就是转义vex的核心部分

这个VG_(basic_tool_funcs)的三个参数分别是三个函数

可以右键转到定义

最主要的就是lk_instrument这个函数

高亮的这个参数类型是IRSB可以转到定义看一下

这个在那个论文里也有说

IRSB就是用来存储转义后的语言的一个superblock

函数里面定义的那个sbOut就是这次调用存储的内容

然后下面会有一个switch函数

大体上是，在vex转义的时候，如果当前指令和下一条指令是有联系的，就可能可以合并为一条IR

这个switch是用来确定怎么样合并的

等这个switch走完了会有一个收尾

下面那个if是用来刷新IRSB事件的

IRSB里面定义了一个EVENT类型的16元数组，等16个事件全部使用了（或者说初始化吧）就会刷新，把这16个事件制空循环使用

最后这个函数是返回sbOut的

这个时候的sbOut就包含了所有的IR表达式

它的源码是没有打印出表达式的，但是他提供了API函数能让你打印

去看IRExpr和IRStmt的定义，注释里面会写IR的语法，以及打印的函数。

IRExpr是ppIRExpr，IRStmt是ppIRStmt，pp是pretty print的意思

四、VEX的直接提取

IRStmt :== NoOp

| IMark of Addr64 * Int

| AbiHint of IRExpr * Int

| Put of Int * IRExpr

| PutI of IRArray * IRExpr * Int * IRExpr

| WrTmp of IRTemp * IRExpr

| Store of IREndness * IRExpr * IRExpr

| Exit of IRExpr * IRJumpKind * IRConst

| Dirty of IRDirty

| CAS of IRCAS

| MBE

IRExpr :== Binder of Int

| Get of Int * IRType

| GetI of IRTemp * IRArray * IRExpr * Int

| RdTmp of IRTemp

| Qop of IROp * IRExpr * IRExpr * IRExpr * IRExpr

| Triop of IROp * IRExpr * IRExpr * IRExpr * IRExpr

| Binop of IROp * IRExpr * IRExpr

| Unop of IROp * IRExpr

| Load of IREndness * IRType * IRExpr

| Const of IRConst

| Mux0X of IRExpr * IRExpr * IRExpr

| CCall of IRCallee * IRType * IRExprVec IRExprVec :== IRExpr | IRExprVec

IREndness :== LittleEndian | BigEndian IRArray :== Int * IRType * Int IRTemp :== UInt

| Ijk_ClientReq | Ijk_Yield

| Ijk_EmWarn | Ijk_NoDecode | Ijk_MapFail | Ijk_TInval

| Ijk_NoRedir | Ijk_Trap | Ijk_Sys_syscall | Ijk_Sys_int32

| Ijk_Sys_int128 | Ijk_Sys_sysenter

IRType :== Ity_INVALID | Ity_I1 | Ity_I8 | Ity_I16 | Ity_I32

上面这一大堆是VEX中间代码BNF描述

示例程序：

gcc -g -Wall hello.c -o hello 编译为可执行文件

valgrind --leak-check=full ./hello 测试

==4832== Memcheck, a memory error detector
==4832== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==4832== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==4832== Command: ./tmp
==4832==
==4832== Invalid write of size 4      // 内存越界
==4832==    at 0x804843F: test (in /home/yanghao/Desktop/testC/testmem/tmp)
==4832==    by 0x804848D: main (in /home/yanghao/Desktop/testC/testmem/tmp)
==4832== Address 0x41a6050 is 0 bytes after a block of size 40 alloc'd
==4832==    at 0x4026864: malloc (vg_replace_malloc.c:236)
==4832==    by 0x8048435: test (in /home/yanghao/Desktop/testC/testmem/tmp)
==4832==    by 0x804848D: main (in /home/yanghao/Desktop/testC/testmem/tmp)
==4832==
==4832== Source and destination overlap in memcpy(0x41a602c, 0x41a6028, 5) // 踩内存
==4832==    at 0x4027BD6: memcpy (mc_replace_strmem.c:635)
==4832==    by 0x8048461: test (in /home/yanghao/Desktop/testC/testmem/tmp)
==4832==    by 0x804848D: main (in /home/yanghao/Desktop/testC/testmem/tmp)
==4832==
==4832== Invalid free() / delete / delete[] // 重复释放
==4832==    at 0x4025BF0: free (vg_replace_malloc.c:366)
==4832==    by 0x8048477: test (in /home/yanghao/Desktop/testC/testmem/tmp)
==4832==    by 0x804848D: main (in /home/yanghao/Desktop/testC/testmem/tmp)
==4832== Address 0x41a6028 is 0 bytes inside a block of size 40 free'd
==4832==    at 0x4025BF0: free (vg_replace_malloc.c:366)
==4832==    by 0x804846C: test (in /home/yanghao/Desktop/testC/testmem/tmp)
==4832==    by 0x804848D: main (in /home/yanghao/Desktop/testC/testmem/tmp)
==4832==
==4832== Use of uninitialised value of size 4 // 非法指针
==4832==    at 0x804847B: test (in /home/yanghao/Desktop/testC/testmem/tmp)
==4832==    by 0x804848D: main (in /home/yanghao/Desktop/testC/testmem/tmp)
==4832==
==4832==
==4832== Process terminating with default action of signal 11 (SIGSEGV) //由于非法指针赋值导致的程序崩溃
==4832== Bad permissions for mapped region at address 0x419FFF4
==4832==    at 0x804847B: test (in /home/yanghao/Desktop/testC/testmem/tmp)
==4832==    by 0x804848D: main (in /home/yanghao/Desktop/testC/testmem/tmp)
==4832==
==4832== HEAP SUMMARY:
==4832==     in use at exit: 0 bytes in 0 blocks
==4832==   total heap usage: 1 allocs, 2 frees, 40 bytes allocated
==4832==
==4832== All heap blocks were freed -- no leaks are possible
==4832==
==4832== For counts of detected and suppressed errors, rerun with: -v
==4832== Use --track-origins=yes to see where uninitialised values come from
==4832== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 11 from 6)
Segmentation fault

从valgrind的检测输出结果看，这几个错误都找了出来。

详细错误信息如下

typedef

enum {

Err_Value, // 使用未初始化变量

Err_Cond, // 使用未初始化值进行条件跳转

Err_CoreMem, // 地址中包含未初始化字节

Err_Addr, // 地址不可读或不可写

Err_Jump, // 程序运行跳转至非法内存

Err_RegParam, // Syscall寄存器参数包含未初始化字节

Err_MemParam, // Syscall堆栈参数包含未初始化或不可寻址字节

Err_User,

Err_Free, // 释放了非法内存

Err_FreeMismatch, // new/new[]/malloc的内存没用对应delete/delete[]/free释放

Err_Overlap, // strcpy/memcpy等函数的src和dest地址有重叠

Err_Leak, // 内存泄露

Err_IllegalMempool, // 内存池地址非法

Err_FishyValue, // 函数参数中带有非法值，如size_t类型传入了负数

}

MC_ErrorTag;

示例代码

#include <stdlib.h>

int main(){

    int a, b[1];

    char* c = malloc(4);

    char* d = malloc(4);

    a = ( a == 0 ? b[0] : b[1] );

    memcpy(c, c+1, 2);

    d[4] = 1;

    free(d); free(d);

    return 0;

}

# 为了方便把结果输出到test.txt文件中

valgrind --trace-flags=10000000 ./hello > test.txt 2>&1

找到main函数所在的superblock的序号，并记录下来

本程序的superblock序号在1337到1375之间，所以用以下命令将结果保存到test.txt

valgrind --trace-flags=10000000 --trace-notbelow=1337 --trace-notabove=1375 ./hello > test.txt 2>&1

0x4005DD: pushq %rbp

t0 = GET:I64(56)

t1 = Sub64(GET:I64(48),0x8:I64)

PUT(48) = t1

STle(t1) = t0

PUT(184) = 0x4005DE:I64

0x4005DE: movq %rsp,%rbp

PUT(56) = GET:I64(48)

PUT(184) = 0x4005E1:I64

0x4005E1: subq $32, %rsp

t4 = GET:I64(48)

t3 = 0x20:I64

t2 = Sub64(t4,t3)

PUT(144) = 0x8:I64

PUT(152) = t4

PUT(160) = t3

PUT(48) = t2

PUT(184) = 0x4005E5:I64

0x4005E5: movl $4,%edi

PUT(72) = 32Uto64(0x4:I32)

PUT(184) = 0x4005EA:I64

0x4005EA: call 0x4004E0

t5 = Sub64(GET:I64(48),0x8:I64)

PUT(48) = t5

STle(t5) = 0x4005EF:I64

t6 = 0x4004E0:I64

====== AbiHint(Sub64(t5,0x80:I64), 128, t6) ======

PUT(184) = 0x4004E0:I64

0x4004E0: jmp* 2100050(%rip)

t8 = Add64(0x4004E6:I64,0x200B52:I64)

t9 = LDle:I64(t8)

PUT(184) = t9

PUT(184) = GET:I64(184); exit-Boring

0x4005EF: movq %rax,-16(%rbp)

t0 = Add64(GET:I64(56),0xFFFFFFFFFFFFFFF0:I64)

STle(t0) = GET:I64(16)

PUT(184) = 0x4005F3:I64

0x4005F3: movl $4,%edi

PUT(72) = 32Uto64(0x4:I32)

PUT(184) = 0x4005F8:I64

0x4005F8: call 0x4004E0

t1 = Sub64(GET:I64(48),0x8:I64)

PUT(48) = t1

STle(t1) = 0x4005FD:I64

t2 = 0x4004E0:I64

====== AbiHint(Sub64(t1,0x80:I64), 128, t2) ======

PUT(184) = 0x4004E0:I64

0x4004E0: jmp* 2100050(%rip)

t4 = Add64(0x4004E6:I64,0x200B52:I64)

t5 = LDle:I64(t4)

PUT(184) = t5

PUT(184) = GET:I64(184); exit-Boring

0x4005FD: movq %rax,-8(%rbp)

t0 = Add64(GET:I64(56),0xFFFFFFFFFFFFFFF8:I64)

STle(t0) = GET:I64(16)

PUT(184) = 0x400601:I64

0x400601: cmpl $0, -20(%rbp)

t4 = Add64(GET:I64(56),0xFFFFFFFFFFFFFFEC:I64)

t3 = LDle:I32(t4)

t2 = 0x0:I32

t1 = Sub32(t3,t2)

PUT(144) = 0x7:I64

PUT(152) = 32Uto64(t3)

PUT(160) = 32Uto64(t2)

PUT(184) = 0x400605:I64

0x400605: jne-8 0x40060C

if (64to1(amd64g_calculate_condition[mcx=0x13]{0x581243c0}(0x4:I64,GET:I64(144),GET:I64(152),GET:I64(160),GET:I64(168)):I64)) { PUT(184) = 0x400607:I64; exit-Boring }

PUT(184) = 0x40060C:I64

PUT(184) = GET:I64(184); exit-Boring

0x400607: movl -28(%rbp),%eax

t0 = Add64(GET:I64(56),0xFFFFFFFFFFFFFFE4:I64)

PUT(16) = 32Uto64(LDle:I32(t0))

PUT(184) = 0x40060A:I64

0x40060A: jmp-8 0x40060F

PUT(184) = 0x40060F:I64

0x40060F: movl %eax,-20(%rbp)

t1 = Add64(GET:I64(56),0xFFFFFFFFFFFFFFEC:I64)

STle(t1) = 64to32(GET:I64(16))

PUT(184) = 0x400612:I64

0x400612: movq -16(%rbp),%rax

t2 = Add64(GET:I64(56),0xFFFFFFFFFFFFFFF0:I64)

PUT(16) = LDle:I64(t2)

PUT(184) = 0x400616:I64

0x400616: leaq 1(%rax), %rcx

t3 = Add64(GET:I64(16),0x1:I64)

PUT(24) = t3

PUT(184) = 0x40061A:I64

0x40061A: movq -16(%rbp),%rax

t4 = Add64(GET:I64(56),0xFFFFFFFFFFFFFFF0:I64)

PUT(16) = LDle:I64(t4)

PUT(184) = 0x40061E:I64

0x40061E: movl $2,%edx

PUT(32) = 32Uto64(0x2:I32)

PUT(184) = 0x400623:I64

0x400623: movq %rcx,%rsi

PUT(64) = GET:I64(24)

PUT(184) = 0x400626:I64

0x400626: movq %rax,%rdi

PUT(72) = GET:I64(16)

PUT(184) = 0x400629:I64

0x400629: call 0x4004D0

t5 = Sub64(GET:I64(48),0x8:I64)

PUT(48) = t5

STle(t5) = 0x40062E:I64

t6 = 0x4004D0:I64

====== AbiHint(Sub64(t5,0x80:I64), 128, t6) ======

PUT(184) = 0x4004D0:I64

0x4004D0: jmp* 2100058(%rip)

t8 = Add64(0x4004D6:I64,0x200B5A:I64)

t9 = LDle:I64(t8)

PUT(184) = t9

PUT(184) = GET:I64(184); exit-Boring

0x40062E: movq -8(%rbp),%rax

t0 = Add64(GET:I64(56),0xFFFFFFFFFFFFFFF8:I64)

PUT(16) = LDle:I64(t0)

PUT(184) = 0x400632:I64

0x400632: addq $4, %rax

t3 = GET:I64(16)

t2 = 0x4:I64

t1 = Add64(t3,t2)

PUT(144) = 0x4:I64

PUT(152) = t3

PUT(160) = t2

PUT(16) = t1

PUT(184) = 0x400636:I64

0x400636: movb $1, (%rax)

t4 = GET:I64(16)

STle(t4) = 0x1:I8

PUT(184) = 0x400639:I64

0x400639: movq -8(%rbp),%rax

t5 = Add64(GET:I64(56),0xFFFFFFFFFFFFFFF8:I64)

PUT(16) = LDle:I64(t5)

PUT(184) = 0x40063D:I64

0x40063D: movq %rax,%rdi

PUT(72) = GET:I64(16)

PUT(184) = 0x400640:I64

0x400640: call 0x4004A0

t6 = Sub64(GET:I64(48),0x8:I64)

PUT(48) = t6

STle(t6) = 0x400645:I64

t7 = 0x4004A0:I64

====== AbiHint(Sub64(t6,0x80:I64), 128, t7) ======

PUT(184) = 0x4004A0:I64

0x4004A0: jmp* 2100082(%rip)

t9 = Add64(0x4004A6:I64,0x200B72:I64)

t10 = LDle:I64(t9)

PUT(184) = t10

PUT(184) = GET:I64(184); exit-Boring

0x400645: movq -8(%rbp),%rax

t0 = Add64(GET:I64(56),0xFFFFFFFFFFFFFFF8:I64)

PUT(16) = LDle:I64(t0)

PUT(184) = 0x400649:I64

0x400649: movq %rax,%rdi

PUT(72) = GET:I64(16)

PUT(184) = 0x40064C:I64

0x40064C: call 0x4004A0

t1 = Sub64(GET:I64(48),0x8:I64)

PUT(48) = t1

STle(t1) = 0x400651:I64

t2 = 0x4004A0:I64

====== AbiHint(Sub64(t1,0x80:I64), 128, t2) ======

PUT(184) = 0x4004A0:I64

0x4004A0: jmp* 2100082(%rip)

t4 = Add64(0x4004A6:I64,0x200B72:I64)

t5 = LDle:I64(t4)

PUT(184) = t5

PUT(184) = GET:I64(184); exit-Boring

0x400651: movl $0,%eax

PUT(16) = 32Uto64(0x0:I32)

PUT(184) = 0x400656:I64

0x400656: leave

t0 = GET:I64(56)

PUT(48) = t0

t1 = LDle:I64(t0)

PUT(56) = t1

PUT(48) = Add64(t0,0x8:I64)

PUT(184) = 0x400657:I64

0x400657: ret

t2 = GET:I64(48)

t3 = LDle:I64(t2)

t4 = Add64(t2,0x8:I64)

PUT(48) = t4

====== AbiHint(Sub64(t4,0x80:I64), 128, t3) ======

PUT(184) = t3

PUT(184) = GET:I64(184); exit-Return

咫尺or天涯

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
使用Valgrind关于link_main.c的探索及完成memcheck、对中间语言VEX的提取

使用Valgrind关于link_main.c的探索及完成memcheck、对中间语言VEX的提取
复制链接

扫一扫

使用Valgrind关于link_main.c的探索及完成memcheck、对中间语言VEX的提取

一、查看Writing a New Valgrind Tool（2.2.4）

http://www.valgrind.org/docs/manual/writing-tools.html#writing-tools.writingcode

二、查看两篇论文

三、在VS2015里查看lackey文件夹的lk_main.c （拷贝出来头文件）

四、VEX的直接提取

“相关推荐”对你有帮助么？