【Linux】Valgrind工具集详解

最新推荐文章于 2025-04-08 00:00:00 发布

熠熠微光

最新推荐文章于 2025-04-08 00:00:00 发布

阅读量4.2k

点赞数 1

分类专栏： Linux

原文链接：https://blog.csdn.net/u010168781/category_6998350.html

版权

Linux 专栏收录该内容

25 篇文章

订阅专栏

文章目录

一、简介
二、入门
三、打印信息说明
四、抑制错误
五、命令行详解
六、使用Valgrind gdbserver和GDB调试程序
七、Memcheck（内存错误检测器）
八、Memcheck命令行参数详解
九、Memcheck检查的内容和方法
十、SGCheck（检查栈和全局数组溢出）
十一、Massif（堆分析器）
十二、DHAT：动态堆分析器
十三、Helgrind（线程错误检测器）
十四、Cachegrind（缓存和分支预测分析器）
十五、Callgrind（性能分析图）

一、简介

一、Valgrind概述
Valgrind是用于构建动态分析工具的仪器框架。它附带了一组工具，每个工具都执行某种调试，分析或类似任务，可帮助您改进程序。Valgrind的架构采用模块化设计，因此可以轻松创建新工具，而不会干扰现有结构。

二、工具集列表
1、Memcheck是一个内存错误检测器。它可以帮助您使程序，尤其是那些用C和C ++编写的程序更加正确。

2、Cachegrind是缓存和分支预测分析器。它可以帮助您使程序运行得更快。

3、Callgrind是一个生成缓存分析器的调用图。它与Cachegrind有一些重叠，但也收集了Cachegrind没有的一些信息。

4、Helgrind是一个线程错误检测器。它可以帮助您使多线程程序更正确。

5、DRD也是线程错误检测器。它与Helgrind类似，但使用不同的分析技术，因此可能会发现不同的问题。

6、Massif是一个堆分析器。它可以帮助您使程序使用更少的内存。

7、DHAT是一种不同类型的堆分析器。它可以帮助您了解块寿命，块利用率和布局效率低下的问题。

8、SGcheck是一种实验工具，可以检测堆栈和全局数组的溢出。它的功能与Memcheck的功能互补：SGcheck发现Memcheck无法解决的问题，反之亦然。

9、BBV是一个实验性的SimPoint基本块矢量生成器。它对进行计算机体系结构研究和开发的人很有用。

还有一些对大多数用户没用的小工具：
10、 Lackey是一个示例工具，用于说明一些仪器基础知识;

11、Nulgrind是最小的Valgrind工具，不进行分析或检测，仅用于测试目的。

原文链接：https://blog.csdn.net/u010168781/article/details/83546402

二、入门

一、使用valgrind
1、安装
安装超级简单：
sudo apt-get install valgrind

2、使用
运行valgrind -h可以查看详细使用方法，命令格式如下：

valgrind [valgrind -h中的选项] 待测程序 [待测程序的命令行参数列表]

最重要的选项是–tool决定运行哪种Valgrind工具。
例如，使用内存检查工具Memcheck 运行“ls -l”命令，执行命令格式如下：

valgrind --tool = memcheck ls -l

Memcheck是默认设置，因此如果要使用它，则可以省略该–tool选项，如：

valgrind  ls -l

3、原理
无论使用哪种工具，Valgrind都会在程序启动前控制待测程序。从可执行文件和相关库中读取调试信息，以便在适当时可以根据源代码位置来表示错误消息和其他输出。

然后，待测程序将在Valgrind核心提供的“合成CPU”上运行。当新代码首次执行时，Valgrind核心将程序代码交给选定的工具。该工具将自己的检测代码添加到此处，并将结果交还给核心，核心协调持续执行此检测代码。

添加的检测代码量在不同工具之间差异很大。Memcheck添加了代码来检查每个内存访问和计算的每个值，使其运行速度比本机慢10-50倍。为Nulgrind（最小工具）根本不添加任何仪器，运行速度比本机慢4倍。

Valgrind模拟程序执行的每条指令。因此，活动工具不仅检查应用程序中的代码，还检查所有支持动态链接库（包括C库，图形库等）的代码。

如果使用的是错误检测工具，Valgrind可能会检测系统库中的错误，例如使用了GNU C或X11库。尽管对这些错误不感兴趣，但是无法控制该代码。因此，Valgrind允许通过将它们记录在Valgrind启动时读取的“抑制文件”（后续会专门讲）中来有选择地抑制错误（就是系统库中的错误不打印出来）。Valgrind构建机制选择默认抑制，为计算机上检测到的操作系统和库提供合理的行为。为了更容易编写抑制，可以使用该 --gen-suppressions=yes选项。这告诉Valgrind打印出每个报告错误的抑制，然后可以将其复制到抑制文件中。

不同的错误检查工具会报告不同类型的错误。因此，抑制机制允许标记出每个抑制适用于哪个工具或工具。

4、对编译程序时的gcc选项的一点说明
最好使用debug版本（gcc -g），这样打印的信息中会将错误和分析的信息指定出相关的代码行；
如果是C++最好将内联函数以普通函数对待（gcc -fno-inline），这样更容易看到函数调用链，这有助于减少在大型C ++应用程序中导航时的混淆；
不要使用优化（gcc -O2或gcc-O1等），这会导致Memcheck错误地报告未初始化的值错误或丢失未初始化的值错误；
最好编译时能显示所有警告（gcc -Wall）

原文链接：https://blog.csdn.net/u010168781/article/details/83546628

三、打印信息说明

一、打印信息格式
Valgrind打印信息的格式如下，很容易和程序输出信息区分出来

== 进程ID ==Valgrind的打印信息

二、打印到何处
1、打印到文件描述符中
主要是设置打印到终端上，默认情况下为2（stderr标准错误输出）。如果要想打印到其他文件描述符（例如编号9），则可以指定 --log-fd=9。

2、打印到指定文件中
使用选项：–log-file=filename
若filename是空，则会引发终止。filename中可有三种格式信息；

%p将被替换为当前进程的ID。当—trace-children=yes，而没用%p时，所有进程的信息都输向同一个文件，会比较混乱，信息也可能不全，最好文件名中包含%p。
%q{FOO}被环境变量FOO的值代替，若FOO的内容奇怪的话也可能引发异常。一般不用这种格式，除了极少情况，如基于MPI(一种并行程序开发库)的程序。若用了此种格式，FOO不能为空，否则也引发异常。一些shell里面，”{””}”可能需要反斜杠转义。
%%被代替为%，%不能后接任何其他字符，否则会引发异常。
3、打印到网络套接字（网络）
使用选项：–log-socket=IP:端口号
接收端使用valgrind-listener，valgrind-listener可以接受来自多达50个Valgrinded流程的同时连接。在每行输出前面，它在圆括号中打印当前活动连接数。
valgrind-listener [–exit-at-zero|-e] [port-number]
valgrind-listener 接受三个命令行选项：

-e --exit-at-zero
当连接的进程数量回落到零时，退出。没有它，它将永远运行，使用Ctrl+c来停止；
–max-connect=INTEGER
默认情况下，侦听器最多可以连接50个进程。偶尔，这个数字太小了。使用此选项可提供不同的限制。例如 --max-connect=100。
portnumber
从默认值（1500）更改它侦听的端口。指定的端口必须在1024到65535之间。相同的限制适用于由–log-socketValgrind本身指定的端口号。
如果Valgrinded进程无法连接到侦听器，无论出于何种原因（侦听器未运行，无效或无法访问的主机或端口等），Valgrind将切换回写入stderr。

三、错误信息分析
Memcheck报告错误信息的一个例子：

==25832== Invalid read of size 4
==25832==    at 0x8048724: BandMatrix::ReSize(int, int, int) (bogon.cpp:45)
==25832==    by 0x80487AF: main (bogon.cpp:66)
==25832==  Address 0xBFFFF74C is not stack'd, malloc'd or free'd

错误信息分析：
程序在地址 0xBFFFF74C处非法读取了4字节。在程序源码bogon.cpp文件的第45行，ReSize(int, int, int)，在程序源码bogon.cpp文件的第66行调用，等等。

Valgrind记得所有错误报告。检测到错误时，会将其与旧报告进行比较，以查看它是否重复。如果是，则记录错误，但不会发出进一步的评论。这可以避免被大量的重复错误报告所淹没。

如果想知道每个错误发生了多少次，请使用该-v选项运行。执行完成后，所有报告都会打印出来，并按其出现次数排序。这样可以轻松查看最常出现的错误。

通常，应该按照报告的顺序尝试修复错误。例如，在Memcheck上运行时，将未初始化的值复制到多个内存位置并稍后使用它们的程序将生成多个错误消息。第一个这样的错误消息可能会给出问题根本原因的最直接线索。

检测重复错误的过程非常昂贵，如果程序产生大量错误，可能会成为显着的性能开销。为避免出现严重问题，Valgrind将在看到1,000种不同的错误或者总共10,000,000个错误之后停止收集错误。在这种情况下，要停止程序并修复它，因为Valgrind在此之后不再输出任何有价值的信息。

如果不想受到上述的限制（1,000种不同的错误或者发现10,000,000个错误后停止收集错误）可以使用该 --error-limit=no选项。然后Valgrind将始终显示错误，无论有多少。但是这将会对性能产生不良影响。

原文链接：https://blog.csdn.net/u010168781/article/details/83547083

四、抑制错误

一、什么是抑制错误
错误检查工具可以检测系统库中的许多问题，例如C库，它是随操作系统预安装的。这些错误无法修复，并且有很多，但不希望看到这些错误。如何屏蔽这样错误就叫做“抑制错误”。

二、使用方法
1、使用默认的抑制错误配置
valgrind参数为
–default-suppressions=yes|no load default suppressions [yes]
–default-suppressions表示是否加载默认的配置，这个默认配置文件路径是/usr/lib/valgrind/default.supp，可以在这个文件中添加自己的配置。
default.supp内容如下：

# This is a generated file, composed of the following suppression rules:
#exp-sgcheck.supp xfree-3.supp xfree-4.supp glibc-2.X-drd.supp glibc-2.34567-NPTL-helgrind.supp glibc-2.X.supp
{
  ld-2.X possibly applying relocations
  exp-sgcheck:SorG
  obj:*/*lib*/ld-2.*so*
  obj:*/*lib*/ld-2.*so*
}

#I'm pretty sure this is a false positive caused by the sg_ stuff
{
  glibc realpath false positive
  exp-sgcheck:SorG
  fun:realpath
  fun:*
}

{
  I think this is glibc's ultra optimised getenv doing 2 byte reads
  exp-sgcheck:SorG
  fun:getenv
}
略。。。。。。

2、从指定文件中添加抑制错误配置
–suppressions= suppress errors described in
–suppressions=/path/to/file.supp

3、抑制错误配置文件格式

# 注释
1 { #每条配置内容在一对大括号内 “{”“}”
2     name #第一行：名字，任何识别字符串都可以，在程序完成时打印出的使用的抑制的摘要中将其引用
3     Memcheck:Value8 #第二行：抑制工具的名称（多个时，用逗号分隔），以及抑制名称，用冒号分隔（不允许空格）
4     fun:_itoa_word#剩余行：要匹配的函数，对象和文件名可以使用通配符 *和 ?
5 }

4、抑制错误类型
Memcheck抑制类型如下：
Value1， Value2， Value4， Value8， Value16，针对1，2，4，8或16字节的变量的未初始化值错误。
Cond（或其旧名称Value0），指针未初始化。
Addr1， Addr2， Addr4， Addr8， Addr16，分别是1，2，4，8或16字节的无效地址。
Jump，意味着跳转到无法追踪的位置错误。
Param，表示无效的系统调用参数错误。
Free，意思是无效或不匹配的free。
Overlap，意思是 src/ dst重叠 memcpy或类似的功能。
Leak，意思是内存泄漏。

三、使用抑制错误的例子
1、源码main.c
编译时添加调试选项：gcc -g main.c

#include <stdio.h>

int main()
{
    int x;
    printf("x = %d\n",x);
}

2、不使用抑制时的错误信息
2.1 执行：valgrind --tool=memcheck --log-file=valgrind%p.txt ./a.out
2.2 错误信息被保存到valgrind*.txt中，内容如下

==3467== Memcheck, a memory error detector
==3467== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==3467== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==3467== Command: ./a.out
==3467== Parent PID: 2870
==3467== 
==3467== Conditional jump or move depends on uninitialised value(s)
==3467==    at 0x4E814CE: vfprintf (vfprintf.c:1660)
==3467==    by 0x4E8B3D8: printf (printf.c:33)
==3467==    by 0x400548: main (main.c:6)
==3467== 
==3467== Use of uninitialised value of size 8
==3467==    at 0x4E8099B: _itoa_word (_itoa.c:179)
==3467==    by 0x4E84636: vfprintf (vfprintf.c:1660)
==3467==    by 0x4E8B3D8: printf (printf.c:33)
==3467==    by 0x400548: main (main.c:6)
==3467== 
==3467== Conditional jump or move depends on uninitialised value(s)
==3467==    at 0x4E809A5: _itoa_word (_itoa.c:179)
==3467==    by 0x4E84636: vfprintf (vfprintf.c:1660)
==3467==    by 0x4E8B3D8: printf (printf.c:33)
==3467==    by 0x400548: main (main.c:6)
==3467== 
==3467== Conditional jump or move depends on uninitialised value(s)
==3467==    at 0x4E84682: vfprintf (vfprintf.c:1660)
==3467==    by 0x4E8B3D8: printf (printf.c:33)
==3467==    by 0x400548: main (main.c:6)
==3467== 
==3467== Conditional jump or move depends on uninitialised value(s)
==3467==    at 0x4E81599: vfprintf (vfprintf.c:1660)
==3467==    by 0x4E8B3D8: printf (printf.c:33)
==3467==    by 0x400548: main (main.c:6)
==3467== 
==3467== Conditional jump or move depends on uninitialised value(s)
==3467==    at 0x4E8161C: vfprintf (vfprintf.c:1660)
==3467==    by 0x4E8B3D8: printf (printf.c:33)
==3467==    by 0x400548: main (main.c:6)
==3467== 
==3467== 
==3467== HEAP SUMMARY:
==3467==     in use at exit: 0 bytes in 0 blocks
==3467==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==3467== 
==3467== All heap blocks were freed -- no leaks are possible
==3467== 
==3467== For counts of detected and suppressed errors, rerun with: -v
==3467== Use --track-origins=yes to see where uninitialised values come from
==3467== ERROR SUMMARY: 6 errors from 6 contexts (suppressed: 0 from 0)

3、使用抑制错误
3.1、编译自己的抑制错误配置文件supp
主要针对上面的错误编译配置文件，目的时屏蔽上面的错误，不打印出来

$ cat gw.supp 
{
	1
	Memcheck:Value8
	fun:_itoa_word
}
{
	2
	Memcheck:Cond
	fun:vfprintf
}
{
	3
	Memcheck:Cond
	fun:_itoa_word
}

3.2 执行命令

valgrind --tool=memcheck  --suppressions=./gw.supp --log-file=valgrind%p.txt ./a.out

$ cat valgrind3488.txt 
==3488== Memcheck, a memory error detector
==3488== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==3488== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==3488== Command: ./a.out
==3488== Parent PID: 2870
==3488== 
==3488== 
==3488== HEAP SUMMARY:
==3488==     in use at exit: 0 bytes in 0 blocks
==3488==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==3488== 
==3488== All heap blocks were freed -- no leaks are possible
==3488== 
==3488== For counts of detected and suppressed errors, rerun with: -v
==3488== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 6 from 6)

对比后可以发现，错误被抑制了，不再打印出来。

四、自动生成抑制错误配置文件
1、刚接触抑制错误时，一个很大的疑惑就是如何编译抑制错误的配置文件，valgrind工具可以自己生成对应错误的抑制配置；
valgrind参数选项–gen-suppressions就可以生成抑制错误的配置文件；
–gen-suppressions=no|yes|all print suppressions for errors? [no] //若选择yes，则每显示一条error，valgrind就暂停，并打印一行：----Print suppression ?—[Return/N/n?Y/y/C/c]—(y=yes,n=no,c=cancle)这条提示信息与下面的–db-attach选项相同，选y，则打印针对这个error的suppressions。若该选项为all，则打印每个error的suppression，不在询问了。此选项对于C ++程序特别有用，因为它根据需要打印出带有错位名称的抑制。
2、例子
还是上面的main.c的例子
执行：valgrind --tool=memcheck --gen-suppressions=all --log-file=valgrind%p.txt ./a.out

$ cat valgrind3668.txt 
==3668== Memcheck, a memory error detector
==3668== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==3668== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==3668== Command: ./a.out
==3668== Parent PID: 2870
==3668== 
==3668== Conditional jump or move depends on uninitialised value(s)
==3668==    at 0x4E814CE: vfprintf (vfprintf.c:1660)
==3668==    by 0x4E8B3D8: printf (printf.c:33)
==3668==    by 0x400548: main (main.c:6)
==3668== 
{
   <insert_a_suppression_name_here>
   Memcheck:Cond
   fun:vfprintf
   fun:printf
   fun:main
}
==3668== Use of uninitialised value of size 8
==3668==    at 0x4E8099B: _itoa_word (_itoa.c:179)
==3668==    by 0x4E84636: vfprintf (vfprintf.c:1660)
==3668==    by 0x4E8B3D8: printf (printf.c:33)
==3668==    by 0x400548: main (main.c:6)
==3668== 
{
   <insert_a_suppression_name_here>
   Memcheck:Value8
   fun:_itoa_word
   fun:vfprintf
   fun:printf
   fun:main
}
==3668== Conditional jump or move depends on uninitialised value(s)
==3668==    at 0x4E809A5: _itoa_word (_itoa.c:179)
==3668==    by 0x4E84636: vfprintf (vfprintf.c:1660)
==3668==    by 0x4E8B3D8: printf (printf.c:33)
==3668==    by 0x400548: main (main.c:6)
==3668== 
{
   <insert_a_suppression_name_here>
   Memcheck:Cond
   fun:_itoa_word
   fun:vfprintf
   fun:printf
   fun:main
}
==3668== Conditional jump or move depends on uninitialised value(s)
==3668==    at 0x4E84682: vfprintf (vfprintf.c:1660)
==3668==    by 0x4E8B3D8: printf (printf.c:33)
==3668==    by 0x400548: main (main.c:6)
==3668== 
{
   <insert_a_suppression_name_here>
   Memcheck:Cond
   fun:vfprintf
   fun:printf
   fun:main
}
==3668== Conditional jump or move depends on uninitialised value(s)
==3668==    at 0x4E81599: vfprintf (vfprintf.c:1660)
==3668==    by 0x4E8B3D8: printf (printf.c:33)
==3668==    by 0x400548: main (main.c:6)
==3668== 
{
   <insert_a_suppression_name_here>
   Memcheck:Cond
   fun:vfprintf
   fun:printf
   fun:main
}
==3668== Conditional jump or move depends on uninitialised value(s)
==3668==    at 0x4E8161C: vfprintf (vfprintf.c:1660)
==3668==    by 0x4E8B3D8: printf (printf.c:33)
==3668==    by 0x400548: main (main.c:6)
==3668== 
{
   <insert_a_suppression_name_here>
   Memcheck:Cond
   fun:vfprintf
   fun:printf
   fun:main
}
==3668== 
==3668== HEAP SUMMARY:
==3668==     in use at exit: 0 bytes in 0 blocks
==3668==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==3668== 
==3668== All heap blocks were freed -- no leaks are possible
==3668== 
==3668== For counts of detected and suppressed errors, rerun with: -v
==3668== Use --track-origins=yes to see where uninitialised values come from
==3668== ERROR SUMMARY: 6 errors from 6 contexts (suppressed: 0 from 0)

删除==3668==的行就是抑制错误的配置文件，否则会报错
或在vim中使用命令：:%s/==3668==/#/g

原文链接：https://blog.csdn.net/u010168781/article/details/83547788

五、命令行详解

一、使用方法
usage: valgrind [options] prog-and-args
使用方法：valgrind [参数选项] 程序和参数

二、选择工具
tool-selection option, with default in [ ]:
工具选择选项，默认值在[]中：
–tool= use the Valgrind tool named [memcheck]
name取值如下：
1、memcheck：检查程序中的内存问题，如泄漏、越界、非法指针等。
2、callgrind：检测程序代码覆盖，以及分析程序性能。
3、cachegrind：分析CPU的cache命中率、丢失率，用于进行代码优化。
4、helgrind：用于检查多线程程序的竞态条件。
5、massif：堆栈分析器，指示程序中使用了多少堆内存等信息。
6、lackey：Lackey是小型工具，很少用到
7、nulgrind：Nulgrind只是为开发者展示如何创建一个工具

三、工具集基本选项
basic user options for all Valgrind tools, with defaults in [ ]:
针对Valgrind工具集的基本选项，默认值在[]中：
-h --help show this message
–help-debug show this message, plus debugging options
–version show version
-q --quiet run silently; only print error msgs //安静的运行，只打印错误
-v --verbose be more verbose – show misc extra info//更加冗长。提供有关程序各个方面的额外信息，例如：加载的共享对象，使用的抑制，检测和执行引擎的进度以及有关异常行为的警告。重复该选项会增加详细程度。
–trace-children=no|yes [no] //如果是yes，被调程序若用exec开启了一个子进程，那valgrind将会追踪此子进程的执行情况。
默认是no，但无论如何，valgrind都还是追踪fork产生的子进程。
–trace-children-skip=patt1,patt2,… //在上面选项设为yes后，这个选项则标识了哪些子进程是不要被追踪的，这些子进程(名字)由patt1,patt2…来决定，pattn可包含”?””*”等通配符。注意，valgrind将停止追踪pattn所指进程下可能产生的所有子进程。
–trace-children-skip-by-arg=patt1,patt2,… //与–trace-childrn-skip选项的不同之处是，跳过的子进程是由给子进程的参数patt1,patt2…来决定，而非程序名字。
–child-silent-after-fork=no|yes omit child output between fork & exec? [no] //若是yes，则不显示由fork调用产生的子进程产生的任何调试或log信息。当调试信息以XML格式输出时(–xml=yes)，强烈建议开启此选项。默认是no。
–vgdb=no|yes|full 激活gdb调试? [yes]//若为yes或full，valgrind允许在其上运行的程序，用GDB去调试它（开启gdbsever）。默认是yes。full比较慢，但是提供了精确的观察点/步骤
–vgdb-error= invoke gdbserver after errors [999999999]
to get started quickly, use --vgdb-error=0
and follow the on-screen directions
//在开启了gdbsever后，有用。报错工具在等待有number个错误报出后，会冻结程序并等你将它连上GDB。因此，当number=0时，在你的程序运行前，gdbserver就开始运行了。典型应用场景是，在运行前插入GDB断点，还有使用那些不报错的工具的情况，如Massif。默认是999999999。
–vgdb-stop-at=event1,event2,… invoke gdbserver for given events [none]
where event is one of:
startup exit valgrindabexit all none
–track-fds=no|yes track open file descriptors? [no] //若是yes，在程序退出时，将打印一系列的程序打开的文件描述符的信息。包括文件是在哪打开的，文件名字或socket细节等。默认是no。
–time-stamp=no|yes add timestamps to log messages? [no] //若 yes，每条信息前将挂个时间信息，指示自程序开始，过去的时间量。
–log-fd= log messages to file descriptor [2=stderr] //将vaolgrind的输出信息输向由number这个文件描述符指定的文件，默认是2，即stderr。注意，这可能与用户自己向stderr输出的东西相互交织。
–log-file= log messages to //将信息输向右filename指定的文件。若filename是空，则会引发终止。filename中可有三种格式信息；1，%p将被替换为当前进程的ID。当—trace-children=yes，而没用%p时，所有进程的信息都输向同一个文件，会比较混乱，信息也可能不全，最好文件名中包含%p。2，%q{FOO}被环境变量FOO的值代替，若FOO的内容奇怪的话也可能引发异常。一般不用这种格式，除了极少情况，如基于MPI(一种并行程序开发库)的程序。若用了此种格式，FOO不能为空，否则也引发异常。一些shell里面，”{””}”可能需要反斜杠转义。3，%%被代替为%，%不能后接任何其他字符，否则会引发异常。
–log-socket=ipaddr:port log messages to socket ipaddr:port //将信息输向指定的IP地址。若省了port-number，则默认用1500端口号。若这个IP地址无法接受信息，信息将被写会到stderr。

四、报错相关的错误
user options for Valgrind tools that report errors:
下面是与报错相关的选项。这些选项适用于所有能报错的工具，如memcheck，而cachegrind就用不了。
–xml=yes emit error output in XML (some tools only) //若为yes，启用后，输出的重要部分（例如工具错误消息）将采用XML格式而不是纯文本格式。不重要的信息(非错误信息)被打印在纯文本中。XML输出目标由–xml-fd或–xml-file或—xml-socket指定，而纯文本信息输出目标则由前面的—log-fd或—log-file或—log-socket指定。输出格式由docs/internals/xml-output-protocol4.txt设定。
–xml-fd= XML output to file descriptor //将XML信息输出到由文件描述符number指定的文件中，必须要–xml=yes。
–xml-file= XML output to //与—log-file类似。也必须要—xml=yes。
–xml-socket=ipaddr:port XML output to socket ipaddr:port //与—log-socket类似，也必须要—xml=yes。
–xml-user-comment=STR copy STR verbatim into XML output //在输出的XML文件开头，加的注释信息，没有—xml=yes的话，将被忽略。
–demangle=no|yes automatically demangle C++ names? [yes] //这个选项开启的话，会试图还原目标代码中的C++符号名，使其与源码中的相关符号名字尽量一致。默认是yes。需要注意的是，抑制文件中提到的函数名称应该是错误的形式。在搜索适用的抑制时，Valgrind不会对函数名称进行解码，因为否则会使抑制文件内容依赖于Valgrind的demangling机制的状态，并且还会降低抑制匹配的速度。
–num-callers= show callers in stack traces [12] //定义了在堆栈追踪过程中显示的最大嵌套调用数。注意，valgrind只显示四层嵌套调用的错误信息，故这个选项不影响最终报道的错误信息量。Number最大值是50，默认值是12。
–error-limit=no|yes stop showing new errors if too many? [yes] //若为yes，则当报道出的错误总数超过10000000或有1000个不同类型的，则停止报错。这么多错误的程序也就没必要再调试了，默认是yes。
–error-exitcode= exit code to return if errors found [0=disable] //设置在发现任何错误信息时的valgrind的返回码。
默认是0，这样返回码就是被调程序的返回码。若设为非0值，则此值将代替默认的返回码。如果Valgrind在运行中报告任何错误，则指定要返回的备用退出代码。设置为默认值（零）时，Valgrind的返回值将始终是正在模拟的过程的返回值。设置为非零值时，如果Valgrind检测到任何错误，则返回该值。这对于将Valgrind用作自动化测试套件的一部分非常有用，因为只需检查返回代码，就可以轻松检测Valgrind报告错误的测试用例。
–exit-on-first-error=<yes|no> [default: no]
如果启用此选项，Valgrind将在第一个错误时退出。必须使用–error-exitcode选项定义非零退出值。如果您正在运行回归测试或使用其他一些自动测试机器，则非常有用
–show-below-main=no|yes continue stack traces below main() [no] //默认是no，追踪堆栈错误时，不显示在那些在main函数下一层那些函数调用的错误信息。
–default-suppressions=yes|no load default suppressions [yes] //抑制错误
–suppressions= suppress errors described in //从其他文件添加抑制
–gen-suppressions=no|yes|all print suppressions for errors? [no] //若选择yes，则每显示一条error，valgrind就暂停，并打印一行：----Print suppression ?—[Return/N/n?Y/y/C/c]—(y=yes,n=no,c=cancle)这条提示信息与下面的–db-attach选项相同，选y，则打印针对这个error的suppressions。若该选项为all，则打印每个error的suppression，不在询问了。此选项对于C ++程序特别有用，因为它根据需要打印出带有错位名称的抑制。
–db-attach=no|yes start debugger when errors detected? [no] 注意:弃用功能//若选yes，则每显示一条error，valgrind就暂停，并打印一行：----Attach to debugger ?—[RegurnN/n/Y/y/C/c]—。若选y，则在此处启动调试器，调试完了要退出调试器，valgrind才可继续。若你用的是GDB，则选项–vgdb=yes或ful会使得调试器功能更强（它启动了valgrind内部的gdbsever，几乎模拟了 GDB的全部功能）。
–db-command= command to start debugger [/usr/bin/gdb -nw %f %p] //开启–db-attach选项时，实际用的命令，默认是"gdb -nw %f %p"，其中%f是被调试的程序名，%p是被调进程的ID，默认的调试器是valgrind安装时它发现的，一般是/usr/bin/gdb。command应该放在双引号内。
–input-fd= file descriptor for input [0=stdin] //当–db-attach=yes或–gen-suppressions=yes时，valgrind在发现错误时会等待键盘输入即便进行下一步操作，即number默认是0(stdin)，通过修改number，可以使得valgrind读取指定文件来执行下一步操作，在关闭了stdin时，这个选项有用。
–dsymutil=no|yes run dsymutil on Mac OS X when helpful? [no] //这个选项只在MacOS上跑valgrind才有用。
–max-stackframe= assume stack switch for SP changes larger than bytes [2000000] //该选项规定了被调程序能使用的最大栈帧空间。默认是2000000。这个选项一般在valgrind的调试输出建议你用时，再用。其实若被valgrind建议了，说明你的程序有问题，最好别在栈上分配太大的数据结构，大的数据最好在堆上分配。
–main-stacksize= set size of main thread’s stack (in bytes) [min(max(current ‘ulimit’ value,1MB),16MB)] //规定了主线程的栈大小。默认情况用ulimit值，一般是16MB或低一点。一般用8~16M能满足绝大部分应用程序的需求。Linux上可最大申请2GB。若valgrid无法分配这么多空间，便会终止。这个选项只影响initial thread，对其他线程栈无影响。

五、与malloc()函数有关的选项
user options for Valgrind tools that replace malloc:
与malloc()函数有关的选项：
–alignment= set minimum alignment of heap allocations [16]
–redzone-size= set minimum size of redzones added before/after
heap blocks (in bytes). [16]

六、不常见选项
uncommon user options for all Valgrind tools:
不常见选项
–fullpath-after= (with nothing after the ‘=’)
show full source paths in call stacks
–fullpath-after=string like --fullpath-after=, but only show the
part of the path after ‘string’. Allows removal
of path prefixes. Use this flag multiple times
to specify a set of prefixes to remove.
–extra-debuginfo-path=path absolute path to search for additional
debug symbols, in addition to existing default
well known search paths.
–debuginfo-server=ipaddr:port also query this server
(valgrind-di-server) for debug symbols
–allow-mismatched-debuginfo=no|yes [no]
for the above two flags only, accept debuginfo
objects that don’t “match” the main object
–smc-check=none|stack|all|all-non-file [stack]
checks for self-modifying code: none, only for
code found in stacks, for all code, or for all
code except that from file-backed mappings
–read-inline-info=yes|no read debug info about inlined function calls
and use it to do better stack traces. [yes]
on Linux/Android for Memcheck/Helgrind/DRD
only. [no] for all other tools and platforms.
–read-var-info=yes|no read debug info on stack and global variables
and use it to print better error messages in
tools that make use of it (Memcheck, Helgrind,
DRD) [no]
–vgdb-poll= gdbserver poll max every basic blocks [5000]
–vgdb-shadow-registers=no|yes let gdb see the shadow registers [no]
–vgdb-prefix= prefix for vgdb FIFOs [/tmp/vgdb-pipe]
–run-libc-freeres=no|yes free up glibc memory at exit on Linux? [yes]
–sim-hints=hint1,hint2,… activate unusual sim behaviours [none]
where hint is one of:
lax-ioctls fuse-compatible enable-outer
no-inner-prefix no-nptl-pthread-stackcache none
–fair-sched=no|yes|try schedule threads fairly on multicore systems [no]
–kernel-variant=variant1,variant2,…
handle non-standard kernel variants [none]
where variant is one of:
bproc android-no-hw-tls
android-gpu-sgx5xx android-gpu-adreno3xx none
–merge-recursive-frames= merge frames between identical
program counters in max frames) [0]
–num-transtab-sectors= size of translated code cache [16]
more sectors may increase performance, but use more memory.
–aspace-minaddr=0xPP avoid mapping memory below 0xPP [guessed]
–show-emwarns=no|yes show warnings about emulation limits? [no]
–require-text-symbol=:sonamepattern:symbolpattern abort run if the
stated shared object doesn’t have the stated
text symbol. Patterns can contain ? and *.
–soname-synonyms=syn1=pattern1,syn2=pattern2,… synonym soname
specify patterns for function wrapping or replacement.
To use a non-libc malloc library that is
in the main exe: --soname-synonyms=somalloc=NONE
in libxyzzy.so: --soname-synonyms=somalloc=libxyzzy.so
–sigill-diagnostics=yes|no warn about illegal instructions? [yes]
–unw-stack-scan-thresh= Enable stack-scan unwind if fewer
than good frames found [0, meaning “disabled”]
NOTE: stack scanning is only available on arm-linux.
–unw-stack-scan-frames= Max number of frames that can be
recovered by stack scanning [5]

六、内存泄漏
user options for Memcheck:
–leak-check=no|summary|full search for memory leaks at exit? [summary] //在退出时搜索内存泄漏
–leak-resolution=low|med|high differentiation of leak stack traces [high]
//在做内存泄漏检查时，确定memcheck将怎么样考虑不同的栈是否是相同的情况。当设置为low时，只需要前两层栈匹配就认为是相同的情况；当设置为med，必须要四层栈匹配，当设置为high时，所有层次的栈都必须匹配。
对于hardcore内存泄漏检查，你很可能需要使用–leak-resolution=high和–num-callers=40或者更大的数字。注意这将产生巨量的信息，这就是为什么默认选项是四个调用者匹配和低分辨率的匹配。注意–leak-resolution= 设置并不影响memcheck查找内存泄漏的能力。它只是改变了结果如何输出。
–show-leak-kinds=kind1,kind2,… which leak kinds to show? [definite,possible]
–errors-for-leak-kinds=kind1,kind2,… which leak kinds are errors? [definite,possible]
where kind is one of:
definite indirect possible reachable all none
–leak-check-heuristics=heur1,heur2,… which heuristics to use for
improving leak search false positive [none]
where heur is one of:
stdstring length64 newarray multipleinheritance all none
–show-reachable=yes same as --show-leak-kinds=all
–show-reachable=no --show-possibly-lost=yes same as --show-leak-kinds=definite,possible
–show-reachable=no --show-possibly-lost=no same as --show-leak-kinds=definite
–undef-value-errors=no|yes check for undefined value errors [yes] //检查未定义的值错误
–track-origins=no|yes show origins of undefined values? [no] //显示未定义值的起源
–partial-loads-ok=no|yes too hard to explain here; see manual [no]
–freelist-vol= volume of freed blocks queue [20000000] //释放块队列的容量
–freelist-big-blocks= releases first blocks with size>= [1000000] //释放第一个具有大小的块
–workaround-gcc296-bugs=no|yes self explanatory [no]
–ignore-ranges=0xPP-0xQQ[,0xRR-0xSS] assume given addresses are OK //假设给定的地址是可以的
–malloc-fill= fill malloc’d areas with given value //用给定的值填充malloc区域
–free-fill= fill free’d areas with given value //用给定的值填充空闲区域
–keep-stacktraces=alloc|free|alloc-and-free|alloc-then-free|none
stack trace(s) to keep for malloc’d/free’d areas [alloc-then-free] //为malloc’d/free’区域保留堆栈跟踪
–show-mismatched-frees=no|yes show frees that don’t match the allocator? [yes] //显示不匹配分配器的释放

Extra options read from ~/.valgrindrc, $VALGRIND_OPTS, ./.valgrindrc

Memcheck is Copyright © 2002-2013, and GNU GPL’d, by Julian Seward et al.
Valgrind is Copyright © 2000-2013, and GNU GPL’d, by Julian Seward et al.
LibVEX is Copyright © 2004-2013, and GNU GPL’d, by OpenWorks LLP et al.

Bug reports, feedback, admiration, abuse, etc, to: www.valgrind.org.

原文链接：https://blog.csdn.net/u010168781/article/details/83744778

六、使用Valgrind gdbserver和GDB调试程序

一、概述
在Valgrind下运行的程序不是由CPU直接执行的。相反，它运行在Valgrind提供的合成CPU上。这就是调试器在Valgrind上运行时无法调试程序的原因。

二、快速入门
在使用Memcheck工具时使用GDB调试程序，启动方式如下：
1、valgrind --vgdb = yes --vgdb-error = 0 可执程序
2、在另一个shell中，启动GDB：gdb 可执程序
3、将以下命令提供给GDB：(gdb) target remote | vgdb
现在可以调试程序了，例如插入断点然后使用GDB continue 命令。

三、远程调试gdbserver
1、gdbserver的原理
本地调试：GNU GDB调试器通常用于调试在同一台机器上运行的进程。在此模式下，GDB使用系统调用来控制和查询正在调试的程序。

远程调试：GDB还可以调试在不同计算机上运行的进程。为此，GDB定义了一个协议（即一组查询和回复数据包），它有助于获取内存或寄存器的值，设置断点等.gdbserver是这种“GDB远程调试”协议的实现。要调试在远程计算机上运行的进程，必须在远程计算机端运行gdbserver

2、Valgrind中的gdbserver
Valgrind核心提供了一个内置的gdbserver实现，它使用–vgdb=yes 或–vgdb=full来激活。此gdbserver允许在Valgrind的合成CPU上运行的进程远程调试。GDB将协议查询数据包（例如“获取寄存器内容”）发送到Valgrind中的gdbserver。gdbserver执行查询（例如，它将获取合成CPU的寄存器值）并将结果返回给GDB。

GDB可以使用各种通道（TCP / IP，串行线等）与gdbserver进行通信。在Valgrind中gdbserver的情况下，通过管道和一个名为vgdb的小帮助程序来完成通信，该程序充当中介。如果没有使用GDB，则vgdb也可用于从shell命令行向Valgrind gdbserver发送监视命令。

3、远程调试步骤
3.1 在目标机上启动gdbserver：
valgrind --tool = memcheck --vgdb = yes --vgdb-error = 0 ./prog
参数解释：
–vgdb=yes：启动gdbserver；
–vgdb-error = 0：在出现0个错误就开始调试，一般用在插入断点调试中

3.2 在调试机上启动GDB及vgdb
（vgdb是GDB和valgrind中gdbserver的通信的中间人）

gdb prog
(gdb) target remote | vgdb

如果远程只有一个gdbserver打印信息如下

Remote debugging using | vgdb
relaying data between gdb and process 2418
Reading symbols from /lib/ld-linux.so.2...done.
Reading symbols from /usr/lib/debug/lib/ld-2.11.2.so.debug...done.
Loaded symbols for /lib/ld-linux.so.2
[Switching to Thread 2418]
0x001f2850 in _start () from /lib/ld-linux.so.2
(gdb)

如果有多个gdbserver，需要只当进程号PID
target remote | vgdb --pid=PID

(gdb) target remote | vgdb
Remote debugging using | vgdb
no --pid= arg given and multiple valgrind pids found:
use --pid=2479 for valgrind --tool=memcheck --vgdb=yes --vgdb-error=0 ./prog 
use --pid=2481 for valgrind --tool=memcheck --vgdb=yes --vgdb-error=0 ./prog 
use --pid=2483 for valgrind --vgdb=yes --vgdb-error=0 ./another_prog 
Remote communication error: Resource temporarily unavailable.

(gdb)  target remote | vgdb --pid=2479
Remote debugging using | vgdb --pid=2479
relaying data between gdb and process 2479
Reading symbols from /lib/ld-linux.so.2...done.
Reading symbols from /usr/lib/debug/lib/ld-2.11.2.so.debug...done.
Loaded symbols for /lib/ld-linux.so.2
[Switching to Thread 2479]
0x001f2850 in _start () from /lib/ld-linux.so.2
(gdb)

原文链接：https://blog.csdn.net/u010168781/article/details/83748757

七、Memcheck（内存错误检测器）

一、概述
Memcheck是一个内存错误检测器。它可以检测C和C ++程序中常见的以下问题：
1、非法内存：如越界、释放后继续访问；
2、使用未初始化的值；
3、释放内存错误：如double-free（同一内存上执行了两次free）、或者 malloc、new、new[] 与 free、delete、delete[]错配使用
4、memcpy函数（或其它相关函数）中src和dst指针重叠；
5、分配函数时，传递的size参数非法，如果是一个负数；
6、内存泄漏。

像这样的问题很难通过其他方式找到，经常长时间未被发现，然后造成偶然的，难以诊断的崩溃。

二、Memcheck中错误消息的含义详解
1、Invalid read of size 4
含义：非法读取或写入错误。
例子，main.c源码如下

#include <stdio.h>
#include <stdlib.h>

int main()
{
	int *x = (int *)malloc(sizeof(int)*10);
	int i;
	for(i=0; i<=10; ++i)
	{
		x[i] = i;//当i=10时，越界，非法访问
	}
}

编译：gcc -g main.c
内存错误检查：valgrind --tool=memcheck ./a.out
错误信息如下

==20979== Memcheck, a memory error detector
==20979== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==20979== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==20979== Command: ./a.out
==20979== Parent PID: 17485
==20979== 
==20979== Invalid write of size 4
==20979==    at 0x400563: main (main.c:10)
==20979==  Address 0x5200068 is 0 bytes after a block of size 40 alloc'd
==20979==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==20979==    by 0x40053E: main (main.c:6)
==20979== 
==20979== 
==20979== HEAP SUMMARY:
==20979==     in use at exit: 40 bytes in 1 blocks
==20979==   total heap usage: 1 allocs, 0 frees, 40 bytes allocated
==20979== 
==20979== LEAK SUMMARY:
==20979==    definitely lost: 40 bytes in 1 blocks
==20979==    indirectly lost: 0 bytes in 0 blocks
==20979==      possibly lost: 0 bytes in 0 blocks
==20979==    still reachable: 0 bytes in 0 blocks
==20979==         suppressed: 0 bytes in 0 blocks
==20979== Rerun with --leak-check=full to see details of leaked memory
==20979== 
==20979== For counts of detected and suppressed errors, rerun with: -v
==20979== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

2、Conditional jump or move depends on uninitialised value(s)
含义：使用未初始化的值。
经常会遇到这个错误。
例子
main.c源码如下

#include <stdio.h>

int main()
{
	int x;
	printf("x = %d\n",x); //此处访问未初始化的值，错误
}

编译：gcc -g main.c
内存错误检查：valgrind --tool=memcheck ./a.out
错误信息如下

==21182== Memcheck, a memory error detector
==21182== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==21182== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==21182== Command: ./a.out
==21182== Parent PID: 17485
==21182== 
==21182== Conditional jump or move depends on uninitialised value(s)
==21182==    at 0x4E814CE: vfprintf (vfprintf.c:1660)
==21182==    by 0x4E8B3D8: printf (printf.c:33)
==21182==    by 0x400548: main (main.c:6)
==21182== 
==21182== Use of uninitialised value of size 8
==21182==    at 0x4E8099B: _itoa_word (_itoa.c:179)
==21182==    by 0x4E84636: vfprintf (vfprintf.c:1660)
==21182==    by 0x4E8B3D8: printf (printf.c:33)
==21182==    by 0x400548: main (main.c:6)
==21182== 
==21182== Conditional jump or move depends on uninitialised value(s)
==21182==    at 0x4E809A5: _itoa_word (_itoa.c:179)
==21182==    by 0x4E84636: vfprintf (vfprintf.c:1660)
==21182==    by 0x4E8B3D8: printf (printf.c:33)
==21182==    by 0x400548: main (main.c:6)
==21182== 
==21182== Conditional jump or move depends on uninitialised value(s)
==21182==    at 0x4E84682: vfprintf (vfprintf.c:1660)
==21182==    by 0x4E8B3D8: printf (printf.c:33)
==21182==    by 0x400548: main (main.c:6)
==21182== 
==21182== Conditional jump or move depends on uninitialised value(s)
==21182==    at 0x4E81599: vfprintf (vfprintf.c:1660)
==21182==    by 0x4E8B3D8: printf (printf.c:33)
==21182==    by 0x400548: main (main.c:6)
==21182== 
==21182== Conditional jump or move depends on uninitialised value(s)
==21182==    at 0x4E8161C: vfprintf (vfprintf.c:1660)
==21182==    by 0x4E8B3D8: printf (printf.c:33)
==21182==    by 0x400548: main (main.c:6)
==21182== 
==21182== 
==21182== HEAP SUMMARY:
==21182==     in use at exit: 0 bytes in 0 blocks
==21182==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==21182== 
==21182== All heap blocks were freed -- no leaks are possible
==21182== 
==21182== For counts of detected and suppressed errors, rerun with: -v
==21182== Use --track-origins=yes to see where uninitialised values come from
==21182== ERROR SUMMARY: 6 errors from 6 contexts (suppressed: 0 from 0)

可以加上选项 “–track-origins=yes”来查找，未初始化的来源，但会使Memcheck运行的更慢；
如加上选项 “–track-origins=yes”后打印信息如下

==21210== Memcheck, a memory error detector
==21210== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==21210== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==21210== Command: ./a.out
==21210== Parent PID: 17485
==21210== 
==21210== Conditional jump or move depends on uninitialised value(s)
==21210==    at 0x4E814CE: vfprintf (vfprintf.c:1660)
==21210==    by 0x4E8B3D8: printf (printf.c:33)
==21210==    by 0x400548: main (main.c:6)
==21210==  Uninitialised value was created by a stack allocation
==21210==    at 0x40052D: main (main.c:4)
==21210== 
==21210== Use of uninitialised value of size 8
==21210==    at 0x4E8099B: _itoa_word (_itoa.c:179)
==21210==    by 0x4E84636: vfprintf (vfprintf.c:1660)
==21210==    by 0x4E8B3D8: printf (printf.c:33)
==21210==    by 0x400548: main (main.c:6)
==21210==  Uninitialised value was created by a stack allocation
==21210==    at 0x40052D: main (main.c:4)
==21210== 
==21210== Conditional jump or move depends on uninitialised value(s)
==21210==    at 0x4E809A5: _itoa_word (_itoa.c:179)
==21210==    by 0x4E84636: vfprintf (vfprintf.c:1660)
==21210==    by 0x4E8B3D8: printf (printf.c:33)
==21210==    by 0x400548: main (main.c:6)
==21210==  Uninitialised value was created by a stack allocation
==21210==    at 0x40052D: main (main.c:4)
==21210== 
==21210== Conditional jump or move depends on uninitialised value(s)
==21210==    at 0x4E84682: vfprintf (vfprintf.c:1660)
==21210==    by 0x4E8B3D8: printf (printf.c:33)
==21210==    by 0x400548: main (main.c:6)
==21210==  Uninitialised value was created by a stack allocation
==21210==    at 0x40052D: main (main.c:4)
==21210== 
==21210== Conditional jump or move depends on uninitialised value(s)
==21210==    at 0x4E81599: vfprintf (vfprintf.c:1660)
==21210==    by 0x4E8B3D8: printf (printf.c:33)
==21210==    by 0x400548: main (main.c:6)
==21210==  Uninitialised value was created by a stack allocation
==21210==    at 0x40052D: main (main.c:4)
==21210== 
==21210== Conditional jump or move depends on uninitialised value(s)
==21210==    at 0x4E8161C: vfprintf (vfprintf.c:1660)
==21210==    by 0x4E8B3D8: printf (printf.c:33)
==21210==    by 0x400548: main (main.c:6)
==21210==  Uninitialised value was created by a stack allocation
==21210==    at 0x40052D: main (main.c:4)
==21210== 
==21210== 
==21210== HEAP SUMMARY:
==21210==     in use at exit: 0 bytes in 0 blocks
==21210==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==21210== 
==21210== All heap blocks were freed -- no leaks are possible
==21210== 
==21210== For counts of detected and suppressed errors, rerun with: -v
==21210== ERROR SUMMARY: 6 errors from 6 contexts (suppressed: 0 from 0)

其中，指出未初始化的位置信息是：

==21210== Uninitialised value was created by a stack allocation
==21210== at 0x40052D: main (main.c:4)

3、Syscall param * uninitialised byte(s)

Syscall param write(buf) points to uninitialised byte(s)
Syscall param write(buf) points to uninitialised byte(s)

含义：在系统调用中使用未初始化或不可寻址的值。
Memcheck检查系统调用的所有参数：
它会检查所有直接参数本身，无论它们是否已初始化。
此外，如果系统调用需要从程序提供的缓冲区中读取，则Memcheck会检查整个缓冲区是否可寻址并初始化其内容。
此外，如果系统调用需要写入用户提供的缓冲区，Memcheck会检查缓冲区是否可寻址。
系统调用后，Memcheck会更新其跟踪信息，以准确反映系统调用导致的内存状态的任何变化。
例子
源码main.c如下

  #include <stdlib.h>
  #include <unistd.h>
  int main( void )
  {
    char* arr  = malloc(10);
    int*  arr2 = malloc(sizeof(int));
    write( 1 /* stdout */, arr, 10 );
    exit(arr2[0]);
  }

编译：gcc -g main.c
内存检查：valgrind --tool=memcheck ./a.out
错误打印信息如下

==21355== Memcheck, a memory error detector
==21355== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==21355== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==21355== Command: ./a.out
==21355== Parent PID: 17485
==21355== 
==21355== Syscall param write(buf) points to uninitialised byte(s)
==21355==    at 0x4F263C0: __write_nocancel (syscall-template.S:81)
==21355==    by 0x4005F6: main (main.c:7)
==21355==  Address 0x5200040 is 0 bytes inside a block of size 10 alloc'd
==21355==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21355==    by 0x4005CE: main (main.c:5)
==21355==  Uninitialised value was created by a heap allocation
==21355==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21355==    by 0x4005CE: main (main.c:5)
==21355== 
==21355== Syscall param exit_group(status) contains uninitialised byte(s)
==21355==    at 0x4EFC109: _Exit (_exit.c:32)
==21355==    by 0x4E7316A: __run_exit_handlers (exit.c:97)
==21355==    by 0x4E731F4: exit (exit.c:104)
==21355==    by 0x400603: main (main.c:8)
==21355==  Uninitialised value was created by a heap allocation
==21355==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21355==    by 0x4005DC: main (main.c:6)
==21355== 
==21355== 
==21355== HEAP SUMMARY:
==21355==     in use at exit: 14 bytes in 2 blocks
==21355==   total heap usage: 2 allocs, 0 frees, 14 bytes allocated
==21355== 
==21355== LEAK SUMMARY:
==21355==    definitely lost: 0 bytes in 0 blocks
==21355==    indirectly lost: 0 bytes in 0 blocks
==21355==      possibly lost: 0 bytes in 0 blocks
==21355==    still reachable: 14 bytes in 2 blocks
==21355==         suppressed: 0 bytes in 0 blocks
==21355== Rerun with --leak-check=full to see details of leaked memory
==21355== 
==21355== For counts of detected and suppressed errors, rerun with: -v
==21355== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

4、 Invalid free() / delete / delete[] / realloc()
含义：非法释放。
例子
源码main.c如下：

#include <stdlib.h>
#include <unistd.h>
int main( void )
{
	char* arr  = malloc(10);
	free(arr);
	free(arr);
	return 0;
}

编译：gcc -g main.c
内存检查：valgrind --tool=memcheck ./a.out
错误打印信息如下

==21442== Memcheck, a memory error detector
==21442== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==21442== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==21442== Command: ./a.out
==21442== Parent PID: 17485
==21442== 
==21442== Invalid free() / delete / delete[] / realloc()
==21442==    at 0x4C2BDEC: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21442==    by 0x4005AA: main (main.c:7)
==21442==  Address 0x5200040 is 0 bytes inside a block of size 10 free'd
==21442==    at 0x4C2BDEC: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21442==    by 0x40059E: main (main.c:6)
==21442== 
==21442== 
==21442== HEAP SUMMARY:
==21442==     in use at exit: 0 bytes in 0 blocks
==21442==   total heap usage: 1 allocs, 2 frees, 10 bytes allocated
==21442== 
==21442== All heap blocks were freed -- no leaks are possible
==21442== 
==21442== For counts of detected and suppressed errors, rerun with: -v
==21442== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

5、Mismatched free() / delete / delete []
含义：分配内存和释放内存方法不匹配。
例子：使用malloc分配内存，然后使用delete释放就会报这个错误。
源码main.c

#include <stdlib.h>
#include <unistd.h>
int main( void )
{
	char* arr  = (char*)malloc(10);
	delete arr;
	return 0;
}

使用G++编译：g++ -g main.c
内存检查：valgrind --tool=memcheck ./a.out
错误打印信息如下

==21579== Memcheck, a memory error detector
==21579== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==21579== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==21579== Command: ./a.out
==21579== Parent PID: 17485
==21579== 
==21579== Mismatched free() / delete / delete []
==21579==    at 0x4C2C2BC: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21579==    by 0x40066E: main (main.c:6)
==21579==  Address 0x5a20040 is 0 bytes inside a block of size 10 alloc'd
==21579==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21579==    by 0x40065E: main (main.c:5)
==21579== 
==21579== 
==21579== HEAP SUMMARY:
==21579==     in use at exit: 0 bytes in 0 blocks
==21579==   total heap usage: 1 allocs, 1 frees, 10 bytes allocated
==21579== 
==21579== All heap blocks were freed -- no leaks are possible
==21579== 
==21579== For counts of detected and suppressed errors, rerun with: -v
==21579== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

6、Source and destination overlap in memcpy
含义：下面的C库函数从一个存储器块复制一些数据到另一个memcpy、 strcpy、strncpy、strcat、strncat它们src和 dst指针指向的块不允许重叠。POSIX标准的措辞如下：“如果在重叠的对象之间进行复制，则行为未定义。” 因此，Memcheck会检查这一点。
例子
源码main.c如下：

#include<string.h>

int main( void )
{
	char a[12] = {'h','e','l', 'l','o','\0'};
	memcpy(a+3,a,6);
	return 0;
}

编译：gcc -g main.c
内存检查：valgrind --tool=memcheck ./a.out
错误打印信息如下

==22017== Memcheck, a memory error detector
==22017== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==22017== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==22017== Command: ./a.out
==22017== 
==22017== Source and destination overlap in memcpy(0xffefffbc3, 0xffefffbc0, 6)
==22017==    at 0x4C2F71C: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==22017==    by 0x400613: main (main.c:6)
==22017== 
==22017== 
==22017== HEAP SUMMARY:
==22017==     in use at exit: 0 bytes in 0 blocks
==22017==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==22017== 
==22017== All heap blocks were freed -- no leaks are possible
==22017== 
==22017== For counts of detected and suppressed errors, rerun with: -v
==22017== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

疑问，当我用memcpy(a, a, 6);时，没有报错，哪位大神给解释下？

7、Argument ‘size’ of function malloc has a fishy (possibly negative) value:
含义：可疑的参数值。
所有内存分配函数都使用一个参数来指定应分配的内存块的大小。显然，请求的大小应该是非负值，并且通常不会过大。例如，在64位计算机上，分配请求的大小超过2 ** 63字节或者是负值。这样的值被称为“可疑的值”。
下列函数中的size参数将被检查： malloc、calloc、 realloc、memalign、new、 new []、 __builtin_new、 __builtin_vec_new，对于calloc 两个参数都在检查中。
例子
源码main.c如下

#include <stdlib.h>
#include <unistd.h>
int main( void )
{
	char* arr  = (char *)malloc(-1);
	free(arr);
	return 0;
}

编译：gcc -g main.c
内存检查：valgrind --tool=memcheck ./a.out
错误打印信息如下

==22177== Memcheck, a memory error detector
==22177== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==22177== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==22177== Command: ./a.out
==22177== 
==22177== Argument 'size' of function malloc has a fishy (possibly negative) value: -1
==22177==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==22177==    by 0x400590: main (main.c:5)
==22177== 
==22177== 
==22177== HEAP SUMMARY:
==22177==     in use at exit: 0 bytes in 0 blocks
==22177==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==22177== 
==22177== All heap blocks were freed -- no leaks are possible
==22177== 
==22177== For counts of detected and suppressed errors, rerun with: -v
==22177== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

8、LEAK SUMMARY:
含义：内存泄漏检查。
Memcheck跟踪malloc/new函数对应的free/delete等的调用，因此，当程序退出时，它知道哪些块未被释放。
此功能需要设置参数–leak-check=summary或full。
例子
源码main.c如下

#include <stdlib.h>
#include <unistd.h>
int main( void )
{
	char* arr  = (char *)malloc(4);
	//free(arr);//此处没有释放
	return 0;
}

编译：gcc -g main.c
内存检查：valgrind --tool=memcheck --leak-check=full ./a.out
错误打印信息如下

==22289== Memcheck, a memory error detector
==22289== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==22289== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==22289== Command: ./a.out
==22289== 
==22289== 
==22289== HEAP SUMMARY:
==22289==     in use at exit: 4 bytes in 1 blocks
==22289==   total heap usage: 1 allocs, 0 frees, 4 bytes allocated
==22289== 
==22289== 4 bytes in 1 blocks are definitely lost in loss record 1 of 1
==22289==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==22289==    by 0x40053E: main (main.c:5)
==22289== 
==22289== LEAK SUMMARY:
==22289==    definitely lost: 4 bytes in 1 blocks
==22289==    indirectly lost: 0 bytes in 0 blocks
==22289==      possibly lost: 0 bytes in 0 blocks
==22289==    still reachable: 0 bytes in 0 blocks
==22289==         suppressed: 0 bytes in 0 blocks
==22289== 
==22289== For counts of detected and suppressed errors, rerun with: -v
==22289== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

LEAK SUMMARY：内存泄漏总结（分类）
definitely lost: 4 bytes in 1 blocks：绝对丢失，这种情况应该由程序员来解决，下面几种情况，可以当作参考
indirectly lost: 0 bytes in 0 blocks：间接丢失
possibly lost: 0 bytes in 0 blocks：可能丢失
still reachable: 0 bytes in 0 blocks：仍然可以访问
suppressed: 0 bytes in 0 blocks：抑制错误中的丢失

原文链接：https://blog.csdn.net/u010168781/article/details/83749609

八、Memcheck命令行参数详解

Memcheck命令行选项
–leak-check=<no|summary|yes|full> [default: summary]
程序执行完毕后，搜索内存泄漏。默认值为summary，只统计发生了多少次泄漏。如果设置为full或 yes，则每个单独的泄漏将被详细显示或计为错误。

–leak-resolution=<low|med|high> [default: high]
在进行泄漏检查时，确定Memcheck有多大意愿将不同的回溯视为相同，以便将多个泄漏合并到单个泄漏报告中。设置为时low，只有前两个条目需要匹配。什么时候med，四个条目必须匹配。何时high，所有条目都需要匹配。
对于硬核泄漏调试，您可能希望 --leak-resolution=high与–num-callers=40一些如此大的数字一起使用。
请注意，该–leak-resolution设置不会影响Memcheck查找泄漏的能力。它只会改变结果的呈现方式。

–show-leak-kinds= [default: definite,possible]
在指定–leak-check=full后，设定需要显示的泄漏类型，具体方法如下：
以逗号分隔的一个或多个列表 definite indirect possible reachable。
all指定完整集（所有泄漏种类）。它相当于 --show-leak-kinds=definite,indirect,possible,reachable。
none 为空集。

–errors-for-leak-kinds= [default: definite,possible]
在指定–leak-check=full后，设定需要计为错误的泄漏类型。与–show-leak-kinds类似。

–leak-check-heuristics= [default: all]
通过启发式检测（heuristics），Memcheck可以识别3～4的情况，不把这些报告成内存泄漏。Memcheck会把一部分"possibly lost"识别成"still reachable"。启发式集以下列方式之一指定：
以逗号分隔的一个或多个列表 stdstring length64 newarray multipleinheritance。
all激活整套启发式算法。它相当于 --leak-check-heuristics=stdstring,length64,newarray,multipleinheritance。
none 为空集。

–show-reachable=<yes|no> ， --show-possibly-lost=<yes|no>
这些选项提供了另一种指定要显示的泄漏类型的方法：
–show-reachable=no --show-possibly-lost=yes相当于 --show-leak-kinds=definite,possible。
–show-reachable=no --show-possibly-lost=no相当于 --show-leak-kinds=definite。
–show-reachable=yes相当于 --show-leak-kinds=all。

–xtree-leak=<no|yes> [no]
如果设置为yes，则在退出时执行的泄漏搜索的结果将以“Callgrind格式”执行树文件输出。请注意，这会自动设置该选项–leak-check=full。生成的文件将包含以下事件：
RB ：可达字节
PB ：可能丢失字节
IB ：间接丢失字节
DB ：绝对丢失字节（直接加间接）
DIB ：绝对间接丢失字节（DB的子集）
RBk ：可达块
PBk ：可能会丢失块
IBk ：间接失去了积木
DBk ：绝对丢失了块
上述所有事件的增加或减少也将在文件中输出，以提供2次连续泄漏搜索之间的增量（增加或减少）。例如，事件iRB的增加是RB事件dPBk的减少PBk。对于第一次泄漏搜索，增加和减少事件的值将为零。

–xtree-leak-file= [default: xtleak.kcg.%p]
在指定文件中生成xtree泄漏报告。%p表示当前进程号

–undef-value-errors=<yes|no> [default: yes]
控制Memcheck是否报告使用未定义的值错误。

–track-origins=<yes|no> [default: no]
控制Memcheck是否跟踪未初始化值的来源。默认情况下是no。
设置yes为时，Memcheck会跟踪所有未初始化值的来源。然后，当报告未初始化的值错误时，Memcheck将尝试显示值的来源。
性能开销：它将Memcheck的速度减半，并将内存使用量增加至少100MB，甚至更多。
准确性：Memcheck非常准确地跟踪起源。为了避免非常大的空间和时间开销，进行了一些近似。Memchecko有可能会报告错误的来源，或者无法识别任何来源。
请注意，–track-origins=yes 和–undef-value-errors=no不能同时设置。Memcheck在启动时会检查，如果同时设置了这两项，会报错。

–partial-loads-ok=<yes|no> [default: yes]
控制Memcheck如何处理32位，64位，128位和256位自然对齐的加载，这些加载来自某些字节可寻址而其他字节不可寻址的地址。何时yes，此类负载不会产生地址错误。相反，源自非法地址的加载字节被标记为未初始化，而与合法地址相对应的加载字节以正常方式处理。？？？
当no来自部分无效地址的加载与来自完全无效地址的加载相同时：发出非法地址错误，并将结果字节标记为已初始化。
请注意，以这种方式运行的代码违反了ISO C / C ++标准，应该被视为已损坏。如果可能的话，应该修复这样的代码。

–expensive-definedness-checks=<no|auto|yes> [default: auto]
控制Memcheck在检查某些值的定义时是否应该使用更精确但更昂贵（耗时）的方法。特别是，这会影响整数加法，减法和相等比较的检测。
选择–expensive-definedness-checks=yes 最大限度地减少错误率，但可能导致高达30％的性能下降。
选择–expensive-definedness-checks=no 最大限度地提高性能，但通常会产生非常高的错误率。
–expensive-definedness-checks=auto强烈建议使用默认设置。

–keep-stacktraces=alloc|free|alloc-and-free|alloc-then-free|none [default: alloc-and-free]
控制哪些堆栈跟踪保留malloc或free块。
使用alloc-then-free，在分配时记录堆栈跟踪，并与块关联。释放块时，将记录第二个堆栈跟踪，这将替换分配堆栈跟踪。因此，与此块相关的任何“释放后使用”错误只能显示块被释放的堆栈跟踪。
使用时alloc-and-free，存储块的分配和释放堆栈跟踪。因此，“使用后免费”错误将显示两者，这可能使错误更容易诊断。相比之下alloc-then-free，此设置略微增加了Valgrind的内存使用，因为块包含两个引用而不是一个引用。
使用时alloc，仅记录（并报告）分配堆栈跟踪。使用时free，仅记录（并报告）释放堆栈跟踪。这些值有点降低了Valgrind的内存和CPU使用率。它们可能很有用，具体取决于您要搜索的错误类型以及分析它们所需的详细程度。例如，如果您只对内存泄漏错误感兴趣，则记录分配堆栈跟踪就足够了。
使用时none，不会记录malloc和free操作的堆栈跟踪。如果您的程序分配了许多块和/或从许多不同的堆栈跟踪中分配/释放，这可以显着减少所需的CPU和/或内存。当然，对于与堆块相关的错误，将报告的细节很少。
请注意，一旦记录了堆栈跟踪，Valgrind就会将堆栈跟踪保留在内存中，即使它没有被任何块引用。某些程序（例如，递归算法）可以生成大量的堆栈跟踪。如果Valgrind在这种情况下使用太多内存，则可以减少选项所需的内存–keep-stacktraces 和/或使用较小的选项值–num-callers。
如果要使用 --xtree-memory=full内存分析（请参阅执行树），则无法指定–keep-stacktraces=free 或–keep-stacktraces=none。

–freelist-vol= [default: 20000000]
当客户端程序使用free（ C）或 delete （C++）释放内存时，该内存不会立即可用于重新分配。相反，它被标记为不可访问并放置在已释放块的队列中。目的是尽可能地推迟释放内存重新流通的点。这增加了Memcheck在释放后的一段时间内能够检测到对块的无效访问的机会。

此选项指定队列中块的最大总大小（以字节为单位）。默认值为20000000。增加此值会增加Memcheck使用的内存总量，但可能会检测到释放块的无效使用，否则将无法检测到这些块。

–freelist-big-blocks= [default: 1000000]
当从可用于重新分配的释放块队列中创建块时，Memcheck将优先重新循环大小大于或等于的块–freelist-big-blocks。这确保释放大块（特别是释放大于块的块 --freelist-vol）不会立即导致空闲列表中的所有（或许多）小块的再循环。换句话说，这个选项增加了发现“小”块的悬空指针的可能性，即使在释放大块时也是如此。

将值设置为0意味着所有块都按FIFO顺序重新循环。

–workaround-gcc296-bugs=<yes|no> [default: no]
启用时，假设读取和写入堆栈指针下方的一些小距离是由于GCC 2.96中的错误，并且不报告它们。“小距离”默认为256字节。请注意，GCC 2.96是某些古老Linux发行版（RedHat 7.X）上的默认编译器，因此您可能需要使用此选项。如果您不必使用它，请不要使用它，因为它可能导致忽略真正的错误。更好的选择是使用更新的GCC，其中修复了此错误。

在32位PowerPC Linux上使用GCC 3.X或4.X时，您可能还需要使用此选项。这是因为GCC生成的代码偶尔会访问堆栈指针下方，特别是浮点到/来自整数转换。这违反了32位PowerPC ELF规范，这使得无法访问堆栈指针下方的位置。

从版本3.12开始，此选项已弃用，可能会从将来的版本中删除。您应该使用 --ignore-range-below-sp指定应忽略的堆栈指针下方的精确偏移范围。一个合适的等价物是–ignore-range-below-sp=1024-1。

–ignore-range-below-sp=-
这是对已弃用–workaround-gcc296-bugs选项的更一般替代。指定时，它会导致Memcheck不报告堆栈指针下方指定偏移处的访问错误。两个偏移量必须是正十进制数，并且 - 有点违反直觉 - 第一个必须更大，以便暗示要忽略的非环绕地址范围。例如，要忽略堆栈指针下8192字节的4字节访问，请使用–ignore-range-below-sp=8192-8189。只能指定一个范围。

–show-mismatched-frees=<yes|no> [default: yes]
启用后，Memcheck将使用与分配函数匹配的函数检查是否已释放堆块。也就是说，它预计free将用于删除根据所分配的块malloc，delete供分配的块new，并delete[]为块的分配new[]。如果检测到不匹配，则报告错误。这通常很重要，因为在某些环境中，使用不匹配的函数释放可能会导致崩溃。

然而，存在无法避免这种不匹配的情况。也就是说，当用户提供 new/ new[]调用malloc和delete/或delete[]调用的实现时free，这些函数是非对称内联的。例如，假设delete[]内联但new[]不是内联。结果是Memcheck“看到”所有delete[]调用都是直接调用free，即使程序源不包含不匹配的调用。

这会导致许多令人困惑和无关的错误报告。 --show-mismatched-frees=no禁用这些检查。但是，通常不建议禁用它们，因为您可能会错过真正的错误。

–ignore-ranges=0xPP-0xQQ[,0xRR-0xSS]
Memcheck的可寻址性检查将忽略此选项中列出的任何范围（并且可以指定多个范围，用逗号分隔）。

–malloc-fill=
使用指定的字节填充由malloc， new等分配但不分配的块calloc。当试图摆脱模糊的内存损坏问题时，这可能很有用。Memcheck仍将分配的区域视为未定义 - 此选项仅影响其内容。请注意，–malloc-fill当它用作客户端请求VALGRIND_MEMPOOL_ALLOC或VALGRIND_MALLOCLIKE_BLOCK的参数时，不会影响内存块。

–free-fill=
填充由释放的块free， delete等等，与指定的字节值。当试图摆脱模糊的内存损坏问题时，这可能很有用。Memcheck仍将被释放区域视为无效访问 - 此选项仅影响其内容。请注意，–free-fill当它用作客户端请求VALGRIND_MEMPOOL_FREE或VALGRIND_FREELIKE_BLOCK的参数时，不会影响内存块。

原文链接：https://blog.csdn.net/u010168781/article/details/83753283

九、Memcheck检查的内容和方法

一、值的有效性
1、什么是值的有效性？
英文原文是Valid-value (V) bits，直译过来就是有效值（V）位。
我将它理解为值的有效性，就是判断在内存或CPU的物理地址中存储的数据是否有效，比如在内存中变量（int i）代表的物理位置（不是地址），没有初始化，就去使用它，是否合法，参见下面的判断。

2、当仅仅是复制未初始化的值，并且不使用它时，Memcheck不会报告错误，认为是有效的。
例子代码如下：

int i, j;
int a[10], b[10];
for ( i = 0; i < 10; i++ ) {
  j = a[i];
  b[i] = j;
}

上述代码中数组a没有赋值，将数组a复制给数组b，虽然这段代码没有意思，但是Memcheck不会报错。

3、Memcheck会检查以下三种情况
当使用值生成内存地址；
需要进行控制流决策时；
检测到系统调用时。
例子代码如下：

for ( i = 0; i < 10; i++ ) {
  j += a[i];
}
if ( j == 77 ) 
  printf("hello world!\n");

由于“j”没有初始化，并且被用到if（控制流决策），所以此处Memcheck会报告错误。

二、地址的有效性
英文原文是Valid-address (A) bits：有效地址(A)位。
内存的一个物理位置中的数据是否有效，我们称为值的有效性；是否可以合法地读取或写入该位置（即，是否可以访问该位置），来判断地址的有效性。
哪些地址有效：
1、程序启动时，所有全局数据区域都标记为可访问（地址有效）。
2、当程序执行 malloc、new，分配的区域标记为可访问（地址有效），没有分配的依然时无效的；在释放该区域后，该区域标记位不可访问（地址无效）。
3、栈的数据，即局部变量地址有效。实现方法是，根据栈指针寄存器（SP）的移动来触发标记哪些是地址有效、哪些已经无效了。规则是从 SP堆栈的底部到堆栈的区域被标记为可访问，并且下面的区域SP是不可访问的。

三、Memcheck的检查机制可归纳如下：
1、存储器中的每个字节有两个属性：该字节中的值是否有效和该字节是否可以访问；
2、读取或写入存储器时，会检查地址是否有效，如果是无效的地址，则Memcheck会发出无效读取或无效写入错误；
3、当存储器读入CPU寄存器或从寄存器写入存储器时不会检查值的有效性；
4、当CPU寄存器中的值用于生成存储器地址或确定条件分支的结果时，将检查这些值的有效性，如果未定义任何值，则发出错误；
5、一旦检查这些值的有效性后，就将它们设置为检查过的，即标记为值有效，以后再检查就认为是有效的，这避免了重复错误。
6、从内存加载值时，Memcheck会检查该地址是否有效，并在需要时发出非法地址警告。在这种情况下，尽管地址无效，也会将该值标记为有效的，目的是减少呈现给用户的混乱信息量。这样避免了既地址无效又值无效的现象，准确定位错误原因。
7、对于来自部分有效且部分无效的地址的多字节加载，存在模糊的边界情况。有关详细信息，请参阅选项–partial-loads-ok
8、Memcheck会记录分析如下函数：malloc、calloc、realloc、valloc、memalign、free、new、 new[]、delete和 delete[]。
8.1、malloc、new、new[]：分配的内存被标记为可寻址的但不具有有效的值。这意味着初始化后才能使用它们。
8.2、calloc：分配的内存标记为可寻址和有效，因为calloc将区域清除为零。
8.3、realloc：如果新分配的内存大于旧的，则多出的部分标记为地址有效但值无效，如同 malloc。如果新分配的内存小于旧的，则失去的部分标记为不可寻址（地址无效）。
8.4、free、delete、delete[]：传递给这些函数的指针（指向的地址）必须是之前malloc、new、new[]等返回的，否则，Memcheck会报错。如果指针确实有效，则Memcheck将其指向的整个区域标记为不可寻址（释放后地址无效了），并将该块放置在freed-blocks-queue中，目的是尽可能延迟重新分配这个区块。如果在释放后，再去访问它就会引发无效地址错误。

原文链接：https://blog.csdn.net/u010168781/article/details/83781852

十、SGCheck（检查栈和全局数组溢出）

一、概述
SGCheck是一种用于检查栈中和全局数组溢出的工具。它的工作原理是使用一种启发式方法，该方法源于对可能的堆栈形式和全局数组访问的观察。
栈中的数据：例如函数内声明数组int a[10]，而不是malloc分配的，malloc分配的内存是在堆中。
SGCheck和Memcheck是互补的：它们的功能不重叠。
Memcheck对堆数组（如malloc分配的内存）执行边界检查和使用后检查。它还可以检查堆或栈分配创建的未初始化值（值的有效性检查）。但它不会对栈或全局数组执行边界检查。
SGCheck只对栈或全局数组进行边界检查，不做其它检查。

二、使用
1、SGCheck没有命令行参数，使用方法如下：

valgrind --tool=exp-sgcheck ./a.out

2、例子
main.c源码

int main()
{
	int i;
	int a[10];
	for (i=0; i<=10; ++i)//当i=10时，越界
	{
		a[i] = i;
	}
	return 0;
}

编译：gcc -g main.c
检查：$ valgrind --tool=exp-sgcheck ./a.out
错误信息

==3228== exp-sgcheck, a stack and global array overrun detector
==3228== NOTE: This is an Experimental-Class Valgrind Tool
==3228== Copyright (C) 2003-2013, and GNU GPL'd, by OpenWorks Ltd et al.
==3228== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==3228== Command: ./a.out
==3228== 
==3228== Invalid write of size 4
==3228==    at 0x400502: main (main.c:7)
==3228==  Address 0xfff0003b8 expected vs actual:
==3228==  Expected: stack array "a" of size 40 in this frame
==3228==  Actual:   unknown
==3228==  Actual:   is 0 after Expected
==3228== 
==3228== 
==3228== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)

如果使用Memcheck无法检查处栈中数组越界的错误

$ valgrind ./a.out
==4212== Memcheck, a memory error detector
==4212== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==4212== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==4212== Command: ./a.out
==4212== 
==4212== 
==4212== HEAP SUMMARY:
==4212==     in use at exit: 0 bytes in 0 blocks
==4212==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==4212== 
==4212== All heap blocks were freed -- no leaks are possible
==4212== 
==4212== For counts of detected and suppressed errors, rerun with: -v
==4212== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

三、SGCheck的缺点
SGCheck是一个实验性的工具，并不完善，使用时有以下限制：
1、遗漏错误
第一次使用栈或全局数组时就溢出了，该情况无法检查。因为内存引用指令对栈或全局数组的第一次访问时，在该指令和数组之间创建了一个关联，在后续访问中才做检查，直到函数退出，所以无法在检查第一次使用栈或全局数组时就溢出的问题。
2、误报
如对下面代码的检查，可以肯定下面代码没有错误，但是SGCheck会报告错误。解决办法就是，使用抑制错误。

int main()
{
        int a[10], b[10], *p, i;
        int q=0;
        for(i=0; i<10; i++)
        {
                p=(q==0)?&a[i]:&b[i];
                if (q==0)
                        q = 1;
                else
                        q = 0;
                *p = 42;
        }
}

编译：gcc -g main.c
检查：$ valgrind --tool=exp-sgcheck ./a.out
错误信息

$ valgrind --tool=exp-sgcheck ./a.out
==10724== exp-sgcheck, a stack and global array overrun detector
==10724== NOTE: This is an Experimental-Class Valgrind Tool
==10724== Copyright (C) 2003-2013, and GNU GPL'd, by OpenWorks Ltd et al.
==10724== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==10724== Command: ./a.out
==10724== 
==10724== Invalid write of size 4
==10724==    at 0x400549: main (main.c:12)
==10724==  Address 0xfff000394 expected vs actual:
==10724==  Expected: stack array "a" of size 40 in this frame
==10724==  Actual:   stack array "b" of size 40 in this frame
==10724==  Actual:   is 12 after Expected
==10724== 
==10724== Invalid write of size 4
==10724==    at 0x400549: main (main.c:12)
==10724==  Address 0xfff000368 expected vs actual:
==10724==  Expected: stack array "b" of size 40 in this frame
==10724==  Actual:   stack array "a" of size 40 in this frame
==10724==  Actual:   is 40 before Expected
==10724== 
==10724== 
==10724== ERROR SUMMARY: 9 errors from 2 contexts (suppressed: 4 from 4)

3、性能
SGCheck的运行速度比Memcheck慢。

4、平台
栈或全局数组检查在PowerPC、ARM或S390X平台上无法正常工作，仅适用于X86和AMD64平台。

原文链接：https://blog.csdn.net/u010168781/article/details/83784492

十一、Massif（堆分析器）

一、概述
Massif是一个堆分析器。它统计程序使用的堆内存大小（由malloc等函数分配的内存）。默认情况下不统计程序所使用的所有内存，如果想统计所有内存，需要使用选项–pages-as-heap=yes。

堆分析可以帮助减少程序使用的内存。如果分配的内存还没有释放并且指针也在，这种情况对于Memcheck（内存泄漏检查器）来说不算错误。但是随着时间内存增加，这也算内存泄漏，Massif可以帮助识别这些泄漏。

重要的是，Massif不仅会报告程序正在使用多少堆内存，还会提供非常详细的信息，来指明这些内存是由程序中哪部分分配的。

二、使用
0、源码

#include <stdio.h>
#include <stdlib.h>

int main()
{
    int *x = (int *)malloc(sizeof(int)*10);
    free(x);
    x = (int *)malloc(sizeof(int)*10);
    int *y = (int *)malloc(sizeof(int)*10);
    free(y);
    free(x);
    return 0;
}

1、编译
gcc编译源码时，添加 -g 选项，Massif对编译优化没有要求。

2、分析
执行命令：valgrind --tool=massif --time-unit=B ./a.out，./a.out是可执行程序，执行完毕后，massif将分析数据保存在在当前目录下，文件格式是massif.out.PID，PID是进程号

3、查看
使用 ms_print massif.out.PID查看分析结果，内容如下

--------------------------------------------------------------------------------
Command:            ./a.out
Massif arguments:   --time-unit=B
ms_print arguments: massif.out.5262
--------------------------------------------------------------------------------
     B
  112^                                                ############            
     |                                                #                       
     |                                                #                       
     |                                                #                       
     |                                                #                       
     |                                                #                       
     |                                                #                       
     |                                                #                       
     |                                                #                       
     |                                                #                       
     |            @@@@@@@@@@@@            ::::::::::::#           ::::::::::: 
     |            @                       :           #           :           
     |            @                       :           #           :           
     |            @                       :           #           :           
     |            @                       :           #           :           
     |            @                       :           #           :           
     |            @                       :           #           :           
     |            @                       :           #           :           
     |            @                       :           #           :           
     |            @                       :           #           :           
   0 +----------------------------------------------------------------------->B
     0                                                                     336

Number of snapshots: 9
 Detailed snapshots: [2, 6 (peak)]
--------------------------------------------------------------------------------
  n        time(B)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
  0              0                0                0             0            0
  1             56               56               40            16            0
  2             56               56               40            16            0
71.43% (40B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->71.43% (40B) 0x40058D: main (main.c:6)
--------------------------------------------------------------------------------
  n        time(B)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
  3            112                0                0             0            0
  4            168               56               40            16            0
  5            224              112               80            32            0
  6            224              112               80            32            0
71.43% (80B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->35.71% (40B) 0x4005A7: main (main.c:8)
| 
->35.71% (40B) 0x4005B5: main (main.c:9)
| 
->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
--------------------------------------------------------------------------------
  n        time(B)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
  7            280               56               40            16            0
  8            336                0                0             0            0

4、说明
<1> 坐标图详解
坐标图中“:”表示普通快照、“@”表示详细快照、“#”表示峰值快照；坐标图左下角的“Number of snapshots: 9”是快照总数量、
“Detailed snapshots: [2, 6 (peak)]”是详细快照列表，peak表示峰值快照。
<2>普通快照详解

-<a>-------<b>------------ <c>----------<d>---------------<e>-------------<f>--
  n        time(B)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
  0          0                0                0             0            0
  1          56               56               40            16           0

a.快照编号；
b.快照采集时间，(B)括号中的B表示时间单位是字节，在执行Massif分析时，添加了参数–time-unit=B；
c.总内存消耗量；
d.可用堆内存的字节数，即程序申请内存时，指定的数量；
e.额外堆内存的字节数，包括管理内存增加的字节（默认是8，可以使用–heap-admin选项来重新设定）和为了对齐多出的字节（通常是8或16，可以使用–alignment选项来重新设定）；
f.栈的大小，默认情况下，栈分析是关闭的，因为它会大大降低Massif的速度。因此，示例中的表示栈大小的列为零（可以使用–stacks=yes选项打开栈分析）。

<3>详细快照详解
除了基本计数（和普通快照一样）之外，它还提供了一个分配树，准确地指出这些堆内存是由哪些代码分配的：

71.43% (40B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->71.43% (40B) 0x40058D: main (main.c:6)

第一行表示分配的函数；
第二行表示在源码中的位置
<4>峰值快照详解
峰值快照和详细快照一样，不再赘述

5、警告
如果程序有分支（有父子进程），需要使用–massif-out-file选项来指定保存文件名，并且在文件名中添加“%p”，来分别保存父子进程的分析结果，否则会记录到一个文件中，导致ms_print无法读取。

6、完整内存分析
默认情况下，Massif只分析堆内存，即函数malloc、 calloc、 realloc、memalign、new、new[]和一些其他类似的函数分配的内存。这意味着它不直接分析较低级别的系统调用，如 mmap、 mremap、brk。它也不分析代码段、数据段和BSS段（约等于全局区域）的大小。因此Massif分析报告的数字可能远远小于top等程序工具所报告的数字。
如果想测量程序使用的所有内存，可以使用–pages-as-heap=yes选项，启用此选项后，Massif通过把mmap和类似系统调用函数分配的每个“页面”都被视为一个不同于常规堆块的“块”来分析。这意味着代码段、数据段、BSS段和栈内存都被测量。
注意：不允许–stacks=yes和–pages-as-heap=yes同时出现。
设置–pages-as-heap=yes后，ms_print的输出大部分不变。一个区别是每个详细快照的开头由：

（heap allocation functions) malloc/new/new[], --alloc-fns, etc.

变成：

(page allocation syscalls) mmap/mremap/brk, --alloc-fns, etc.

三、Massif命令行选项
–heap=<yes|no> [default: yes]
是否启动对分析，默认是yes（启动）。

–heap-admin= [default: 8]
设置每个块的管理字节数，属于快照详解中额外字节数

–stacks=<yes|no> [default: no]
是否启动栈内存分析，启动后会减慢Massif，默认是关闭。

–pages-as-heap=<yes|no> [default: no]
是否检查程序所使用的全部内存，即代码段、数据段、BSS段和栈内存，参见上面的分析。
注意：不允许–stacks=yes和–pages-as-heap=yes同时出现。

–depth= [default: 30]
设置详细快照中分配树的最大深度。增加它将使Massif运行得更慢，使用更多内存，并产生更大的输出文件。

–alloc-fn=
指定封装了堆分配函数的函数名。
注意：
如果malloc1封装了函数malloc，并且malloc2又封装了malloc1，则只指定 --alloc-fn=malloc2将不起作用。还需要指定–alloc-fn=malloc1；
如果是C++，必须完整地写入重载的函数名，并放到单引号内，如：

--alloc-fn ='operator new（unsigned，std :: nothrow_t const＆）'

–ignore-fn=
指定堆分析时，忽略的函数，如malloc、new或–alloc-fn指定的函数等，编写C ++函数名的规则与–alloc-fn相同。

–threshold=<m.n> [default: 1.0]
设置详细快照中分配树是否打印出来代码详细位置的阈值，低于该阈值时，不打印，默认是1.0%，如：

99.76% (10,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->79.81% (8,000B) 0x80483C2: g (example.c:5)
| ->39.90% (4,000B) 0x80483E2: f (example.c:11)
| | ->39.90% (4,000B) 0x8048431: main (example.c:23)
| |   
| ->39.90% (4,000B) 0x8048436: main (example.c:25)
|   
->19.95% (2,000B) 0x80483DA: f (example.c:10)
| ->19.95% (2,000B) 0x8048431: main (example.c:23)
|   
->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)

–peak-inaccuracy=<m.n> [default: 1.0]
针对峰值快照的选项，当内存增加到比前一个峰值至少1.0%（默认值）时，才将这时的内存进行详细的快照，更新为新的峰值。否则内存增加一点点就记录为峰值会影响Massif的性能。

–time-unit=<i|ms|B> [default: i]
设置Massif分析的时间单位。有三种：按照程序运行的指令数、按照时间（毫秒）、按照申请、释放的字节数变化。最后一种对于小程序和测试时非常有用，因为它的复现性比较好。

–detailed-freq= [default: 10]
设置详细快照的频率。当设置为1（–detailed-freq=1）时，每个快照都被记录成详细的。

–max-snapshots= [default: 100]
设置最大快照数。如果设置为N，对于除非常小的程序（程序小到它全部快照数不超过默认值）外，最终的快照数将介于N / 2和N之间。

–massif-out-file= [default: massif.out.%p]
将Massif分析数据写入指定file 而不是默认输出文件 massif.out.。
文件名中可以添加 %p和%q用来加入进程ID和%q后指定环境变量的内容。
比如，有环境变量“HELLO”，可以使用下面的命令格式将进程ID和该环境变量加入到文件名中：

valgrind --tool=massif --massif-out-file=hello.%p.%q{HELLO} ./a.out

如果%q后没有环境变量，将报错：valgrind: --massif-out-file: expected ‘{’ after ‘%q’，错误的命令格式如下：

valgrind --tool=massif --massif-out-file=hello.%p.%q ./a.out

如果%q后面的环境变量不存在，将报错：valgrind: --massif-out-file: environment variable HELLO is not set

四、ms_print命令行选项
-h --help
显示帮助信息。

–version
显示版本号。

–threshold=<m.n> [default: 1.0]
与Massif的–threshold选项相同，但在分析后而不是在分析期间使用。

–x=<4…1000> [default: 72]
设置图表的宽度。

–y=<4…1000> [default: 20]
设置图表的高度。

原文链接：https://blog.csdn.net/u010168781/article/details/83788559

十二、DHAT：动态堆分析器

一、概述
DHAT动态堆分析器。Massif（堆分析器）是在程序结束后输出分析结果，而DHAT是实时输出结果，所以叫做动态堆分析器。Massif只记录堆内存的申请和释放，DHAT还会分析堆空间的使用率、使用周期等信息。
DHAT的功能：它首先记录在堆上分配的块，通过分析每次内存访问时所指定的块判断是否是之前已经记录过的块，并收集统计这些信息，最终输入如下结果：

总共分配的堆内存数（字节数和块数）；
程序运行中堆内存的最大数（字节数和块数）；
块平均寿命（从分配到释放之间的指令数）；
块中每个字节的平均读写次数（“访问率”）；
对于总是仅分配一个大小的块的分配点，该大小为4096字节或更少：计数表示访问块内每个字节偏移的频率。
使用这些统计信息可以得出以下结果：

潜在的泄漏（进程生命周期内）：由该点分配的块只是累积，并且仅在运行结束时释放；
过度浪费内存（英文原文excessive turnover）：由该点分配的块只是累积，吞噬很多堆内存，但会释放，不会保持很长时间；
过度瞬态内存：从分配到释放的时间非常短；
无用或未充分利用的内存：已分配但未完全使用的内存，或只写到内存中但随后并没有读的内存；
使用效率低的块
二、使用
1、例子源码

#include <stdio.h>
#include <stdlib.h>

int main()
{
	char *x = (char *)malloc(100000);

	int *i = (int*) malloc(sizeof(int));
	int *arr[1000];
	for(*i=0; *i<1000; ++(*i))
	{
		arr[*i] = (int*)malloc(sizeof(int));
		*arr[*i] = *i;
	}
	int *j = (int*) malloc(sizeof(int));
	for(*i=0; *i<1000; ++(*i))
	{
		(*j) += *arr[*i];
		free(arr[*i]);
	}

	int *arr1[1000];
	for(*i=0; *i<1000; ++(*i))
	{
		arr1[*i] = (int*)malloc(sizeof(int));
		*arr1[*i] = *i;
	}
	for(*i=0; *i<1000; ++(*i))
	{
		free(arr1[*i]);
	}

	free(i);
	free(j);
	free(x);

	return 0;
}

2、编译

gcc -g main.c

3、分析
执行命令：valgrind --tool=exp-dhat ./a.out
输出结果如下：

$ valgrind --tool=exp-dhat ./a.out
==14000== DHAT, a dynamic heap analysis tool
==14000== NOTE: This is an Experimental-Class Valgrind Tool
==14000== Copyright (C) 2010-2013, and GNU GPL'd, by Mozilla Inc
==14000== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==14000== Command: ./a.out
==14000== 
==14000== 
==14000== ======== SUMMARY STATISTICS ========
==14000== 
==14000== guest_insns:  333,340
==14000== 
==14000== max_live:     104,008 in 1,003 blocks
==14000== 
==14000== tot_alloc:    108,008 in 2,003 blocks
==14000== 
==14000== insns per allocated byte: 3
==14000== 
==14000== 
==14000== ======== ORDERED BY decreasing "max-bytes-live": top 10 allocators ========
==14000== 
==14000== -------------------- 1 of 10 --------------------
==14000== max-live:    100,000 in 1 blocks
==14000== tot-alloc:   100,000 in 1 blocks (avg size 100000.00)
==14000== deaths:      1, at avg age 209,049 (62.71% of prog lifetime)
==14000== acc-ratios:  0.00 rd, 0.00 wr  (0 b-read, 0 b-written)
==14000==    at 0x4C28EF0: malloc (vg_replace_malloc.c:296)
==14000==    by 0x400592: main (main.c:6)
==14000== 
==14000== -------------------- 2 of 10 --------------------
==14000== max-live:    4,000 in 1,000 blocks
==14000== tot-alloc:   4,000 in 1,000 blocks (avg size 4.00)
==14000== deaths:      1,000, at avg age 55,405 (16.62% of prog lifetime)
==14000== acc-ratios:  1.00 rd, 1.00 wr  (4,000 b-read, 4,000 b-written)
==14000==    at 0x4C28EF0: malloc (vg_replace_malloc.c:296)
==14000==    by 0x4005CC: main (main.c:12)
==14000== 
==14000== Aggregated access counts by offset:
==14000== 
==14000== [   0]  2000 2000 2000 2000 
==14000== 
==14000== -------------------- 3 of 10 --------------------
==14000== max-live:    4,000 in 1,000 blocks
==14000== tot-alloc:   4,000 in 1,000 blocks (avg size 4.00)
==14000== deaths:      1,000, at avg age 49,514 (14.85% of prog lifetime)
==14000== acc-ratios:  0.00 rd, 1.00 wr  (0 b-read, 4,000 b-written)
==14000==    at 0x4C28EF0: malloc (vg_replace_malloc.c:296)
==14000==    by 0x4006C8: main (main.c:25)
==14000== 
==14000== Aggregated access counts by offset:
==14000== 
==14000== [   0]  1000 1000 1000 1000 
==14000== 
==14000== -------------------- 4 of 10 --------------------
==14000== max-live:    4 in 1 blocks
==14000== tot-alloc:   4 in 1 blocks (avg size 4.00)
==14000== deaths:      1, at avg age 208,950 (62.68% of prog lifetime)
==14000== acc-ratios:  17004.00 rd, 4004.00 wr  (68,016 b-read, 16,016 b-written)
==14000==    at 0x4C28EF0: malloc (vg_replace_malloc.c:296)
==14000==    by 0x4005A3: main (main.c:8)
==14000== 
==14000== Aggregated access counts by offset:
==14000== 
==14000== [   0]  21008 21008 21008 21008 
==14000== 
==14000== -------------------- 5 of 10 --------------------
==14000== max-live:    4 in 1 blocks
==14000== tot-alloc:   4 in 1 blocks (avg size 4.00)
==14000== deaths:      1, at avg age 153,940 (46.18% of prog lifetime)
==14000== acc-ratios:  1000.00 rd, 1000.00 wr  (4,000 b-read, 4,000 b-written)
==14000==    at 0x4C28EF0: malloc (vg_replace_malloc.c:296)
==14000==    by 0x400627: main (main.c:15)
==14000== 
==14000== Aggregated access counts by offset:
==14000== 
==14000== [   0]  2000 2000 2000 2000 
==14000== 
==14000== 
==14000== 
==14000== ==============================================================
==14000== 
==14000== Some hints: (see --help for command line option details):
==14000== 
==14000== * summary stats for whole program are at the top of this output
==14000== 
==14000== * --show-top-n=  controls how many alloc points are shown.
==14000==                  You probably want to set it much higher than
==14000==                  the default value (10)
==14000== 
==14000== * --sort-by=     specifies the sort key for output.
==14000==                  See --help for details.
==14000== 
==14000== * Each allocation stack, by default 12 frames, counts as
==14000==   a separate alloc point.  This causes the data to be spread out
==14000==   over far too many alloc points.  I strongly suggest using
==14000==   --num-callers=4 or some such, to reduce the spreading.
==14000==

三、结果分析
1、概要信息

==14000== ======== SUMMARY STATISTICS ========
==14000== guest_insns:  333,340
==14000== max_live:     104,008 in 1,003 blocks
==14000== tot_alloc:    108,008 in 2,003 blocks
==14000== insns per allocated byte: 3

guest_insns：程序运行的指令总数；
max_live：程序运行期间堆内存最大值；
tot_alloc：程序运行期间堆内存累计值；
insns per allocated byte：平均每到每条指令分配的堆内存数

2、每次分配的堆内存的信息
如果程序很大，分配堆内存的点很多，一般会不列出点的信息，DHAT会按照分配内存的数量由多到少的顺序列出前n个，n默认是10，可以由选项–show-top-n=来设定，一般情况下会设置几百个。
下面是其中一个堆内存分配点的的详细信息：

==14000== ======== ORDERED BY decreasing "max-bytes-live": top 10 allocators ========
==14000== -------------------- 1 of 10 --------------------
==14000== max-live:    100,000 in 1 blocks
==14000== tot-alloc:   100,000 in 1 blocks (avg size 100000.00)
==14000== deaths:      1, at avg age 209,049 (62.71% of prog lifetime)
==14000== acc-ratios:  0.00 rd, 0.00 wr  (0 b-read, 0 b-written)
==14000==    at 0x4C28EF0: malloc (vg_replace_malloc.c:296)
==14000==    by 0x400592: main (main.c:6)
==14000==

max-live：程序运行期间堆内存最大值
tot-alloc：程序运行期间堆内存累计值
deaths：释放次数
acc-ratios：分配的块中的每个字节平均读写次数

四、命令行选项
–show-top-n= [default: 10]
设置显示的条目，默认是10，这个值比较小，一般情况下至少要设置为几百。

–sort-by= [default: max-bytes-live]
在运行结束时，DHAT根据某个指标对累积的分配点进行排序，并显示最高得分条目。 --sort-by 选择用于排序的指标：

max-bytes-live 程序运行期间堆内存最大值，按字节计算[默认]；
tot-bytes-allocd 程序运行期间堆内存累计值，按字节计算；
max-blocks-live 程序运行期间堆内存最大值，按块计算；
tot-blocks-allocd 程序运行期间堆内存累计值，按块计算
友情提示：

按max-blocks-live排序往往会显示创建大量小对象的分配点。
–num-callers = 4或更小，更容易的分析分配点。

原文链接：https://blog.csdn.net/u010168781/article/details/83861592

十三、Helgrind（线程错误检测器）

一、概述
Helgrind用于检测C、C ++和Fortran程序中使用符合POSIX标准的线程函数造成的同步错误。

POSIX中关于线程的主要抽象描述有：共享公共地址空间的一组线程、线程创建、线程连接、线程退出、互斥（锁）、条件变量（线程间事件通知）、读写器锁、自旋锁、信号量和线程等待（也叫做屏障）。

Helgrind可以检测到三类错误：

错误使用POSIX线程API；
死锁问题；
资源竞争——在没有足锁定或同步的情况下访问内存。
像这样的问题经常导致不可重现的、与时间相关的崩溃、死锁等很难通过其他方式找到。

二、使用
编译： gcc -g -pthread main.c

三、错误信息详解
1、错误使用POSIX线程API
Helgrind检查许多POSIX线程函数的调用，因此能够报告各种常见问题。虽然有很多都是无意义的错误，但它们可能会导致程序行为不明确，以及以后难以发现的错误。检测到的错误有以下几种：

a.解锁无效的互斥锁，错误信息如下

==10045== Thread #1 unlocked an invalid lock at 0xFFEFFFBA0
==10045==    at 0x4C329D6: pthread_mutex_unlock (hg_intercepts.c:707)
==10045==    by 0x4009D7: main (main.c:28)

b.解锁未锁定的互斥锁

==10045== Thread #1 unlocked a not-locked lock at 0x6010E0
==10045==    at 0x4C329D6: pthread_mutex_unlock (hg_intercepts.c:707)
==10045==    by 0x4009F0: main (main.c:31)
==10045==  Lock at 0x6010E0 was first observed
==10045==    at 0x4C321AA: pthread_mutex_init (hg_intercepts.c:518)
==10045==    by 0x4009E6: main (main.c:30)
==10045==  Address 0x6010e0 is 0 bytes inside data symbol "g_lock"

c.解锁由不同线程持有的互斥锁
d.销毁无效或锁定的互斥锁

==10073== Thread #1: pthread_mutex_destroy of a locked mutex
==10073==    at 0x4C32268: pthread_mutex_destroy (hg_intercepts.c:553)
==10073==    by 0x4009D9: main (main.c:43)

e.递归锁定非递归互斥锁
f.释放包含锁定互斥锁的内存
g.错误的将pthread_mutex_t类型参数传递给了本应该是pthread_ rwlock_t参数的函数，反之亦然
h.当POSIX线程函数调用失败并且必须处理错误代码时
i.当一个线程退出，而持有的锁仍然处于锁定状态时
j.pthread_cond_wait 使用了未锁定的互斥锁、无效的互斥锁或由其他线程锁定的互斥锁时
k.绑定条件变量互斥锁和使用的锁不一致
l.无效或重复初始化线程等待（如pthread_barrier_init）
m.在pthread_barrier_init或pthread_barrier_wait之前调用了pthread_barrier_destroy
n.在pthread_barrier_init之前调用了pthread_barrier_wait
o.由系统线程库返回的错误代码，即使Helgrind本身未检测到错误

2、死锁
造成死锁最常见的原因如下：
假设锁L1和L2在同时锁定的状态下才能访问共享资源R。如果线程T1和T2都要访问R，T1首先获取L1、T2首先获取L2，然后T1因为尝试获取L2失败而进入于等待状态，T2因为尝试获取L1失败也进入等待状态，这时就造成死锁。
Helgrind检查死锁的原理：
Helgrind首先构建一个有向图，指示过去获取锁的顺序。当线程获取新锁时，图表会更新，然后检查它是否现在包含一个循环。循环的存在表明涉及循环中的锁可能是死锁。
通常，Helgrind将选择可能造成死锁的两个锁，并将相关的信息打印出来。

==8039== Thread #1: lock order "0x6010E0 before 0x601120" violated
==8039== 
==8039== Observed (incorrect) order is: acquisition of lock at 0x601120
==8039==    at 0x4C32536: pthread_mutex_lock (hg_intercepts.c:593)
==8039==    by 0x4009BB: main (main.c:41)
==8039== 
==8039==  followed by a later acquisition of lock at 0x6010E0
==8039==    at 0x4C32536: pthread_mutex_lock (hg_intercepts.c:593)
==8039==    by 0x4009C5: main (main.c:42)
==8039== 
==8039== Required order was established by acquisition of lock at 0x6010E0
==8039==    at 0x4C32536: pthread_mutex_lock (hg_intercepts.c:593)
==8039==    by 0x4008EC: fun (main.c:16)
==8039==    by 0x4C30FA6: mythread_wrapper (hg_intercepts.c:234)
==8039==    by 0x4E45183: start_thread (pthread_create.c:312)
==8039==    by 0x515903C: clone (clone.S:111)
==8039== 
==8039==  followed by a later acquisition of lock at 0x601120
==8039==    at 0x4C32536: pthread_mutex_lock (hg_intercepts.c:593)
==8039==    by 0x4008F6: fun (main.c:17)
==8039==    by 0x4C30FA6: mythread_wrapper (hg_intercepts.c:234)
==8039==    by 0x4E45183: start_thread (pthread_create.c:312)
==8039==    by 0x515903C: clone (clone.S:111)
==8039== 
==8039==  Lock at 0x6010E0 was first observed
==8039==    at 0x4C321AA: pthread_mutex_init (hg_intercepts.c:518)
==8039==    by 0x40096F: main (main.c:30)
==8039==  Address 0x6010e0 is 0 bytes inside data symbol "g_lock1"
==8039== 
==8039==  Lock at 0x601120 was first observed
==8039==    at 0x4C321AA: pthread_mutex_init (hg_intercepts.c:518)
==8039==    by 0x40097E: main (main.c:31)
==8039==  Address 0x601120 is 0 bytes inside data symbol "g_lock2"

3、资源竞争
当两个线程访问共享内存位置而不使用合适的锁或其他同步方法来确保，同一时刻只用单一线程访问共享内存时，就可能发生资源竞争。

原文链接：https://blog.csdn.net/u010168781/article/details/83865659

十四、Cachegrind（缓存和分支预测分析器）

一、概述
Cachegrind，它模拟CPU中的一级缓存I1，Dl和二级缓存，能够精确地指出程序中cache的丢失和命中。如果需要，它还能够为我们提供cache丢失次数，内存引用次数，以及每行代码，每个函数，每个模块，整个程序产生的指令数。这对优化程序有很大的帮助。

Cachegrind模拟程序与CPU的缓存层次结构和分支预测器（可选）的交互方式。它模拟具有独立的第一级指令和数据缓存（I1和D1）的CPU，支持二级缓存（L2）。这与许多CPU的配置完全匹配。

但是，一些新的CPU具有三级或四级缓存。对于这种情况，Cachegrind会模拟第一级L1和最后一级缓存。原因是L1高速缓存通常具有低关联性、最后一级缓存对程序运行时影响最大，因此模拟它们可以检测代码与此高速缓存主要交互情况。

Cachegrind收集以下统计信息（括号中给出了每个统计信息使用的缩写）：
I 缓存读取（Ir，它等于执行的指令数），I1缓存读取未命中（I1mr）和LL缓存指令读取未命中（ILmr）。
D缓存读取（Dr等于内存读取的数量），D1缓存读取未命中（D1mr）和LL缓存数据读取未命中（DLmr）。
D缓存写入（Dw等于内存写入次数），D1缓存写入未命中（D1mw）和LL缓存数据写入未命中（DLmw）。
条件分支执行（Bc）和条件分支错误预测（Bcm）。
间接分支执行（Bi）和间接分支错误预测（Bim）。
注意，D1总访问量由D1mr+ 给出 D1mw，LL总访问量由ILmr+ DLmr+ 给出DLmw。

在现代CPU上，L1未命中通常将花费大约10个周期，LL未命中可能花费多达200个周期，并且分支错误预测的成本在10到30个周期。详细的缓存和分支分析对于了解程序如何与CPU进行交互以及如何使其更快地运行非常有用。

此外，由于每个执行的指令执行一次指令高速缓存读取，因此可以找出每行代码的指令数。

二、使用
1、例子main.c源码

#include <stdio.h>

int main()
{
	printf("hello\n");
	return 0;
}

2、编译

gcc -g main.c

3、检测命令

$ valgrind --tool=cachegrind  ./a.out

4、打印信息

==5968== Cachegrind, a cache and branch-prediction profiler
==5968== Copyright (C) 2002-2013, and GNU GPL'd, by Nicholas Nethercote et al.
==5968== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==5968== Command: ./a.out
==5968== 
--5968-- warning: L3 cache found, using its data for the LL simulation.
hello
==5968== 
==5968== I   refs:      106,417
==5968== I1  misses:        777
==5968== LLi misses:        765
==5968== I1  miss rate:    0.73%
==5968== LLi miss rate:    0.71%
==5968== 
==5968== D   refs:       40,230  (26,892 rd   + 13,338 wr)
==5968== D1  misses:      1,760  ( 1,253 rd   +    507 wr)
==5968== LLd misses:      1,555  ( 1,075 rd   +    480 wr)
==5968== D1  miss rate:     4.3% (   4.6%     +    3.8%  )
==5968== LLd miss rate:     3.8% (   3.9%     +    3.5%  )
==5968== 
==5968== LL refs:         2,537  ( 2,030 rd   +    507 wr)
==5968== LL misses:       2,320  ( 1,840 rd   +    480 wr)
==5968== LL miss rate:      1.5% (   1.3%     +    3.5%  )

5、Cachegrind命令行选项
–I1=,,
指定1级指令高速缓存的大小，关联性和行大小。

–D1=,,
指定1级数据高速缓存的大小，关联性和行大小。

–LL=,,
指定最后一级缓存的大小，关联性和行大小。

–cache-sim=no|yes [yes]
启用或禁用缓存访问和未命中计数的收集。

–branch-sim=no|yes [no]
启用或禁用分支指令和错误预测计数的收集。默认情况下，这会被禁用，因为它会使Cachegrind减慢大约25％。注意：–cache-sim=no 和–branch-sim=no 不能同时使用。

–cachegrind-out-file=
将配置文件数据写入 file而不是默认输出文件 cachegrind.out.。

6、cg_annotate命令行选项
-h --help
显示帮助信息。

–version
显示版本号。

–show=A,B,C [default: all, using order in cachegrind.out.]
指定要显示的事件（以及列顺序）。默认是使用cachegrind.out.文件中的所有内容（并使用文件中的顺序）。配合–sort使用。

–sort=A,B,C [default: order in cachegrind.out.]
指定逐个函数条目的排序所基于的事件。

–threshold=X [default: 0.1%]
设置阈值。

–auto=<no|yes> [default: no]
启用后，会自动注释可以找到的逐个功能摘要中提到的每个文件。还列出了无法找到的列表。

–context=N [default: 8]
在每个带注释的行之前和之后打印N行上下文。避免打印未执行的大部分源文件。使用大数字（例如100000）来显示所有源线。

-I

–include=
[default: none]
将目录添加到列表中以搜索文件。可以提供多个-I/ --include选项来添加多个目录。
7、cg_merge命令行选项
-o outfile
将配置文件数据写入outfile 而不是标准输出。

8、cg_diff命令行选项
-h --help
显示帮助信息。

–version
显示版本号。

–mod-filename= [default: none]
用于消除文件中的微小差异。

–mod-funcname= [default: none]
和–mod-filename相似。用于删除某些编译器生成的自动生成函数的随机名称中的微小差异。

原文链接：https://blog.csdn.net/u010168781/article/details/84137730

十五、Callgrind（性能分析图）

一、概述
1、Callgrind
Callgrind用于记录程序中函数之间的调用历史信息，对程序性能分析。默认情况下，收集的数据包括执行的指令数，它们与源码行的关系，函数之间的调用者、被调用者关系以及此类调用的数量。可选项是，对高速缓存模拟和分支预测（类似于Cachegrind）。

2、callgrind_annotate、callgrind_control
在程序终止时将配置文件数据写出到文件。为了呈现数据和交互式控制分析，提供了两个命令行工具：

callgrind_annotate

此命令读入配置文件数据，并打印已排序的函数列表，可选择使用源注释。
对于数据的图形可视化可以使用 KCachegrind。

callgrind_control

使用此命令可以交互式地观察和控制当前在Callgrind控件下运行的程序的状态，而无需停止程序。

3、功能
上一篇介绍过Cachegrind，Cachegrind用于收集：事件计数（数据读取，缓存未命中等）。而Callgrind用于记录函数成本。例如：函数foo调用 bar，则将成本bar加入到 foo成本中。使用callgrind_annotate或KCachegrind可以查看从main开始的调用关系图，可以查看各个点的成本，便于优化代码。

Callgrind检测函数调用和返回的能力取决于它运行的平台的指令集。它最适用于x86和amd64，遗憾的是目前在PowerPC，ARM，Thumb或MIPS代码上运行效果不佳。这是因为这些指令集中没有显式的调用或返回指令，因此Callgrind必须依靠启发式方法来检测调用和返回。

二、使用
1、实例源码main.c

#include <stdio.h>
#include <pthread.h>
#include <sys/types.h>
#include <unistd.h>

static int g_i = 0;
static pthread_mutex_t g_lock1;

static int num=1;
static void *fun(void * arg)
{
	int i;
	for(i=0; i<num; ++i)
	{
		pthread_mutex_lock(&g_lock1);
		printf("pthread[%d]:i=%d\n",getpid(), --g_i);
		pthread_mutex_unlock(&g_lock1);

		sleep(1);
	}
}

int main()
{
	pthread_t t;

	pthread_mutex_init(&g_lock1, NULL);

	if (pthread_create(&t, NULL, fun, NULL)==-1)
	{
		printf("pthread creat error\n");
	}

	int i;
	for(i=0; i<num; ++i)
	{
		pthread_mutex_lock(&g_lock1);
		printf("main[%d]:i=%d\n",getpid(), ++g_i);
		pthread_mutex_unlock(&g_lock1);
		
		sleep(1);
	}

	if (pthread_join(t, NULL) == -1)
	{
		printf("pthread_join error\n");
	}
	pthread_mutex_destroy(&g_lock1);
	return 0;
}

2、编译

gcc -g main.c -pthread

3、性能分析命令

valgrind --tool=callgrind  ./a.out

4、查看结果
执行完上述命令后，在当前目录下生成 callgrind.out.文件，使用可视化kcachegrind查看

kcachegrind callgrind.out.21479

5、生成流程图
使用gprof2dot.py和dot生成图片

python gprof2dot.py -f callgrind -n10 -s callgrind.out.21479> valgrind.dot
dot -Tpng valgrind.dot -o valgrind.png

三、命令行参数详解
–callgrind-out-file=
将配置文件数据写入指定文件 file中，而不是默认输出文件 callgrind.out.。可以使用%p、%q格式说明符，参见–log-file；

–dump-line=<no|yes> [default: yes]
使用以源码行粒度执行事件计数，需要在gcc时加 -g选项；

–dump-instr=<no|yes> [default: no]
使用以指令粒度执行事件计数。结果只能在KCachegrind中显示。

–compress-strings=<no|yes> [default: yes]
用数字代替标识符（文件和函数名称）；

–compress-pos=<no|yes> [default: yes]
位置表示为绝对值还是相对值；

–combine-dumps=<no|yes> [default: no]
启用后，将结果输出到同一文件中。不建议启用；

–dump-every-bb= [default: 0, never]
设置配置文件大小，超出后保存到下一个文件。

–dump-before=
当进入到指定函数时，将信息保存到新的配置文件中；

–zero-before=
进入指定函数时将所有性能参数归零；

–dump-after=
当离开指定函数时，将信息保存到新的配置文件中；

–instr-atstart=<yes|no> [default: yes]
指定是否希望Callgrind从程序开头开始模拟和分析。

–collect-atstart=<yes|no> [default: yes]
指定是否在配置文件运行开始时启用事件收集。

–toggle-collect=
在进入/退出指定函数时切换集合。

–collect-jumps=<no|yes> [default: no]
这指定是否应该收集条件跳转的信息。

–collect-systime=<no|yes> [default: no]
这指定是否应收集系统调用时间的信息。

–collect-bus=<no|yes> [default: no]
这指定是否应该收集执行的全局总线事件的数量。

–separate-threads=<no|yes> [default: no]
是否为每个线程单独生成配置文件。如果是，则文件名将附加“-threadID”。

–separate-callers= [default: 0]

–separate-callers=

–separate-recs= [default: 2]

–separate-recs=

–skip-plt=<no|yes> [default: yes]
忽略对PLT部分的调用。

–skip-direct-rec=<no|yes> [default: yes]
忽略直接递归。

–fn-skip=
忽略对给定函数的调用。例如，如果有一个名为A> B> C的调用链，并指定要忽略的函数B，则只能看到A> C.

原文链接：https://blog.csdn.net/u010168781/article/details/84303954