服务内存占用分析_jeprof-CSDN博客

本文链接：https://blog.csdn.net/XYY_CN/article/details/137058638

进程内存占用分析

1 查看进程内存常用命令
2 详细的内存分布

有时，我们有这样一些需求，分析进程的内存占用热点，便于进一步优化内存；分析进程的内存随时间变化情况，判断是否存在内存泄漏问题。这一类需求，首先需要先从进程层面查看服务究竟占用了多少内存，是否触及系统允许用户使用的最大内存，往往还需要进一步剖析究竟是哪些代码消耗了较多的内存。获取这些前置信息往往是后续分析和优化的前提。文章介绍了linux系统上如何查看进程内存大小，以及如何使用 valgrind massif、 jeprof工具分析内存热点和变化趋势。

1 查看进程内存常用命令

1.1 ps

ps -aux | grep redis

root      260337  0.0  0.1  68836 11640 ?        Ssl   2023 444:56 redis-server *:6380 [cluster]
root      260343  0.0  0.1  68476 11284 ?        Ssl   2023 440:08 redis-server *:6381 [cluster]

1.2 top

top -p $pid

Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.3 us,  1.3 sy,  0.0 ni, 95.7 id,  0.5 wa,  0.0 hi,  0.2 si,  0.0 st
MiB Mem :   7768.7 total,    259.9 free,   4510.2 used,   2998.6 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   2959.8 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                                                                 
 260337 root      20   0   68836  11640   3048 S   0.0   0.1 444:56.89 redis-server

1.3 pidstat

pidstat -p $pid 1 -r:每隔1秒输出依次进程的内存使用情况

Linux 4.18.0-305.3.1.el8.x86_64 (VM-32-17-centos)       2024年03月26日  _x86_64_        (2 CPU)
22时24分01秒   UID       PID  minflt/s  majflt/s     VSZ     RSS   %MEM  Command
22时24分02秒     0    260337      0.00      0.00   68836   11640   0.15  redis-server
22时24分03秒     0    260337      0.00      0.00   68836   11640   0.15  redis-server
22时24分04秒     0    260337      0.00      0.00   68836   11640   0.15  redis-server

2 详细的内存分布

2.1 valgrind massif

vrlgrind 提供了一种分析内存的工具massif,可以帮助我们分析进程的内存占用情况和变化趋势。massif

如何使用

例如我得程序名称为 demo, 启动方式为: ./bin/demo conf.json，那么我们可以通过下面两个步骤产出内存分析报告:

使用massif启动进程并开始对内存进行分析
valgrind -v --tool=massif --detailed-freq=10 --depth=10 --threshold=1 --massif-out-file=./massif.out ./bin/demo conf.json
启动后工具会对内存进行快照，每10个快照生成一次详细的分析，分析树中最大深度为10，内存占比小于1%的部分不在分析树中进行展示。之后，我们终止分析，会生成massif.out文件。
ms_print massif.out 输出分析报告

输出报告怎么看

报告开始的地方会给出进程的一些信息和一个内存的统计图

--------------------------------------------------------------------------------
Command:            ./bin/demo  conf.json
Massif arguments:   (none)
ms_print arguments: massif.out
--------------------------------------------------------------------------------
    GB
25.66^                                                                       :
     |                                                                       #
     |                                                                       #
     |                                                                       #
     |                                                                       #
     |                                                                    ::@#
     |                              @@:::::::@:@:::::::::::::::::@::::::::: @#
     |                          ::::@ :: :: :@:@: : : :: :  ::: :@: :: :: : @#
     |                   @::::::: : @ :: :: :@:@: : : :: :  ::: :@: :: :: : @#
     |                  @@:: : :: : @ :: :: :@:@: : : :: :  ::: :@: :: :: : @#
     |               :::@@:: : :: : @ :: :: :@:@: : : :: :  ::: :@: :: :: : @#
     |               :: @@:: : :: : @ :: :: :@:@: : : :: :  ::: :@: :: :: : @#
     |             :::: @@:: : :: : @ :: :: :@:@: : : :: :  ::: :@: :: :: : @#
     |          :::: :: @@:: : :: : @ :: :: :@:@: : : :: :  ::: :@: :: :: : @#
     |       @@@: :: :: @@:: : :: : @ :: :: :@:@: : : :: :  ::: :@: :: :: : @#
     |     ::@ @: :: :: @@:: : :: : @ :: :: :@:@: : : :: :  ::: :@: :: :: : @#
     |    @: @ @: :: :: @@:: : :: : @ :: :: :@:@: : : :: :  ::: :@: :: :: : @#
     |  @@@: @ @: :: :: @@:: : :: : @ :: :: :@:@: : : :: :  ::: :@: :: :: : @#
     | @@ @: @ @: :: :: @@:: : :: : @ :: :: :@:@: : : :: :  ::: :@: :: :: : @#
     | @@ @: @ @: :: :: @@:: : :: : @ :: :: :@:@: : : :: :  ::: :@: :: :: : @#
   0 +----------------------------------------------------------------------->Ti
     0                                                                   2.065

Number of snapshots: 75
 Detailed snapshots: [1, 2, 3, 5, 6, 12, 13, 22, 28, 30, 41, 48, 54, 64, 68, 69, 70, 71 (peak)]

起始的表格是记录一些启动参数。接着是一个统计图，图的纵轴是内存大小，例如上图是GB为单位，横轴的单位默认是指令数，可以通过–time-unit选项设置为ms(时间,毫秒为单位，如果进程执行时间很短，为了清晰看到中间环节，可以使用该选项)， B(内存分配大小)。图中由三类符号:

: 表示快照是一次普通快照，只记录大概信息
@ 表示一次详细的快照，会输出内存占用的详细信息
# 表示内存峰值的一次快照。

Number of snapshots：表示工具总共进行多少次内存快照
Detailed snapshots：表示哪些快照是保存的详细信息

普通快照信息
记录总共申请了多少内存，使用了多少内存，已经因为内存对齐和head信息额外申请的内存大小。
total=useful-heap+extra-heap （没有记录栈空间的情况下）

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
  4 185,986,340,777   16,683,855,416   15,855,594,450   828,260,966            0

详细快照信息
下面的内容是我手动精简之后的，避免太多内容影响查看。可以看到第5次块为详细快照，记录了每一部分内存主要消耗在哪里。例如，下图中，总共占用了17.6G内存，其中19.89%是低于1%阈值的，没有记录。15.16%来自 CAhoCorasick::initialize() 这个函数，该函数由loaddata调用。这样我们便能够轻松的定位到到底是哪些函数的哪些操作占用了较多的内存，方便进一步的针对性优化。

  5 231,717,477,787   18,627,986,072   17,621,534,931 1,006,451,141            0
94.60% (17,621,534,931B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->19.89% (3,705,401,934B) in 1580 places, all below massif's threshold (1.00%)
| 
->15.16% (2,824,867,416B) 0xA5DA1F7: CAhoCorasick::initialize(int, int const*, char const**) (in /usr/lib64/xx.so.0.0.0)
| ->08.11% (1,510,635,112B) 0x6D7ED50: loaddata(char const*) (xx.cpp:28)
| | ->08.11% (1,510,635,112B) 0x6DC6D2D: operator() (xx.cpp:1177)       
| ->07.06% (1,314,232,304B) 0x5A910A: loadDict(char const*) (yy.cpp:29)
|         
->14.38% (2,679,624,312B) 0x16DCDE5C: re_node_set_merge (in /usr/lib64/libc-2.17.so)
| ->14.38% (2,679,624,312B) 0x16DDA7B4: calc_eclosure_iter (in /usr/lib64/libc-2.17.so)
|
->08.95% (1,666,926,376B) 0x16DDE339: re_compile_internal (in /usr/lib64/libc-2.17.so)
| ->08.95% (1,666,926,376B) 0x16DDF36F: regcomp (in /usr/lib64/libc-2.17.so)
|   ->08.37% (1,558,788,552B) 0x6EAE61C: InitJob(char const*) (a.cpp:4513)
|   | ->08.37% (1,558,788,552B) 0x6EAF1B2: InitRules(char const*) (a.cpp:229)
|   
->07.85% (1,461,641,324B) 0x1659CA18: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib64/libstdc++.so.6.0.19)
| ->02.74% (509,998,183B) 0xC6DE2E0: char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) (in /usr/lib64/libboost_regex.so.1.53.0)
| | ->02.25% (418,530,522B) 0x1659E6D7: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) (in /usr/lib64/libstdc++.so.6.0.19)

一些重要的options

–max-snapshots: 最大快照数，超出这个数目会将前面的记录进行删除
–main-stacksize: 指定main线程的栈空间大小
–depth : 内存分析树的深度，默认是30
–threshold: 小于该阈值时，分析树上不在进行记录，默认是1.0，即1%
–detailed-freq: 记录快照的频率，默认10，即每生成10次内存快照记录一次详细快照
–massif-out-file: 指定输出的记录文件位置

2.2 jeprof

jeprof是jemalloc的附带工具，同样可以用于分析内存热点。

安装jeprof

在安装jemalloc时，需要手动打开jeprof工具的编译选项。autogen.sh文件中的autoconfigure那一行添加 --enable-prof选项。
在这里插入图片描述

无代码侵入的使用方式

在服务设计的初期，通常不会考虑直接使用jemalloc中的mallctl接口进行内存情况的分析。往往时在服务出现内存问题时，才想到使用jeprof去分析，此时往往希望不修改代码就行进行分析。也就是导入环境变量的方式，MALLOC_CONF这个环境变量会在jemalloc初始化的时候被分析。
export MALLOC_CONF=prof:true,lg_prof_interval:30,lg_prof_sample:17
常见的一些配置选项:

prof，开启profiling 功能
lg_prof_interval，每隔2^30字节进行一次内存dump
lg_prof_sample，每隔2^17字节进行一次内存采样,记录到dump文件中

代码侵入的使用

mallctl接口提供了一些选项，可以定期dump内存快照。

输出分析

采用上述两种方式，会输出若干heap profiling 文件。

交互式查看
jeprof program heap_file
生成树形文件
可以选择生成pdf svg 等格式的文件
jeprof --pdf program heap_file

jeprof的一些选项

Options:
   --cum               Sort by cumulative data
   --base=<base>       Subtract <base> from <profile> before display
   --interactive       Run in interactive mode (interactive "help" gives help) [default]
   --seconds=<n>       Length of time for dynamic profiles [default=30 secs]
   --add_lib=<file>    Read additional symbols and line info from the given library
   --lib_prefix=<dir>  Comma separated list of library path prefixes

Reporting Granularity:
   --addresses         Report at address level
   --lines             Report at source line level
   --functions         Report at function level [default]
   --files             Report at source file level

Output type:
   --text              Generate text report
   --callgrind         Generate callgrind format to stdout
   --gv                Generate Postscript and display
   --evince            Generate PDF and display
   --web               Generate SVG and display
   --list=<regexp>     Generate source listing of matching routines
   --disasm=<regexp>   Generate disassembly of matching routines
   --symbols           Print demangled symbol names found at given addresses
   --dot               Generate DOT file to stdout
   --ps                Generate Postcript to stdout
   --pdf               Generate PDF to stdout
   --svg               Generate SVG to stdout
   --gif               Generate GIF to stdout
   --raw               Generate symbolized jeprof data (useful with remote fetch)
   --collapsed         Generate collapsed stacks for building flame graphs
                       (see http://www.brendangregg.com/flamegraphs.html)

Heap-Profile Options:
   --inuse_space       Display in-use (mega)bytes [default]
   --inuse_objects     Display in-use objects
   --alloc_space       Display allocated (mega)bytes
   --alloc_objects     Display allocated objects
   --show_bytes        Display space in bytes
   --drop_negative     Ignore negative differences

Contention-profile options:
   --total_delay       Display total delay at each region [default]
   --contentions       Display number of delays at each region
   --mean_delay        Display mean delay at each region

Call-graph Options:
   --nodecount=<n>     Show at most so many nodes [default=80]
   --nodefraction=<f>  Hide nodes below <f>*total [default=.005]
   --edgefraction=<f>  Hide edges below <f>*total [default=.001]
   --maxdegree=<n>     Max incoming/outgoing edges per node [default=8]
   --focus=<regexp>    Focus on backtraces with nodes matching <regexp>
   --thread=<n>        Show profile for thread <n>
   --ignore=<regexp>   Ignore backtraces with nodes matching <regexp>
   --scale=<n>         Set GV scaling [default=0]
   --heapcheck         Make nodes with non-0 object counts
                       (i.e. direct leak generators) more visible
   --retain=<regexp>   Retain only nodes that match <regexp>
   --exclude=<regexp>  Exclude all nodes that match <regexp>

Miscellaneous:
   --tools=<prefix or binary:fullpath>[,...]   $PATH for object tool pathnames
   --test              Run unit tests
   --help              This message
   --version           Version information
   --debug-syms-by-id  (Linux only) Find debug symbol files by build ID as well as by name