Profiling Ceph memcpy Usage

Profiling Ceph memcpy Usage

 

目录

  1. Profiling Ceph memcpy Usage
    1. Introduction
    2. Placing the uprobe on memcpy
    3. Simple Python Perf script to analyse memcpy sizes
    4. Distribution of memcpy sizes
    5. Steps to analyse `std::__ostream_insert<char, std::char_traits<char> >`
    6. Locating ceph-osd callers that lead to size 1 memcpy
    7. Being tricked by perf
    8. Flame graph of call stack for size 1 memcpy's
    9. Actual hot memcpy's

 

Introduction

This page was originally a brain dump of my investigation into the Ceph memcpy size distributions. I've kept the content mostly as is (after this introduction section). 

The take away messages from this page are essentially: 

  • Yes Ceph has a lot of memcpy's of size 1, ... but 
  • The only way these really crop up is via a probe point rather than via natural pc sampling, ... meaning 
  • That they are not likely worth worrying about :-(. 

This page should hopefully serve as a warning for those who rely too much on probes and fail to measure actual CPU impact. 

 

Placing the uprobe on memcpy

First find out where in the C library the memcpy resides: 

$ nm -D /lib64/libc.so.6 | grep memcpy
0000000000084fc0 T memcpy
00000000000ebaf0 T __memcpy_chk
000000000008c70c W wmemcpy
00000000000ed41c T __wmemcpy_chk

Now place a uprobe: 

# echo "p:memcpy /lib64/libc.so.6:0x84fc0 len=%x2" > /sys/kernel/debug/tracing/uprobe_events

The uprobe is recorded via: 

perf record -e uprobes:memcpy -a -g sleep 20

This gives twenty seconds of recording system-wide. Processes can be analysed via a perf script. 

 

Simple Python Perf script to analyse memcpy sizes

 

切换行号显示

   1 from __future__ import division
   2 import os
   3 import sys
   4 
   5 import collections
   6 import string
   7 
   8 sys.path.append(os.environ['PERF_EXEC_PATH'] + \
   9         '/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
  10 
  11 from perf_trace_context import *
  12 from Core import *
  13 
  14 
  15 sizes = dict()
  16 
  17 sizeones = dict()
  18 
  19 def trace_begin():
  20         print "in trace_begin"
  21 
  22 def trace_end():
  23         print "in trace_end\n\n"
  24 
  25         for key in sizes.keys():
  26                 print "\n\nTop 10 memcpy sizes of %s:" % (key)
  27                 total = len(sizes[key])
  28 
  29                 for pair in collections.Counter(sizes[key]).most_common(10):
  30                         print "%s\t%d\t(%f%%)" % (pair[0], pair[1], 100.0*pair[1]/total)
  31 
  32         for key in sizeones.keys():
  33                 print "\n\nTop 10 memcpy size 1 callers of %s:" % (key)
  34                 
  35                 for pair in collections.Counter(sizeones[key]).most_common(10):
  36                         print "%d\t%s" % (pair[1], pair[0])
  37 
  38 def uprobes__memcpy(event_name, context, common_cpu,
  39         common_secs, common_nsecs, common_pid, common_comm,
  40         common_callchain, __probe_ip, len):
  41 
  42                 if (common_comm not in sizes):
  43                         sizes[common_comm] = list();
  44 
  45                 sizes[common_comm].append(len)
  46 
  47                 if (len == 1):
  48                         node = common_callchain[1]
  49                         if ('sym' in node):
  50                                 caller = format("[%x] %s %s" % (node['ip'], node['dso'], node['sym']['name']))
  51                         else:
  52                                 caller = format("[%x]" % (node['ip']))
  53 
  54                         if (common_comm not in sizeones):
  55                                 sizeones[common_comm] = list();
  56 
  57                         sizeones[common_comm].append(caller)
  58 
  59 def trace_unhandled(event_name, context, event_fields_dict):
  60                 print ' '.join(['%s=%s'%(k,str(v))for k,v in sorted(event_fields_dict.items())])

 

Distribution of memcpy sizes

These were obtained via a uprobe on memcpy 

 

Top 10 memcpy sizes of ceph-mds:
1       3349    (30.542636%)
4       1411    (12.868217%)
8       1007    (9.183767%)
2       663     (6.046512%)
46      350     (3.191974%)
6       313     (2.854537%)
5       307     (2.799818%)
85      280     (2.553580%)
3       270     (2.462380%)
80      231     (2.106703%)


Top 10 memcpy sizes of sshd:
4       80      (30.769231%)
16      40      (15.384615%)
100     40      (15.384615%)
76      38      (14.615385%)
6       20      (7.692308%)
60      20      (7.692308%)
40      19      (7.307692%)
231     2       (0.769231%)
8       1       (0.384615%)


Top 10 memcpy sizes of perf:
16      1       (20.000000%)
4       1       (20.000000%)
30      1       (20.000000%)
6       1       (20.000000%)
5       1       (20.000000%)


Top 10 memcpy sizes of chronyd:
24      5       (50.000000%)
128     2       (20.000000%)
72      2       (20.000000%)
192     1       (10.000000%)


Top 10 memcpy sizes of systemd:
10      1       (25.000000%)
39      1       (25.000000%)
54      1       (25.000000%)
7       1       (25.000000%)


Top 10 memcpy sizes of lt-rados:
1       6138    (35.294118%)
4       3350    (19.262837%)
8       1549    (8.906906%)
2       1025    (5.893853%)
16      890     (5.117590%)
3       556     (3.197056%)
7       442     (2.541544%)
13      437     (2.512794%)
6       329     (1.891783%)
5       314     (1.805532%)


Top 10 memcpy sizes of ceph-mon:
1       56354   (28.211418%)
4       34085   (17.063317%)
8       20027   (10.025731%)
3       9962    (4.987084%)
2       9248    (4.629648%)
6       7521    (3.765093%)
5       6256    (3.131821%)
14      4419    (2.212199%)
80      4063    (2.033981%)
9       3999    (2.001942%)


Top 10 memcpy sizes of ceph-osd:
1       354007  (39.020946%)
4       115021  (12.678360%)
2       109050  (12.020198%)
8       51911   (5.721967%)
3       40176   (4.428459%)
6       36285   (3.999568%)
80      25808   (2.844725%)
5       19347   (2.132552%)
7       18729   (2.064432%)
11      15813   (1.743011%)


Top 10 memcpy sizes of sleep:
12      14      (73.684211%)
10      2       (10.526316%)
1       1       (5.263158%)
4       1       (5.263158%)
6       1       (5.263158%)


Top 10 memcpy size 1 callers of lt-rados:
2716    [98db8] /usr/lib64/libstdc++.so.6.0.20 std::__ostream_insert<char, std::char_traits<char> >
1666    [1df68] /usr/lib64/libsoftokn3.so sftk_AddAttributeType
522     [8ce30] /usr/lib64/libstdc++.so.6.0.20 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<unsigned long>
245     [48984] /home/steven/ceph/src/.libs/librados.so.2.0.0 ceph::buffer::list::append
198     [8c868] /usr/lib64/libstdc++.so.6.0.20 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<long>
125     [266944] /home/steven/ceph/src/.libs/librados.so.2.0.0 Pipe::tcp_read_nonblocking
122     [8d554] /usr/lib64/libstdc++.so.6.0.20 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<unsigned long long>
77      [1c7298] /home/steven/ceph/src/.libs/librados.so.2.0.0 request_redirect_t::decode
77      [1c72a8] /home/steven/ceph/src/.libs/librados.so.2.0.0 request_redirect_t::decode
77      [229b98] /home/steven/ceph/src/.libs/librados.so.2.0.0 pg_t::decode


Top 10 memcpy size 1 callers of ceph-osd:
231102  [98db8] /usr/lib64/libstdc++.so.6.0.20 std::__ostream_insert<char, std::char_traits<char> >
39477   [8c868] /usr/lib64/libstdc++.so.6.0.20 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<long>
35877   [8ce30] /usr/lib64/libstdc++.so.6.0.20 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<unsigned long>
19712   [6ccbf8] /home/steven/ceph/src/ceph-osd ceph::buffer::list::append
6734    [1df68] /usr/lib64/libsoftokn3.so sftk_AddAttributeType
4496    [8ccb0] /usr/lib64/libstdc++.so.6.0.20 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::do_put
1496    [8d554] /usr/lib64/libstdc++.so.6.0.20 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<unsigned long long>
883     [68e628] /home/steven/ceph/src/ceph-osd ghobject_t::decode
883     [68e770] /home/steven/ceph/src/ceph-osd ghobject_t::decode
883     [68e6c8] /home/steven/ceph/src/ceph-osd ghobject_t::decode


Top 10 memcpy size 1 callers of sleep:
1       [2a698] /usr/lib64/libc-2.20.so _nl_find_locale


Top 10 memcpy size 1 callers of ceph-mds:
1403    [98db8] /usr/lib64/libstdc++.so.6.0.20 std::__ostream_insert<char, std::char_traits<char> >
352     [8c868] /usr/lib64/libstdc++.so.6.0.20 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<long>
351     [8ce30] /usr/lib64/libstdc++.so.6.0.20 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<unsigned long>
241     [557f78] /home/steven/ceph/src/ceph-mds ceph::buffer::list::append
214     [8dde0] /usr/lib64/libstdc++.so.6.0.20 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::do_put
211     [8dc1c] /usr/lib64/libstdc++.so.6.0.20 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_float<double>
196     [1df68] /usr/lib64/libsoftokn3.so sftk_AddAttributeType
77      [8ccb0] /usr/lib64/libstdc++.so.6.0.20 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::do_put
70      [43989c] /home/steven/ceph/src/ceph-mds DecayCounter::decode
70      [4397fc] /home/steven/ceph/src/ceph-mds DecayCounter::decode


Top 10 memcpy size 1 callers of ceph-mon:
29454   [98db8] /usr/lib64/libstdc++.so.6.0.20 std::__ostream_insert<char, std::char_traits<char> >
9161    [8ce30] /usr/lib64/libstdc++.so.6.0.20 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<unsigned long>
5614    [1df68] /usr/lib64/libsoftokn3.so sftk_AddAttributeType
5226    [8c868] /usr/lib64/libstdc++.so.6.0.20 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_int<long>
2473    [3cc1d0] /home/steven/ceph/src/ceph-mon ceph::buffer::list::append
463     [455ef4] /home/steven/ceph/src/ceph-mon Pipe::tcp_read_nonblocking
387     [1948e8] /home/steven/ceph/src/ceph-mon MonitorDBStore::Transaction::decode
387     [194944] /home/steven/ceph/src/ceph-mon MonitorDBStore::Transaction::decode
387     [1948f8] /home/steven/ceph/src/ceph-mon MonitorDBStore::Transaction::decode
385     [8ccb0] /usr/lib64/libstdc++.so.6.0.20 std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::do_put

 

Steps to analyse `std::__ostream_insert<char, std::char_traits<char> >`

 

$ addr2line -f -e /usr/lib64/libstdc++.so.6.0.20 0x98db8
__ostream_write<char, std::char_traits<char> >
/usr/src/debug/gcc-4.9.2-20141101/obj-aarch64-redhat-linux/aarch64-redhat-linux/libstdc++-v3/include/bits/ostream_insert.h:51

This leads to: 

切换行号显示

   1   template<typename _CharT, typename _Traits>
   2     inline void
   3     __ostream_write(basic_ostream<_CharT, _Traits>& __out,
   4                     const _CharT* __s, streamsize __n)
   5     {
   6       typedef basic_ostream<_CharT, _Traits>       __ostream_type;
   7       typedef typename __ostream_type::ios_base    __ios_base;
   8 
   9       const streamsize __put = __out.rdbuf()->sputn(__s, __n);
  10       if (__put != __n) //<=== line 51
  11         __out.setstate(__ios_base::badbit);
  12     }

So we need to investigate sputn()

This leads to: 

切换行号显示

   1       streamsize
   2       sputn(const char_type* __s, streamsize __n)
   3       { return this->xsputn(__s, __n); }

Let's investigate __streambuf_type::xsputnThis leads to: 

切换行号显示

   1   template<typename _CharT, typename _Traits>
   2     streamsize
   3     basic_streambuf<_CharT, _Traits>::
   4     xsputn(const char_type* __s, streamsize __n)
   5     {
   6       streamsize __ret = 0;
   7       while (__ret < __n)
   8         {
   9           const streamsize __buf_len = this->epptr() - this->pptr();
  10           if (__buf_len)
  11             {
  12               const streamsize __remaining = __n - __ret;
  13               const streamsize __len = std::min(__buf_len, __remaining);
  14               traits_type::copy(this->pptr(), __s, __len);
  15               __ret += __len;
  16               __s += __len;
  17               this->__safe_pbump(__len);
  18             }
  19 
  20           if (__ret < __n)
  21             {
  22               int_type __c = this->overflow(traits_type::to_int_type(*__s));
  23               if (!traits_type::eq_int_type(__c, traits_type::eof()))
  24                 {
  25                   ++__ret;
  26                   ++__s;
  27                 }
  28               else
  29                 break;
  30             }
  31         }
  32       return __ret;
  33     }

Let's investigate traits_type::copy. 

切换行号显示

   1       static char_type*
   2       copy(char_type* __s1, const char_type* __s2, size_t __n)
   3       { return static_cast<char_type*>(__builtin_memcpy(__s1, __s2, __n)); }

 

Locating ceph-osd callers that lead to size 1 memcpy

Here is the output: 

Top ten ceph callers leading to memcpy size 1
19712   [6ccbf8] ceph::buffer::list::append
13346   [334f04] operator<<
9571    [25e724] operator<< <int>
7426    [308410] operator<<
7426    [3083fc] operator<<
6734    [5a85f4] nss_aes_operation
6657    [68e2b0] operator<<
6647    [68e33c] operator<<
6635    [68e2d0] operator<<
6577    [68e368] operator<<

Investigating first sample: 

$ addr2line -Cf -e /home/steven/ceph/src/ceph-osd 0x6ccbf4
Objecter::_session_linger_op_remove(Objecter::OSDSession*, Objecter::LingerOp*)
/home/steven/ceph/src/osdc/Objecter.cc:2570

Which gives the following function: 

切换行号显示

   1 void Objecter::_session_linger_op_remove(OSDSession *from, LingerOp *op)
   2 {
   3   assert(from == op->session);
   4   assert(from->lock.is_locked());
   5 
   6   if (from->is_homeless()) {
   7     num_homeless_ops.dec();
   8   }
   9 
  10   from->linger_ops.erase(op->linger_id);
  11   put_session(from);
  12   op->session = NULL;
  13 
  14   ldout(cct, 15) << __func__ << " " << from->osd << " " << op->linger_id << dendl;
  15 }

Looks like this is due to debugging messages! 

 

Being tricked by perf

The ip's need to have 0x40000 added to them for ceph-osd (perf bug?).... Let's try again.... 

 

$ addr2line -Cf -e ./src/ceph-osd 0xaccbf8
ceph::buffer::ptr::end() const
/home/steven/ceph/src/./include/buffer.h:207

$ addr2line -Cf -e ./src/ceph-osd 0xaccbf4
ceph::buffer::list::append(char const*, unsigned int)
/home/steven/ceph/src/common/buffer.cc:1310

This leads to: 

切换行号显示

   1   void buffer::list::append(const char *data, unsigned len)
   2   {
   3     while (len > 0) {
   4       // put what we can into the existing append_buffer.
   5       unsigned gap = append_buffer.unused_tail_length();
   6       if (gap > 0) {
   7         if (gap > len) gap = len;
   8         //cout << "append first char is " << data[0] << ", last char is " << data[len-1] << std::endl;
   9         append_buffer.append(data, gap); // <--- this is line 1310
  10         append(append_buffer, append_buffer.end() - gap, gap);  // add segment to the list
  11         len -= gap;
  12         data += gap;
  13       }
  14       if (len == 0)
  15         break;  // done!
  16 
  17       // make a new append_buffer!
  18       unsigned alen = CEPH_PAGE_SIZE * (((len-1) / CEPH_PAGE_SIZE) + 1);
  19       append_buffer = create_page_aligned(alen);
  20       append_buffer.set_length(0);   // unused, so far.
  21     }
  22   }

I think the problem could be attributed to: 

切换行号显示

   1   void buffer::ptr::append(const char *p, unsigned l)
   2   {
   3     assert(_raw);
   4     assert(l <= unused_tail_length());
   5     memcpy(c_str() + _len, p, l);
   6     _len += l;
   7   }

 

Flame graph of call stack for size 1 memcpy's

A flame graph of size 1 memcpy call stacks can be found here: ceph-size1.svgceph-size1.svg

And the inverted graph here: ceph-size1-rev.svgceph-size1-rev.svg

The stack data was generated via the following python script 

切换行号显示

   1 import os
   2 import sys
   3 
   4 sys.path.append(os.environ['PERF_EXEC_PATH'] + \
   5         '/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
   6 
   7 from perf_trace_context import *
   8 from Core import *
   9 
  10 
  11 def trace_begin():
  12         pass
  13 
  14 def trace_end():
  15         for callframe in callframes.keys():
  16                 print "%s %d" % (callframe, callframes[callframe])
  17 
  18 callframes = dict()
  19 
  20 def uprobes__memcpy(event_name, context, common_cpu,
  21         common_secs, common_nsecs, common_pid, common_comm,
  22         common_callchain, __probe_ip, len):
  23 
  24                 if (len != 1):
  25                         return
  26 
  27                 callframe = common_comm
  28 
  29                 for node in reversed(common_callchain):
  30                         callframe += ';'
  31 
  32                         if 'sym' in node:
  33                                 callframe += node['sym']['name']
  34                         else:
  35                                 callframe += '[unknown]'
  36 
  37                 if (callframe not in callframes.keys()):
  38                         callframes[callframe] = 1
  39                 else:
  40                         callframes[callframe] += 1
  41 
  42 def trace_unhandled(event_name, context, event_fields_dict):
  43                 print ' '.join(['%s=%s'%(k,str(v))for k,v in sorted(event_fields_dict.items())])

 

Actual hot memcpy's

Analysing via uprobe has one significant disadvantage that I overlooked. Namely, that it gives us the proportion of all memcpy's that are size 1. It doesnottake into account the expensive/cheap memcpy's i.e. what are the memcpy's that are taking up the most time. 

If we look at the call graph for the hot memcpy calls we get: 

    10.55%  ceph-osd         /usr/lib64/libc-2.20.so                        0x8503c            B [.] memcpy                                                           
                   |
                   ---memcpy
                      |          
                      |--98.83%-- ceph::buffer::list::rebuild_aligned_size_and_memory(unsigned int, unsigned int)
                      |          FileJournal::align_bl(long, ceph::buffer::list&)
                      |          FileJournal::do_write(ceph::buffer::list&)
                      |          FileJournal::write_thread_entry()
                      |          FileJournal::Writer::entry()
                      |          Thread::entry_wrapper()
                      |          start_thread
                      |          thread_start

i.e. we see this as buffer management rather than single character log appends. 

  • 2
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Pandas profiling是一个用于生成数据分析报告的Python库。它可以帮助我们快速地分析和了解数据集的概况、数据质量、缺失值、异常值等信息。通过使用Pandas profiling,我们可以快速了解数据集的基本特征,并可以生成一个包含各种统计指标、图表和交互式可视化的报告。 要使用pandas-profiling库,首先需要安装它。可以通过在命令行中使用pip命令进行安装。例如,可以运行以下命令来安装pandas-profiling: pip install pandas-profiling 安装完毕后,就可以在Python脚本中导入并使用pandas-profiling库了。可以按照以下基本用法进行使用: import numpy as np import pandas as pd from pandas_profiling import ProfileReport # 创建一个示例数据集 df = pd.DataFrame(np.random.rand(100, 5), columns=["a", "b", "c", "d", "e"]) # 生成报告 profile = ProfileReport(df, title="Pandas Profiling Report") profile.to_file("your_report.html") 上述代码创建了一个包含随机数据的DataFrame,并使用Pandas Profiling生成了一个报告。报告将保存为一个HTML文件,可以在浏览器中打开查看。报告中包含了数据集的各种统计指标、数据质量分析、缺失值分析、异常值分析等内容。 此外,pandas-profiling还支持许多其他功能和参数,比如可以设置报告的标题、生成简化版报告、自定义图表的参数等。可以参考官方文档或者示例代码来进一步了解和使用pandas-profiling库。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* *2* *3* [Python之pandas-profiling:pandas-profiling库的简介、安装、使用方法之详细攻略](https://blog.csdn.net/qq_41185868/article/details/109710384)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 100%"] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值