Hadoop 2.6的新特性包含了Trace功能,一个类似于Google Dapper的分布式跟踪工具,为Hadoop系列应用提供请求跟踪和性能分析。虽然Hadoop 2.6中使用的还是pre-apache版本的HTrace,但是在2.7中支持了Apache version of HTrace。并且,HTrace也集成到了HBase中,HBase 1.0.0使用Apache 3.1.0 release。参考http://events.linuxfoundation.org/sites/events/files/slides/2015-03-05_apachecon2015__introducing_apache_htrace.pdf。
本文主要介绍如何在HDFS、HBase、HBaseClient(YCSB中的hbase客户端程序为例)中开启HTrace。
在MAVEN Repository中,htrace有多个地址,比如org.htrace(3.0.4版本最新),org.cloudera.htrace(2.05版本最新),org.apache.htrace(3.1.0, 3.2.0),我们采用org.apache.htrace(htrace-core-3.1.0-incubating.jar)。参考地址http://search.maven.org/#browse%7C-1954581700。
前提:Hadoop(hdfs和yarn都正常启动),Hbase(包括Zookeeper)正常启动。
一下的配置都要在所有Hadoop、HBase节点上进行。
1. Hadoop 2.7.0
下载htrace-core-3.1.0-incubating.jar到hdfs库目录中。
库目录:hadoop-2.7.0/share/hadoop/hdfs/lib/
在hadoop-2.7.0/etc/hadoop/core-site.xml中添加hadoop.htrace配置项:
<property>
<name>hadoop.htrace.spanreceiver.classes</name>
<value>org.apache.htrace.impl.LocalFileSpanReceiver</value>
</property>
<property>
<name>hadoop.htrace.local-file-span-receiver.path</name>
<value>/var/log/hadoop/htrace.out</value>
</property>
重启hadoop。这时候,不出意外,/var/log/hadoop/中已经生成了htrace.out文件。执行hadoop fs -ls / 来测试hadoop输出的htrace日志,如下:
{"i":"f19bfdb594269132","s":"befdd7ab18bc944f","b":1434990777657,"e":1434990777700,"d":"org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo","r":"NameNode","p":["e65fa1aa1d4ea5f9"]}
{"i":"f19bfdb594269132","s":"e65fa1aa1d4ea5f9","b":1434990777462,"e":1434990777708,"d":"ClientNamenodeProtocol#getFileInfo","r":"FsShell","p":["3266f445372f0b7d"],"t":[{"t":1434990777481,"m":"IPC client connecting to centos6-1/10.10.10.20:8020"},{"t":1434990777507,"m":"IPC client connected to centos6-1/10.10.10.20:8020"}]}
{"i":"f19bfdb594269132","s":"3266f445372f0b7d","b":1434990777453,"e":1434990777793,"d":"getFileInfo","r":"FsShell","p":[],"n":{
"path":"/"}}
{"i":"d36e237682b818fa","s":"72d96c8b2749892b","b":1434990777798,"e":1434990777804,"d":"org.apache.hadoop.hdfs.protocol.ClientProtocol.getListing","r":"NameNode","p&#