htrace-zipkin与hadoop集成

这里坑更多

hadoop zipkin配置

hadoop的zipkin配置部分的文章就是shit呀,包括官方文档。
我这里使用的是

hadoop 2.7.1

https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/Tracing.html
来看看官方文档,照着配置,绝对不能成功,呵呵

为什么呢,咱们来看看这个jira

DFSClient should use hdfs.client.htrace HTrace configuration prefix rather than hadoop.htrace
https://issues.apache.org/jira/browse/HDFS-8213

要不要这么搞笑。

幸好stackoverflow上有个人说

Exactly, in the code, I see the configuration key's prefix is dfs.htrace, not the hadoop.htrace. And in dfsclient, it's dfs.client.htrace. You can change the prefix to dfs.htrace, then restart the cluster and it take effect. The code is in class org.apache.hadoop.tracing.SpanReceiverHost. Hope this help!

我觉得世界就是崩溃的呀,花了半天时间,你告诉我一切都是个误会。

<property>
    <name>hdfs.client.htrace.sampler</name>
    <value>AlwaysSampler</value>
  </property>
<property>
    <name>hdfs.client.htrace.spanreceiver.classes</name>
    <value>ZipkinSpanReceiver</value>
  </property>
  <property>
    <name>hdfs.client.htrace.zipkin.collector-hostname</name>
    <value>localhost</value>
  </property>
  <property>
    <name>hdfs.client.htrace.zipkin.collector-port</name>
    <value>9410</value>
  </property>

  <property>
    <name>dfs.htrace.sampler</name>
    <value>AlwaysSampler</value>
  </property>
<property>
    <name>dfs.htrace.spanreceiver.classes</name>
    <value>ZipkinSpanReceiver</value>
  </property>
  <property>
    <name>dfs.htrace.zipkin.collector-hostname</name>
    <value>localhost</value>
  </property>
  <property>
    <name>dfs.htrace.zipkin.collector-port</name>
    <value>9410</value>
  </property>

  <property>
    <name>dfs.client.htrace.sampler</name>
    <value>AlwaysSampler</value>
  </property>
<property>
    <name>dfs.client.htrace.spanreceiver.classes</name>
    <value>ZipkinSpanReceiver</value>
  </property>
  <property>
    <name>dfs.client.htrace.zipkin.collector-hostname</name>
    <value>localhost</value>
  </property>
  <property>
    <name>dfs.client.htrace.zipkin.collector-port</name>
    <value>9410</value>
  </property>

我采取的配置是十分激进的,把这些都配置上吧。没有具体检查是哪个,看看源码就知道了。先这样的,已经浪费很多时间了。

还有个很有意思的命令可以记忆一下

hdfs dfs -Dfs.shell.htrace.span.receiver.classes=org.apache.htrace.impl.ZipkinSpanReceiver \
           -Dfs.shell.htrace.sampler.classes=AlwaysSampler \
           -Dhadoop.htrace.zipkin.collector-hostname=localhost \
           -Dhadoop.htrace.zipkin.collector-port=9410 \
           -ls /
$ javac -cp `hadoop classpath` TracingFsShell.java
$ HADOOP_CLASSPATH=. hdfs TracingFsShell -put sample.txt /tmp/

这里的执行方法也很好,我之前都是自己统计classpath,太二了。

hadoop trace 测试

所谓出师未捷身先死

-sh-4.1$ hdfs dfs -ls /
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/thrift/protocol/TProtocolFactory
    at java.lang.Class.getDeclaredConstructors0(Native Method)
    at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)
    at java.lang.Class.getConstructor0(Class.java:3075)
    at java.lang.Class.getConstructor(Class.java:1825)
    at org.apache.htrace.SpanReceiverBuilder.build(SpanReceiverBuilder.java:107)
    at org.apache.hadoop.tracing.SpanReceiverHost.loadInstance(SpanReceiverHost.java:169)
    at org.apache.hadoop.tracing.SpanReceiverHost.loadSpanReceivers(SpanReceiverHost.java:154)
    at org.apache.hadoop.tracing.SpanReceiverHost.get(SpanReceiverHost.java:78)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:634)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:150)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:170)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:355)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
    at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325)
    at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235)
    at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218)
    at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201)
    at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
    at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
Caused by: java.lang.ClassNotFoundException: org.apache.thrift.protocol.TProtocolFactory
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 28 more
-sh-4.1$ ll

嗯,这个时候我觉得又可以看看官方文档了

$ git clone https://github.com/cloudera/htrace
  $ cd htrace/htrace-zipkin
  $ mvn compile assembly:single
  $ cp target/htrace-zipkin-*-jar-with-dependencies.jar $HADOOP_HOME/share/hadoop/hdfs/lib/

这是官方的介绍,需要一个fat的jar包,那么就干吧。打个包而已。我之前上传的是thin包,不包括依赖

htrace-zipkin编译打包

打包这个事情做过好多次了
首先呢,下面的命令是执行不了的,因为我下载的代码根本在注释里面说了,skip assembly,所以要修改pom.xml

mvn compile assembly:single
  1. 只是删除或者注释掉skip assembly那一段,没有效果,然后一直报错说
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-assembly-plugin:2.5.3:single (default) on project htrace-zipkin: Error reading assemblies: Error locating assembly descriptor: src/main/assembly/src.xml

这个嘛,既然你找不到,那我就给你一个

<?xml version="1.0" encoding="utf-8"?>
<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor
license agreements. See the NOTICE file distributed with this work for additional
information regarding copyright ownership. The ASF licenses this file to
You under the Apache License, Version 2.0 (the "License"); you may not use
this file except in compliance with the License. You may obtain a copy of
the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required
by applicable law or agreed to in writing, software distributed under the
License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS
OF ANY KIND, either express or implied. See the License for the specific
language governing permissions and limitations under the License. -->
<assembly xmlns="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.0 http://maven.apache.org/xsd/assembly-1.1.0.xsd">
  <id>jar-with-dependencies</id>
  <formats>
    <format>jar</format>
  </formats>
  <includeBaseDirectory>false</includeBaseDirectory>
  <dependencySets>
    <dependencySet>
      <unpack>true</unpack>
      <scope>runtime</scope>
    </dependencySet>
  </dependencySets>
  <fileSets>
    <fileSet>
      <directory>${project.build.outputDirectory}</directory>
    </fileSet>
  </fileSets>
</assembly>

这样还是不行的

[ERROR] Failed to execute goal org.apache.rat:apache-rat-plugin:0.11:check (default) on project htrace-zipkin: Too many files with unapproved license: 1 See RAT report in: /Users/xxx/code/incubator-htrace/htrace-zipkin/target/rat.txt -> [Help 1]

这是个什么鬼呀,看了下报错信息

1 Unknown Licenses

*******************************

Unapproved licenses:

  src/main/assembly/src.xml

*******************************

原来是我自己写的xml不支持apache license,所以不给编译。这么高级。。。。
好吧,我就给你个license

<?xml version="1.0" encoding="utf-8"?>
<!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor
license agreements. See the NOTICE file distributed with this work for additional
information regarding copyright ownership. The ASF licenses this file to
You under the Apache License, Version 2.0 (the "License"); you may not use
this file except in compliance with the License. You may obtain a copy of
the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required
by applicable law or agreed to in writing, software distributed under the
License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS
OF ANY KIND, either express or implied. See the License for the specific
language governing permissions and limitations under the License. -->
<assembly xmlns="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http
://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.0 http://maven.apache.org/xsd/assembly-1.1.0.xsd">
  <id>jar-with-dependencies</id>
  <formats>
    <format>jar</format>
  </formats>
  <includeBaseDirectory>false</includeBaseDirectory>
  <dependencySets>
    <dependencySet>
      <unpack>true</unpack>
      <scope>runtime</scope>
    </dependencySet>
  </dependencySets>
  <fileSets>
    <fileSet>
      <directory>${project.build.outputDirectory}</directory>
    </fileSet>
  </fileSets>
</assembly>

从随便一个pom里面拷贝了一份apache license声明放在文件头上,就可以了。

另外这个过程还出现了java doc的编译问题,没有解决,只是单纯的删除了pom中关于java doc的plugin部分跳过。

最终编译成功。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值