关闭

Hadoop “Failed to set setXIncludeAware(true) for parser” error and how to resolve it

标签: hadoopmaven
813人阅读 评论(0) 收藏 举报
分类:

原文地址: http://caffeinbean.wordpress.com/2011/03/01/hadoop-failed-to-set-setxincludeawaretrue-for-parser-error-and-how-to-resolve-it/


Hadoop is a great piece of technology. But it’s not the technology that helps you solve the great problems. It’s the attitude you gain after absorbing the knowledge, and the courage to attack the problems.

For Hadoop, the “hello world” application is WordCount. Basically you feed a document with the assumption that it can be huge, the map reduce program outputs unique words and their counts. In real life however, the challenges you face is not as trivial. Some are not yet answered and subject to active exploration and development. Dependency injection is a hot topic for instance. But for this post I’ll focus on a specific problem and present you the solution.

If you ever have to deal with XML in map reduce environment, it’s possible that you get a stacktrace dump similar below.

1
ERROR conf.Configuration: Failed to set setXIncludeAware(true) for parser org.apache.xerces.jaxp.DocumentBuilderFactoryImpl@47315d34:java.lang.UnsupportedOperationException: This parser does not support specification "null" version "null"java.lang.UnsupportedOperationException: This parser does not support specification "null" version "null" at javax.xml.parsers.DocumentBuilderFactory.setXIncludeAware(DocumentBuilderFactory.java:590)   at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1054)   at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1030)  at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:980)    at org.apache.hadoop.conf.Configuration.set(Configuration.java:405) at org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:585)  at org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:290) at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:375)   at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:153)  at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:138)  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:59)    at my.job.MapReduce.main(MyJob.java:103)    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)    at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

The reason is that the JDK supplied XML libraries are a bit out of date. In order to get rid of this error, you’ll need to both provide recent versions of Xalan and Xerces with you job configuration, which means you’ll need to make them available in your classpath.

If you’re using maven, (you are using maven for map reduce jobs right?) it’s just a couple of lines to include in the pom file.

1
2
3
4
5
6
7
8
9
10
<dependency>
    <groupId>xerces</groupId>
    <artifactId>xercesImpl</artifactId>
    <version>2.9.1</version>
</dependency>
<dependency>
    <groupId>xalan</groupId>
    <artifactId>xalan</artifactId>
    <version>2.7.1</version>
</dependency>

The versions for xalan are xerces are specific. You need to supply the versions listed or above.


0
0

查看评论
* 以上用户言论只代表其个人观点,不代表CSDN网站的观点或立场
    个人资料
    • 访问:5161次
    • 积分:112
    • 等级:
    • 排名:千里之外
    • 原创:5篇
    • 转载:2篇
    • 译文:0篇
    • 评论:6条
    文章分类
    最新评论