hadoop 依赖式job_解决Hadoop mapreduce 包依赖问题

最新推荐文章于 2023-12-30 06:22:55 发布

weixin_39716088

最新推荐文章于 2023-12-30 06:22:55 发布

阅读量59

点赞数

文章标签： hadoop 依赖式job

本文链接：https://blog.csdn.net/weixin_39716088/article/details/111735351

版权

One of the disadvantages of setting up a Hadoop development environment in Eclipse is that I have been dependent on Eclipse to take care of job submission for me and so I had never worried about doing it by hand. I have been developing mostly on a single node cluster (i.e my laptop) which meant I never had the need to submit a job to an actual cluster, a remote cluster in this case. Also, the first MapReduce programs I have written and run on the cluster (more to follow) were not dependent on third party jars. However, the program I am working on depends on a third-party xml parser which in turn depends on another jar.

As it turns out, I had to specify 3 external jars everytime I submit a job. I knew there was a -libjars option that you could use as I had seen it somewhere (including the hadoop help when you don’t specify all arguments for a command) but I did not pay attention since I did not need it then. Googling around, I found a mention of copying the jars to the lib folder of the Hadoop installation. It seemed a good solution untill you think about a multi-node cluster which means you have to copy the libraries to every node. Also, what if you do not have complete control of the clusters. Will you have write permissions to lib folder.

Luckily, I bumped into a solution suggested Doug Cutting as an answer to someone who had a similar predicament. The solution was to create a “lib” folder in your project and copy all the external jars into this folder. According to Doug, Hadoop will look for third-party jars in this folder. It works great!

《Hadoop权威指南》中也有关于jar打包的处理措施，查找之

【任何非独立的JAR文件都必须打包到JAR文件的lib目录中。(这与Java的web application archive或WAR文件类似，不同的是，后者的JAR文件放在WEB-INF/lib子目录下的WAR文件中)】

weixin_39716088

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
hadoop 依赖式job_解决Hadoop mapreduce 包依赖问题

One of the disadvantages of setting up a Hadoop development environment in Eclipse is that I have been dependent on Eclipse to take care of job submission for me and so I had never worried about doing...
复制链接

扫一扫