问题:
在自己写好udf函数,编译成jar包,在hive命令行下执行,
add jar udfs-1.0-SNAPSHOT.jar;
CREATE TEMPORARY FUNCTION strlen AS 'com.pingan.pbear.udf.StrLen';
select name, strlen(name), score from stu order by score;
上述代码涉及到order函数,hive会生成MR任务执行,报错信息如下:
`Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapreduce.job.reduces=
java.io.FileNotFoundException: File does not exist: hdfs://localhost:9000/Users/lovelife/git/pbear-offline/news/udfs/target/udfs-1.0-SNAPSHOT.jar
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
Job Submission failed with exception ‘java.io.FileNotFoundException(File does not exist: hdfs://localhost:9000/Users/lovelife/git/pbear-offline/news/udfs/target/udfs-1.0-SNAPSHOT.jar)’
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
select name, strlen(name), score from stu`
如运行简单的命令则不会报错,比如
解决方案:
在hadoop中修改mapred-site.xml配置文件,加上如下内容:
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
名称mapreduce.framework.name指的是使用yarn运行mapreduce程序