Hive0.8.1 with MapReduce 0.23.1 单机测试安装
自从MR V2出来之后,变化了很多,安装方式也和以前不一样了,折腾了好久,才装好。
1、 Hadoop的安装
1) 下载hadoop-1.0.0.tar.gz包,放到指定目录并解压缩。
2) 修改配置,加入环境变量
$hadoop_home/etc/hadoop/yarn-env.sh
在最前面加入配置:
export JAVA_HOME=/opt/JDK/jdk1.6.0_31
export YARN_HOME=/opt/Hadoop/hadoop-0.23.1
export HADOOP_COMMON_HOME=/opt/Hadoop/hadoop-0.23.1
export HADOOP_HDFS_HOME=$HADOOP_COMMON_HOME
3) $hadoop_home/etc/hadoop/Yarn-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce.shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>user.name</name>
<value>root</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:54311</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>localhost:54312</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>localhost:54313</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>localhost:54314</value>
</property>
<property>
<name>yarn.web-proxy.address</name>
<value>localhost:54315</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>localhost</value>
</property>
</configuration>
4) $hadoop_home/etc/hadoop/Hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/opt/Hadoop/hadoop-0.23.1/dfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/opt/Hadoop/hadoop-0.23.1/dfs/data</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>/opt/Hadoop/hadoop-0.23.1/dfs/secondaryname</value>
</property>
<property>
<name>dfs.namenode.checkpoint.edits.dir</name>
<value>/opt/Hadoop/hadoop-0.23.1/dfs/secedits</value>
</property>
</configuration>
5) $hadoop_home/etc/hadoop/Core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:54310/</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/Hadoop/hadoop-0.23.1/hadoop-root</value>
</property>
</configuration>
6) 复制yarn-env.sh 到同级目录并命名为hadoop-env.sh,好像hdfs启动需要。
安装的时候遇到的问题:
1、 Hive中找不到hive-builtins-*.jar的问题。错误如下:
2012-03-22 00:10:22,022 ERROR security.UserGroupInformation (UserGroupInformation.java:doAs(1180)) - PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /opt/Hive/hive-0.8.1/lib/hive-builtins-0.8.1.jar
2012-03-22 00:10:22,022 ERROR security.UserGroupInformation (UserGroupInformation.java:doAs(1180)) - PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /opt/Hive/hive-0.8.1/lib/hive-builtins-0.8.1.jar
2012-03-22 00:10:22,035 ERROR exec.ExecDriver (SessionState.java:printError(380)) - Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: /opt/Hive/hive-0.8.1/lib/hive-builtins-0.8.1.jar)'
java.io.FileNotFoundException: File does not exist: /opt/Hive/hive-0.8.1/lib/hive-builtins-0.8.1.jar
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:729)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:246)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:284)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:355)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1221)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1218)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:609)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:604)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:604)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:452)
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:710)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:200)
折腾了很久,也请教了同事,才发现是配置文件问题,但是具体那个属性说不上来,namenode和MapReduce都可以正常启动,但是hive的任务就是不能跑。
最后把所有的配置属性修改了一次,搞定了,估计是core-site.xml中的配置属性的问题。
可能是” mapred.job.tracker”这个属性的问题。
<property>
<name>mapred.job.tracker</name>
<value>localhost</value>
</property>
2、 还有一个就是由于是用root用户安装的,启动的时候脚本提示jvm错误的问题,具体错误日志忘记了。但是只要修改$hadoop_home/bin/yarn就可以了
elif [ "$COMMAND" = "nodemanager" ] ; then
CLASSPATH=${CLASSPATH}:$YARN_CONF_DIR/nm-config/log4j.properties
CLASS='org.apache.hadoop.yarn.server.nodemanager.NodeManager'
if [[ $EUID -eq 0 ]]; then
YARN_OPTS="$YARN_OPTS -jvm server $YARN_NODEMANAGER_OPTS"
else
YARN_OPTS="$YARN_OPTS -server $YARN_NODEMANAGER_OPTS"
fi
就是这里的-jvm的问题,把jvm删掉即可。注意删掉之后,server前面要有-
elif [ "$COMMAND" = "nodemanager" ] ; then
CLASSPATH=${CLASSPATH}:$YARN_CONF_DIR/nm-config/log4j.properties
CLASS='org.apache.hadoop.yarn.server.nodemanager.NodeManager'
if [[ $EUID -eq 0 ]]; then
YARN_OPTS="$YARN_OPTS -server $YARN_NODEMANAGER_OPTS"
else
YARN_OPTS="$YARN_OPTS -server $YARN_NODEMANAGER_OPTS"
fi
最后执行一次“./hadoop namenode -format” 格式化hdfs即可。
在sbin目录下调用start-dfs.sh,start-yarn.sh启动即可。如果需要密码,就输入root的密码。当前用户的密码。
2、 Hive的安装
在解决了hive-builtins-*.jar的问题之后,hive的安装就太简单了。
在/etc/profile中配置环境变量即可,如果不想配置,就直接加入到hive的脚本里面去。
export JAVA_HOME=/opt/JDK/jdk1.6.0_31
export ANT_HOME=/opt/Ant/apache-ant-1.8.2
export IVY_HOME=/opt/Ivy/apache-ivy-2.2.0
export HADOOP_HOME=/opt/Hadoop/hadoop-0.23.1
export HIVE_HOME=/opt/Hive/hive-0.8.1
export HIVE_CONF_DIR=$HIVE_HOME/conf
export HIVE_LIB=$HIVE_HOME/lib
export CLASSPATH=$JAVA_HOME/lib:$HIVE_LIB:$CLASSPATH
export PATH=$JAVA_HOME/bin:$HIVE_HOME/bin/:$PATH
这个是我所有的环境变量,捡需要的加上去即可。加上去之后要重新登陆。