1. 准备软件
hadoop 2.5.2 + eclipse + hadoop-eclipse-plugin
可以去这里下载 提取密码:yxy4
2.创建用户
***本人ubuntu系统里面有很多其他的东西,为了不影响,所以单独创建一个用户
不过建议大家都这样做,因为后面由于权限,可能会出现一些比较麻烦的问题
sudo addgroup hadoop
sudo adduser -ingroup hadoop hadoop // 设置新密码,确认信息输入Y即可
添加权限
sudo vi /etc/sudoers
在第20行后加一行: hadoop ALL=(ALL:ALL)ALL
***********************
切换到hadoop用户下
su hadoop
***********************
3.安装ssh
sudo apt-get install openssh-server
启动ssh:
sudo /etc/init.d/ssh start
查看:
ps -e |grep ssh
登录设置:
ssh-keygen -t rsa -P ""
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
4.安装hadoop
解压:
tar zxvf hadoop-2.5.2.tar.gz
sudo mv hadoop-2.5.2 /usr/local/hadoop
sudo chmod 774 /usr/local/hadoop
配置.bashrc文件
sudo vi ~/.bashrc
添加以下内容:
export JAVA_HOME=你的java安装路径
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
source ~/.bashrc
***************/usr/local/hadoop/etc/hadoop/hadoop-env.sh
修改JAVA_HOME:
export JAVA_HOME=你的java安装路径
*************** /usr/local/hadoop/etc/hadoop/core-site.xml
在<configration></configration>之间增加以下内容
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
***************配置yarn-site.xml
在<configration></configration>之间增加以下内容
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
***************配置mapred-site.xml
cp mapred-site.xml.template mapred-site.xml
在<configration></configration>之间增加以下内容
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
***************配置hdfs-site.xml
先创建namenode和datanode目录
cd /usr/local/hadoop
mkdir hdfs/name
mkdir hdfs/data
打开hdfs-site.xml
在该文件的<configuration></configuration>之间增加如下内容:
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/hdfs/data</value>
</property>
格式化hdfs:
hdfs namenode -format
启动hadoop:
cd /usr/local/hadoop
sbin/start-dfs.sh
sbin/start-yarn.sh
执行jps命令,查看相关进程
浏览器打开 http://localhost:50070/,查看hdfs管理页面
##### 检查一下hadoop文件系统容量大小是否为零,如果是则肯定是前面哪里操作时权限不对,所有文件目录的所有者都是hadoop
http://localhost:8088,hadoop进程管理页面
验证wordcount :
cd /usr/local/hadoop
hadoop fs -mkdir -p input
hadoop fs -put README.txt /input
**********
如果这里传文件传不上去,出现错误java.io.IOException: File /user/root/input could only be replicated to 0 nodes, instead of 1 ......
是datanode没有启动的原因.
**********
hadoop jar share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.5.2-sources.jar org.apache.hadoop.examples.WordCount input output
hadoop fs -cat output/*
5.安装eclipse
下载好了之后,解压:
tar zxvf eclipse-SDK-4.3.1-linux-gtk-x86_64.tar.gz
sudo mv eclipse /usr/local/eclipse
6.编译eclipse-hadoop插件
unzip hadoop2x-eclipse-plugin-master.zip
cd hadoop2x-eclipse-plugin-master/src/contrib/eclipse-plugin
***************
先去下载htrace-core-3.0.4.ja r http://mvnrepository.com/artifact/org.htrace/htrace-core/3.0.4
复制到 hadoop/share/hadoop/common/lib/
***************
ant jar -Dversion=2.5.2 -Dhadoop.version=2.5.2 -Declipse.home=/usr/local/eclipse -Dhadoop.home=/usr/local/hadoop
-Dversion:插件版本
-Dhadoop.version:hadoop的版本
eclipse.home:eclipse的安装目录
hadoop.home:hadoop的安装目录
hadoop.home:hadoop的安装目录
7.安装插件
1)将hadoop-eclipse-plugin-2.5.2.jar 复制到你的eclispe/plugins目录下,重启eclpise
2) 打开Windows—Preferences后,在窗口左侧会有Hadoop Map/Reduce选项,点击此选项,在窗口右侧设置Hadoop安装路径
3)打开Windows—Open Perspective—Other ,会找到项 “ Map/Reduce”,点击“确定”按钮
3)在“Map/Reduce Locations” Tab页 空白的地方右键,选择“New Hadoop location…”,弹出对话框“New hadoop location…”
设置Locations Name,任意即可.Map/Reduce Master Port 是9001
DFS Mastrer Port是 9000
其他不需要设置,保存关闭即可
点击左侧的DFSLocations—>你配置的location name,如能看到user,表示安装成功!!!
错误1:
Permission denied: user=root, access=WRITE
解决方法:
在hdfs-site.xml中添加 <property> <name>dfs.permissions</name> <value>false</value> </property>
错误2:
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly.
解决方法:
etc/hadoop/目录下的log4j.properties文件到MapReduce项目 src文件夹下
错误3:
org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://localhost:9000/user/hadoop/output already exist
解决方法:
删除hdfs上的output目录