HADOOP 2.X安装:
记得安装jdk后再来安装hadoop
-
解压安装包:
tar -zxvf hadoop-2.7.1.tar.gz -C /usr/hadoop
- 配置环境变量:
vim /etc/profile
添加下面的内容:
# HADOOP_HOME
export HADOOP_HOME=/usr/hadoop/hadoop-2.7.1
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
:wq 保存并退出
-
source环境变量:
source /etc/profile
-
查看是否安装成功:hadoop version
-
配置集群文件(重点)
进入配置文件位置:
cd $HADOOP_HOME/etc/hadoop
-
配置core-site.xml
vim core-site.xml <configuration> <!-- HDFS NameNode路径 --> <property> <name>fs.defaultFS</name> <value>hdfs://master:8020</value> </property> <property> <!-- 指定Hadoop运行时产生的存储 --> <name>hadoop.tmp.dir</name> <value>/usr/hadoop/hadoop-2.7.1/data</value> </property> <property> <name>hadoop.http.staticuser.user</name> <value>yu</value>#默认静态用户 </property> </configuration>
-
配置hadoop-env.sh
vim hadoop-env.sh
修改为:
export JAVA_HOME=/usr/java/jdk1.8.0_152
-
配置hdfs-site.xml
vim hdfs-site.xml <configuration> <!-- 指定HDFS副本数 --> <property> <name>dfs.replication</name> <value>3</value> </property> <!-- Hadoop web 页面 --> <property> <name>dfs.namenode.http-address</name> <value>master:9870</value> <property> <property> <name>dfs.namenode.secondary.http-address</name> <value>slave2:9868</value> </property> </configuration>
-
配置yarn-site-xml
vim yarn-site-xml <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>slave1</value> </property> <property> <name>yarn.nodemanager.env-whitelist</name> <value> JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME, HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTACACHE, HADOOP_YARN_HOME,HADOOP_MAPRED_HOME </value> </property> <!-- 开启日志聚集功能 --> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <!-- 设置日志聚集服务器地址 --> <property> <name>yarn.log.server.url</name> <value>http://master:19888/jobhistory/logs</value> </property> <!-- 设置日志保留时间为 7 天 --> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>604800</value> </property> <property> <name>yaen.nodemanager.maximum-allocation.mb</name> <value>10000</value> </proeprty> <property> <name>yaen.nodemanager.miimum-allocation.mb</name> <value>1000</value> </proeprty> <property> <name>yarn.nodemanager.vmem-check-enable</name> <value>false</value> </property> <property> <name>yarn.nodemanager.pmem-check-enable</name> <value>true</value> </property> <property> <name>yarn.nodemanager.vmem-pmem-ratio</name> <value>5</value> </property> </configuration>
-
配置mapred-site.xml
cp mapred-site.xml.template mapred-site.xml vim mapred-site.xml <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <!-- 历史服务器端地址 --> <property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> </property> <!-- 历史服务器 web 端地址 --> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> </property> </configuration>
启动历史服务器:
sbin/mr-jobhistory-daemon.sh historyserver
-
设置主从节点:
vim $HADOOP_HOME/etc/hadoop/slaves
3.0是:works
添加:
master
slave1
slave2
-
分发文件:
scp -r /usr/hadoop root@slave1:/usr/ scp -r /usr/hadoop root@slave2:/usr/ scp -r /etc/profile root@slave1:/etc/ scp -r /etc/profile root@slave2:/etc/
source /etc/profile
在master、slave1、slave2 -
格式化主机:
hdfs namenode -format
-
启动hdfs:
sbin/start-dfs.sh
- 浏览器查看服务器启动:
- HDFS:mster:9870
- yarn: slave2:9868
- 浏览器查看服务器启动:
-
在slave1中启动YARN:
$HADOOP_HOME/sbin/start-yarn.sh
-
jobhistory
服务器启动:sbin/mr-jobhistory-daemon.sh start historyserver