HADOOP的安装及环境配置
一.对本机添加互信
- 修改主机名:
hostnamectl set-hostname hadooptest1
在互信之前最好systemctl disable firewalld
关下防火墙,不然可能会有Error
-
在
vi /etc/hosts
里添加IP 主机名
-
ssh-keygen
连按几次回车键生成私钥
-
cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys
复制私钥到公钥
-
ssh-copy-id -i /root/.ssh/id_rsa.pub -p22 root@hadooptest1
远程复制本机器 -
ssh root@hadooptest1
远程登录验证,不需要输入密码即可
解压并重命名
tar -zxvf hadoop-2.6.0-cdh5.14.2.tar.gz -C /opt
mv hadoop-2.6.0-cdh5.14.2 hadoop2.6.0
二.配置cd /opt/hadoop2.6.0/etc/hadoop/
目录下的文件
1.vi hadoop-env.sh
修改JAVA_HOME路径
2.vi core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ip地址:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop2.6.0/tmp</value>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.native.lib</name>
<value>false</value>
<description>Should native hadoop libraries, if present, be used.
</description>
</property>
<!-- Hadoop的回收站trash功能,默认为0,设置大于0就会自动创建保存被删除的数据直至1440分钟后 -->
<property>
<name>fs.trash.interval</name>
<value>1440</value>
<description>Number of minutes between trash checkpoints.
If zero, the trash feature is disabled.
</description>
</property>
</configuration>
3.vi hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>ip地址:50090</value>
</property>
</configuration>
4.vi mapred-site.xml
首先重命名mv mapred-site.xml.template mapred-site.xml再进去修改
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>ip地址:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>ip地址:19888</value>
</property>
</configuration>
5.vi yarn-site.xml
<configuration>
<!-- reducer获取数据方式 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<!-- 指定YARN的ResourceManager的地址 -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>具体主机名</value>
</property>
<!-- 日志聚集功能使用 -->
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<!-- 日志保留时间设置7天 -->
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
</configuration>
6.vi ./slaves
修改localhost为具体的主机名
三.Hadoop环境变量配置
vi /etc/profile
在JDK的JAVA_HOME和PATH之间插入
export HADOOP_HOME=/opt/hadoop2.6.0
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_INSTALL=$HADOOP_HOME
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
然后修改JDK的PATH为:
export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
四.格式化HDFS
hadoop namenode -format
五.启动hadoop
start-all.sh
六.访问Hadoop页面
1.HDFS页面
192.168.198.111:50070
2.YARN管理页面
192.168.198.111:8088
3.历史服务页面
192.168.198.111:19888