1. hadoop安装之前要保证jdk和ssh服务都已安装成功
1.1
从本机将所需文件传到虚拟机三台对应的主机下面
1.2
复制文件到/usr/local/
执行命令cp /home/name1/Downloads/ /usr/local/
1.3 执行命令: tar –zxvf /usr/local/hadoop2.9.0 -C /usr/local/1.4
为以后升级方便,统一将文件将改为hadoop
执行命令:mv /usr/local/hadoop2.9.0/usr/local/hadoop
在做hadoop搭建master,node的时候保证hostname和环境变量,ssh都已经配置完成:
sudo gedit /etc/hostname,分别改成master,slave1(node1),slave2(node2)
1.5
创建master的hdfs的NameNode、DataNode及临时文件
/usr/local/hadoop/tmp/dfs/data
/usr/local/hadoop/tmp/dfs/name
slave1和slave2同理,也可以直接scp此master服务器地址到slave服务器地址
1.6
配置环境变量:
在bashrc增加hadoop环境变量
sudo gedit ~/.bashrc
export HADOOP_HOME=/usr/local/hadoop
exportPATH=$JAVA_HOME/bin:$PATH:$HADOOP_HOME/bin
1.7
进到hadoop/etc/hadoop目录下:
修改hadoop-env.sh、core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml和slave2文件,
a.hadoop-env.sh
# The java implementation to use.
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_144
b.core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/usr/local/hadoop/tmp</value>
<description>A base for othertemporary directories.</description>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:8020</value>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
</configuration>
c. hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/tmp/dfs/name</value>
<final>true</final>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/tmp/dfs/data</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>slave1:50090</value>
</property>
<property>
<name>dfs.datanode.secondary.http-address</name>
<value>slave1:50090</value>
</property>
<property>
<name>dfs.datanode.secondary.http-address</name>
<value>slave2:50090</value>
</property>
<property>
<name>dfs.namenode.rpc-address</name>
<value>master:8020</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
d. mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
e. yarn-site.xml
<configuration>
<!-- Sitespecific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce-shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
</configuration>
f.slaves
slave1
slave2(或者直接修改成ip地址)
以上配置完成可以直接复制到node节点对应的目录(也可手动进入对应的机器修改)
1.8master免密码登录node
a.执行scp ~/.ssh/id_rsa.pub hadoop@node1:~/.(master进行)
查看:
b.追加到文件(node进行)
cat ~/./id_rsa.pub >> ~/.ssh/authorized_keys
1.9hadoop启动与停止
a.启动之前要格式化namenode
执行命令: hadoop namenode –format
b.启动
执行bin目录下面对应的.sh程序:
Start-dfs.sh和start-yarn.sh(或者start-all.sh)
c.停止
执行bin目录下面对应的.sh程序:
Start-dfs.sh和start-yarn.sh