准备工作
1. Hadoop:hadoop-2.4.1.tar.gz
2. Spark:下载编译好的基于对应hadoop版本的版本:spark-1.3.0-bin-hadoop2.4.tgz
3.JAVA:jdk-7u80-linux-x64.tar.gz
4. scala:scala-2.11.8.tgz5
搭建环境
在MobaXterm界面连接服务器,使用命令sudo virt-manager打开virtual machine manager窗口,使用ubuntu16.04镜像创建一个新的虚拟机,进行如下操作
- 设置静态ip地址
sudo vim /etc/network/interfaces ## 更改ip地址,将#iface ens3 inet dhcp改为如下内容 # The primary network interface auto ens3 iface ens3 inet static address 192.168.122.54 netmask 255.255.255.0 gateway 192.168.122.1 ## 重启网络 /etc/init.d/networking restart ## 注意:clone机器后仍然需要修改,我设置的静态ip如下 master 192.168.122.54 slave1 192.168.122.55 slave2 192.168.122.56 slave3 192.168.122.57 slave4 192.168.122.58
- 配置hosts文件
## 修改主机名 sudo vim /etc/hostname ## 改为master,clone之后其他再统一修改 ## 修改hosts文件,只保留localhost,剩余内容进行追加(否则会出问题,后面会提到!) sudo vim/etc/hosts 192.168.122.54 master 192.168.122.55 slave1 192.168.122.56 slave2 192.168.122.57 slave3 192.168.122.58 slave4
- 搭建Java环境
## 解压 sudo tar -zxvf jdk-7u80-linux-x64.tar.gz -C ./software/ sudo mv jdkxxxx jdk ## 配置环境变量 sudo vim /etc/profile export JAVA_HOME=/home/zmx/software/jdk export JRE_HOME=/home/zmx/software/jdk/jre export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH export CLASSPATH=$CLASSPATH:.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
- 搭建scala环境
## 解压 sudo tar -zxvf scala-2.11.8.tgz -C software/ sudo mv scala-2.11.8/ scala ## 追加环境变量 sudo vim /etc/profile ## 最终效果如图 export JAVA_HOME=/home/zmx/softwarek export JRE_HOME=/home/zmx/softwarek/jre export SCALA_HOME=/home/zmx/software/scala export PATH=$SCALA_HOME/bin:$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH export CLASSPATH=$CLASSPATH:.:$JAVA_HOMEb:$JAVA_HOME/jreb
- 关闭防火墙
sudo ufw disable Firewall stopped and disabled on system startup
- clone虚拟机:在MobaXterm界面输入sudo virt-manager打开vm manager后进行clone
- 搭建集群,测试机器之间能否通信
## clone机器后记得进行ip地址和host的更改,这样我们一共有五台机器:master、slave1-slave4 ## master上进行如下操作,其他机器同理 ping slave1 ping slave2 ping slave3 ping slave4
- 配置master-slave ssh 免密登陆
## 在每台机器上生成私钥和公钥 ssh-keygen -t rsa ## 将slave上的id_rsa.pub用scp命令发给master scp ./.ssh/id_rsa.pub zmx@master:~/.ssh/id_rsa.pub.slave1 scp ./.ssh/id_rsa.pub zmx@master:~/.ssh/id_rsa.pub.slave2 scp ./.ssh/id_rsa.pub zmx@master:~/.ssh/id_rsa.pub.slave3 scp ./.ssh/id_rsa.pub zmx@master:~/.ssh/id_rsa.pub.slave4 ## 在master上,将所有公钥加到用于认证的公钥文件authorized_keys中 zmx@master:~$ cat .ssh/id_rsa.pub* >> ~/.ssh/authorized_keys ## 将公钥文件分发给slaves scp .ssh/authorized_keys zmx@slave1:~/.ssh/ scp .ssh/authorized_keys zmx@slave2:~/.ssh/ scp .ssh/authorized_keys zmx@slave3:~/.ssh/ scp .ssh/authorized_keys zmx@slave4:~/.ssh/ ## 最后在每台主机上,用SSH命令,检验下是否能免密码登录 ssh slave1 ssh slave2 ssh slave3 ssh slave4
安装Hadoop
配置环境
- 配置文件
- hadoop-env.sh
## 末尾增加 export HADOOP_IDENT_STRING=$USER export JAVA_HOME=/home/zmx/software/jdk export HADOOP_PREFIX=/home/zmx/software/hadoop-2.4.1
- yarn-evn.sh
## 末尾增加 export JAVA_HOME=/home/zmx/software/jdk
- slaves: 加入master表示将master也视为slave
master slave1 slave2 slave3 slave4
- core-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/zmx/software/hadoop-2.4.1/tmp</value> </property> </configuration>
- hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.datanode.ipc.address</name> <value>0.0.0.0:50020</value> </property> <property> <name>dfs.datanode.http.ad
- hadoop-env.sh