1 准备工作
1.1下载安装包
- hadoop
wget http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
- jdk1.8.0_121
- 下载mysql
wget https://dev.mysql.com/get/Downloads/MySQL-5.7/mysql-community-server-5.7.17-1.el6.x86_64.rpm --no-check-certificate
- 下载hive
wget http://mirrors.cnnic.cn/apache/hive/hive-2.1.1/apache-hive-2.1.1-bin.tar.gz
wget http://mirrors.cnnic.cn/apache/hive/hive-2.1.1/apache-hive-2.1.1-src.tar.gz
- 下载zookeeper
wget https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.5.2-alpha/zookeeper-3.5.2-alpha.tar.gz --no-check-certificate
- 下载hbase
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hbase/1.3.0/hbase-1.3.0-bin.tar.gz --no-check-certificate
- 下载storm
wget https://mirrors.cnnic.cn/apache/storm/apache-storm-1.1.0/apache-storm-1.1.0.tar.gz --no-check-certificate
- 下载sqoop
wget https://mirrors.cnnic.cn/apache/sqoop/1.99.7/sqoop-1.99.7-bin-hadoop200.tar.gz --no-check-certificate
- 下载spark
wget https://mirrors.cnnic.cn/apache/spark/spark-2.1.0/spark-2.1.0-bin-hadoop2.7.tgz --no-check-certificate
wget https://mirrors.cnnic.cn/apache/spark/spark-2.1.0/spark-2.1.0-bin-without-hadoop.tgz --no-check-certificate
- 下载kafka
https://mirrors.cnnic.cn/apache/kafka/0.10.1.1/kafka_2.11-0.10.1.1.tgz --no-check-certificate
- 下载flume
wget https://mirrors.cnnic.cn/apache/flume/1.7.0/apache-flume-1.7.0-bin.tar.gz --no-check-certificate
1.2 准备四个节点:
- node01,node02,node03,node04,而且已经配置了ssh互信,而且已经关闭了防火墙
- 应用解压安装目录为/app
- HDFS目录为:/app/dirhadoop/
1.3 配置环境变量
所有节点执行
echo 'export JAVA_HOME=/app/jdk1.8.0_121' >> /etc/profile
echo 'export CLASS_PATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar' >> /etc/profile
echo 'export HADOOP_HOME=/app/hadoop-2.7.3' >>/etc/profile
echo 'export PATH=$PATH:$HADOOP_HOME/bin:$JAVA_HOME/bin:$JAVA_HOME/jre/bin' >> /etc/profile
source /etc/profile
1.4 创建HDFS使用的目录
每个节点都要执行
#core-site.xml的hadoop.tmp.dir
mkdir -p /app/dirhadoop/tmp
#hdfs-site.xml的dfs.namenode.name.dir
mkdir -p /app/dirhadoop/dfs/name
#hdfs-site.xml的dfs.datanode.data.dir
mkdir -p /app/dirhadoop/dfs/data
2. hadoop-2.7的配置文件
先配置一个节点的配置文件,然后scp到其他节点
2.1 配置hadoop-env.sh
echo 'export JAVA_HOME=/app/jdk1.8.0_121' >>/app/hadoop-2.7.3/etc/hadoop/hadoop-env.sh
2.2 配置core-site.xml
2.2.1 参数解释
- hadoop.tmp.dir定义了hdfs的临时文件存放目录
- fs.default.name指定了如何访问HDFS,例如hbase设置为:
<name>hbase.rootdir</name>
<value>hdfs://node01:9000/hbase</value>
2.2.2 配置文件
打开/app/hadoop-2.7.3/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://node01:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/dirhadoop/tmp</value>
</property>
</configuration>
2.3 配置hdfs-site.xml
重要的参数:
- dfs.namenode.name.dir
- dfs.datanode.data.dir
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/app/dirhadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/app/dirhadoop/dfs/data</value>
</property>
</configuration>
2.4 配置mapred-site.xml
重要参数:
- mapreduce.jobhistory.webapp.address 定义了访问网址
- 部署完成之后,可以通过sbin/mr-jobhistory-daemon.sh start historyserver
启动mapreduce.jobhistory.webapp
打开/app/hadoop-2.7.3/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>node01:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>