1、安装Linux、JDK、关闭防火墙、配置主机名
解压:tar -zxvf hadoop-2.7.3.tar.gz -C ~/training/
设置Hadoop的环境变量: vi ~/.bash_profile
HADOOP_HOME=/root/training/hadoop-2.7.3
export HADOOP_HOME
PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export PATH
生效
测试MapReduce程序:
1、创建目录 mkdir ~/input
2、运行(只能测试mapreduce)
例子:/root/training/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar
hadoop jar hadoop-mapreduce-examples-2.7.3.jar wordcount ~/input/data.txt ~/output
export JAVA_HOME=/root/training/jdk1.8.0_144(也要添加jdk路径)
hdfs-site.xml
<property>
<name>dfs.replication&l
解压:tar -zxvf hadoop-2.7.3.tar.gz -C ~/training/
设置Hadoop的环境变量: vi ~/.bash_profile
HADOOP_HOME=/root/training/hadoop-2.7.3
export HADOOP_HOME
PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export PATH
生效
source ~/.bash_profile(在安装hadoop之前要先安装jdk)
2、Hadoop的目录结构
三种安装模式:
1.本地模式(一台机器)
特点:没有HDFS、只能测试MapReduce程序
MapReduce处理的是本地Linux的文件数据
vi hadoop-env.sh
25 export JAVA_HOME=/root/training/jdk1.8.0_144(先查看echo $JAVA_HOME)测试MapReduce程序:
1、创建目录 mkdir ~/input
2、运行(只能测试mapreduce)
例子:/root/training/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar
hadoop jar hadoop-mapreduce-examples-2.7.3.jar wordcount ~/input/data.txt ~/output
2.伪分布模式(一台机器)
特点:是在单机上,模拟一个分布式的环境
具备Hadoop的主要功能
HDFS: namenode+datanode+secondarynamenode
Yarn: resourcemanager + nodemanager
配置文件如下:
hadoop-env.shexport JAVA_HOME=/root/training/jdk1.8.0_144(也要添加jdk路径)
hdfs-site.xml
(原则:一般数据块的冗余度跟数据节点(DataNode)的个数一致;最大不超过3)
<!--表示数据块的冗余度,默认:3--><property>
<name>dfs.replication&l