注意:官网提供的都是32位的安装包,64位的安装包需要自己编译
192.168.100.200 master
192.168.100.201 slave1
192.168.100.202 slave2
2 三台安装jdk
bin etc games hadoop-1.2.1.tar.gz include jdk-7u79-linux-x64.tar.gz lib lib64 libexec sbin share src
JAVA_HOME=/usr/local/jdk1.7.0_79
PATH=$JAVA_HOME/bin:$PATH
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export JAVA_HOME
export PATH
export CLASSPATH
3 完全分布式安装部署
192.168.100.200 master
192.168.100.201 slave1
192.168.100.202 slave2
3.1 配置SSH(三台)
按两次回车
把各个节点的authorized_keys内容互相拷贝到对方文件中,使得彼此可以免密码登陆
192.168.100.200 master
192.168.100.201 slave1
192.168.100.202 slave2
192.168.100.200 localhost.localdomain --每一台节点这一行不一致
localhost.localdomain
3.2 安装 hadoop
配置hadoop-env.sh、
core-site.xml 、
hdfs-site.xml 、
mapred-site.xml 、yarn-site.xml 、slave
export JAVA_HOME=/usr/local/jdk1.7.0_79
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.100.200:9000</value>
</property>
</configuration>
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>192.168.100.200:9001</value>
</property>
</configuration>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>192.168.100.200:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>192.168.100.200:19888</value>
</property>
</configuration>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>192.168.100.200:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>192.168.100.200:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>192.168.100.200:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>192.168.100.200:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>192.168.100.200:8088</value>
</property>
</configuration>
192.168.100.201
192.168.100.202
3.3 向各个节点复制hadoop
3.4 格式化HDFS
在名称节点运行命令,即master
出现successfully
formatted 成功
在你重新格式化分布式文件系统之前,需要将文件系统中的数据先清除,否则,datanode将创建不成功。
3.5 启动集群
禁用防火墙
可以在名称节点上运行
3.6 查看启动进程
master上
slave上
web访问
http://192.168.100.200:8088/ resourcemanager
http://192.168.100.200:50070/ namenode 可以看到live node 为2
4 eclipse集成hadoop开发环境
将
hadoop-eclipse-plugin-2.7.1.jar拷贝到${eclipse}
\dropins\plugins
打开windows -->showview -->找到Map/Reduce Locaions -->new Hadoop location
将hadoop-2.7.1解压到本地一份,放d:\下
将对应的 winutils.exe和 hadoop.dll文件拷贝到hadoop/bin下面
之后将hadoop的路径配置为环境变量中
在项目的src下面新建log4j.properties
log4j.rootLogger=INFO, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d %p [%c] - %m%n
log4j.appender.logfile=org.apache.log4j.FileAppender
log4j.appender.logfile.File=target/spring.log
log4j.appender.logfile.layout=org.apache.log4j.PatternLayout
log4j.appender.logfile.layout.ConversionPattern=%d %p [%c] - %m%n