看官方安装需求:
Required Software
Required software for Linux include:
1. Java™ must be installed. Recommended Javaversions are described at HadoopJavaVersions.
2. ssh must be installed and sshd must berunning to use the Hadoop scripts that manage remote Hadoop daemons.
1、配置SSH无密码登录
Centos 默认没有启动SSH无密登录,去掉/etc/ssh/sshd_config中的注释
#RSAAuthenticationyes
#PubkeyAuthenticationyes
# ssh-keygen -tdsa -P '' -f ~/.ssh/id_dsa
# cat~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
验证 # ssh localhost
Last login: ThuJul 27 22:53:44 2017 from localhost
查看是否安装rsync
# rpm -qa |greprsync -i
rsync-3.0.6-12.el6.i686
2、安装hadoop
去官方吧, http://hadoop.apache.org/releases.html
下载2.7.3版本 .3.0已经有alpha4, 下载完成后发现tar.gz有205M
/usr/local 目录下解压
$ tar -zxvfhadoop-2.7.3.tar.gz
$ bin/hadoop 显示参数用法
hadoop支持三种模式:
Local(Standalone)Mode \ Pseudo-Distributed Mode \ Fully-Distributed Mode
默认情况下hadoop配置为non-distributedmode, as a single Java process. 方便调试
1)建立目录 tmp hdfs hdfs/data hdfs/name
$ mkdir tmp hdfs
$ mkdir hdfs/data hdfs/name
3、hadoop 的配置文件
涉及到的相关文件
hadoop-env.sh
core-site.xml
yarn-env.sh
hdfs-site.xml
mapred-site.xml
yarn-site.xml
1)配置文件 etc/hadoop/hadoop-env.sh
#exportJAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/usr/local/jdk
2)配置文件 etc/hadoop/yarn-env.sh
#exportJAVA_HOME=/home/y/libexec/jdk1.6.0/
export JAVA_HOME=/usr/local/jdk/
3)配置文件 etc/hadoop/core-site.xml:
HDFS的URI,文件系统://namenode标识:端口号
namenode上本地的hadoop临时文件夹
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-2.7.3/tmp</value>
</property>
</configuration>
4)配置文件 etc/hadoop/hdfs-site.xml:
副本个数,配置默认是3,应小于datanode机器数量
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/usr/local/hadoop-2.7.3/hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/usr/local/hadoop-2.7.3/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
5)配置文件 etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
6)配置文件 etc/hadoop/yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>192.168.253.119:8099</value>
</property>
</configuration>
$ bin/hdfs namenode -format
启动hadoop
$ sbin/start-dfs.sh
启动yarn
$sbin/start-yarn.sh
7/07/28 00:03:14WARN util.NativeCodeLoader: Unable to load native-hadoop library for yourplatform... using builtin-java classes where applicable
出现WARN,但查看有启动,不影响使用
$ jps
4593 NodeManager
3571 DataNode
4677 Jps
4494 ResourceManager
3742 SecondaryNameNode
应该是5个,怎么少一个NameNade
关闭 sbin/stop-dfs.sh 重format
bin/hadoop namecode -format
再启重动 sbin/start-dfs.sh 全部可以显示
$ jps
6037 SecondaryNameNode
6647 Jps
6186 ResourceManager
5837 DataNode
6286 NodeManager
5743 NameNode
netstat -ant 有端口 50070 及9000启动
关闭hadoop
$ sbin/stop-dfs.sh
安装完成后无法在windows 看到50070 端口的管理界面
1)设置防火墙,在内网测试所以就直接关掉
2)设置NAT 映射 端口50070