客户机环境处理
修改win7 hosts文件以便查看web UI
--------------------
[C:\Windows\System32\drivers\etc\hosts]
127.0.0.1localhost
192.168.238.128 s100
192.168.238.129 s102
192.168.238.130 s103
192.168.238.131 s104
192.168.238.132 s105
--------------------
ip :100 - 104
主机名: s100
准备5台主机或者虚拟机并修改主机名
--------------
[/etc/hostname]
s100
修改dns解析
-------------------
[/etc/hosts]
127.0.0.1localhost
192.168.231.100s100
192.168.231.101s101
192.168.231.102s102
192.168.231.103s103
192.168.231.104s104
修改ip地址
-----------------------
[/etc/network/interfaces]
#This file describes the network interfaces available on your system
#and how to activate them. For more information, see interfaces(5).
#The loopback network interface
autolo
ifacelo inet loopback
#ifaceeth0 inet static
ifaceeth0 inet static
address192.168.231.100
netmask255.255.255.0
gateway192.168.231.2
dns-nameservers192.168.231.2
autoeth0
hadoop的安装
------------------
0.创建/soft目录,并更改用户和组
$>sudomkdir /soft
$>sudochown ubuntu:ubuntu /soft
1.安装jdk
a.复制jdk-8u65-linux-x64.tar.gz到 ~/Downloads
$>cp/mnt/hgfs/downloads/bigdata/jdk-8u65-linux-x64.tar.gz ~/Downloads
b.tarjdk-8u65-linux-x64.tar.gz
$>cd~/Downloads
$>tatar -xzvf jdk-8u65-linux-x64.tar.gz
c.移动到jdk1.8.0_65到/soft下
$>mv~/Downloads/jdk1.8.0_65 /soft
$>ln-s /soft/jdk-xxx jdk //创建符号连接
d.配置环境变量
[/etc/environment]
JAVA_HOME=/soft/jdk
PATH="...:/soft/jdk/bin"
e.让环境变量生效
$>source/etc/environment
f.检验安装是否成功
$>java-version
2.安装hadoop
a.复制并tar开hadoop.tar.gz
$>cp/mnt/hgfs/downloads/bigdata/hadoop-2.7.2.tar.gz ~/Downloads/
$>cd~/Downloads
$>tar-xzvf hadoop-2.7.2.tar.gz
$>mv~/Downloads/hadoop-2.7.2 /soft //移动到/soft下
$>cd/soft
$>ln-s hadoop-2.7.2 hadoop //创建hadoop符号连接
b.配置环境变量.
$>sudonano /etc/environment
[/etc/environment]
JAVA_HOME=/soft/jdk
HADOOP_HOME=/soft/hadoop
PATH="...:/soft/jdk/bin:/soft/hadoop/bin:/soft/hadoop/sbin"
c.重启系统
$>sudoreboot
d.验证hadoop安装是否成功
$>hadoopversion
配置Hadoop
----------------
1.Standalone/local
独立/本地模式,使用的本地文件系统。
nothing!!!
查看文件系统的方式:
$>hadoopfs -ls
没有启动任何java进程。
用于测试和开发环境.
2.Pseudodistributedmode
伪分布模式
[配置过程]
a.core-site.xml
<?xml version="1.0" ?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost/</value>
</property>
</configuration>
b.hdfs-site.xml
<?xml version="1.0"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
c.mapred-site.xml(本文件自己新建)
<?xml version="1.0"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
d.yarn-site.xml
<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
e.配置SSH
安全登录.
1)安装ssh
$>sudoapt-get install ssh
2)生成秘钥对
$>ssh-keygen-t rsa -P '' -f ~/.ssh/id_rsa
$>cd~/.ssh //查看生成的公私秘钥
3)导入公钥数据到授权库中
$>cat~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
4)登录到localhost
$>sshlocalhost
$>....yes
$>exit
$>sshlocalhost //不需要密码
5)格式化hdfs文件系统
$>hadoopnamenode -format
6)启动所有进程
$>start-all.sh
7)查看进程
$>jps //5 RM NMNN DN 2NN
8)查看文件系统
$>hadoopfs -ls
9)创建文件系统
$>hadoopfs -mkdir -p /user/ubuntu/data
$>hadoopfs -ls -R / //-lsr
3.Fullydistributed mode
完全分布式
安装SSH注意事项
---------------
1.禁用wifi
2.关闭防火墙
3.client能够访问外网
$>pingwww.baidu.com
4.修改ubuntu的软件源
[/etc/apt/sources.list]
...
163
[aliyun源]
debhttp://mirrors.aliyun.com/ubuntu/ precise main restricted universe multiverse
debhttp://mirrors.aliyun.com/ubuntu/ precise-security main restricted universemultiverse
debhttp://mirrors.aliyun.com/ubuntu/ precise-updates main restricted universemultiverse
debhttp://mirrors.aliyun.com/ubuntu/ precise-proposed main restricted universemultiverse
debhttp://mirrors.aliyun.com/ubuntu/ precise-backports main restricted universemultiverse
deb-srchttp://mirrors.aliyun.com/ubuntu/ precise main restricted universe multiverse
deb-srchttp://mirrors.aliyun.com/ubuntu/ precise-security main restricted universemultiverse
deb-srchttp://mirrors.aliyun.com/ubuntu/ precise-updates main restricted universemultiverse
deb-srchttp://mirrors.aliyun.com/ubuntu/ precise-proposed main restricted universemultiverse
deb-srchttp://mirrors.aliyun.com/ubuntu/ precise-backports main restricted universemultiverse
5.安装ssh
$>sudoapt-get install ssh
6.查看进程,是否启动了sshd服务
$>ps-Af | grep ssh
7.生成秘钥对
$>ssh-keygen-t rsa -P '' -f ~/.ssh/id_rsa
8.导入公钥到授权keys文件
$>cat~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
9.登录localhost
$>sshlocalhost
$>输入yes
10.退出之后,再登录
$>sshlocalhost
使用nc在两个client之间传递文件
------------------------------
0.描述
100向101传递文件.
1.在101机器
$>nc-l 8888 > ~/.ssh/id_rsa.pub.100
2.在100机器
$>nc192.168.231.101 8888 < ~/.ssh/id_rsa.pub
3.在101上添加公钥文件
$>cat~/.ssh/id_rsa.pub.100 >> ~/.ssh/authorized_keys
首次启动hadoop
-------------------
1.格式化文件系统
$>hadoopnamenode -format
2.启动所有进程
$>start-all.sh
3.查询进程
$>jps
4.停止所有进程
$stop-all.sh
使用webui访问hadoop hdfs
----------------------------
1.hdfswebui
http://localhost:50070/
2.datanode
http://localhost:50075/
3.2nn
完全分布式
自定义脚本xsync,在集群上分发文件。
-------------------------------------
循环复制文件到所有节点的相同目录下。
rsync-rvl /home/ubuntu ubuntu@s101:
xsynchello.txt
[/usr/local/bin/xsync]
#!/bin/bash
pcount=$#
if(( pcount<1 )) ; then
echono args;
exit;
fi
p1=$1;
fname=`basename$p1`
#echofname=$fname;
pdir=`cd-P $(dirname $p1) ; pwd`
#echopdir=$pdir
cuser=`whoami`
for(( host=100 ; host<105 ; host=host+1 )) ; do
echo ------------ s$host ---------------
rsync -rvl $pdir/$fname $cuser@s$host:$pdir
done
编写/usr/local/bin/xcall脚本,在所有主机上执行相同的命令
---------------------------------------------------------
[/usr/local/bin/xcall]
#!/bin/bash
pcount=$#
if(( pcount<1 )) ; then
echono args;
exit;
fi
echo -------- localhost --------
$@
for(( host=101 ; host<105 ; host=host+1 )) ; do
echo -------- s$host --------
ssh s$host $@
done
1.准备5台客户机
2.安装jdk
略
3.配置环境变量
JAVA_HOME
PATH
4.安装hadoop
略
5.配置环境变量
HADOOP_HOME
PATH
6.安装ssh
7.配置文件
[/soft/hadoop/etc/hadoop/core-site.xml]
fs.defaultFS=hdfs://s100/
[/soft/hadoop/etc/hadoop/hdfs-site.xml]
[hdfs-site.xml]
<?xmlversion="1.0"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>s104:50090</value>
</property>
</configuration>
[/soft/hadoop/etc/hadoop/yarn-site.xml]
yarn.resourcemanager.hostname=s100
[/soft/hadoop/etc/hadoop/slaves]
s101
s102
s103
8.在集群上分发以上三个文件
$>cd/soft/hadoop/etc/hadoop
$>xsynccore-site.xml
$>xsyncyarn-site.xml
$>xsyncslaves
$>xsync hdfs-site.xml
修改本地的临时目录
-----------------------
1.修改增加hadoop.tmp.dir
在[core-site.xml]增加以下设置:
<property>
<name>hadoop.tmp.dir</name>
<value>/home/ubuntu/hadoop/</value>
</property>
2.分发core-site.xml
$>xsynccore-site.xml
3.停止进程
$>stop-all.sh
4.格式化
$>hadoopnamenode -format
5.启动所有进程
$>start-all.sh
6.重启系统
$>sudoreboot //是否ok.