完全分布式
--------------------
1.克隆3台client(centos7)
右键centos-7-->管理->克隆-> ... -> 完整克隆
2.启动client
3.启用客户机共享文件夹。
4.修改hostname和ip地址文件
[/etc/hostname]
s202
[/etc/sysconfig/network-scripts/ifcfg-ethxxxx]
...
IPADDR=..
5.重启网络服务
$>sudo service network restart
6.修改/etc/resolv.conf文件
nameserver 192.168.231.2
7.重复以上3 ~ 6过程.
准备完全分布式主机的ssh
-------------------------
1.删除所有主机上的/home/centos/.ssh/*
2.在s201主机上生成密钥对
$>ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
3.将s201的公钥文件id_rsa.pub远程复制到202 ~ 204主机上。
并放置/home/centos/.ssh/authorized_keys
$>scp id_rsa.pub centos@s201:/home/centos/.ssh/authorized_keys
$>scp id_rsa.pub centos@s202:/home/centos/.ssh/authorized_keys
$>scp id_rsa.pub centos@s203:/home/centos/.ssh/authorized_keys
$>scp id_rsa.pub centos@s204:/home/centos/.ssh/authorized_keys
4.配置完全分布式(${hadoop_home}/etc/hadoop/)
[core-site.xml]
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://s201:9000</value>
</property>
<!-- 指定hadoop运行时产生文件的存储目录 ,如果不配置这个键值对的话,hadoop运行时产生文件默认会存放到/tmp 文件夹下,/tmp下的文件,在机器重启后,数据会丢失,start-all.sh后,jps后,看不到NameNode进程,集群配置的时候必须加上这一项。配置后,在指定目录下创建临时文件夹-->
<property>
<name>hadoop.tmp.dir</name>
<value>/soft/hadoop/hadoop_tmp</value>
</property>
</configuration>
[hdfs-site.xml]
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>s204:50090</value>
</property>
</configuration>
[mapred-site.xml]
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
[yarn-site.xml]
<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>s201</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
<!-- slaves只放DataNode主机名或IP地址-->
[slaves]
s202
s203
s204
[hadoop-env.sh]
...
export JAVA_HOME=/soft/jdk
...
[yarn-env.sh]
...
export JAVA_HOME=/soft/jdk
...
5.分发配置
$>cd /soft/hadoop/etc/
$>scp -r full centos@s202:/soft/hadoop/etc/
$>scp -r full centos@s203:/soft/hadoop/etc/
$>scp -r full centos@s204:/soft/hadoop/etc/
6.删除符号连接
$>cd /soft/hadoop/etc
$>rm hadoop
$>ssh s202 rm /soft/hadoop/etc/hadoop
$>ssh s203 rm /soft/hadoop/etc/hadoop
$>ssh s204 rm /soft/hadoop/etc/hadoop
7.创建符号连接
$>cd /soft/hadoop/etc/
$>ln -s full hadoop
$>ssh s202 ln -s /soft/hadoop/etc/full /soft/hadoop/etc/hadoop
$>ssh s203 ln -s /soft/hadoop/etc/full /soft/hadoop/etc/hadoop
$>ssh s204 ln -s /soft/hadoop/etc/full /soft/hadoop/etc/hadoop
8.删除临时目录文件
$>cd /tmp
$>rm -rf hadoop-centos
$>ssh s202 rm -rf /tmp/hadoop-centos
$>ssh s203 rm -rf /tmp/hadoop-centos
$>ssh s204 rm -rf /tmp/hadoop-centos
9.删除hadoop日志
$>cd /soft/hadoop/logs
$>rm -rf *
$>ssh s202 rm -rf /soft/hadoop/logs/*
$>ssh s203 rm -rf /soft/hadoop/logs/*
$>ssh s204 rm -rf /soft/hadoop/logs/*
10.格式化文件系统
$>hadoop namenode -format
11.启动hadoop进程
$>start-all.sh
rsync
------------------
四个机器均安装rsync命令。
远程同步.
$>sudo yum install rsync
将root用户实现无密登录
------------------------
1.同
编写脚本
---------------
1.xcall.sh
2.xsync.sh
xsync.sh /home/etc/a.txt
rsync -lr /home/etc/a.txt centos@s202:/home/etc