Hadoop伪分布及全分布(笔记)
一、I. 永久关闭防火墙--chkconfig iptables off
II.防火墙状态--service iptables status
二、jdk的配置
1.解压jdk:tar -zxvf jdk-8u11-linux-x64.tar.gz
2.在usr/中创建soft目录:mkdir soft
3.进入挂载目录:cd /mnt/hgfs/Ubuntu/
4.复制(cp)或移动解压的jdk: mv jdk1.8.0_11 /usr/soft/
5.进入soft目录: cd ./soft/
6.查看jdk是否存在: ls
7.进入jdk目录: cd ./jdk1.8.0_11/
8.查看jdk路径并复制下来以供配置jdk: pwd
9.回到上一级目录: cd ..
10.利用gedit编辑器配置jdk: gedit /etc/profile
I.加入:export JAVA_HOME=/usr/soft/jdk1.8.0_11
II.配置PATH: export PATH=$JAVA_HOME/bin/:$PATH
11.使配置立即生效:source /etc/profile
12.输入以下命令测试jdk是否配置成功: I. java
II. javac
III. java -version
三、配置无秘钥
13.回到当前用户家目录: cd ~
14.查看本机是否安装了ssh:ssh
15.配置ssh(输入当前命令回车): ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
16.进入~/.ssh目录: cd ~/.ssh
17.查看目录下的目录: ls
18.把id_rsa.pub重定向输入至authorized_keys: cat id_rsa.pub >> authorized_keys
19.测试无秘钥是否生成: ssh localhost
四、hadoop伪分布配置
20.进入挂载分区: cd /mnt/hgfs/Ubuntu/
21.查看当前文件中是否有hadoop: ls
22.解压hadoop: tar -zxvf hadoop-2.7.2.tar.gz
23.查看解压结果:ls
24.复制或移动解压以后的hadoop并重命名: mv hadoop-2.7.2 /usr/soft/hadoop
25.进入usr/soft/目录: cd /usr/soft/
26.查看soft子文件: ls
27.查看详细信息: ls -la
28.进入hadoop目录: cd ./hadoop/
28.查看路径并复制下来以供配置hadoop: pwd
29.配置hadoop :gedit /etc/profile
I.加入:export HADOOP_INSTALL=/usr/soft/hadoop
II.配置
PATH:export PATH=$JAVA_HOME/bin/:$PATH:$HADOOP_INSTALL/bin:$HADOOP_INSTALL/sbin
30.进入hadoop目录: cd ./soft/hadoop/
31.进入hadoop下的/etc/hadoop/: cd ./etc/hadoop/
33.编辑一下配置文件:I. gedit core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000/</value>
</property>
</configuration>
II. gedit hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
III. gedit mapred-site.xml.template
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
IV. gedit yarn-site.xml
<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>??</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
34.配置hadoop环境: gedit hadoop-env.sh
I.配置java_home:export JAVA_HOME=/usr/soft/jdk1.8.0_11
35.回到上一级目录: cd ..
36.继续回到上一级目录: cd ..
37.查看: ls
38.继续回到上一级目录: cd ..
39.查看: ls
40.进入hadoop目录: cd ./hadoop/
41.格式化namenode: ./bin/hadoop namenode -format
42.启动hadoop集群: ./sbin/start-all.sh
43.查看当前进程: jps
44.停止集群: ./sbin/stop-all.sh
************************************至此伪分布搭建完毕**********************************
五、hadoop全分布配置
45.关闭虚拟机:halt
46.克隆虚拟机:右击虚拟机---->管理------->克隆
47.更改主机名: gedit ./sysconfig/network
I.master改为s0:HOSTNAME=s0
II.同理更改slave1为s1:HOSTNAME=s1(第2台机器)
III.同理更改slave2为s2:HOSTNAME=s2(第3台机器)
IV.同理更改slave3为s3:HOSTNAME=s3(第4台机器)
48.查看修改后的主机名: hostname
49.修改hosts文件: gedit /etc/hosts
I. master:192.168.72.129(master’IP) s0(master’hostname)
II. slave1:192.168.72.128(slave1’IP) s1(slave1’hostname)
III. slave2:192.168.72.127(slave2’IP) s1(slave2’hostname)
IV. slave3:192.168.72.123(slave3’IP) s1(slave3’hostname)
50.测试是否能ping 通: ping s0
51.把/etc/hosts分别分配到s1、s2、s3: scp /etc/hosts root@192.168.72.128:/etc/
52.测试ping s1: ping s1
53.远程登录s1: ssh s1
54.进入hadoop目录: cd /usr/soft/hadoop/
55.进入/etc/hadoop/目录: cd ./etc/hadoop/
56.编辑同伪分布的.xml配置文件: gedit *.xml
I. gedit core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://s0/</value>
</property>
</configuration>
II. gedit hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
III. gedit mapred-site.xml.template
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
IV. gedit yarn-site.xml
<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>s0</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
57.编辑slave: gedit slaves
I.s1
II.s2
58.回到上一级目录: cd ..
59.进入: cd ./etc/
60.远程把hadoop配置文件复制到各个slave主机上去:
scp -r hadoop root@s1:/usr/soft/hadoop/etc/
61.远程登录s1测试: ssh s1
62.回到上一级目录: cd ..
63.格式化hadoop: hadoop namenode -format
64.启动hadoop: ./sbin/start-all.sh
65.查看进程: jps
66.登录到s1、s2查看进程: ssh s1
67.停止hadoop: ./sbin/stop-all.sh
68.查看进程:jps
****************************************至此全分布搭建完成********************************