◆ ◆ ◆ ◆ ◆
注意!!!
阅读此文章前,必须掌握虚拟机搭建,linux基本命令,请戳下面链接学习:
环境准备
1.关闭防火墙及开机自启动
/*普通用户:切换为root用户*/
$ su - root
/*root用户:关闭防火墙及开机自启动*/
# systemctl stop firewalld.service
# systemctl disable firewalld.service
2.查看、修改主机名与主机IP映射
# vim /etc/hostname
master
# vim /etc/hosts
192.168.XXX.XXX master
3.配置免密登录
# su - yan
$ cd
$ ssh-keygen -t rsa
$ ssh master
$ ssh-copy-id master
$ ssh master --不需要再输入密码即表示免密登录配置成功
4.上传并解压安装文件
第一步:创建文件夹
/*普通用户:切换到家目录,创建文件目录*/
$ cd
$ mkdir hadoop
$ mkdir hadoopdata
$ cd hadoop
第二步:直接拖拽安装包到SecureCRT,或rz上传即可
/*需要上传的介质*/
hadoop-2.7.7.tar.gz
jdk-8u144-linux-x64.tar.gz
第三步:解压安装包
$ tar -zxvf hadoop-2.7.7.tar.gz
$ tar -zxvf jdk-8u144-linux-x64.tar.gz
$ ll
total 394764
drwxr-xr-x. 10 yan yan 161 Feb 12 16:36 hadoop-2.7.7
-rw-r--r--. 1 yan yan 218720521 Dec 17 2018 hadoop-2.7.7.tar.gz
drwxr-xr-x. 8 yan yan 255 Jul 22 2017 jdk1.8.0_144
-rw-r--r--. 1 yan yan 185515842 Oct 17 2017 jdk-8u144-linux-x64.tar.gz
第四步:配置环境变量
$ cd
$ vim .bash_profile
export JAVA_HOME=/home/yan/hadoop/jdk1.8.0_144
export HADOOP_HOME=/home/yan/hadoop/hadoop-2.7.7
export PATH=$JAVA_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/bin:$PATH
$ source .bash_profile
/*查看是否生效*/
$ java -version
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
$ hadoop version
Hadoop 2.7.7
Subversion Unknown -r c1aad84bd27cd79c3d1a7dd58202a8c3ee1ed3ac
Compiled by stevel on 2018-07-18T22:47Z
Compiled with protoc 2.5.0
From source with checksum 792e15d20b12c74bd6f19a1fb886490
This command was run using /home/yan/hadoop/hadoop-2.7.7/share/hadoop/common/hadoop-common-2.7.7.jar
集群配置
/*切换到hadoop配置文件目录*/
$ cd /home/yan/hadoop/hadoop-2.7.7/etc/hadoop
1.hadoop-env.sh
$ vim hadoop-env.sh
export JAVA_HOME=/home/yan/hadoop/jdk1.8.0_144
2.mapred-env.sh
$ vim mapred-env.sh
export JAVA_HOME=/home/yan/hadoop/jdk1.8.0_144
3.yarn-env.sh
$ vim yarn-env.sh
export JAVA_HOME=/home/yan/hadoop/jdk1.8.0_144
4.core-site.xml
$ vim core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/yan/hadoopdata</value>
</property>
</configuration>
5.hdfs-site.xml
$ vim hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
6.yarn-site.xml
$ vim yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:18040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:18030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:18025</value>
</property>
</configuration>
7.mapred-site.xml
/*创建一个副本*/
$ cp mapred-site.xml.template mapred-site.xml
$ vim mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
8.slaves文件
$ vim slaves
master
9.格式化文件系统并启动集群
$ cd
/*格式化文件系统*/
$ hdfs namenode -format
/*启动集群*/
$ start-all.sh
10.验证是否启动成功
方法一:jps查看进程(出现6个,缺一不可)
$ jps
9713 DataNode
10071 ResourceManager
10505 Jps
9915 SecondaryNameNode
9596 NameNode
10175 NodeManager
方法二:web端查看(观察界面是否出现)
http://master:50070/
http://master:8088/
牛刀小试
1.计算PI值
$ cd /home/yan/hadoop/hadoop-2.7.7/share/hadoop/mapreduce
$ hadoop jar hadoop-mapreduce-examples-2.7.7.jar pi 5 5
2.词频统计
$ cd
$ vim word.txt
Hello Yan
Wuhan Win
I love U
$ hadoop fs -mkdir /test
$ hadoop fs -put word.txt /test
$ cd /home/yan/hadoop/hadoop-2.7.7/share/hadoop/mapreduce
$ hadoop jar hadoop-mapreduce-examples-2.7.7.jar wordcount /test/word.txt /output
$ hadoop fs -cat /output/part-r-00000
电脑太卡了!!!就写到这吧~#END#
———— 下次见 ————
注意!!!
阅读此文章前,必须掌握虚拟机搭建,linux基本命令,请戳下面链接学习:
python爬虫人工智能大数据公众号