一、预备
安装java环境,jdk1.8.0_171
首先去Oracle下载jdk,复制到虚拟机
root@hadoopmaster module]# pwd
/opt/module
[root@hadoopmaster module]# ll
total 186424
drwxr-xr-x. 8 root root 4096 Mar 28 17:18 jdk1.8.0_171
-rwxrw-rw-. 1 root root 190890122 Jun 1 06:10 jdk-8u171-linux-x64.tar.gz
[root@hadoopmaster module]#
然后配置环境:
[root@hadoopmaster module]# vi /etc/profile
在末尾加入:
#JAVA_HOME
export JAVA_HOME=/opt/module/jdk1.8.0_171
export PATH=${PATH}:${JAVA_HOME}/bin
保存,加载配置:
[root@hadoopmaster module]# source /etc/profile
测试:
[root@hadoopmaster module]# java -version
java version "1.8.0_171"
Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)
jdk环境搭好了,下面安装hadoop
二、单机模式
1、下载Hadoop-2.6.5.tar.gz
地址:点击下载
2、安装
复制到虚拟机:
[root@hadoopmaster module]# pwd
/opt/module
[root@hadoopmaster module]# ll
total 194964
-rwxrw-rw-. 1 root root 199635269 Aug 8 23:19 hadoop-2.6.5.tar.gz
drwxr-xr-x. 8 root root 4096 Mar 28 17:18 jdk1.8.0_171
[root@hadoopmaster module]#
解压:
[root@hadoopmaster module]# tar -zxvf hadoop-2.6.5.tar.gz
将hadoop添加到环境中:
[root@hadoopmaster module]# vi /etc/profile
#HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop-2.6.5
export PATH=${PATH}:${HADOOP_HOME}/bin
[root@hadoopmaster module]# source /etc/profile
查看是否设置成功:
[root@hadoopmaster module]# hadoop version
Hadoop 2.6.5
Subversion https://github.com/apache/hadoop.git -r e8c9fe0b4c252caf2ebf1464220599650f119997
Compiled by sjlee on 2016-10-02T23:43Z
Compiled with protoc 2.5.0
From source with checksum f05c9fa095a395faa9db9f7ba5d754
This command was run using /opt/module/hadoop-2.6.5/share/hadoop/common/hadoop-common-2.6.5.jar
配置成功了,但是有一点需要注意,在hadoop环境配置文件中需要将JAVA_HOME由原来${JAVA_HOME}换成具体路径,这样在集群环境中才不会出现问题:
[root@hadoopmaster module]# vi /opt/module/hadoop-2.6.5/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/opt/module/jdk1.8.0_171
保存,测试:
[root@hadoopmaster module]# hadoop version
Hadoop 2.6.5
Subversion https://github.com/apache/hadoop.git -r e8c9fe0b4c252caf2ebf1464220599650f119997
Compiled by sjlee on 2016-10-02T23:43Z
Compiled with protoc 2.5.0
From source with checksum f05c9fa095a395faa9db9f7ba5d754
This command was run using /opt/module/hadoop-2.6.5/share/hadoop/common/hadoop-common-2.6.5.jar
3、测试
接下来需要运行一个实例来进行测试:
进入hadoop根目录:
[root@hadoopmaster hadoop-2.6.5]# pwd
/opt/module/hadoop-2.6.5
[root@hadoopmaster hadoop-2.6.5]# ll
total 116
drwxrwxr-x. 2 j j 4096 Oct 2 2016 bin
drwxrwxr-x. 3 j j 19 Oct 2 2016 etc
drwxrwxr-x. 2 j j 101 Oct 2 2016 include
drwxrwxr-x. 3 j j 19 Oct 2 2016 lib
drwxrwxr-x. 2 j j 4096 Oct 2 2016 libexec
-rw-rw-r--. 1 j j 84853 Oct 2 2016 LICENSE.txt
-rw-rw-r--. 1 j j 14978 Oct 2 2016 NOTICE.txt
-rw-rw-r--. 1 j j 1366 Oct 2 2016 README.txt
drwxrwxr-x. 2 j j 4096 Oct 2 2016 sbin
drwxrwxr-x. 4 j j 29 Oct 2 2016 share
新建一个文件夹input:
[root@hadoopmaster hadoop-2.6.5]# mkdir input
[root@hadoopmaster hadoop-2.6.5]# ll
total 116
drwxrwxr-x. 2 j j 4096 Oct 2 2016 bin
drwxrwxr-x. 3 j j 19 Oct 2 2016 etc
drwxrwxr-x. 2 j j 101 Oct 2 2016 include
drwxr-xr-x. 2 root root 6 Aug 10 23:18 input
drwxrwxr-x. 3 j j 19 Oct 2 2016 lib
drwxrwxr-x. 2 j j 4096 Oct 2 2016 libexec
-rw-rw-r--. 1 j j 84853 Oct 2 2016 LICENSE.txt
-rw-rw-r--. 1 j j 14978 Oct 2 2016 NOTICE.txt
-rw-rw-r--. 1 j j 1366 Oct 2 2016 README.txt
drwxrwxr-x. 2 j j 4096 Oct 2 2016 sbin
drwxrwxr-x. 4 j j 29 Oct 2 2016 share
这里我测试数据将一个tomcat的日志文件log.txt复制进去进行测试:
[root@hadoopmaster hadoop-2.6.5]# cd input
[root@hadoopmaster input]# ll
total 40
-rwxrw-rw-. 1 root root 39654 Aug 9 08:28 log.txt
运行一个haoop官方例子,单词次数计算:
[root@hadoopmaster hadoop-2.6.5]# hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.5.jar wordcount input output
input是文件输入的文件夹,output是结果输出文件夹
需要注意的是output文件夹不能存在,存在就会报错
可以看到,运行成功,输出目录在haoop目录下的output文件夹中
查看结果:
[root@hadoopmaster hadoop-2.6.5]# cd output
[root@hadoopmaster output]# ll
total 12
-rw-r--r--. 1 root root 9295 Aug 10 23:22 part-r-00000
-rw-r--r--. 1 root root 0 Aug 10 23:22 _SUCCESS
[root@hadoopmaster output]# vi part-r-00000
#结果,GET请求61次,post请求282次。
"-" 17
"CONNECT 2
"GET 61
"OPTIONS 10
"POST 282
这样,单机模式就搭建完成了!
二、伪分布模式
前面安装教程和单机模式一模一样,但是需要修改一些配置文件:
目录:/opt/module/hadoop-2.6.5/etc/hadoop/
修改core-site.xml:
[root@hadoopmaster hadoop]# vi core-site.xml
<configuration>
<!-- 指定HDFS中NameNode的地址 -->
<property>
<name>fs.defaultFS</name>
<!--hadoopmaster是我的主机名,可以换成ip或localhost-->
<value>hdfs://hadoopmaster:9000</value>
</property>
<property>
<!--这个配置是将hadoop的临时目录改成自定义的目录下-->
<name>hadoop.tmp.dir</name>
<value>/opt/module/hadoop-2.6.5/data/tmp</value>
</property>
</configuration>
修改hdfs-site.xml:
[root@hadoopmaster hadoop]# vi hdfs-site.xml
<configuration>
<!-- 指定HDFS副本的数量 -->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
格式化namenode
[root@hadoopmaster hadoop]# hdfs namenode -format
成功:
启动
切换到hadoop根目录下的sbin目录中。
启动namenode:
[root@hadoopmaster sbin]# ./hadoop-daemon.sh start namenode
starting namenode, logging to /opt/module/hadoop-2.6.5/logs/hadoop-root-namenode-hadoopmaster.out
查看namenode是否启动成功:
[root@hadoopmaster sbin]# jps
2862 NameNode
2943 Jps
启动成功,接下来启动datanode:
[root@hadoopmaster sbin]# ./hadoop-daemon.sh start datanode
starting datanode, logging to /opt/module/hadoop-2.6.5/logs/hadoop-root-datanode-hadoopmaster.out
查看datanode是否启动成功:
[root@hadoopmaster sbin]# jps
2980 DataNode
2862 NameNode
3054 Jps
成功!
操作集群
在文件系统中建立一个input文件夹:
[root@hadoopmaster sbin]# hadoop fs -mkdir -p /user/data/input
[root@hadoopmaster sbin]# hadoop fs -ls -R /
drwxr-xr-x - root supergroup 0 2018-08-11 00:18 /user
drwxr-xr-x - root supergroup 0 2018-08-11 00:19 /user/data
drwxr-xr-x - root supergroup 0 2018-08-11 00:19 /user/data/input
将单机模式中的log.txt上传至文件系统中input目录:
[root@hadoopmaster hadoop-2.6.5]# hadoop fs -put input/log.txt /user/data/input/
[root@hadoopmaster hadoop-2.6.5]# hadoop fs -ls -R /
drwxr-xr-x - root supergroup 0 2018-08-11 00:18 /user
drwxr-xr-x - root supergroup 0 2018-08-11 00:19 /user/data
drwxr-xr-x - root supergroup 0 2018-08-11 00:21 /user/data/input
-rw-r--r-- 1 root supergroup 39654 2018-08-11 00:21 /user/data/input/log.txt
测试
使用单机模式一样的测试:
[root@hadoopmaster hadoop-2.6.5]# hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.5.jar wordcount /user/data/input/ /user/data/output
成功
查看结果:
[root@hadoopmaster hadoop-2.6.5]# hadoop fs -ls -R /
drwxr-xr-x - root supergroup 0 2018-08-11 00:18 /user
drwxr-xr-x - root supergroup 0 2018-08-11 00:24 /user/data
drwxr-xr-x - root supergroup 0 2018-08-11 00:21 /user/data/input
-rw-r--r-- 1 root supergroup 39654 2018-08-11 00:21 /user/data/input/log.txt
drwxr-xr-x - root supergroup 0 2018-08-11 00:24 /user/data/output
-rw-r--r-- 1 root supergroup 0 2018-08-11 00:24 /user/data/output/_SUCCESS
-rw-r--r-- 1 root supergroup 9295 2018-08-11 00:24 /user/data/output/part-r-00000
[root@hadoopmaster hadoop-2.6.5]# hadoop fs -cat /user/data/output/part-r-00000
"-" 17
"CONNECT 2
"GET 61
"OPTIONS 10
"POST 282
伪分布模式搭建成功!
问题
在搭建过程中,往往第一次搭建可以成功,但是重启之后就不能运行了,大部分可能是以下原因:
- datanode不被namenode识别问题
Namenode在format初始化的时候会形成两个标识,blockPoolId和clusterId。新的datanode加入时,会获取这两个标识作为自己工作目录中的标识。
一旦namenode重新format后,namenode的身份标识已变,而datanode如果依然持有原来的id,就不会被namenode识别。
解决办法,删除datanode节点中的数据后,再次重新格式化namenode。
之前我们配置hadoop.tmp.dir的存放目录就是为了可以很快定位问题目录,将目录下文件删除后再格式化namenode就能正常重启了。
基于yarn的完全分布式模式
环境
首先将单机模式中的环境搭建好并且克隆多台虚拟机设备,这里使用三台来搭建分布式集群
修改静态ip
将三台虚拟机都启动起来,进行静态ip设置:
进入目录,查看网卡名称:
[root@hadoop001 hadoop-2.6.5]# cd /etc/sysconfig/network-scripts
[root@hadoop001 network-scripts]# ll
total 248
-rw-r--r--. 1 root root 603 Aug 9 03:44 ifcfg-eno16777736
-rw-r--r--. 1 root root 254 Jan 2 2018 ifcfg-lo
lrwxrwxrwx. 1 root root 24 Jun 2 03:01 ifdown -> ../../../usr/sbin/ifdown
其中ifcfg-eno16777736就是网卡名称,编辑网卡配置:
[root@hadoop001 network-scripts]# vi ifcfg-eno16777736
TYPE=Ethernet
BOOTPROTO=static #这里设置为static
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
#IPV6INIT=yes
#IPV6_AUTOCONF=yes
#IPV6_DEFROUTE=yes
#IPV6_FAILURE_FATAL=no
NAME=eno1
UUID=ad2ec8bd-7dc2-4c2c-95a9-de5c967bde8f
DEVICE=eno1
ONBOOT=yes
PEERDNS=yes
PEERROUTES=yes
#IPV6_PEERDNS=yes
#IPV6_PEERROUTES=yes
DNS1=114.114.114.114 #这个是国内的DNSip,是固定的,当然还有个8.8.8.8,是国外谷歌的
IPADDR=192.168.170.131 # 设置一个和原来动态分配的ip在同一子网的IP
NETMASK=255.255.255.0 #子网掩码
GATEWAY=192.168.170.2 #网关
保存,重启网络配置:
service network restart
对于三台虚拟机设备都要设置,ip不能一样。
修改主机名
执行honstname查看主机名:
[root@hadoop001 network-scripts]# hostname
hadoop001
因为我设置好了,所以是hadoop001,如果没有设置可能是localhost.
通过以下设置修改:
[root@hadoop001 network-scripts]# vi /etc/sysconfig/network
#加上这两行:
# Created by anaconda
NETWORKING=yes
HOSTNAME=hadoop001
#centos7中修改
[root@hadoop001 network-scripts]# vi /etc/hostname
hadoop001
#保存,编辑host
[root@hadoop001 network-scripts]# vi /etc/hosts
#加入三台设备ip和名称
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.170.130 hadoopmaster
192.168.170.131 hadoop001
192.168.170.132 hadoop002
192.168.170.133 hadoop003
三台虚拟机修改后重启:
[root@hadoop001 network-scripts]# hostname
hadoop001
[root@hadoop002 ~]# hostname
hadoop002
[root@hadoop003 ~]# hostname
hadoop003
关闭防火墙
[root@hadoop001 ~]# systemctl stop firewalld.service
[root@hadoop002 ~]# systemctl stop firewalld.service
[root@hadoop003 ~]# systemctl stop firewalld.service
SSH免密登录
首先测试是否安装SSH:
[root@hadoopmaster hadoop-2.6.5]# ssh localhost
The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is SHA256:yaQsgy2y8sMHLxhhIALyr6EOIwfk8wH5K/9w3hoeXv4.
ECDSA key fingerprint is MD5:48:0c:3f:cc:8f:f4:c6:b7:54:13:0a:ad:fb:05:48:41.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
root@localhost's password:
Last login: Sat Aug 11 00:08:48 2018
[root@hadoopmaster ~]#
如果出现以上信息,则说明已经有ssh了,如果没有,使用以下命令安装:
yum install openssh-server -y
进入登录用户的home目录:
[root@hadoopmaster ~]# cd ~/.ssh
[root@hadoopmaster .ssh]# ll
total 4
-rw-r--r--. 1 root root 171 Aug 11 00:34 known_hosts
可以看到该目录下没有公钥和私钥,接下来需要生成:
[root@hadoopmaster .ssh]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:nBpUboShzdkLbjkRNIFsY930q16HHPwfohG9/mJy8uQ root@hadoopmaster
The key's randomart image is:
+---[RSA 2048]----+
| . +**+ |
| *+oO.. |
| o..B + . |
| o * + o |
| * S = . |
| . + o = . |
| . . = * . |
| . .oB+o . |
| . .*Eoo |
+----[SHA256]-----+
[root@hadoopmaster .ssh]# ll
total 12
-rw-------. 1 root root 1675 Aug 11 00:38 id_rsa
-rw-r--r--. 1 root root 399 Aug 11 00:38 id_rsa.pub
-rw-r--r--. 1 root root 171 Aug 11 00:34 known_hosts
一路回车后再ll可以看到有公钥和私钥。
将公钥拷贝到要免密登录的目标机器上:
ssh-copy-id hadoop001
ssh-copy-id hadoop002
ssh-copy-id hadoop003
此操作在三台虚拟机上都要执行。
集群配置
集群部署规划:
null
hadoop001
hadoop002
hadoop003
HDFS
NameNode,DataNode
DataNode
SecondaryNameNode,DataNode
YARN
NodeManager
ResourceManager,NodeManager
NodeManager
修改core-site.xml
[root@hadoop001 hadoop]# vi core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop001:8020</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/module/hadoop-2.6.5/data/tmp</value>
</property>
</configuration>
修改hdfs-site.xml
[root@hadoop001 hadoop]# vi hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop003:50090</value>
</property>
</configuration>
修改slaves
[root@hadoop001 hadoop]# vi slaves
hadoop001
hadoop002
hadoop003
修改yarn-env.sh
[root@hadoop001 hadoop]# vi yarn-env.sh
export JAVA_HOME=/opt/module/jdk1.7.0_79
修改yarn-site.xml
[root@hadoop001 hadoop]# vi yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop002</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
修改mapred-env.sh
[root@hadoop001 hadoop]# vi mapred-env.sh
export JAVA_HOME=/opt/module/jdk1.7.0_79
修改mapred-site.xml
[root@hadoop001 hadoop]# vi mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
然后将以上所有改动的文件全部替换到hadoop002,hadoop003相对目录下的文件
接下来格式化namenode,因为我们将namenode放在hadoop001上,所以在hadoop001上格式化:
[root@hadoop001 hadoop]# hdfs namenode -format
测试
启动hdfs:
#进入sbin目录
[root@hadoop001 sbin]# ./start-dfs.sh
Starting namenodes on [hadoop001]
hadoop001: starting namenode, logging to /opt/module/hadoop-2.6.5/logs/hadoop-root-namenode-hadoop001.out
hadoop001: starting datanode, logging to /opt/module/hadoop-2.6.5/logs/hadoop-root-datanode-hadoop001.out
hadoop003: starting datanode, logging to /opt/module/hadoop-2.6.5/logs/hadoop-root-datanode-hadoop003.out
hadoop002: starting datanode, logging to /opt/module/hadoop-2.6.5/logs/hadoop-root-datanode-hadoop002.out
Starting secondary namenodes [hadoop003]
hadoop003: starting secondarynamenode, logging to /opt/module/hadoop-2.6.5/logs/hadoop-root-secondarynamenode-hadoop003.out
[root@hadoop001 sbin]# jps
4327 DataNode
4200 NameNode
4527 Jps
[root@hadoop001 sbin]#
启动成功,下一步启动yarn:
注意,yarn的resourceManager设置在了hadoop002中,所以,我们需要在hadoop002中启动yarn:
#进入sbin目录
[root@hadoop002 sbin]# ./start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/module/hadoop-2.6.5/logs/yarn-root-resourcemanager-hadoop002.out
hadoop003: starting nodemanager, logging to /opt/module/hadoop-2.6.5/logs/yarn-root-nodemanager-hadoop003.out
hadoop001: starting nodemanager, logging to /opt/module/hadoop-2.6.5/logs/yarn-root-nodemanager-hadoop001.out
hadoop002: starting nodemanager, logging to /opt/module/hadoop-2.6.5/logs/yarn-root-nodemanager-hadoop002.out
[root@hadoop002 sbin]# jps
3091 NodeManager
3478 Jps
2985 ResourceManager
启动成功!
运行例子
在hdfs文件系统中建立目录:
[root@hadoop001 sbin]# hadoop fs -mkdir -p /user/data/input/
[root@hadoop001 sbin]# hadoop fs -ls -R /
drwxr-xr-x - root supergroup 0 2018-08-11 02:08 /user
drwxr-xr-x - root supergroup 0 2018-08-11 02:08 /user/data
drwxr-xr-x - root supergroup 0 2018-08-11 02:08 /user/data/input
上传输入文件:
[root@hadoop001 hadoop-2.6.5]# hadoop fs -put input/log.txt /user/data/input/
[root@hadoop001 hadoop-2.6.5]# hadoop fs -ls -R /
drwxr-xr-x - root supergroup 0 2018-08-11 02:08 /user
drwxr-xr-x - root supergroup 0 2018-08-11 02:08 /user/data
drwxr-xr-x - root supergroup 0 2018-08-11 02:09 /user/data/input
-rw-r--r-- 3 root supergroup 39654 2018-08-11 02:09 /user/data/input/log.txt
运行wordcount:
[root@hadoop001 hadoop-2.6.5]# hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.5.jar wordcount /user/data/input/ /user/data/output
18/08/11 02:11:45 INFO client.RMProxy: Connecting to ResourceManager at hadoop002/192.168.170.132:8032
18/08/11 02:11:46 INFO input.FileInputFormat: Total input paths to process : 1
18/08/11 02:11:46 INFO mapreduce.JobSubmitter: number of splits:1
18/08/11 02:11:47 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1533978355521_0001
18/08/11 02:11:47 INFO impl.YarnClientImpl: Submitted application application_1533978355521_0001
18/08/11 02:11:47 INFO mapreduce.Job: The url to track the job: http://hadoop002:8088/proxy/application_1533978355521_0001/
18/08/11 02:11:47 INFO mapreduce.Job: Running job: job_1533978355521_0001
18/08/11 02:12:22 INFO mapreduce.Job: Job job_1533978355521_0001 running in uber mode : false
18/08/11 02:12:22 INFO mapreduce.Job: map 0% reduce 0%
18/08/11 02:12:45 INFO mapreduce.Job: map 100% reduce 0%
18/08/11 02:12:55 INFO mapreduce.Job: map 100% reduce 100%
18/08/11 02:12:55 INFO mapreduce.Job: Job job_1533978355521_0001 completed successfully
18/08/11 02:12:55 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=10812
FILE: Number of bytes written=236323
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=39764
HDFS: Number of bytes written=9295
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=19409
Total time spent by all reduces in occupied slots (ms)=6661
Total time spent by all map tasks (ms)=19409
Total time spent by all reduce tasks (ms)=6661
Total vcore-milliseconds taken by all map tasks=19409
Total vcore-milliseconds taken by all reduce tasks=6661
Total megabyte-milliseconds taken by all map tasks=19874816
Total megabyte-milliseconds taken by all reduce tasks=6820864
Map-Reduce Framework
Map input records=372
Map output records=3686
Map output bytes=54398
Map output materialized bytes=10812
Input split bytes=110
Combine input records=3686
Combine output records=388
Reduce input groups=388
Reduce shuffle bytes=10812
Reduce input records=388
Reduce output records=388
Spilled Records=776
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=225
CPU time spent (ms)=2320
Physical memory (bytes) snapshot=397717504
Virtual memory (bytes) snapshot=4202889216
Total committed heap usage (bytes)=276824064
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=39654
File Output Format Counters
Bytes Written=9295
ok,搭建完成!