实验环境:Red Hat 6.5、Hadoop-2.6.5 JDK1.7.0版本
具体参考实验指导书,本文档做辅助工作,详细命令请看教学实验指导书。
1.Hadoop安装实验
- 准备工作
- 配置主机名
更改主机名方法:
查看自己虚拟机的IP地址
修改每台服务器上的/etc/hosts文件,添加主机名配置。
-
-
- 安装配置JDK
-
Red Hat系统yum需要注册,所以我们手动下载
Jdk下载版本为:jdk-7u67-linux-x64.tar.gz
配置 在./bashrc文件中添加:
-
-
- 创建运行Hadoop程序的用户
-
在系统中添加运行Hadoop程序的用户hadoop,并修改其登录密码为hadoop
————————————
Linux下赋予普通用户超级权限sudo免密码
vim /etc/sudoers
找到这行 root ALL=(ALL) ALL
在他下面添加xxx ALL=(ALL) ALL
可以sudoers添加下面四行中任意一条
root ALL=(ALL) ALL
user ALL=(ALL) NOPASSWD: ALL
user ALL=(root) //赋予root用户拥有的权限
————————————
-
-
- 配置SSH无密钥验证配置
-
为了实现无口令SSH登录,需要在主机上配置公私密钥方式登录。此时需要通过root帐号登录,修改每个主机上的/etc/ssh/sshd_config文件,去掉RSAAuthentication yes和PubkeyAuthentication yes前面的“#”。
使用创建的hadoop重新登录系统,在每台主机上运行ssh-keygen -t rsa命令生成本地的公私密钥对。
同样的过程需要在3台slave主机上执行。然后通过scp命令将3台slave主机的公钥拷贝到master主机上
其中id_rsa.pub内容如下,是用户hadoop在master主机上的公钥;而slave1.pub、slave2.pub和slave3.pub分别是slave主机的公钥。
之后,将4个主机的公钥加入到master主机的授权文件中[1],之后看起来如下所示:
然后将master主机上的authorized_keys文件分别scp到3台slave主机上
命令 scp authorized_keys slave1:/home/hadoop/.ssh/
此外,还必须修改相关目录及文件的权限,需要在每台服务器上运行:
关闭防火墙
将selinux的值设为disabled
-
- 安装Hadoop集群
- 在master上安装
- 安装Hadoop集群
上传文件hadoop-2.6.5.tar.gz到master结点,解压
[hadoop@master ~]tar zxvf hadoop-2.6.5.tar.gz
建立符号连接
[hadoop@master ~]ln -s hadoop-2.6.5 hadoop
为建行配置过程,建立hadoop-config目录,并建立环境变量
[hadoop@master ~]mkdir hadoop-config
[hadoop@master ~]cp hadoop/conf/* ./hadoop-config/
[hadoop@master ~]export HADOOP_CONF_DIR=/home/hadoop/hadoop-config/
建立环境变量
修改etc/hadoop下的配置文件
Core-site.xml
Hdfs-site.xml
masters
slaves
-
-
- 修改slave主机
-
拷贝master主机上的配置和安装文件:
[hadoop@master ~]$ scp -r .bashrc hadoop-config/ hadoop-2.6.5.tar.gz slave1:/home/hadoop/
[hadoop@slave1 ~]# mkdir /var/lib/cassandra/data/hadoop
[hadoop@slave1 ~]# mkdir /var/lib/cassandra/data/hadoop/tmp
[hadoop@slave1 ~]# mkdir /var/lib/cassandra/data/hadoop/data
[hadoop@slave1 ~]# mkdir /var/lib/cassandra/data/hadoop/name
解压安装文件,创建符号连接
tar zxvf hadoop-0.20.2.tar.gz && ln -s hadoop-2.6.5 hadoop
-
- 启动服务
若第一次启动服务,首先需要对NameNode结点进行格式化:
[hadoop@master ~]$ hadoop namenode -format
如果报错:则原因是权限不够无法在目录内新建文件,输入
sudo chmod -R a+w /var/lib/
就可以了
如果jps datanode没有启动:
则命令行输入:${HADOOP_HOME}/sbin/hadoop-daemon.sh start datanode
其他节点为启动也类似的方法
关闭Hadoop:stop-all.sh
上传文件
[hadoop@master ~]$ hadoop fs -put hadoop-config/ config
查看
Hadoop fs -ls /usr/Hadoop/
运行例子程序
[hadoop@master hadoop]hadoop jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar wordcount /usr/hadoop/config usr/hadoop/results
结果:
…
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000027_0 decomp: 41 len: 45 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 41 bytes from map-output for attempt_local2077883811_0001_m_000027_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 41, inMemoryMapOutputs.size() -> 11, commitMemory -> 25025, usedMemory ->25066
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000003_0 decomp: 3942 len: 3946 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 3942 bytes from map-output for attempt_local2077883811_0001_m_000003_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 3942, inMemoryMapOutputs.size() -> 12, commitMemory -> 25066, usedMemory ->29008
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000016_0 decomp: 1723 len: 1727 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 1723 bytes from map-output for attempt_local2077883811_0001_m_000016_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1723, inMemoryMapOutputs.size() -> 13, commitMemory -> 29008, usedMemory ->30731
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000002_0 decomp: 4795 len: 4799 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 4795 bytes from map-output for attempt_local2077883811_0001_m_000002_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 4795, inMemoryMapOutputs.size() -> 14, commitMemory -> 30731, usedMemory ->35526
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000029_0 decomp: 15 len: 19 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 15 bytes from map-output for attempt_local2077883811_0001_m_000029_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 15, inMemoryMapOutputs.size() -> 15, commitMemory -> 35526, usedMemory ->35541
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000017_0 decomp: 1777 len: 1781 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 1777 bytes from map-output for attempt_local2077883811_0001_m_000017_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1777, inMemoryMapOutputs.size() -> 16, commitMemory -> 35541, usedMemory ->37318
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000011_0 decomp: 2140 len: 2144 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 2140 bytes from map-output for attempt_local2077883811_0001_m_000011_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2140, inMemoryMapOutputs.size() -> 17, commitMemory -> 37318, usedMemory ->39458
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000001_0 decomp: 4637 len: 4641 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 4637 bytes from map-output for attempt_local2077883811_0001_m_000001_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 4637, inMemoryMapOutputs.size() -> 18, commitMemory -> 39458, usedMemory ->44095
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000025_0 decomp: 938 len: 942 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 938 bytes from map-output for attempt_local2077883811_0001_m_000025_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 938, inMemoryMapOutputs.size() -> 19, commitMemory -> 44095, usedMemory ->45033
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000024_0 decomp: 1019 len: 1023 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 1019 bytes from map-output for attempt_local2077883811_0001_m_000024_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1019, inMemoryMapOutputs.size() -> 20, commitMemory -> 45033, usedMemory ->46052
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000012_0 decomp: 2144 len: 2148 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 2144 bytes from map-output for attempt_local2077883811_0001_m_000012_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2144, inMemoryMapOutputs.size() -> 21, commitMemory -> 46052, usedMemory ->48196
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000000_0 decomp: 12150 len: 12154 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 12150 bytes from map-output for attempt_local2077883811_0001_m_000000_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 12150, inMemoryMapOutputs.size() -> 22, commitMemory -> 48196, usedMemory ->60346
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000026_0 decomp: 386 len: 390 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 386 bytes from map-output for attempt_local2077883811_0001_m_000026_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 386, inMemoryMapOutputs.size() -> 23, commitMemory -> 60346, usedMemory ->60732
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000013_0 decomp: 2240 len: 2244 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 2240 bytes from map-output for attempt_local2077883811_0001_m_000013_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2240, inMemoryMapOutputs.size() -> 24, commitMemory -> 60732, usedMemory ->62972
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000008_0 decomp: 2387 len: 2391 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 2387 bytes from map-output for attempt_local2077883811_0001_m_000008_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2387, inMemoryMapOutputs.size() -> 25, commitMemory -> 62972, usedMemory ->65359
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000021_0 decomp: 1323 len: 1327 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 1323 bytes from map-output for attempt_local2077883811_0001_m_000021_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1323, inMemoryMapOutputs.size() -> 26, commitMemory -> 65359, usedMemory ->66682
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000009_0 decomp: 2992 len: 2996 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 2992 bytes from map-output for attempt_local2077883811_0001_m_000009_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2992, inMemoryMapOutputs.size() -> 27, commitMemory -> 66682, usedMemory ->69674
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000023_0 decomp: 1212 len: 1216 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 1212 bytes from map-output for attempt_local2077883811_0001_m_000023_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1212, inMemoryMapOutputs.size() -> 28, commitMemory -> 69674, usedMemory ->70886
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000022_0 decomp: 1202 len: 1206 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 1202 bytes from map-output for attempt_local2077883811_0001_m_000022_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1202, inMemoryMapOutputs.size() -> 29, commitMemory -> 70886, usedMemory ->72088
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000010_0 decomp: 2111 len: 2115 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 2111 bytes from map-output for attempt_local2077883811_0001_m_000010_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2111, inMemoryMapOutputs.size() -> 30, commitMemory -> 72088, usedMemory ->74199
19/06/16 21:15:08 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
19/06/16 21:15:08 INFO mapred.LocalJobRunner: 30 / 30 copied.
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: finalMerge called with 30 in-memory map-outputs and 0 on-disk map-outputs
19/06/16 21:15:08 INFO mapred.Merger: Merging 30 sorted segments
19/06/16 21:15:08 INFO mapred.Merger: Down to the last merge-pass, with 30 segments left of total size: 73995 bytes
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: Merged 30 segments, 74199 bytes to disk to satisfy reduce memory limit
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: Merging 1 files, 74145 bytes from disk
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
19/06/16 21:15:08 INFO mapred.Merger: Merging 1 sorted segments
19/06/16 21:15:08 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 74136 bytes
19/06/16 21:15:08 INFO mapred.LocalJobRunner: 30 / 30 copied.
19/06/16 21:15:08 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
19/06/16 21:15:08 INFO mapred.Task: Task:attempt_local2077883811_0001_r_000000_0 is done. And is in the process of committing
19/06/16 21:15:08 INFO mapred.LocalJobRunner: 30 / 30 copied.
19/06/16 21:15:08 INFO mapred.Task: Task attempt_local2077883811_0001_r_000000_0 is allowed to commit now
19/06/16 21:15:08 INFO output.FileOutputCommitter: Saved output of task 'attempt_local2077883811_0001_r_000000_0' to hdfs://master:9000/user/hadoop/usr/hadoop/results/_temporary/0/task_local2077883811_0001_r_000000
19/06/16 21:15:08 INFO mapred.LocalJobRunner: reduce > reduce
19/06/16 21:15:08 INFO mapred.Task: Task 'attempt_local2077883811_0001_r_000000_0' done.
19/06/16 21:15:08 INFO mapred.LocalJobRunner: Finishing task: attempt_local2077883811_0001_r_000000_0
19/06/16 21:15:08 INFO mapred.LocalJobRunner: reduce task executor complete.
19/06/16 21:15:09 INFO mapreduce.Job: map 100% reduce 100%
19/06/16 21:15:09 INFO mapreduce.Job: Job job_local2077883811_0001 completed successfully
19/06/16 21:15:09 INFO mapreduce.Job: Counters: 38
File System Counters
FILE: Number of bytes read=10542805
FILE: Number of bytes written=19367044
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=1847773
HDFS: Number of bytes written=36492
HDFS: Number of read operations=1117
HDFS: Number of large read operations=0
HDFS: Number of write operations=33
Map-Reduce Framework
Map input records=2087
Map output records=7887
Map output bytes=105178
Map output materialized bytes=74319
Input split bytes=3547
Combine input records=7887
Combine output records=3940
Reduce input groups=1570
Reduce shuffle bytes=74319
Reduce input records=3940
Reduce output records=1570
Spilled Records=7880
Shuffled Maps =30
Failed Shuffles=0
Merged Map outputs=30
GC time elapsed (ms)=1088
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=4331405312
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=77522
File Output Format Counters
Bytes Written=36492
在线查看part-r-00000文件中的内容
hadoop fs -cat /user/hadoop/results/part-r-00000
结果如下
…
java 3
javadoc 1
job 6
jobs 10
jobs, 1
jobs. 2
jsvc 2
jvm 3
jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext 1
jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 1
jvm.period=10 1
jvm.servers=localhost:8649 1
key 10
key. 1
key="capacity" 1
key="user-limit" 1
keys 1
keystore 9
keytab 2
killing 1
kms 2
kms-audit 1
language 24
last 1
law 24
leaf 2
level 6
levels 2
library 1
license 12
licenses 12
like 3
limit 1
limitations 24
line 1
links 1
list 44
location 2
log 14
log4j.additivity.kms-audit=false 1
log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false 1
log4j.additivity.org.apache.hadoop.mapred.AuditLogger=false 1
log4j.additivity.org.apache.hadoop.mapred.JobInProgress$JobSummary=false 1
log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=false 1
log4j.appender.DRFA.DatePattern=.yyyy-MM-dd 1
log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file} 1
log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout 1
log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender 1
log4j.appender.DRFAS.DatePattern=.yyyy-MM-dd 1
log4j.appender.DRFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} 1
log4j.appender.DRFAS.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.DRFAS.layout=org.apache.log4j.PatternLayout 1
log4j.appender.DRFAS=org.apache.log4j.DailyRollingFileAppender 1
log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter 1
log4j.appender.JSA.File=${hadoop.log.dir}/${hadoop.mapreduce.jobsummary.log.file} 1
log4j.appender.JSA.MaxBackupIndex=${hadoop.mapreduce.jobsummary.log.maxbackupindex} 1
log4j.appender.JSA.MaxFileSize=${hadoop.mapreduce.jobsummary.log.maxfilesize} 1
log4j.appender.JSA.layout.ConversionPattern=%d{yy/MM/dd 1
log4j.appender.JSA.layout=org.apache.log4j.PatternLayout 1
log4j.appender.JSA=org.apache.log4j.RollingFileAppender 1
log4j.appender.MRAUDIT.File=${hadoop.log.dir}/mapred-audit.log 1
log4j.appender.MRAUDIT.MaxBackupIndex=${mapred.audit.log.maxbackupindex} 1
log4j.appender.MRAUDIT.MaxFileSize=${mapred.audit.log.maxfilesize} 1
log4j.appender.MRAUDIT.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.MRAUDIT.layout=org.apache.log4j.PatternLayout 1
log4j.appender.MRAUDIT=org.apache.log4j.RollingFileAppender 1
log4j.appender.NullAppender=org.apache.log4j.varia.NullAppender 1
log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file} 1
log4j.appender.RFA.MaxBackupIndex=${hadoop.log.maxbackupindex} 1
log4j.appender.RFA.MaxFileSize=${hadoop.log.maxfilesize} 1
log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.RFA.layout=org.apache.log4j.PatternLayout 1
log4j.appender.RFA=org.apache.log4j.RollingFileAppender 1
log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log 1
log4j.appender.RFAAUDIT.MaxBackupIndex=${hdfs.audit.log.maxbackupindex} 1
log4j.appender.RFAAUDIT.MaxFileSize=${hdfs.audit.log.maxfilesize} 1
log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout 1
log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender 1
log4j.appender.RFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} 1
log4j.appender.RFAS.MaxBackupIndex=${hadoop.security.log.maxbackupindex} 1
log4j.appender.RFAS.MaxFileSize=${hadoop.security.log.maxfilesize} 1
log4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.RFAS.layout=org.apache.log4j.PatternLayout 1
log4j.appender.RFAS=org.apache.log4j.RollingFileAppender 1
log4j.appender.RMSUMMARY.File=${hadoop.log.dir}/${yarn.server.resourcemanager.appsummary.log.file} 1
log4j.appender.RMSUMMARY.MaxBackupIndex=20 1
log4j.appender.RMSUMMARY.MaxFileSize=256MB 1
log4j.appender.RMSUMMARY.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.RMSUMMARY.layout=org.apache.log4j.PatternLayout 1
log4j.appender.RMSUMMARY=org.apache.log4j.RollingFileAppender 1
log4j.appender.TLA.isCleanup=${hadoop.tasklog.iscleanup} 1
log4j.appender.TLA.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.TLA.layout=org.apache.log4j.PatternLayout 1
log4j.appender.TLA.taskId=${hadoop.tasklog.taskid} 1
log4j.appender.TLA.totalLogFileSize=${hadoop.tasklog.totalLogFileSize} 1
log4j.appender.TLA=org.apache.hadoop.mapred.TaskLogAppender 1
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd 1
log4j.appender.console.layout=org.apache.log4j.PatternLayout 1
log4j.appender.console.target=System.err 1
log4j.appender.console=org.apache.log4j.ConsoleAppender 1
log4j.appender.httpfs.Append=true 1
log4j.appender.httpfs.DatePattern='.'yyyy-MM-dd 1
log4j.appender.httpfs.File=${httpfs.log.dir}/httpfs.log 1
log4j.appender.httpfs.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.httpfs.layout=org.apache.log4j.PatternLayout 1
log4j.appender.httpfs=org.apache.log4j.DailyRollingFileAppender 1
log4j.appender.httpfsaudit.Append=true 1
log4j.appender.httpfsaudit.DatePattern='.'yyyy-MM-dd 1
log4j.appender.httpfsaudit.File=${httpfs.log.dir}/httpfs-audit.log 1
log4j.appender.httpfsaudit.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.httpfsaudit.layout=org.apache.log4j.PatternLayout 1
log4j.appender.httpfsaudit=org.apache.log4j.DailyRollingFileAppender 1
log4j.appender.kms-audit.Append=true 1
log4j.appender.kms-audit.DatePattern='.'yyyy-MM-dd 1
log4j.appender.kms-audit.File=${kms.log.dir}/kms-audit.log 1
log4j.appender.kms-audit.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.kms-audit.layout=org.apache.log4j.PatternLayout 1
log4j.appender.kms-audit=org.apache.log4j.DailyRollingFileAppender 1
log4j.appender.kms.Append=true 1
log4j.appender.kms.DatePattern='.'yyyy-MM-dd 1
log4j.appender.kms.File=${kms.log.dir}/kms.log 1
log4j.appender.kms.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.kms.layout=org.apache.log4j.PatternLayout 1
log4j.appender.kms=org.apache.log4j.DailyRollingFileAppender 1
log4j.category.SecurityLogger=${hadoop.security.logger} 1
log4j.logger.com.amazonaws.http.AmazonHttpClient=ERROR 1
log4j.logger.com.amazonaws=ERROR 1
log4j.logger.com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator=OFF 1
log4j.logger.httpfsaudit=INFO, 1
log4j.logger.kms-audit=INFO, 1
log4j.logger.org.apache.hadoop.conf.Configuration.deprecation=WARN 1
log4j.logger.org.apache.hadoop.conf=ERROR 1
log4j.logger.org.apache.hadoop.fs.http.server=INFO, 1
log4j.logger.org.apache.hadoop.fs.s3a.S3AFileSystem=WARN 1
log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger} 1
log4j.logger.org.apache.hadoop.lib=INFO, 1
log4j.logger.org.apache.hadoop.mapred.AuditLogger=${mapred.audit.logger} 1
log4j.logger.org.apache.hadoop.mapred.JobInProgress$JobSummary=${hadoop.mapreduce.jobsummary.logger} 1
log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.server.resourcemanager.appsummary.logger} 1
log4j.logger.org.apache.hadoop=INFO 1
log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR 1
log4j.rootLogger=${hadoop.root.logger}, 1
log4j.rootLogger=ALL, 1
log4j.threshold=ALL 1
logger 2
logger. 1
logging 4
logs 2
logs.</description> 1
loops 1
manage 1
manager 3
map 2
mapping 1
mapping]* 1
mappings 1
mappings. 1
mapred 1
mapred.audit.log.maxbackupindex=20 1
mapred.audit.log.maxfilesize=256MB 1
mapred.audit.logger=INFO,NullAppender 1
mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext 1
mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 1
mapred.class=org.apache.hadoop.metrics.spi.NullContext 1
mapred.period=10 1
mapred.servers=localhost:8649 1
mapreduce.cluster.acls.enabled 3
mapreduce.cluster.administrators 2
maps 1
master 1
masters 1
match="configuration"> 1
material 3
max 3
maximum 4
may 52
means 18
message 1
messages 2
metadata 1
method="html"/> 1
metrics 3
midnight 1
might 1
milliseconds. 4
min.user.id=1000#Prevent 1
missed 1
modified. 1
modify 1
modifying 1
more 12
mradmin 1
ms) 1
multi-dimensional 1
multiple 4
must 1
name 3
name. 1
name="{name}"><xsl:value-of 1
namenode 1
namenode-metrics.out 1
namenode. 2
namenode.</description> 1
namenode: 1
names. 19
nesting 1
new 2
no 4
nodes 2
nodes. 2
non-privileged 2
normal 1
not 51
null 5
number 6
numerical 3
obtain 24
of 138
off 4
on 30
one 15
one: 2
only 9
operation. 3
operations 8
operations. 9
opportunities 1
option 6
optional. 2
options 11
options. 2
or 71
ordinary 1
org.apache.hadoop.metrics2 1
other 1
other. 6
others 2
overridden 4
override 5
overrides 4
owner 1
ownership. 12
package-info.java 1
parameters 4
parent 1
part 2
password 3
path 2
pending 1
per 1
percent 1
percentage 1
period 1
period, 1
permissions 24
picked 3
pid 3
place 1
please 1
policy 3
port 5
ports 2
ports. 2
potential 2
preferred 3
prefix. 1
present, 1
principal 4
principal. 1
printed 1
priorities. 1
priority 1
privileged 2
privileges 1
privileges. 1
properties 7
property 11
protocol 6
protocol, 2
protocol. 2
provide 3
q1. 1
q2 2
q2. 1
quashed 1
query 1
queue 12
queue). 1
queue, 1
queue. 10
queues 9
queues, 1
queues. 3
rack 1
rack-local 1
recovery. 1
reduce 2
refresh 2
regarding 12
reload 2
remote 2
representing 3
required 29
resolve 2
resources 2
response. 2
restore 1
retrieve 1
return 1
returned 2
rolling 1
rollover-key 1
root 2
rootlogger 1
rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext 1
rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 1
rpc.class=org.apache.hadoop.metrics.spi.NullContext 1
rpc.period=10 1
rpc.servers=localhost:8649 1
run 10
running 3
running, 1
running. 1
runs 3
runtime 2
same 1
sample 1
sampling 2
scale 3
schedule 1
scheduler. 1
schedulers, 1
scheduling 2
secondary 1
seconds 1
seconds). 2
secret 3
secure 7
security 1
segment 1
select="description"/></td> 1
select="name"/></a></td> 1
select="property"> 1
select="value"/></td> 1
send 1
sending 1
separate 2
separated 20
separated. 1
server 3
service 2
service-level 2
set 60
set." 1
sets 2
setting 7
setup 1
severity 1
should 6
sign 1
signature 2
similar 3
single 1
sinks 1
site-specific 4
sizes 1
slave1 1
slave2 1
slave3 1
so 3
softlink 1
software 24
some 2
sometimes 1
source 2
space 1
space), 2
spaces 1
special 19
specific 33
specification. 1
specified 13
specified, 3
specified. 6
specifiying 1
specify 3
specifying 2
split 1
stand-by 1
start 3
started 2
starting 3
state 4
states 1
status 2
stopped, 1
store 1
stored 2
stored. 7
string 3
string, 1
submission 1
submit 2
submitting 1
such 1
summary 5
super-users 1
support 2
supported 1
supports 1
supportsparse 1
suppress 1
symlink 2
syntax 1
syntax: 1
system 3
tag 1
tags 3
target 1
tasks 2
tasktracker. 1
template 1
temporary 2
than 1
that 19
the 370
them 1
then 8
there 2
therefore 3
this 78
threads 1
time 4
timeline 2
timestamp. 1
to 163
top 1
traffic. 1
transfer 4
true. 3
turn 1
two. 3
type 1
type, 1
type="text/xsl" 6
typically 1
u:%user:%user 1
ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext 1
ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 1
ugi.class=org.apache.hadoop.metrics.spi.NullContext 1
ugi.period=10 1
ugi.servers=localhost:8649 1
uncommented 1
under 84
unset 1
up 2
updating 1
usage 2
use 30
use, 2
use. 4
used 36
used. 3
user 48
user. 2
user1,user2 2
user? 1
users 27
users,wheel". 18
uses 2
using 14
value 45
value="20"/> 1
value="30"/> 1
values 4
variable 4
variables 4
version 1
version="1.0" 5
version="1.0"> 1
version="1.0"?> 7
via 3
view, 1
viewing 1
w/ 1
want 1
warnings. 1
when 9
where 4
which 7
while 1
who 6
will 23
window 1
window, 1
with 59
within 4
without 1
work 12
writing, 24
written 2
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 1
yarn.nodemanager.linux-container-executor.group 1
yarn.nodemanager.linux-container-executor.group=#configured 1
yarn.server.resourcemanager.appsummary.log.file 1
yarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log 1
yarn.server.resourcemanager.appsummary.logger 2
yarn.server.resourcemanager.appsummary.logger=${hadoop.root.logger} 1
you 28
将结果下载到本地:
hadoop fs -get /user/hadoop/usr/hadoop/results count-results
ls
下载到本地后查看,如果成功
那么,实验一完成。
2.MapReduce编程实验
实验目标
了解MapReduce的工作原理
掌握基本的MapReduce的编程方法
学习Eclipse下的MapReduce编程
学会设计、实现、运行MapReduce程序
实验内容
备注:因为版本的原因,教学实验指导书的内容比较陈旧,现在Hadoop内容已经到了2.x、3.x版本,对于实验指导书中Hadoop-0.20版本修改了很多类、方法及API,故本实验在参考教学实验指导书的同时,采用下面的步骤。
示例程序WordCount用于统计一批文本文件中单词出现的频率,完整的代码可在Hadoop安装包中得到(在src/examples 目录中)。(Hadoop2.x版本已有所变化,找不到)
WordCount代码:
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
public class WordCount {
public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
}
public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values,
Context context
) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
if (otherArgs.length < 2) {
System.err.println("Usage: wordcount <in> [<in>...] <out>");
System.exit(2);
}
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
for (int i = 0; i < otherArgs.length - 1; ++i) {
FileInputFormat.addInputPath(job, new Path(otherArgs[i]));
}
FileOutputFormat.setOutputPath(job,
new Path(otherArgs[otherArgs.length - 1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
-
- 安装配置Eclipse
- 安装Hadoop Eclipse Plugin
我这里用的是hadoop-eclipse-plugin-2.7.3.jar包,将编译好的hadoop-eclipse-plugin相关的jar包,复制到eclipse安装目录下的plugins下。重启eclipse。点击菜单栏Windows–>Preferences,如果插件安装成功,就会出现如下图,在红圈范围内选择hadoop安装目录。
点击Windows–> Show View –> Others –> Map/Redure Location 。
点击下方Map/Redure Locations 窗口,空白处右键New Hadoop location。或点击右边带加号的蓝色小象。填写相关信息。
如何设置信息看下图:
-
- 在 Eclipse 环境下进行开发和调试
在Eclipse环境下可以方便地进行Hadoop并行程序的开发和调试。推荐使用IBM MapReduce Tools for Eclipse,使用这个Eclipse plugin可以简化开发和部署Hadoop并行程序的过程。基于这个plugin,可以在Eclipse中创建一个Hadoop MapReduce应用程序,并且提供了一些基于MapReduce框架的类开发的向导,可以打包成JAR文件,部署一个Hadoop MapReduce应用程序到一个Hadoop服务器(本地和远程均可),可以通过一个专门的视图(perspective)查看Hadoop服务器、Hadoop分布式文件系统(DFS)和当前运行的任务的状态。
可在IBM alpha Works网站下载这个MapReduce Tool,或在本文的下载清单中下载。将下载后的压缩包解压到你Eclipse安装目录,重新启动Eclipse即可使用了。点击Eclipse主菜单上Windows->Preferences,然后在左侧选择Hadoop Home Directory,设定你的Hadoop主目录。
配置完成,在Eclipse中运行示例。
File —> Project,选择Map/Reduce Project,输入项目名称WordCount等
写入WordCount代码。
点击WordCount.java,右键,点击Run As—>Run Configurations,配置运行参数,参数为:hdfs://localhost:9000/user/hadoop/input hdfs://localhost:9000/user/hadoop/output,分别对应输入和输出。
运行完成,查看结果:
方法1:在终端里面使用命令查看
方法2:直接在Eclipse中查看,DFS Locations,双击打开part-r00000查看结果。
-
- 改进的WordCount程序
下面对WordCount程序进行一些改进,目标:
- 原WordCount程序仅按空格切分单词,导致各类标点符号与单词混杂在一起,改进后的程序应该能够正确的切出单词,并且单词不要区分大小写。
- 在最终结果中,按单词出现频率的降序进行排序。
WordCount降序输出代码:
package desc;
/**
* WordCount
* 统计输入文件各个单词出现频率
* 统计的时候对于“停词”(从文本文件读入)将不参与统计
* 最后按统计的词频从高到底输出
*
* 特别主import某个类的时候,确定你是要用哪个包所属的该类
*
* */
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.*;
import java.util.Map.Entry;
import org.apache.hadoop.filecache.DistributedCache;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.map.InverseMapper;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
public class WordCount {
/**
* Map: 将输入的文本数据转换为<word-1>的键值对
* */
public static class WordCountMap extends Mapper<LongWritable, Text, Text, IntWritable> {
String regex = "[.,\"!--;:?'\\]]"; //remove all punctuation
Text word = new Text();
final static IntWritable one = new IntWritable(1);
HashSet<String> stopWordSet = new HashSet<String>();
/**
* 将停词从文件读到hashSet中
* */
private void parseStopWordFile(String path){
try {
String word = null;
BufferedReader reader = new BufferedReader(new FileReader(path));
while((word = reader.readLine()) != null){
stopWordSet.add(word);
}
} catch (IOException e) {
e.printStackTrace();
}
}
/**
* 完成map初始化工作
* 主要是读取停词文件
* */
public void setup(Context context) {
Configuration patternsFiles = null;
{
patternsFiles = context.getConfiguration();
}
if(patternsFiles == null){
System.out.println("have no stopfile\n");
return;
}
//read stop-words into HashSet
for (Entry<String, String> patternsFile : patternsFiles) {
parseStopWordFile(patternsFile.toString());
}
}
/**
* map
* */
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String s = null;
String line = value.toString().toLowerCase();
line = line.replaceAll(regex, " "); //remove all punctuation
//split all words of line
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
s = tokenizer.nextToken();
if(!stopWordSet.contains(s)){
word.set(s);
context.write(word, one);
}
}
}
}
/**
* Reduce: add all word-counts for a key
* */
public static class WordCountReduce extends Reducer<Text, IntWritable, Text, IntWritable> {
int min_num = 0;
/**
* minimum showing words
* */
public void setup(Context context) {
min_num = Integer.parseInt(context.getConfiguration().get("min_num"));
System.out.println(min_num);
}
/**
* reduce
* */
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
if(sum < min_num) return;
context.write(key, new IntWritable(sum));
}
}
/**
* IntWritable comparator
* */
private static class IntWritableDecreasingComparator extends IntWritable.Comparator {
public int compare(WritableComparable a, WritableComparable b) {
return -super.compare(a, b);
}
public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
return -super.compare(b1, s1, l1, b2, s2, l2);
}
}
/**
* main: run two job
* */
public static void main(String[] args){
boolean exit = false;
String skipfile = null; //stop-file path
int min_num = 0;
String tempDir = "wordcount-temp-" + Integer.toString(new Random().nextInt(Integer.MAX_VALUE));
Configuration conf = new Configuration();
//获取停词文件的路径,并放到DistributedCache中
for(int i=0;i<args.length;i++)
{
if("-skip".equals(args[i]))
{
DistributedCache.addCacheFile(new Path(args[++i]).toUri(), conf);
System.out.println(args[i]);
}
}
//获取要展示的最小词频
for(int i=0;i<args.length;i++)
{
if("-greater".equals(args[i])){
min_num = Integer.parseInt(args[++i]);
System.out.println(args[i]);
}
}
//将最小词频值放到Configuration中共享
conf.set("min_num", String.valueOf(min_num)); //set global parameter
try{
/**
* run first-round to count
* */
Job job = new Job(conf, "jiq-wordcountjob-1");
job.setJarByClass(WordCount.class);
//set format of input-output
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(SequenceFileOutputFormat.class);
//set class of output's key-value of MAP
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
//set mapper and reducer
job.setMapperClass(WordCountMap.class);
job.setReducerClass(WordCountReduce.class);
//set path of input-output
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(tempDir));
if(job.waitForCompletion(true)){
/**
* run two-round to sort
* */
//Configuration conf2 = new Configuration();
Job job2 = new Job(conf, "jiq-wordcountjob-2");
job2.setJarByClass(WordCount.class);
//set format of input-output
job2.setInputFormatClass(SequenceFileInputFormat.class);
job2.setOutputFormatClass(TextOutputFormat.class);
//set class of output's key-value
job2.setOutputKeyClass(IntWritable.class);
job2.setOutputValueClass(Text.class);
//set mapper and reducer
//InverseMapper作用是实现map()之后的数据对的key和value交换
//将Reducer的个数限定为1, 最终输出的结果文件就是一个
/**
* 注意,这里将reduce的数目设置为1个,有很大的文章。
* 因为hadoop无法进行键的全局排序,只能做一个reduce内部
* 的本地排序。 所以我们要想有一个按照键的全局的排序。
* 最直接的方法就是设置reduce只有一个。
*/
job2.setMapperClass(InverseMapper.class);
job2.setNumReduceTasks(1); //only one reducer
//set path of input-output
FileInputFormat.addInputPath(job2, new Path(tempDir));
FileOutputFormat.setOutputPath(job2, new Path(args[1]));
/**
* Hadoop 默认对 IntWritable 按升序排序,而我们需要的是按降序排列。
* 因此我们实现了一个 IntWritableDecreasingComparator 类,
* 并指定使用这个自定义的 Comparator 类对输出结果中的 key (词频)进行排序
* */
job2.setSortComparatorClass(IntWritableDecreasingComparator.class);
exit = job2.waitForCompletion(true);
}
}catch(Exception e){
e.printStackTrace();
}finally{
try {
//delete tempt dir
FileSystem.get(conf).deleteOnExit(new Path(tempDir));
if(exit) System.exit(1);
System.exit(0);
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
运行结果:
到此为止,实验一和实验二完成。
致谢:
[1]https://blog.csdn.net/u010223431/article/details/51191978
[2]https://blog.csdn.net/xingyyn78/article/details/81085100
[3]https://blog.csdn.net/abcjennifer/article/details/22393197
[4]https://www.cnblogs.com/StevenSun1991/p/6931500.html
[5]https://www.ibm.com/developerworks/cn/opensource/os-cn-hadoop2/index.html