实验环境:linux操作系统,Hadoop环境
1.YARN的安装
Hadoop2.0以后自带yarn,所以使用yarn进行一个词频统计的demo,来记录一下yarn的学习。
2.启动hadoop
输入以下命令,回车,进入/apps/hadoop/sbin目录
cd /apps/hadoop/sbin
回车后显示如下
root:/apps/hadoop/sbin$
在当前目录下输入以下命令,回车,启动hadoop
./start-all.sh
启动正常显示如下
WARNING: Attempting to start all Apache Hadoop daemons as dolphin in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [0.0.0.0]
0.0.0.0: Warning: Permanently added ‘0.0.0.0’ (ECDSA) to the list of known hosts.
Starting datanodes
localhost: Warning: Permanently added ‘localhost’ (ECDSA) to the list of known hosts.
Starting secondary namenodes [tools.hadoop-s.ambari]
tools.hadoop-s.ambari: Warning: Permanently added ‘tools.hadoop-s.ambari,172.25.0.2’ (ECDSA) to the list of known hosts.
Starting resourcemanager
Starting nodemanagers
输入jps命令回车,查看是否启动成功
出现类似这些信息说明启动成功
931 ResourceManager
1349 Jps
1046 NodeManager
646 SecondaryNameNode
391 NameNode
490 DataNode
3.在本地准备测试文件
首先输入以下命令,回车,创建yarn目录
mkdir /apps/yarn
输入以下命令,回车,在/apps/yarn目录下创建words.txt文件,并进入该文件
vim /apps/yarn/words.txt
接着在文件中,写入以下内容
hello tom
hello ethan
hello jony
hello tom
hello tom
内容写入完成后,进行退出并保存
输入以下命令回车,查看/apps/yarn/目录下,测试文件 words.txt 是否创建成功
ls -l /apps/yarn/
/apps/yarn/目录显示如下
-rw-r–r-- 1 root root 54 7\u6708 20 10:49 words.txt
/apps/yarn/目录下生成了words.txt文件,说明测试文件创建成功
4.将本地测试文件上传到hdfs
首先输入以下命令,回车,在hdfs上创建wordcount目录
hdfs dfs -mkdir -p /wordcount/input
接着输入以下命令,回车,将本地/apps/yarn/目录下的测试数据words.txt文件上产到hdfs的/wordcount/input目录
hdfs dfs -put /apps/yarn/words.txt /wordcount/input
最后输入以下命令,回车,查看测试文件上传是否成功
hdfs dfs -ls /wordcount/input
回车后显示如下
Found 1 items
-rw-r–r-- 1 dolphin supergroup 54 2018-07-20 03:02 /wordcount/input/words.txt
显示以上信息说明测试数据上传成功
5.执行测试
输入以下命令,回车
yarn jar \
回车后显示如下
>
继续输入以下命令,回车
/apps/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar `\`
回车后显示如下
>
继续输入以下命令,回车
wordcount
回车后显示如下
>
继续输入以下命令,回车
/wordcount/input \
回车后显示如下
>
继续输入以下命令,回车
/wordcount/output
回车后显示类似如下信息
18/05/29 20:40:59 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
18/05/29 20:40:59 INFO input.FileInputFormat: Total input files to process : 1
18/05/29 20:41:00 INFO mapreduce.JobSubmitter: number of splits:1
18/05/29 20:41:00 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1526991305992_0001
18/05/29 20:41:01 INFO impl.YarnClientImpl: Submitted application application_1526991305992_0001
18/05/29 20:41:01 INFO mapreduce.Job: The url to track the job: http://hadoop000:8088/proxy/application_1526991305992_0001/
18/05/29 20:41:01 INFO mapreduce.Job: Running job: job_1526991305992_0001
18/05/29 20:41:14 INFO mapreduce.Job: Job job_1526991305992_0001 running in uber mode : false
18/05/29 20:41:14 INFO mapreduce.Job: map 0% reduce 0%
18/05/29 20:41:23 INFO mapreduce.Job: map 100% reduce 0%
18/05/29 20:41:29 INFO mapreduce.Job: map 100% reduce 100%
18/05/29 20:41:30 INFO mapreduce.Job: Job job_1526991305992_0001 completed successfully
18/05/29 20:41:30 INFO mapreduce.Job: Counters: 49
继续输入以下命令,回车,查看 hdfs 下的/wordcount/output目录
hdfs dfs -ls /wordcount/output
显示该目录下信息如下
Found 2 items
-rw-r–r-- 1 dolphin supergroup 0 2018-07-20 03:19 /wordcount/output/_SUCCESS
-rw-r–r-- 1 dolphin supergroup 29 2018-07-20 03:19 /wordcount/output/part-r-00000
输入以下命令,回车,继续查看该目录下的part-r-00000文件
hdfs dfs -cat /wordcount/output/part-r-00000
显示当前文件内容如下
ethan 1
hello 5
jony 1
tom 3
统计出了测试数据文件中各个词的词频,说明测试成功