1
MapReduce:计算 是jar包 企业不用 代码复杂 hive sql/... 是不需要部署的 运行在yarn
Yarn: 资源和作业的调度 是需要部署的
resourcemanager rm 资源管理者
nodemanager nm 节点管理者
2参照官网配置yarn
[hadoop@hadoop001 hadoop]$ pwd
/home/hadoop/app/hadoop/etc/hadoop
[hadoop@hadoop001 hadoop]$ mv mapred-site.xml.template20190711 mapred-site.xml
[hadoop@hadoop001 hadoop]$ ll
total 144
-rw-r--r-- 1 hadoop hadoop 4436 Mar 24 2016 capacity-scheduler.xml
-rw-r--r-- 1 hadoop hadoop 1335 Mar 24 2016 configuration.xsl
-rw-r--r-- 1 hadoop hadoop 318 Mar 24 2016 container-executor.cfg
-rw-r--r-- 1 hadoop hadoop 880 Jul 11 14:01 core-site.xml
-rw-r--r-- 1 hadoop hadoop 4267 Jul 11 14:43 hadoop-env.sh
-rw-r--r-- 1 hadoop hadoop 2598 Mar 24 2016 hadoop-metrics2.properties
-rw-r--r-- 1 hadoop hadoop 2490 Mar 24 2016 hadoop-metrics.properties
-rw-r--r-- 1 hadoop hadoop 9683 Mar 24 2016 hadoop-policy.xml
-rw-r--r-- 1 hadoop hadoop 1112 Jul 11 00:16 hdfs-site.xml
-rw-r--r-- 1 hadoop hadoop 1449 Mar 24 2016 httpfs-env.sh
-rw-r--r-- 1 hadoop hadoop 1657 Mar 24 2016 httpfs-log4j.properties
-rw-r--r-- 1 hadoop hadoop 21 Mar 24 2016 httpfs-signature.secret
-rw-r--r-- 1 hadoop hadoop 620 Mar 24 2016 httpfs-site.xml
-rw-r--r-- 1 hadoop hadoop 3523 Mar 24 2016 kms-acls.xml
-rw-r--r-- 1 hadoop hadoop 1611 Mar 24 2016 kms-env.sh
-rw-r--r-- 1 hadoop hadoop 1631 Mar 24 2016 kms-log4j.properties
-rw-r--r-- 1 hadoop hadoop 5511 Mar 24 2016 kms-site.xml
-rw-r--r-- 1 hadoop hadoop 11291 Mar 24 2016 log4j.properties
-rw-r--r-- 1 hadoop hadoop 1383 Mar 24 2016 mapred-env.sh
-rw-r--r-- 1 hadoop hadoop 4113 Mar 24 2016 mapred-queues.xml.template
-rw-r--r-- 1 hadoop hadoop 758 Jul 11 19:38 mapred-site.xml
-rw-r--r-- 1 hadoop hadoop 758 Mar 24 2016 mapred-site.xml.template
-rw-r--r-- 1 hadoop hadoop 10 Jul 10 19:19 slaves
-rw-r--r-- 1 hadoop hadoop 2316 Mar 24 2016 ssl-client.xml.example
-rw-r--r-- 1 hadoop hadoop 2268 Mar 24 2016 ssl-server.xml.example
-rw-r--r-- 1 hadoop hadoop 4567 Mar 24 2016 yarn-env.sh
-rw-r--r-- 1 hadoop hadoop 690 Mar 24 2016 yarn-site.xml
[hadoop@hadoop001 hadoop]$ vi mapred-site.xml 配置文件参考官网
[hadoop@hadoop001 hadoop]$ vi yarn-site.xml 配置文件参考官网
[hadoop@hadoop001 hadoop]$ cd ../../
[hadoop@hadoop001 hadoop]$ ll
total 84
drwxr-xr-x 2 hadoop hadoop 4096 Jul 9 17:34 bin
drwxr-xr-x 2 hadoop hadoop 4096 Mar 24 2016 bin-mapreduce1
drwxr-xr-x 3 hadoop hadoop 4096 Mar 24 2016 cloudera
drwxr-xr-x 6 hadoop hadoop 4096 Mar 24 2016 etc
drwxr-xr-x 5 hadoop hadoop 4096 Mar 24 2016 examples
drwxr-xr-x 3 hadoop hadoop 4096 Mar 24 2016 examples-mapreduce1
drwxr-xr-x 2 hadoop hadoop 4096 Mar 24 2016 include
drwxr-xr-x 3 hadoop hadoop 4096 Mar 24 2016 lib
drwxr-xr-x 2 hadoop hadoop 4096 Mar 24 2016 libexec
-rw-r--r-- 1 hadoop hadoop 17087 Mar 24 2016 LICENSE.txt
drwxrwxr-x 2 hadoop hadoop 4096 Jul 11 14:44 logs
-rw-r--r-- 1 hadoop hadoop 101 Mar 24 2016 NOTICE.txt
drwxrwxr-x 3 hadoop hadoop 4096 Jul 11 13:57 output
-rw-r--r-- 1 hadoop hadoop 1366 Mar 24 2016 README.txt
drwxr-xr-x 3 hadoop hadoop 4096 Jul 9 02:45 sbin
drwxr-xr-x 4 hadoop hadoop 4096 Mar 24 2016 share
drwxr-xr-x 17 hadoop hadoop 4096 Mar 24 2016 src
[hadoop@hadoop001 hadoop]$ sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/software/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-resourcemanager-hadoop001.out
hadoop001: starting nodemanager, logging to /home/hadoop/software/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-hadoop001.out
[hadoop@hadoop001 hadoop]$ jps
13696 Jps
13399 NodeManager
13309 ResourceManager
hadoop@hadoop001 hadoop]$ netstat -nlp|grep 8088
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 0.0.0.0:8088 0.0.0.0:* LISTEN 13309/java
3修改yarn端口号
hadoop@hadoop001 hadoop]$ pwd
/home/hadoop/app/hadoop/etc/hadoop
[hadoop@hadoop001 hadoop]$ vi yarn-site.xml
[hadoop@hadoop001 hadoop]$ cat yarn-site.xml
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop001:8081</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.https.address</name>
<value>hadoop001:8090</value>
</property>
</configuration>
[hadoop@hadoop001 hadoop]$
打开外部界面
4跑案例
[hadoop@hadoop001 hadoop]$ find ./ -name '*example*.jar'
./share/hadoop/mapreduce1/hadoop-examples-2.6.0-mr1-cdh5.7.0.jar
./share/hadoop/mapreduce2/sources/hadoop-mapreduce-examples-2.6.0-cdh5.7.0-test-sources.jar
./share/hadoop/mapreduce2/sources/hadoop-mapreduce-examples-2.6.0-cdh5.7.0-sources.jar
./share/hadoop/mapreduce2/hadoop-mapreduce-examples-2.6.0-cdh5.7.0.jar
[hadoop@hadoop001 hadoop]$
[hadoop@hadoop001 hadoop]$ bin/hadoop jar ./share/hadoop/mapreduce2/hadoop-mapreduce-examples-2.6.0-cdh5.7.0.jar
An example program must be given as the first argument.
Valid program names are:
aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
dbcount: An example job that count the pageview counts from a database.
distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
grep: A map/reduce program that counts the matches of a regex in the input.
join: A job that effects a join over sorted, equally partitioned datasets
multifilewc: A job that counts words from several files.
pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
randomwriter: A map/reduce program that writes 10GB of random data per node.
secondarysort: An example defining a secondary sort to the reduce.
sort: A map/reduce program that sorts the data written by the random writer.
sudoku: A sudoku solver.
teragen: Generate data for the terasort
terasort: Run the terasort
teravalidate: Checking results of terasort
wordcount: A map/reduce program that counts the words in the input files. 小案例
wordmean: A map/reduce program that counts the average length of the words in the input files.
wordmedian: A map/reduce program that counts the median length of the words in the input files.
wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.
[hadoop@hadoop001 hadoop]$ ll 造组词,内容需编辑
total 92
-rw-rw-r-- 1 hadoop hadoop 32 Jul 11 21:11 1.log
-rw-rw-r-- 1 hadoop hadoop 32 Jul 11 21:11 2.log
[hadoop@hadoop001 hadoop]$ bin/hdfs dfs -mkdir /examples 创建文件夹
19/07/13 13:32:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop001 hadoop]$ bin/hdfs dfs -mkdir /examples/input 创建input文件夹
19/07/13 13:32:58 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop001 hadoop]$ bin/hdfs dfs -put *.log /examples/input 把我们造好的两个文件移到/examples/input
19/07/13 13:34:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop001 hadoop]$ bin/hdfs dfs -ls /examples/input 查看
19/07/13 13:34:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r-- 1 hadoop supergroup 32 2019-07-13 13:34 /examples/input/1.log
-rw-r--r-- 1 hadoop supergroup 32 2019-07-13 13:34 /examples/input/2.log
[hadoop@hadoop001 hadoop]$ bin/hadoop jar ./share/hadoop/mapreduce2/hadoop-mapreduce-examples-2.6.0-cdh5.7.0.jar wordcount /examples/input /examples/output1 运行
19/07/13 13:37:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/07/13 13:37:13 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
19/07/13 13:37:15 INFO input.FileInputFormat: Total input paths to process : 2
19/07/13 13:37:15 INFO mapreduce.JobSubmitter: number of splits:2
19/07/13 13:37:16 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1562986579443_0001
19/07/13 13:37:16 INFO impl.YarnClientImpl: Submitted application application_1562986579443_0001
19/07/13 13:37:17 INFO mapreduce.Job: The url to track the job: http://hadoop001:8081/proxy/application_1562986579443_0001/
19/07/13 13:37:17 INFO mapreduce.Job: Running job: job_1562986579443_0001
19/07/13 13:37:31 INFO mapreduce.Job: Job job_1562986579443_0001 running in uber mode : false
19/07/13 13:37:31 INFO mapreduce.Job: map 0% reduce 0%
19/07/13 13:37:40 INFO mapreduce.Job: map 50% reduce 0%
19/07/13 13:37:41 INFO mapreduce.Job: map 100% reduce 0%
19/07/13 13:37:50 INFO mapreduce.Job: map 100% reduce 100% 成功
19/07/13 13:37:51 INFO mapreduce.Job: Job job_1562986579443_0001 completed successfully
19/07/13 13:37:52 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=142
FILE: Number of bytes written=334480
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=278
HDFS: Number of bytes written=44
HDFS: Number of read operations=9
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=2
Launched reduce tasks=1
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=14642
Total time spent by all reduces in occupied slots (ms)=7272
Total time spent by all map tasks (ms)=14642
Total time spent by all reduce tasks (ms)=7272
Total vcore-seconds taken by all map tasks=14642
Total vcore-seconds taken by all reduce tasks=7272
Total megabyte-seconds taken by all map tasks=14993408
Total megabyte-seconds taken by all reduce tasks=7446528
Map-Reduce Framework
Map input records=12
Map output records=12
Map output bytes=112
Map output materialized bytes=148
Input split bytes=214
Combine input records=12
Combine output records=12
Reduce input groups=6
Reduce shuffle bytes=148
Reduce input records=12
Reduce output records=6
Spilled Records=24
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=493
CPU time spent (ms)=1690
Physical memory (bytes) snapshot=685453312
Virtual memory (bytes) snapshot=8411357184
Total committed heap usage (bytes)=512229376
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=64
File Output Format Counters
Bytes Written=44
[hadoop@hadoop001 hadoop]$ bin/hdfs dfs -ls /examples/output1 打开
19/07/13 14:52:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r-- 1 hadoop supergroup 0 2019-07-13 13:37 /examples/output1/_SUCCESS 成功的标识
-rw-r--r-- 1 hadoop supergroup 44 2019-07-13 13:37 /examples/output1/part-r-00000 内容在这
[hadoop@hadoop001 hadoop]$
[hadoop@hadoop001 hadoop]$ bin/hdfs dfs -cat /examples/output1/part-r-00000 查看
19/07/13 14:55:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
11 2
22 2
33 2
44 2
5 2
www.ruozedata.com 2
[hadoop@hadoop001 hadoop]$
[hadoop@hadoop001 hadoop]$ bin/hadoop 注:bin/hadoop fs =bin/hdfs dfs
Usage: hadoop [--config confdir] COMMAND
where COMMAND is one of:
fs run a generic filesystem user client
version print the version
jar <jar> run a jar file
checknative [-a|-h] check native hadoop and compression libraries availability
distcp <srcurl> <desturl> copy file or directories recursively
archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
classpath prints the class path needed to get the
credential interact with credential providers
Hadoop jar and the required libraries
daemonlog get/set the log level for each daemon
trace view and modify Hadoop tracing settings
or
CLASSNAME run the class named CLASSNAME
Most commands print help when invoked w/o parameters.
[hadoop@hadoop001 hadoop]$