mapreduce计算最高分

一、编写代码

1、创建maven项目

mvn  archetype:generate
[WBQ@westgisB064 one]$ mvn archetype:generate
[INFO] Scanning for projects...
[INFO] 
[INFO] ------------------< org.apache.maven:standalone-pom >-------------------
[INFO] Building Maven Stub Project (No POM) 1
[INFO] --------------------------------[ pom ]---------------------------------
[INFO] 
[INFO] >>> maven-archetype-plugin:3.2.1:generate (default-cli) > generate-sources @ standalone-pom >>>
[INFO] 
[INFO] <<< maven-archetype-plugin:3.2.1:generate (default-cli) < generate-sources @ standalone-pom <<<
[INFO] 
[INFO] 
[INFO] --- maven-archetype-plugin:3.2.1:generate (default-cli) @ standalone-pom ---
[INFO] Generating project in Interactive mode
[WARNING] No archetype found in remote catalog. Defaulting to internal catalog
[INFO] No archetype defined. Using maven-archetype-quickstart (org.apache.maven.archetypes:maven-archetype-quickstart:1.0)
Choose archetype:
1: internal -> org.apache.maven.archetypes:maven-archetype-archetype (An archetype which contains a sample archetype.)
2: internal -> org.apache.maven.archetypes:maven-archetype-j2ee-simple (An archetype which contains a simplifed sample J2EE application.)
3: internal -> org.apache.maven.archetypes:maven-archetype-plugin (An archetype which contains a sample Maven plugin.)
4: internal -> org.apache.maven.archetypes:maven-archetype-plugin-site (An archetype which contains a sample Maven plugin site.
      This archetype can be layered upon an existing Maven plugin project.)
5: internal -> org.apache.maven.archetypes:maven-archetype-portlet (An archetype which contains a sample JSR-268 Portlet.)
6: internal -> org.apache.maven.archetypes:maven-archetype-profiles ()
7: internal -> org.apache.maven.archetypes:maven-archetype-quickstart (An archetype which contains a sample Maven project.)
8: internal -> org.apache.maven.archetypes:maven-archetype-site (An archetype which contains a sample Maven site which demonstrates
      some of the supported document types like APT, XDoc, and FML and demonstrates how
      to i18n your site. This archetype can be layered upon an existing Maven project.)
9: internal -> org.apache.maven.archetypes:maven-archetype-site-simple (An archetype which contains a sample Maven site.)
10: internal -> org.apache.maven.archetypes:maven-archetype-webapp (An archetype which contains a sample Maven Webapp project.)
Choose a number or apply filter (format: [groupId:]artifactId, case sensitive contains): 7: 
Define value for property 'groupId': ten
Define value for property 'artifactId': nine
Define value for property 'version' 1.0-SNAPSHOT: : 1.0
Define value for property 'package' ten: : eight
Confirm properties configuration:
groupId: ten
artifactId: nine
version: 1.0
package: eight
 Y: : 
[INFO] ----------------------------------------------------------------------------
[INFO] Using following parameters for creating project from Old (1.x) Archetype: maven-archetype-quickstart:1.1
[INFO] ----------------------------------------------------------------------------
[INFO] Parameter: basedir, Value: /home/WBQ/code/maven/one
[INFO] Parameter: package, Value: eight
[INFO] Parameter: groupId, Value: ten
[INFO] Parameter: artifactId, Value: nine
[INFO] Parameter: packageName, Value: eight
[INFO] Parameter: version, Value: 1.0
[INFO] project created from Old (1.x) Archetype in dir: /home/WBQ/code/maven/one/nine
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  39.874 s
[INFO] Finished at: 2023-05-30T22:30:42+08:00
[INFO] ------------------------------------------------------------------------
[WBQ@westgisB064 one]$

2、配置maven项目pom.xml文件

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>ten</groupId>
  <artifactId>nine</artifactId>
  <version>1.0</version>
  <packaging>jar</packaging>

  <name>nine</name>
  <url>http://maven.apache.org</url>


 <!--依赖版本管理    -->
    <properties>
        <maven.compiler.source>8</maven.compiler.source>
        <maven.compiler.target>8</maven.compiler.target>
        <java.version>1.8</java.version>
        <hadoop.version>3.1.3</hadoop.version>
        <log4j.version>1.2.14</log4j.version>
        <junit.version>4.8.2</junit.version>
    </properties>
    <dependencies>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>${hadoop.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>${hadoop.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>${hadoop.version}</version>
        </dependency>
        <dependency>
            <groupId>log4j</groupId>
            <artifactId>log4j</artifactId>
            <version>${log4j.version}</version>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>${junit.version}</version>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-jar-plugin</artifactId>
                <configuration>
                    <archive>
                        <manifest>
                            <mainClass>eight.dailyAccessCount</mainClass>
                        </manifest>
                    </archive>
                </configuration>
            </plugin>
        </plugins>
    </build>

</project>

3、编写代码

package eight;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;


import java.io.IOException;

public class dailyAccessCount {
    public static class FindMaxMapper extends Mapper<LongWritable, Text,Text,IntWritable>{
        Text course = new Text();
        IntWritable score = new IntWritable();
        @Override
        protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
            String [] values = value.toString().trim().split(" ");
            course.set(values[0]);
            score.set(Integer.parseInt(values[1]));
            context.write(course,score);

        }
    }
    public static class FindMaxReducer extends Reducer<Text,IntWritable,Text,IntWritable>{
        @Override
        protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
            int maxScore = -1;
            Text course = new Text();
            for(IntWritable score:values){
                if (score.get()>maxScore){
                    maxScore = score.get();
                    course = key;
                }
            }
            context.write(course,new IntWritable(maxScore));
        }
    }
    public static void main(String [] args) throws Exception{
        if (args.length != 2){
            System.out.println("FindMax <input> <output>");
            System.exit(-1);
        }
Configuration conf = new Configuration();
        Job job = Job.getInstance(conf,"findmax");
        job.setJarByClass(dailyAccessCount.class);
        job.setMapperClass(FindMaxMapper.class);
        job.setReducerClass(FindMaxReducer.class);
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        job.setNumReduceTasks(1);
        FileInputFormat.addInputPath(job,new Path(args[0]));
        FileSystem.get(conf).delete(new Path(args[1]),true);
        FileOutputFormat.setOutputPath(job,new Path(args[1]));
        System.out.println(job.waitForCompletion(true) ? 0 : 1);

    }
}

4、编译

mvn  compile

5、打包

mvn  install

二、提交jar包到集群中运行

1、准备数据

vi  score.txt
语文 102
数学 30
英语 88
语文 120
数学 100
英语 67

2、启动集群

[WBQ@westgisB064 ~]$ $HADOOP_HOME/sbin/start-dfs.sh
Starting namenodes on [westgisB064]
Starting datanodes
Starting secondary namenodes [westgisB064]
[WBQ@westgisB064 ~]$ $HADOOP_HOME/sbin/start-yarn.sh
Starting resourcemanager
Starting nodemanagers
[WBQ@westgisB064 ~]$

3、上传数据

[WBQ@westgisB064 ~]$ hdfs dfs -mkdir /input
[WBQ@westgisB064 ~]$ hdfs dfs -put /home/WBQ/code/maven/one/nine/score.txt /input
2023-05-30 22:51:56,341 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
[WBQ@westgisB064 ~]$

4、执行jar包

[WBQ@westgisB064 ~]$ hadoop jar /home/WBQ/code/maven/one/nine/target/nine-1.0.jar /input /output
2023-05-30 23:00:18,488 INFO client.RMProxy: Connecting to ResourceManager at westgisB064/10.103.105.64:8032
2023-05-30 23:00:19,047 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2023-05-30 23:00:19,078 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/WBQ/.staging/job_1685458797877_0001
2023-05-30 23:00:19,222 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2023-05-30 23:00:19,679 INFO input.FileInputFormat: Total input files to process : 1
2023-05-30 23:00:19,711 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2023-05-30 23:00:20,029 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2023-05-30 23:00:20,055 INFO mapreduce.JobSubmitter: number of splits:1
2023-05-30 23:00:20,163 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2023-05-30 23:00:20,195 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1685458797877_0001
2023-05-30 23:00:20,195 INFO mapreduce.JobSubmitter: Executing with tokens: []
2023-05-30 23:00:20,385 INFO conf.Configuration: resource-types.xml not found
2023-05-30 23:00:20,386 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2023-05-30 23:00:20,659 INFO impl.YarnClientImpl: Submitted application application_1685458797877_0001
2023-05-30 23:00:20,700 INFO mapreduce.Job: The url to track the job: http://westgisB064:8088/proxy/application_1685458797877_0001/
2023-05-30 23:00:20,701 INFO mapreduce.Job: Running job: job_1685458797877_0001
2023-05-30 23:00:28,860 INFO mapreduce.Job: Job job_1685458797877_0001 running in uber mode : false
2023-05-30 23:00:28,861 INFO mapreduce.Job:  map 0% reduce 0%
2023-05-30 23:00:33,927 INFO mapreduce.Job:  map 100% reduce 0%
2023-05-30 23:00:39,968 INFO mapreduce.Job:  map 100% reduce 100%
2023-05-30 23:00:40,984 INFO mapreduce.Job: Job job_1685458797877_0001 completed successfully
2023-05-30 23:00:41,094 INFO mapreduce.Job: Counters: 53
	File System Counters
		FILE: Number of bytes read=84
		FILE: Number of bytes written=436247
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=167
		HDFS: Number of bytes written=32
		HDFS: Number of read operations=8
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
	Job Counters 
		Launched map tasks=1
		Launched reduce tasks=1
		Data-local map tasks=1
		Total time spent by all maps in occupied slots (ms)=2778
		Total time spent by all reduces in occupied slots (ms)=3010
		Total time spent by all map tasks (ms)=2778
		Total time spent by all reduce tasks (ms)=3010
		Total vcore-milliseconds taken by all map tasks=2778
		Total vcore-milliseconds taken by all reduce tasks=3010
		Total megabyte-milliseconds taken by all map tasks=2844672
		Total megabyte-milliseconds taken by all reduce tasks=3082240
	Map-Reduce Framework
		Map input records=6
		Map output records=6
		Map output bytes=66
		Map output materialized bytes=84
		Input split bytes=104
		Combine input records=0
		Combine output records=0
		Reduce input groups=3
		Reduce shuffle bytes=84
		Reduce input records=6
		Reduce output records=3
		Spilled Records=12
		Shuffled Maps =1
		Failed Shuffles=0
		Merged Map outputs=1
		GC time elapsed (ms)=142
		CPU time spent (ms)=1890
		Physical memory (bytes) snapshot=658104320
		Virtual memory (bytes) snapshot=5683085312
		Total committed heap usage (bytes)=857735168
		Peak Map Physical memory (bytes)=358383616
		Peak Map Virtual memory (bytes)=2836721664
		Peak Reduce Physical memory (bytes)=299720704
		Peak Reduce Virtual memory (bytes)=2846363648
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=63
	File Output Format Counters 
		Bytes Written=32
0
[WBQ@westgisB064 ~]$

5、查看执行结果

[WBQ@westgisB064 ~]$ hdfs dfs -cat /output/*
2023-05-30 23:01:27,454 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
数学	100
英语	88
语文	120
[WBQ@westgisB064 ~]$

6、执行完毕,关闭集群

[WBQ@westgisB064 ~]$ $HADOOP_HOME/sbin/stop-yarn.sh
Stopping nodemanagers
westgisB065: WARNING: nodemanager did not stop gracefully after 5 seconds: Trying to kill with kill -9
Stopping resourcemanager
[WBQ@westgisB064 ~]$ $HADOOP_HOME/sbin/stop-dfs.sh
Stopping namenodes on [westgisB064]
Stopping datanodes
Stopping secondary namenodes [westgisB064]
[WBQ@westgisB064 ~]$ ps aux|grep java
WBQ      27055  0.0  0.0 112712   980 pts/2    R+   23:02   0:00 grep --color=auto java
[WBQ@westgisB064 ~]$

备注:代码部分来自不争气大王

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值