Hadoop运行本地和伪分布式程序

最新推荐文章于 2024-05-21 21:14:32 发布

bboyzqh

最新推荐文章于 2024-05-21 21:14:32 发布

阅读量484

点赞数

文章标签：分布式 hadoop

本文链接：https://blog.csdn.net/zhuqiuhui/article/details/75102841

版权

一、Hadoop 运行本地程序

（1）编译（SequenceFileWriterDemo.java）

javac -classpath /usr/local/Cellar/hadoop/2.8.0/libexec/share/hadoop/common/hadoop-common-2.8.0.jar SequenceFileWriterDemo.java -d classes

其中：

-classpath：指定编译需要的 jar 包位置

/usr/local/Cellar/hadoop/2.8.0/libexec/share/hadoop/common/hadoop-common-2.8.0.jar：本地安装 hadoop 的 jar 包位置（本地环境是mac）

-d：指定生成的类文件的位置，后面跟 classes 表示将生成的 class 文件放到 classes 目录下（目录必须存在）

注：classes所在目录为：/Users/zhuqiuhui/Downloads/classes

（2）运行

export  HADOOP_CLASSPATH=/Users/zhuqiuhui/Downloads/classes (设置HADOOP_CLASSPATH环境变量用于添加应用程序类的路径，这里的路径是用户本地的文件路径，classes目录存放了刚才编译生成的 SequenceFileWriterDemo.class)

hadoop SequenceFileWriterDemo res.txt (当前目录是classes中，res.txt即放在了伪分布式环境中)

注：其中 core-site.xml (指定了默认的HDFS路径)

<configuration>   
    <property>   
        <name>fs.defaultFS</name>   
        <value>hdfs://localhost:9000</value>   
    </property>   
</configuration>

（3）代码及输出结果

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.SequenceFile;
import org.apache.hadoop.io.Text;

import java.io.IOException;
import java.net.URI;

/**
 * Created by zhuqiuhui on 2017/7/13.
 */
public class SequenceFileWriterDemo {
    private static final String[] DATA = {
            "One, two, buckle my shoe",
            "Three, four, shut the door",
            "Five, six, pick up sticks",
            "Seven, eight, lay them straight",
            "Nine, ten, a big fat hen"
    };

    public static void main(String[] args) throws IOException {
        String uri = args[0];
        Configuration conf = new Configuration();
        FileSystem fs = FileSystem.get(URI.create(uri), conf);
        Path path = new Path(uri);

        IntWritable key = new IntWritable();
        Text value = new Text();
        SequenceFile.Writer writer = null;
        writer = SequenceFile.createWriter(fs, conf, path, key.getClass(), value.getClass());
        for(int i = 0;i<100; ++i) {
            key.set(100-i);
            value.set(DATA[i%DATA.length]);
            System.out.printf("[%s]\t%s\t%s\n", writer.getLength(), key, value);
            writer.append(key, value);
        }
        IOUtils.closeStream(writer);
    }
}


输出结果：
[128]   100 One, two, buckle my shoe
[173]   99  Three, four, shut the door
[220]   98  Five, six, pick up sticks
[264]   97  Seven, eight, lay them straight
[314]   96  Nine, ten, a big fat hen
[359]   95  One, two, buckle my shoe
[404]   94  Three, four, shut the door
[451]   93  Five, six, pick up sticks
[495]   92  Seven, eight, lay them straight
[545]   91  Nine, ten, a big fat hen
[590]   90  One, two, buckle my shoe
[635]   89  Three, four, shut the door
[682]   88  Five, six, pick up sticks
[726]   87  Seven, eight, lay them straight
[776]   86  Nine, ten, a big fat hen
[821]   85  One, two, buckle my shoe
[866]   84  Three, four, shut the door
[913]   83  Five, six, pick up sticks
[957]   82  Seven, eight, lay them straight
[1007]  81  Nine, ten, a big fat hen
[1052]  80  One, two, buckle my shoe
[1097]  79  Three, four, shut the door
[1144]  78  Five, six, pick up sticks
[1188]  77  Seven, eight, lay them straight
[1238]  76  Nine, ten, a big fat hen
[1283]  75  One, two, buckle my shoe
[1328]  74  Three, four, shut the door
[1375]  73  Five, six, pick up sticks
[1419]  72  Seven, eight, lay them straight
[1469]  71  Nine, ten, a big fat hen
[1514]  70  One, two, buckle my shoe
[1559]  69  Three, four, shut the door
[1606]  68  Five, six, pick up sticks
[1650]  67  Seven, eight, lay them straight
[1700]  66  Nine, ten, a big fat hen
[1745]  65  One, two, buckle my shoe
[1790]  64  Three, four, shut the door
[1837]  63  Five, six, pick up sticks
[1881]  62  Seven, eight, lay them straight
[1931]  61  Nine, ten, a big fat hen
[1976]  60  One, two, buckle my shoe
[2021]  59  Three, four, shut the door
[2088]  58  Five, six, pick up sticks
[2132]  57  Seven, eight, lay them straight
[2182]  56  Nine, ten, a big fat hen
[2227]  55  One, two, buckle my shoe
[2272]  54  Three, four, shut the door
[2319]  53  Five, six, pick up sticks
[2363]  52  Seven, eight, lay them straight
[2413]  51  Nine, ten, a big fat hen
[2458]  50  One, two, buckle my shoe
[2503]  49  Three, four, shut the door
[2550]  48  Five, six, pick up sticks
[2594]  47  Seven, eight, lay them straight
[2644]  46  Nine, ten, a big fat hen
[2689]  45  One, two, buckle my shoe
[2734]  44  Three, four, shut the door
[2781]  43  Five, six, pick up sticks
[2825]  42  Seven, eight, lay them straight
[2875]  41  Nine, ten, a big fat hen
[2920]  40  One, two, buckle my shoe
[2965]  39  Three, four, shut the door
[3012]  38  Five, six, pick up sticks
[3056]  37  Seven, eight, lay them straight
[3106]  36  Nine, ten, a big fat hen
[3151]  35  One, two, buckle my shoe
[3196]  34  Three, four, shut the door
[3243]  33  Five, six, pick up sticks
[3287]  32  Seven, eight, lay them straight
[3337]  31  Nine, ten, a big fat hen
[3382]  30  One, two, buckle my shoe
[3427]  29  Three, four, shut the door
[3474]  28  Five, six, pick up sticks
[3518]  27  Seven, eight, lay them straight
[3568]  26  Nine, ten, a big fat hen
[3613]  25  One, two, buckle my shoe
[3658]  24  Three, four, shut the door
[3705]  23  Five, six, pick up sticks
[3749]  22  Seven, eight, lay them straight
[3799]  21  Nine, ten, a big fat hen
[3844]  20  One, two, buckle my shoe
[3889]  19  Three, four, shut the door
[3936]  18  Five, six, pick up sticks
[3980]  17  Seven, eight, lay them straight
[4030]  16  Nine, ten, a big fat hen
[4075]  15  One, two, buckle my shoe
[4140]  14  Three, four, shut the door
[4187]  13  Five, six, pick up sticks
[4231]  12  Seven, eight, lay them straight
[4281]  11  Nine, ten, a big fat hen
[4326]  10  One, two, buckle my shoe
[4371]  9   Three, four, shut the door
[4418]  8   Five, six, pick up sticks
[4462]  7   Seven, eight, lay them straight
[4512]  6   Nine, ten, a big fat hen
[4557]  5   One, two, buckle my shoe
[4602]  4   Three, four, shut the door
[4649]  3   Five, six, pick up sticks
[4693]  2   Seven, eight, lay them straight
[4743]  1   Nine, ten, a big fat hen

二、Hadoop 运行伪分布式程序（WordCount）

（1）编译

javac -classpath /usr/local/Cellar/hadoop/2.8.0/libexec/share/hadoop/common/hadoop-common-2.8.0.jar:/usr/local/Cellar/hadoop/2.8.0/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.8.0.jar  WordCount.java -d classes

在 classes目录下出现WordCount.class、WordCount$IntSumReducer.class、WordCount$TokenizerMapper.class

（2）打包

jar -cvf wordCount.jar classes （打成wordCount.jar）

（3）运行

hadoop jar wordCount.jar WordCount hdfs://localhost:9000/count.txt /output

主程序需要两个参数，一个输入参数（输入文件），一个输出参数（输出文件）
输入文件已经在 hdfs 根目录中，即 input.txt，其内容：

hadoop mapreduce   
hadoop yarn
hadoop hdfs
hadoop mapreduce   
hadoop yarn
hadoop hdfs
zqh gkn
lzy zqh

（4）代码及输出结果

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

import java.io.IOException;
import java.util.StringTokenizer;

/**
 * Created by zhuqiuhui on 2017/7/14.
 */
public class WordCount extends Configured implements Tool {

    public int run(String[] args) throws Exception {
        Configuration conf = new Configuration();
        if(args.length != 2) {
            System.err.println("Usage: wordcount <in> <out>");
            System.exit(2);
        }

        Job job = new Job(conf, "wordcount");
        job.setJarByClass(WordCount.class);
        job.setMapperClass(TokenizerMapper.class);
        job.setCombinerClass(IntSumReducer.class);
        job.setReducerClass(IntSumReducer.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        return job.waitForCompletion(true)?0:1;
    }

    public class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> {
        IntWritable one = new IntWritable(1);
        Text word = new Text();

        public void map(Object key, Text value, Context context) throws IOException,InterruptedException {
            StringTokenizer itr = new StringTokenizer(value.toString());
            while(itr.hasMoreTokens()) {
                word.set(itr.nextToken());
                context.write(word, one);
            }
        }
    }

    public class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
        IntWritable result = new IntWritable();

        public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException,InterruptedException {
            int sum = 0;
            for(IntWritable val:values) {
                sum += val.get();
            }
            result.set(sum);
            context.write(key,result);
        }
    }

    public static void main(String[] args) throws Exception {
        int exitCode = ToolRunner.run(new WordCount(), args);
        System.exit(exitCode);
    }
}


17/07/14 14:33:44 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/07/14 14:33:45 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
17/07/14 14:33:45 WARN mapreduce.JobResourceUploader: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
17/07/14 14:33:45 INFO input.FileInputFormat: Total input files to process : 1
17/07/14 14:33:45 INFO mapreduce.JobSubmitter: number of splits:1
17/07/14 14:33:45 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1499913113300_0006
17/07/14 14:33:45 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources.
17/07/14 14:33:45 INFO impl.YarnClientImpl: Submitted application application_1499913113300_0006
17/07/14 14:33:45 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1499913113300_0006/
17/07/14 14:33:45 INFO mapreduce.Job: Running job: job_1499913113300_0006
17/07/14 14:33:51 INFO mapreduce.Job: Job job_1499913113300_0006 running in uber mode : false
17/07/14 14:33:51 INFO mapreduce.Job:  map 0% reduce 0%
17/07/14 14:33:54 INFO mapreduce.Job: Task Id : attempt_1499913113300_0006_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class WordCount$TokenizerMapper not found
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2216)
	at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:745)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)
Caused by: java.lang.ClassNotFoundException: Class WordCount$TokenizerMapper not found
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2122)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2214)
	... 8 more

17/07/14 14:33:58 INFO mapreduce.Job: Task Id : attempt_1499913113300_0006_m_000000_1, Status : FAILED
Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class WordCount$TokenizerMapper not found
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2216)
	at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:745)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)
Caused by: java.lang.ClassNotFoundException: Class WordCount$TokenizerMapper not found
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2122)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2214)
	... 8 more

17/07/14 14:34:01 INFO mapreduce.Job: Task Id : attempt_1499913113300_0006_m_000000_2, Status : FAILED
Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class WordCount$TokenizerMapper not found
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2216)
	at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:745)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)
Caused by: java.lang.ClassNotFoundException: Class WordCount$TokenizerMapper not found
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2122)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2214)
	... 8 more

17/07/14 14:34:06 INFO mapreduce.Job:  map 100% reduce 100%
17/07/14 14:34:06 INFO mapreduce.Job: Job job_1499913113300_0006 failed with state FAILED due to: Task failed task_1499913113300_0006_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

17/07/14 14:34:07 INFO mapreduce.Job: Counters: 13
	Job Counters
		Failed map tasks=4
		Killed reduce tasks=1
		Launched map tasks=4
		Other local map tasks=3
		Data-local map tasks=1
		Total time spent by all maps in occupied slots (ms)=7720
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=7720
		Total time spent by all reduce tasks (ms)=0
		Total vcore-milliseconds taken by all map tasks=7720
		Total vcore-milliseconds taken by all reduce tasks=0
		Total megabyte-milliseconds taken by all map tasks=7905280
		Total megabyte-milliseconds taken by all reduce tasks=0

上面没有找到类，执行mapreduce出现的错，原因是map类和reduce没有加static修饰，因为Hadoop在调用map和reduce类时采用的反射调用，内部类不是静态的，没有获取到内部类的实例。对两个静态内部类加上static即可。