Hadoop运行本地和伪分布式程序

一、Hadoop 运行本地程序


(1)编译(SequenceFileWriterDemo.java)

javac -classpath /usr/local/Cellar/hadoop/2.8.0/libexec/share/hadoop/common/hadoop-common-2.8.0.jar SequenceFileWriterDemo.java -d classes

其中:

-classpath:指定编译需要的 jar 包位置

/usr/local/Cellar/hadoop/2.8.0/libexec/share/hadoop/common/hadoop-common-2.8.0.jar:本地安装 hadoop 的 jar 包位置(本地环境是mac)

-d:指定生成的类文件的位置,后面跟 classes 表示将生成的 class 文件放到 classes 目录下(目录必须存在)


注:classes所在目录为:/Users/zhuqiuhui/Downloads/classes

(2)运行

export  HADOOP_CLASSPATH=/Users/zhuqiuhui/Downloads/classes (设置HADOOP_CLASSPATH环境变量用于添加应用程序类的路径,这里的路径是用户本地的文件路径,classes目录存放了刚才编译生成的 SequenceFileWriterDemo.class)

hadoop SequenceFileWriterDemo res.txt (当前目录是classes中,res.txt即放在了伪分布式环境中)

注:其中 core-site.xml (指定了默认的HDFS路径)

<configuration>   
    <property>   
        <name>fs.defaultFS</name>   
        <value>hdfs://localhost:9000</value>   
    </property>   
</configuration> 

(3)代码及输出结果

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.SequenceFile;
import org.apache.hadoop.io.Text;

import java.io.IOException;
import java.net.URI;

/**
 * Created by zhuqiuhui on 2017/7/13.
 */
public class SequenceFileWriterDemo {
    private static final String[] DATA = {
            "One, two, buckle my shoe",
            "Three, four, shut the door",
            "Five, six, pick up sticks",
            "Seven, eight, lay them straight",
            "Nine, ten, a big fat hen"
    };

    public static void main(String[] args) throws IOException {
        String uri = args[0];
        Configuration conf = new Configuration();
        FileSystem fs = FileSystem.get(URI.create(uri), conf);
        Path path = new Path(uri);

        IntWritable key = new IntWritable();
        Text value = new Text();
        SequenceFile.Writer writer = null;
        writer = SequenceFile.createWriter(fs, conf, path, key.getClass(), value.getClass());
        for(int i = 0;i<100; ++i) {
            key.set(100-i);
            value.set(DATA[i%DATA.length]);
            System.out.printf("[%s]\t%s\t%s\n", writer.getLength(), key, value);
            writer.append(key, value);
        }
        IOUtils.closeStream(writer);
    }
}


输出结果:
[128]   100 One, two, buckle my shoe
[173]   99  Three, four, shut the door
[220]   98  Five, six, pick up sticks
[264]   97  Seven, eight, lay them straight
[314]   96  Nine, ten, a big fat hen
[359]   95  One, two, buckle my shoe
[404]   94  Three, four, shut the door
[451]   93  Five, six, pick up sticks
[495]   92  Seven, eight, lay them straight
[545]   91  Nine, ten, a big fat hen
[590]   90  One, two, buckle my shoe
[635]   89  Three, four, shut the door
[682]   88  Five, six, pick up sticks
[726]   87  Seven, eight, lay them straight
[776]   86  Nine, ten, a big fat hen
[821]   85  One, two, buckle my shoe
[866]   84  Three, four, shut the door
[913]   83  Five, six, pick up sticks
[957]   82  Seven, eight, lay them straight
[1007]  81  Nine, ten, a big fat hen
[1052]  80  One, two, buckle my shoe
[1097]  79  Three, four, shut the door
[1144]  78  Five, six, pick up sticks
[1188]  77  Seven, eight, lay them straight
[1238]  76  Nine, ten, a big fat hen
[1283]  75  One, two, buckle my shoe
[1328]  74  Three, four, shut the door
[1375]  73  Five, six, pick up sticks
[1419]  72  Seven, eight, lay them straight
[1469]  71  Nine, ten, a big fat hen
[1514]  70  One, two, buckle my shoe
[1559]  69  Three, four, shut the door
[1606]  68  Five, six, pick up sticks
[1650]  67  Seven, eight, lay them straight
[1700]  66  Nine, ten, a big fat hen
[1745]  65  One, two, buckle my shoe
[1790]  64  Three, four, shut the door
[1837]  63  Five, six, pick up sticks
[1881]  62  Seven, eight, lay them straight
[1931]  61  Nine, ten, a big fat hen
[1976]  60  One, two, buckle my shoe
[2021]  59  Three, four, shut the door
[2088]  58  Five, six, pick up sticks
[2132]  57  Seven, eight, lay them straight
[2182]  56  Nine, ten, a big fat hen
[2227]  55  One, two, buckle my shoe
[2272]  54  Three, four, shut the door
[2319]  53  Five, six, pick up sticks
[2363]  52  Seven, eight, lay them straight
[2413]  51  Nine, ten, a big fat hen
[2458]  50  One, two, buckle my shoe
[2503]  49  Three, four, shut the door
[2550]  48  Five, six, pick up sticks
[2594]  47  Seven, eight, lay them straight
[2644]  46  Nine, ten, a big fat hen
[2689]  45  One, two, buckle my shoe
[2734]  44  Three, four, shut the door
[2781]  43  Five, six, pick up sticks
[2825]  42  Seven, eight, lay them straight
[2875]  41  Nine, ten, a big fat hen
[2920]  40  One, two, buckle my shoe
[2965]  39  Three, four, shut the door
[3012]  38  Five, six, pick up sticks
[3056]  37  Seven, eight, lay them straight
[3106]  36  Nine, ten, a big fat hen
[3151]  35  One, two, buckle my shoe
[3196]  34  Three, four, shut the door
[3243]  33  Five, six, pick up sticks
[3287]  32  Seven, eight, lay them straight
[3337]  31  Nine, ten, a big fat hen
[3382]  30  One, two, buckle my shoe
[3427]  29  Three, four, shut the door
[3474]  28  Five, six, pick up sticks
[3518]  27  Seven, eight, lay them straight
[3568]  26  Nine, ten, a big fat hen
[3613]  25  One, two, buckle my shoe
[3658]  24  Three, four, shut the door
[3705]  23  Five, six, pick up sticks
[3749]  22  Seven, eight, lay them straight
[3799]  21  Nine, ten, a big fat hen
[3844]  20  One, two, buckle my shoe
[3889]  19  Three, four, shut the door
[3936]  18  Five, six, pick up sticks
[3980]  17  Seven, eight, lay them straight
[4030]  16  Nine, ten, a big fat hen
[4075]  15  One, two, buckle my shoe
[4140]  14  Three, four, shut the door
[4187]  13  Five, six, pick up sticks
[4231]  12  Seven, eight, lay them straight
[4281]  11  Nine, ten, a big fat hen
[4326]  10  One, two, buckle my shoe
[4371]  9   Three, four, shut the door
[4418]  8   Five, six, pick up sticks
[4462]  7   Seven, eight, lay them straight
[4512]  6   Nine, ten, a big fat hen
[4557]  5   One, two, buckle my shoe
[4602]  4   Three, four, shut the door
[4649]  3   Five, six, pick up sticks
[4693]  2   Seven, eight, lay them straight
[4743]  1   Nine, ten, a big fat hen


二、Hadoop 运行伪分布式程序(WordCount)


(1)编译

javac -classpath /usr/local/Cellar/hadoop/2.8.0/libexec/share/hadoop/common/hadoop-common-2.8.0.jar:/usr/local/Cellar/hadoop/2.8.0/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.8.0.jar  WordCount.java -d classes

在 classes目录下出现WordCount.class、WordCount$IntSumReducer.class、WordCount$TokenizerMapper.class

(2)打包

jar -cvf wordCount.jar classes (打成wordCount.jar)

(3)运行

hadoop jar wordCount.jar WordCount hdfs://localhost:9000/count.txt /output

主程序需要两个参数,一个输入参数(输入文件),一个输出参数(输出文件)
输入文件已经在 hdfs 根目录中,即 input.txt,其内容:


hadoop mapreduce   
hadoop yarn
hadoop hdfs
hadoop mapreduce   
hadoop yarn
hadoop hdfs
zqh gkn
lzy zqh


(4)代码及输出结果


import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

import java.io.IOException;
import java.util.StringTokenizer;

/**
 * Created by zhuqiuhui on 2017/7/14.
 */
public class WordCount extends Configured implements Tool {

    public int run(String[] args) throws Exception {
        Configuration conf = new Configuration();
        if(args.length != 2) {
            System.err.println("Usage: wordcount <in> <out>");
            System.exit(2);
        }

        Job job = new Job(conf, "wordcount");
        job.setJarByClass(WordCount.class);
        job.setMapperClass(TokenizerMapper.class);
        job.setCombinerClass(IntSumReducer.class);
        job.setReducerClass(IntSumReducer.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        return job.waitForCompletion(true)?0:1;
    }

    public class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> {
        IntWritable one = new IntWritable(1);
        Text word = new Text();

        public void map(Object key, Text value, Context context) throws IOException,InterruptedException {
            StringTokenizer itr = new StringTokenizer(value.toString());
            while(itr.hasMoreTokens()) {
                word.set(itr.nextToken());
                context.write(word, one);
            }
        }
    }

    public class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
        IntWritable result = new IntWritable();

        public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException,InterruptedException {
            int sum = 0;
            for(IntWritable val:values) {
                sum += val.get();
            }
            result.set(sum);
            context.write(key,result);
        }
    }

    public static void main(String[] args) throws Exception {
        int exitCode = ToolRunner.run(new WordCount(), args);
        System.exit(exitCode);
    }
}


17/07/14 14:33:44 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/07/14 14:33:45 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
17/07/14 14:33:45 WARN mapreduce.JobResourceUploader: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
17/07/14 14:33:45 INFO input.FileInputFormat: Total input files to process : 1
17/07/14 14:33:45 INFO mapreduce.JobSubmitter: number of splits:1
17/07/14 14:33:45 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1499913113300_0006
17/07/14 14:33:45 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources.
17/07/14 14:33:45 INFO impl.YarnClientImpl: Submitted application application_1499913113300_0006
17/07/14 14:33:45 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1499913113300_0006/
17/07/14 14:33:45 INFO mapreduce.Job: Running job: job_1499913113300_0006
17/07/14 14:33:51 INFO mapreduce.Job: Job job_1499913113300_0006 running in uber mode : false
17/07/14 14:33:51 INFO mapreduce.Job:  map 0% reduce 0%
17/07/14 14:33:54 INFO mapreduce.Job: Task Id : attempt_1499913113300_0006_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class WordCount$TokenizerMapper not found
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2216)
	at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:745)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)
Caused by: java.lang.ClassNotFoundException: Class WordCount$TokenizerMapper not found
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2122)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2214)
	... 8 more

17/07/14 14:33:58 INFO mapreduce.Job: Task Id : attempt_1499913113300_0006_m_000000_1, Status : FAILED
Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class WordCount$TokenizerMapper not found
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2216)
	at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:745)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)
Caused by: java.lang.ClassNotFoundException: Class WordCount$TokenizerMapper not found
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2122)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2214)
	... 8 more

17/07/14 14:34:01 INFO mapreduce.Job: Task Id : attempt_1499913113300_0006_m_000000_2, Status : FAILED
Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class WordCount$TokenizerMapper not found
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2216)
	at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:745)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)
Caused by: java.lang.ClassNotFoundException: Class WordCount$TokenizerMapper not found
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2122)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2214)
	... 8 more

17/07/14 14:34:06 INFO mapreduce.Job:  map 100% reduce 100%
17/07/14 14:34:06 INFO mapreduce.Job: Job job_1499913113300_0006 failed with state FAILED due to: Task failed task_1499913113300_0006_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

17/07/14 14:34:07 INFO mapreduce.Job: Counters: 13
	Job Counters
		Failed map tasks=4
		Killed reduce tasks=1
		Launched map tasks=4
		Other local map tasks=3
		Data-local map tasks=1
		Total time spent by all maps in occupied slots (ms)=7720
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=7720
		Total time spent by all reduce tasks (ms)=0
		Total vcore-milliseconds taken by all map tasks=7720
		Total vcore-milliseconds taken by all reduce tasks=0
		Total megabyte-milliseconds taken by all map tasks=7905280
		Total megabyte-milliseconds taken by all reduce tasks=0

上面没有找到类, 执行mapreduce出现的错,原因是map类和reduce没有加static修饰 ,因为Hadoop在调用map和reduce类时采用的反射调用,内部类不是静态的,没有获取到内部类的实例。对两个静态内部类加上static即可。


三、注意


注:在Hadoop集群中运行作业的时候,必须要将程序打包为jar文件。
在Hadoop本地和伪分布中可以运行jar文件,也可以直接运行class文件,注意直接运行class文件,必须是没有map和reducer的,直接获取FileSystem来进行操作。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
Hadoop3的分布式是指在单台机器上模拟多节点的分布式环境。在分布式模式下,Hadoop的各个组件(如HDFS、YARN等)在同一台机器上运行,通过配置不同的端口和目录,模拟多个节点之间的通信和数据交互。 在hadoop3分布式模式下,通常需要进行以下步骤的配置: 1. 安装Hadoop软件包:可以从Apache官网下载最新版本的Hadoop3,并解压到指定的目录。 2. 配置环境变量:在系统的环境变量中添加Hadoop的安装路径,以便在命令行中可以直接调用Hadoop相关的命令。 3. 配置HDFS:修改Hadoop的核心配置文件,设置HDFS的相关参数,如副本数量、Block大小等。同时,需要在本地文件系统中创建指定的目录,用作HDFS存储数据的目录。 4. 配置YARN:修改YARN的配置文件,设置YARN的相关参数,如资源管理器的内存大小、节点管理器的内存大小等。 5. 启动Hadoop:通过启动脚本启动Hadoop的各个组件,如启动HDFS的NameNode、DataNode,启动YARN的ResourceManager、NodeManager等。 6. 执行任务:使用Hadoop提供的命令行工具或编写Java/Python等程序提交任务到Hadoop集群中,进行数据处理和计算。 分布式模式提供了在单机上测试和开发Hadoop应用程序的便利性,但由于受限于单台机器的资源,无法真正实现分布式的计算和存储能力。因此,在实际生产环境中,还需要配置多台机器组成Hadoop集群,实现真正的分布式计算和存储。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

bboyzqh

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值