MapReduce课程实验

创建一个文件"words.txt",上传到hdfs
代码:
public class CreateFile {

public static void main(String[] args) throws Exception {

  //设置一个配置 服务器所在  信息 
  Configuration conf = new Configuration();
  // linux 上的  hdfs  访问 地址
  conf.set("fs.defaultFS", "hdfs://master:8020");
  // 从服务器 获取 hdfs 文件 操作对象
  FileSystem hdfs = FileSystem.get(conf);
  //找到上传的 文件
  byte[] buf = ("Asia is better than the rest of the world. We should continue to develop and build" +
        " Asia well, show Asia's resilience, wisdom and strength, and build an anchor of peace " +
        "and stability, a source of growth and a new highland of cooperation in the world." +
        "First, firmly safeguard peace in Asia. Today, the five principles of peaceful " +
        "coexistence and the \"Bandung spirit\" initiated by Asia are of more practical " +
        "significance. We should uphold the principles of mutual respect, equality and mutual " +
        "benefit and peaceful coexistence, pursue a policy of good neighborliness and friendship," +
        " and firmly hold our destiny in our own hands." +
        "Second, actively promote Asian cooperation. The agreement on regional comprehensive " +
        "economic partnership has officially entered into force, and the railway between China " +
        "and old fellow has been opened to traffic, effectively enhancing the level of regional " +
        "hard interconnection and soft Unicom. We should take this opportunity to promote the " +
        "formation of a larger and more open market in Asia and take new steps in promoting " +
        "win-win cooperation in Asia." +
        "Third, jointly promote Asian solidarity. We should consolidate the central position of " +
        "ASEAN in the regional structure and maintain a regional order that takes into account" +
        " the demands of all parties and embraces the interests of all parties. Countries, big" +
        " or small, strong or weak, both within and outside the region, should add luster to " +
        "Asia without adding chaos. They should jointly follow the path of peaceful development" +
        ", seek win-win cooperation and create a united and progressive Asian family." +
        "China will fully implement the new development concept, accelerate the construction " +
        "of a new development pattern and strive to promote high-quality development. No matter" +
        " what changes take place in the world, China's confidence and will in reform and opening" +
        " up will not waver. China will unswervingly follow the path of peaceful development " +
        "and always be a builder of world peace, a contributor to global development and a " +
        "defender of the international order." +
        "Often do, not afraid of thousands of things. As long as we join hands and work " +
        "together, we will be able to gather the great strength of win-win cooperation, " +
        "overcome various challenges on the way forward and usher in a brighter and better" +
        " future for mankind.").getBytes();

  //对应  hdfs  路径 
  Path dst = new Path("/neusoftin/words.txt");
  // 创建文件路径
  FSDataOutputStream out = hdfs.create(dst);
  
  out.write(buf, 0, buf.length);// 向文件 传入信息
  out.close();
  // 验证 是否创建成功
  System.out.println(hdfs.exists(dst));

}
}
算法2:使用Tool工具类实现Mapreduce词频统计main方法
代码:
public class WordCountTool extends Configured implements Tool {

public static void main(String[] args) throws Exception {
    // 服务器连接对象
    Configuration conf = new Configuration();
    conf.set("fs.defaultFS", "hdfs://master:8020");
    FileSystem hdfs = FileSystem.get(conf);
    // 设置读取路径和文件
    String input ="/neusoftin/*.txt";
    String output= "/neusoftout";  // mapreduce  最后的结果,路径不

事先存在
Path outputpath = new Path(output);
// 执行前先 删除 结果文件夹;所以 如果为 true
if(hdfs.exists(outputpath)){
hdfs.delete(outputpath);
}
//工具类中 启动
args = new String[]{“/neusoftin/*.txt”, “/neusoftout”};
int re =ToolRunner.run(conf,new WordCountTool(),args);

    System.exit(re);
}

@Override
public int run(String[] strings) throws Exception {
    //
    Job job =Job.getInstance(getConf());

    job.setJarByClass(WordCountMain.class);//执行jar启动类

    job.setInputFormatClass(TextInputFormat.class);
    TextInputFormat.setInputPaths(job,strings[0]);// 输入入口
    //Mapper
    job.setMapperClass(WordCountMapper.class);
    job.setMapOutputKeyClass(Text.class);
    job.setMapOutputValueClass(IntWritable.class);

    job.setCombinerClass(WordCoundCombiner.class);
    // reducer
    job.setReducerClass(WordCountReducer.class);

    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);

    // 结果文件 输出
    job.setOutputFormatClass(TextOutputFormat.class);
    TextOutputFormat.setOutputPath(job,new Path(strings[1]));//  

行输出

    //运行
    boolean  result = job.waitForCompletion(true);
    FileSystem hdfs = FileSystem.get(getConf());
    if(result){
        // 获取hdfs 路径下的
        for(FileStatus fs: hdfs.listStatus(new Path(strings[1]))){

            FSDataInputStream dis = hdfs.open(fs.getPath());
            //用IOUtils下的copyBytes将流中的数据打印输出到控制台
            BufferedReader reader  = new BufferedReader(new

InputStreamReader(dis)); // 字节转字符
String line = reader.readLine();
while(line!=null){
System.out.println(line);
line = reader.readLine();
}
}
}
return 0;
}
}
算法3:MapReduce的mapper方法
代码:
public class WordCountMapper extends Mapper<LongWritable, Text,Text,
IntWritable> {

//  优化 写法

private Text outMapKey= new Text();

private static final IntWritable outMapValue = new

IntWritable(1);

/**
 *
 * @param key
 * @param value   传入的文本
 * @param context   返回的map
 * @throws IOException
 * @throws InterruptedException
 */
@Override
protected void map(LongWritable key, Text value,

Mapper<LongWritable, Text, Text, IntWritable>.Context context) throws
IOException, InterruptedException {
//获取 传入需要 统计的信息
String line = value.toString();
//分片
// line.split(" "); 是否为空的判定
if(StringUtils.isBlank(line)){
return ;
}
//调用工具类 差分 获取单词
StringTokenizer st = new StringTokenizer(line);
while(st.hasMoreTokens()){ // 循环判断是否还有 可以或缺的 单词
String word =st.nextToken(); // 向下获取单词
outMapKey.set(word);
context.write(outMapKey, outMapValue); // 向reduce 传递 信息
key 和value
}
}
}
算法4:MapReduce的reducer方法
代码:
public class WordCountReducer extends Reducer<Text,
IntWritable,Text ,IntWritable> {

/**
 *
 * @param key
 * @param values
 * @param context
 * @throws IOException
 * @throws InterruptedException
 */
@Override
protected void reduce(Text key, Iterable<IntWritable> values,

Reducer<Text, IntWritable, Text, IntWritable>.Context context) throws
IOException, InterruptedException {

    int sum=0; // 定义求和变量
    for(IntWritable value :values){  //循环  vaues
        sum+= value.get();  //+1
    }

    context.write(key,new IntWritable(sum));// 返回map
}

}

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值