Hadoop mapper类和reducer类的阅读 Hadoop(1)

最新推荐文章于 2022-11-22 15:41:10 发布

jbt2008bg

最新推荐文章于 2022-11-22 15:41:10 发布

阅读量398

点赞数

在Hadoop的mapper类中，有4个主要的函数，分别是：setup，clearup，map，run。

protected void setup(Context context) throws IOException, InterruptedException {
// NOTHING
}

protected void map(KEYIN key, VALUEIN value, 
                     Context context) throws IOException, InterruptedException {
 context.write((KEYOUT) key, (VALUEOUT) value);
}

protected void cleanup(Context context) throws IOException, InterruptedException {
// NOTHING
}

 public void run(Context context) throws IOException, InterruptedException {
    setup(context);
    while (context.nextKeyValue()) {
      map(context.getCurrentKey(), context.getCurrentValue(), context);
    }
    cleanup(context);
  }
}

在Hadoop的reducer类中，有3个主要的函数，分别是：setup，clearup，reduce。

  /**
   * Called once at the start of the task.
   */
  protected void setup(Context context
                       ) throws IOException, InterruptedException {
    // NOTHING
  }

  /**
   * This method is called once for each key. Most applications will define
   * their reduce class by overriding this method. The default implementation
   * is an identity function.
   */
  @SuppressWarnings("unchecked")
  protected void reduce(KEYIN key, Iterable<VALUEIN> values, Context context
                        ) throws IOException, InterruptedException {
    for(VALUEIN value: values) {
      context.write((KEYOUT) key, (VALUEOUT) value);
    }
  }

  /**
   * Called once at the end of the task.
   */
  protected void cleanup(Context context
                         ) throws IOException, InterruptedException {
    // NOTHING
  }

/*
   * control how the reduce task works.
   */
  @SuppressWarnings("unchecked")
  public void run(Context context) throws IOException, InterruptedException {
    setup(context);
    while (context.nextKey()) {
      reduce(context.getCurrentKey(), context.getValues(), context);
      // If a back up store is used, reset it
      ((ReduceContext.ValueIterator)
          (context.getValues().iterator())).resetBackupStore();
    }
    cleanup(context);
  }
}

jbt2008bg

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Hadoop mapper类和reducer类的阅读 Hadoop(1)

在Hadoop的mapper类中，有4个主要的函数，分别是：setup，clearup，map，run。protected void setup(Context context) throws IOException, InterruptedException {// NOTHING}protected void map(KEYIN key, VALUEIN value,
复制链接

扫一扫