既有适合小白学习的零基础资料,也有适合3年以上经验的小伙伴深入学习提升的进阶课程,涵盖了95%以上大数据知识点,真正体系化!
由于文件比较多,这里只是将部分目录截图出来,全套包含大厂面经、学习笔记、源码讲义、实战项目、大纲路线、讲解视频,并且后续会持续更新
org.apache.hadoop
hadoop-common
2.8.3
org.apache.hadoop
hadoop-client
2.8.3
org.apache.hadoop
hadoop-hdfs
2.8.3
2,编写单词统计的Java代码。
主类WordCountMain.java:
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCountMain {
public WordCountMain(String[] args) throws Exception {
Configuration configuration = new Configuration();
Job job = Job.getInstance(configuration, “word_count”);
job.setJarByClass(WordCountMain.class);
job.setMapperClass(MyMapper.class);
job.setReducerClass(MyReducer.class);
job.setMapOutputKeyClass(Text.class);
job.setOutputValueClass(LongWritable.class);
job.setOutputKeyClass(Text.class);
FileInputFormat.setInputPaths(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.out.println(job.waitForCompletion(true) ? “运行成功” : “运行失败”);
}
public static void main(String[] args) {
try {
WordCountMain wordCountMain = new WordCountMain(args);
} catch (Exception e) {
e.printStackTrace();
}
}
}
map类:
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class MyMapper extends Mapper<LongWritable, Text, Text, LongWritable> {
@Override
protected void map(LongWritable key, Text value, Context context) {
String line = value.toString();
String[] words = line.split(" ");
for (String word : words) {
// 将单词作为key,将次数1作为value。
try {
context.write(new Text(word), new LongWritable(1));
} catch (Exception e) {
e.printStackTrace();
}
}
既有适合小白学习的零基础资料,也有适合3年以上经验的小伙伴深入学习提升的进阶课程,涵盖了95%以上大数据知识点,真正体系化!
由于文件比较多,这里只是将部分目录截图出来,全套包含大厂面经、学习笔记、源码讲义、实战项目、大纲路线、讲解视频,并且后续会持续更新
基础资料,也有适合3年以上经验的小伙伴深入学习提升的进阶课程,涵盖了95%以上大数据知识点,真正体系化!**
由于文件比较多,这里只是将部分目录截图出来,全套包含大厂面经、学习笔记、源码讲义、实战项目、大纲路线、讲解视频,并且后续会持续更新