关闭

Hadoop单机版环境搭建及第一个示例

1539人阅读 评论(0) 收藏 举报
分类:

1,安装JDK(1.8)与Hadoop(2.7.2)

1>将jdk与hadoop放到/opt/tool(自己建一个文件夹)

2>tar zxvf进行解压

3>配置JDK与Hadoop环境变量

vim /etc/profile

内容如下:

<span style="font-size:24px;">export JAVA_HOME=/opt/tool/jdk1.8.0_101
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export HADOOP_HOME=/opt/tool/hadoop-2.7.2
export PATH=.:$HADOOP_HOME/bin:$JAVA_HOME/bin:$JRE_HOME/sbin:$PATH</span>

source /etc/profile

测试:java -version  hadoop,检查是否安装成功

4>将jdk设置入hadoop中

在vim etc/hadoop/hadoop-env.sh 设置export JAVA_HOME=/opt/tool/jdk1.8.0_101

5>启动hadoop

./hadoop/sbin/start-all.sh,按步骤操作,输入密码以及yes确定等

测试:jps,检查各个进程是否已经启动成功;

6>这个很重要:

vim /etc/sysconfig/network进入检查主机名

hostname检查主机名

如不一样,使用hostname localhost.localdomain更改

2,测试demo

1>编写Java代码,也就是MapReduce

1>>

<span style="font-size:24px;">package wordcount;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class WordMapper extends Mapper<Object,Text,Text,IntWritable>{
	private final static IntWritable one = new IntWritable(1);
	private Text word = new Text();
	public void map(Object key,Text value,Context context) throws IOException,InterruptedException{
		StringTokenizer itr = new StringTokenizer(value.toString());
		while(itr.hasMoreElements()){
			word.set(itr.nextToken());
			context.write(word, one);
		}
	}
}</span>
2>>

<span style="font-size:24px;">package wordcount;

import java.io.IOException;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class WordReducer extends Reducer<Text,IntWritable,Text,IntWritable>{
	private IntWritable result = new IntWritable();
	public void reduce(Text key,Iterable<IntWritable> values,Context context) throws IOException,InterruptedException{
		int sum = 0;
		for(IntWritable val:values){
			sum += val.get();
		}
		result.set(sum);
		context.write(key, result);
	}
}</span>

3>>

<span style="font-size:24px;">package wordcount;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;


public class WordMain {
	public static void main(String[] args) throws Exception{
		Configuration conf = new Configuration();
		String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
		for(int i=0;i<otherArgs.length;i++){
			System.out.println(otherArgs[i]);
		}
//		if(otherArgs.length != 2){
//			System.err.println("Usage:wordcount <in><out>");
//			System.exit(2);
//		}
		Job job = new Job(conf,"word count");
		job.setJarByClass(WordMain.class);
		job.setMapperClass(WordMapper.class);
		job.setCombinerClass(WordReducer.class);
		job.setReducerClass(WordReducer.class);
		job.setOutputKeyClass(Text.class);
		job.setOutputValueClass(IntWritable.class);
		FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
		FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
		System.exit(job.waitForCompletion(true) ? 0:1);
	}
}
</span>

2>打成jarfile

3>将jar放到hadoop节点下

4>将统计文件放入到hdfs系统下,并创建文件夹,为hdfs文件

hadoop fs -put /opt/tool/file.txt /opt/tool/input/

测试:查看是否上传成功:hadoop fs -ls /opt/tool/input/

5>执行大数据分析:

hadoop jar wordcount.jar wordcount.WordMain /opt/tool/input/file.txt  /opt/tool/output

注意执行可执行文件(这个jar一定注意路径,只有在当前路径才可以不写路径)




0
0

查看评论
* 以上用户言论只代表其个人观点,不代表CSDN网站的观点或立场
    个人资料
    • 访问:46000次
    • 积分:652
    • 等级:
    • 排名:千里之外
    • 原创:25篇
    • 转载:0篇
    • 译文:0篇
    • 评论:6条
    文章分类
    最新评论