Ubuntu下配置hadoop

实验环境: Ubuntu12.10, Hadoop1.1.1, jdk1.7

准备工作:

    1.下载Ubuntu12.10, 下载地址:http://www.ubuntu.com/download

    2.下载jdk, 下载地址:http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html

    3.下载hadoop, 下载地址:http://apache.dataguru.cn/hadoop/common/hadoop-1.1.1/

详细步骤:

    1.安装配置jdk

        1)在/home/hotye/下新建一个work文件夹(注:hotye为你自己计算机用户名)

        2)使用快捷键Ctrl+Alt+T打开命令终端,拷贝下载好到jdk到/home/hotye/work下

        3)解压jdk

              #sudo tar -xzvf jdk-7u17-linux-i586.tar.gz

        4)重命名jdk

              #sudo mv jdk1.7

         5)配置jdk环境变量

              #sudo vi /etc/profile

              输入一下内容:

                      export JAVA_HOME=/home/hotye/work/jdk1.7
                      PATH=$JAVA_HOME/bin:$PATH

                配置好后,检查是否配置成功。

                 #java -version

                      显示的jdk是你下载配置到jdk1.7

                 #type java

                        显示的路径为自己解压的java文件路径,说明安装配置成功。

    2.安装配置hadoop

        1)将下载好到hadoop1.1.1拷贝到/home/hotye/work文件夹下

        2)在/home/hotye/work下解压hadoop文件夹 ; #sudo tar -xzvf hadoop-1.1.1.tar.gz

        3)重命名解压后到文件名为:hadoop; #sudo mv hadoop-1.1.1 hadoop

        4)进入hadoop文件夹下的conf文件夹; #cd /home/hotye/work/hadoop/conf

        5)修改core-site.xml文件如下:

           

<configuration>
	<property>
		<name>fs.default.name</name>
		<value>hdfs://localhost:9000</value>
	</property>
</configuration>

        6)修改hdfs-site.xml文件如下:

           

<configuration>
	<property>
		<name>dfs.replication</name>
			<value>1</value>
	</property>
</configuration>


        7)修改mapred-site.xml文件如下:

<configuration>
	<property>
		<name>mapred.job.tracker</name>
		<value>localhost:9000</value>
	</property>
</configuration>

        8)创建一个新的分布式文件系统

            #cd /home/hotye/work/hadoop

           #sudo bin/hadoop namenode -format

       9)启动hadoop

           #sudo bin/start-all.sh

       10)判断是否配置启动成功

           http://localhost:50030     Hadoop管理页面

           http://localhost:50070     HadoopDFS状态

           http://localhost:50060     Hadoop Task Tracker

          如果全部打开,则说明配置hadoop成功。

        11)关闭hadoop

            #sudo bin/stop-all.sh

    3.配置SSH

        1)安装SSH; #sudo apt-get install ssh

        2)然后创建一个新SSH密钥,以启动无密码登录

            #ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

            #cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

        3)用以下命令测试

            #ssh localhost

            如果成功,则无需键入密码。

    4.创建一个例子

        1)新建一个workspace;

              #cd /home/hotye/work

               #mkdir workspace

               #cdworkspace

               #mkdir wordcount

               #cd wordcount

               #mkdirworkcount_class

               #mkdir input

        2)在wordcount文件夹下新建一个WordCount.java文件

             #vi WordCount.java

            

package org.myorg;
	
import java.io.IOException;
import java.util.*;
 	
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;

public class WordCount {

	public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {
	     private final static IntWritable one = new IntWritable(1);
	     private Text word = new Text();
	
	     public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
	       String line = value.toString();
 	       StringTokenizer tokenizer = new StringTokenizer(line);
 	       while (tokenizer.hasMoreTokens()) {
 	         word.set(tokenizer.nextToken());
 	         output.collect(word, one);
 	       }
 	     }
 	   }
 	
 	   public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> {
 	     public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
 	       int sum = 0;
 	       while (values.hasNext()) {
 	         sum += values.next().get();
 	       }
 	       output.collect(key, new IntWritable(sum));
 	     }
 	   }
 	
 	   public static void main(String[] args) throws Exception {
 	     JobConf conf = new JobConf(WordCount.class);
 	     conf.setJobName("wordcount");
 	
 	     conf.setOutputKeyClass(Text.class);
 	     conf.setOutputValueClass(IntWritable.class);
 	
 	     conf.setMapperClass(Map.class);
 	     conf.setCombinerClass(Reduce.class);
 	     conf.setReducerClass(Reduce.class);
 	
 	     conf.setInputFormat(TextInputFormat.class);
 	     conf.setOutputFormat(TextOutputFormat.class);
 	
 	     FileInputFormat.setInputPaths(conf, new Path(args[0]));
 	     FileOutputFormat.setOutputPath(conf, new Path(args[1]));
 	
 	     JobClient.runJob(conf);
 	   }
 	}

         3)编译WordCount.java

           #javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d wordcount_classes WordCount.java 

        4)把编译后到class文件打包

            #jar -cvf /home/hotye/work/workspace/wordcount/wordcount.jar -C wordcount_classes/ .

        5)在input文件夹下新建一个example.txt

             #cd /home/hotye/work/workspace/wordcount/input

             #vi example.txt

            输入一下内容:

             hello world hello bye hello world bye

             good hello world good

        6)进入hadoop文件夹下,把刚刚创建到example.txt文件拷贝到HDFS中

            #sudo bin/hadoop dfs -put /home/hotye/work/workspace/wordcount/input input

        7)执行WordCount.java程序

            #sudo bin/hadoop jar /home/hotye/work/workspace/wordcount/workcount.jar org.myorg.WordCount input output

       8)查看运算结果

             #sudo bin/hadoop dfs -cat /output/part-00000

              hello    3
              hotye    3
              huangzn    1 


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值