Hadoop学习(一): Ubuntu上安装Hadoop

Hadoop学习(一): Ubuntu上安装Hadoop

1.安装ssh

$ sudo apt-get install openssh-client
$ sudo apt-get install openssh-server

2.查看JAVA_HOME变量值

/opt/jdk1.8.0_91

3.安装hadoop-2.7.2
从官网(http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.7.2/)下载,解压到hadoop-2.7.2

4.修改hadoop-2.7.2的etc/hadoop/hadoop-env.sh文件,设置JAVA_HOME

export JAVA_HOME=/opt/jdk1.8.0_91

输入以下命令,弹出hadoop的用法,则配置成功

$ bin/hadoop

hadoop支持以下三种模式:

5.Standalone Operation(单机模式)
开启ssh服务

$ sudo /etc/init.d/ssh start

免密码登陆

#client端产生密钥:
$ ssh-keygen -t rsa
#server端:
$ cp id_rsa.pub authorized_keys
$ chmod 600 authorized_keys

测试:The following example copies the unpacked conf directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.

 $ mkdir input
 $ cp etc/hadoop/*.xml input
 $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input output 'dfs[a-z.]+'
 $ cat output/*

6.Pseudo-Distributed Operation(单机伪分布模式)
修改两处文件:
etc/hadoop/core-site.xml:

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

etc/hadoop/hdfs-site.xml:

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

ssh登陆:

$ ssh localhost

执行:

//Format the filesystem:
$ bin/hdfs namenode -format

//Start NameNode daemon and DataNode daemon:
$ sbin/start-dfs.sh

Browse the web interface for the NameNode; by default it is available at: NameNode - http://localhost:50070/

//Make the HDFS directories required to execute MapReduce jobs:
$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/<username>

//Copy the input files into the distributed filesystem:
$ bin/hdfs dfs -put etc/hadoop input

//Run some of the examples provided:
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input output 'dfs[a-z.]+'

//Examine the output files: Copy the output files from the distributed filesystem to the local filesystem and examine them:
$ bin/hdfs dfs -get output output
$ cat output/*
//or view the output files on the distributed filesystem:
$ bin/hdfs dfs -cat output/*

//When you’re done, stop the daemons with:
$ sbin/stop-dfs.sh
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值