hadoop安装配置

Hadoop的整体框架

Hadoop由HDFS、MapReduce、HBase、Hive和ZooKeeper等成员组成,其中最基础最重要元素为底层用于存储集群中所有存储节点文件的文件系统HDFS(Hadoop Distributed File System)来执行MapReduce程序的MapReduce引擎。

  1. Pig是一个基于Hadoop的大规模数据分析平台,Pig为复杂的海量数据并行计算提供了一个简单的操作和编程接口;
  2. Hive是基于Hadoop的一个工具,提供完整的SQL查询,可以将sql语句转换为MapReduce任务进行运行;
  3. ZooKeeper:高效的,可拓展的协调系统,存储和协调关键共享状态;
  4. HBase是一个开源的,基于列存储模型的分布式数据库;
  5. HDFS是一个分布式文件系统,有着高容错性的特点,适合那些超大数据集的应用程序;
  6. MapReduce是一种编程模型,用于大规模数据集(大于1TB)的并行运算。
下图是一个典型的Hadoop集群的部署结构:

这里写图片描述

接着给出Hadoop各组件依赖共存关系:

这里写图片描述

安装:

jdk安装:

[root@server1 ~]#  useradd -u 800 hadoop
[root@server1 ~]# id hadoop
uid=800(hadoop) gid=800(hadoop) groups=800(hadoop)
[root@server1 ~]# 
[root@server1 hadoop]# su - hadoop
[hadoop@server1 ~]$ ls
hadoop-2.7.3.tar.gz  jdk-7u79-linux-x64.tar.gz
[hadoop@server1 ~]$ tar -zxf jdk-7u79-linux-x64.tar.gz 
[hadoop@server1 ~]$ tar -zxf hadoop-2.7.3.tar.gz 
[hadoop@server1 ~]$ ln -s jdk1.7.0_79/ java
[hadoop@server1 ~]$ ln -s hadoop-2.7.3 hadoop
[hadoop@server1 ~]$ ll
total 359004
lrwxrwxrwx 1 hadoop hadoop        12 Aug 26 07:47 hadoop -> hadoop-2.7.3
drwxr-xr-x 9 hadoop hadoop      4096 Aug 18  2016 hadoop-2.7.3
-rwxr-xr-x 1 root   root   214092195 Aug 25 23:26 hadoop-2.7.3.tar.gz
lrwxrwxrwx 1 hadoop hadoop        12 Aug 26 07:47 java -> jdk1.7.0_79/
drwxr-xr-x 8 hadoop hadoop      4096 Apr 11  2015 jdk1.7.0_79
-rwxr-xr-x 1 root   root   153512879 Aug 25 23:26 jdk-7u79-linux-x64.tar.gz
[hadoop@server1 ~]$ 
修改配置:
[hadoop@server1 ~]$ cd hadoop/etc/hadoop/
[hadoop@server1 hadoop]$ vim hadoop-env.sh

---------
export JAVA_HOME=/home/hadoop/java
测试:
[hadoop@server1 hadoop]$ pwd
/home/hadoop/hadoop
[hadoop@server1 hadoop]$ mkdir input
[hadoop@server1 hadoop]$ cp etc/hadoop/* input/
[hadoop@server1 hadoop]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar grep input  output 'dfs[a-z.]+'
-----
 ##过滤input表中以dfs开头的文件,cp到output表中
 -----
[root@server1 hadoop]# ls output/
[hadoop@server1 hadoop]$ ls output/
part-r-00000  _SUCCESS
[hadoop@server1 hadoop]$ 
伪分布式hadoop配置
配置hadoop
[hadoop@server1 hadoop]$ cd etc/hadoop/
[hadoop@server1 hadoop]$ vim core-site.xml 
------------
<configuration>
    <property>
            <name>fs.defaultFS</name>
                    <value>hdfs://172.25.5.1:9000</value>
                        </property>
</configuration>
-------------
[hadoop@server1 hadoop]$ vim slaves 
------------
172.25.12.1
-------------
[hadoop@server1 hadoop]$ vim hdfs-site.xml 
--------------
<configuration>
    <property>
            <name>dfs.replication</name>
                    <value>1</value>
                        </property>
</configuration>
--------------
ssh免密
[hadoop@server1 hadoop]$ ssh-keygen 
-------------
-------------
[hadoop@server1 hadoop]$ cd ~/.ssh/
[hadoop@server1 .ssh]$ cp id_rsa.pub authorized_keys
[hadoop@server1 .ssh]$ ssh localhost
The authenticity of host 'localhost (::1)' can't be established.
RSA key fingerprint is 6e:e3:30:10:e9:ec:a3:ab:f7:92:ce:3b:73:c1:b8:3f.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
[hadoop@server1 ~]$ logout
Connection to localhost closed.
[hadoop@server1 .ssh]$ ls
authorized_keys  id_rsa  id_rsa.pub  known_hosts
----检测
--------
[hadoop@server1 hadoop]$ ssh 0.0.0.0
Last login: Sun Aug 26 08:05:04 2018 from localhost
[hadoop@server1 ~]$ logout
Connection to 0.0.0.0 closed.
[hadoop@server1 hadoop]$ ssh 172.25.5.1
Last login: Sun Aug 26 08:16:11 2018 from localhost
[hadoop@server1 ~]$ logout
Connection to 172.25.5.1 closed.
[hadoop@server1 hadoop]$ ssh server1
Last login: Sun Aug 26 08:16:20 2018 from server1
[hadoop@server1 ~]$ logout
Connection to server1 closed.
[hadoop@server1 ~]$ 

格式化hdfs
[hadoop@server1 hadoop]$ pwd
/home/hadoop/hadoop
[hadoop@server1 hadoop]$ bin/hdfs namenode -format
18/08/26 08:10:08 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = server1/172.25.5.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.7.3
-------
-------
[hadoop@server1 hadoop]$ ls /tmp/
hadoop-hadoop  hsperfdata_hadoop  yum.log
[hadoop@server1 hadoop]$ 
启动dfs
[hadoop@server1 hadoop]$ sbin/start-dfs.sh
Starting namenodes on [server1]
server1: starting namenode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-namenode-server1.out
172.25.5.1: starting datanode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-datanode-server1.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-hadoop-secondarynamenode-server1.out
[hadoop@server1 hadoop]$ 
配置环境变量
[hadoop@server1 hadoop]$ cd ~
[hadoop@server1 ~]$ vim .bash_profile 
[hadoop@server1 ~]$ source .bash_profile 
[hadoop@server1 ~]$ jps
2657 NameNode
2766 DataNode
3059 Jps
2945 SecondaryNameNode
[hadoop@server1 ~]$ 

这里写图片描述

浏览器中访问:

这里写图片描述

[hadoop@server1 hadoop]$ bin/hdfs dfs -mkdir  /user
[hadoop@server1 hadoop]$ bin/hdfs dfs -mkdir  /user/hadoop
[hadoop@server1 hadoop]$ mkdir input
mkdir: cannot create directory `input': File exists
[hadoop@server1 hadoop]$ cp etc/hadoop/* input/
[hadoop@server1 hadoop]$  bin/hdfs dfs -put input/
-----------
-----------
[hadoop@server1 hadoop]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar  wordcount input output   
##统计input表中的单词的数量,并导入到output中
-----------
-----------
此时网页访问,可以看到内容

这里写图片描述

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值