Hadoop1.0.2+hbase0.94.0+linux完全分布式搭建

最新推荐文章于 2024-10-14 19:05:55 发布

kky2010_110

最新推荐文章于 2024-10-14 19:05:55 发布

阅读量1.4k

点赞数

文章标签： hbase hadoop linux 虚拟机 java zk

本文链接：https://blog.csdn.net/kky2010_110/article/details/7996051

版权

linux 3台虚拟机：

192.169.200.101 cpyftest-1（主机名或者域名） Namenode（别名）

192.169.200.161 cpyftest-2（主机名或者域名） Datanode1（别名）

192.169.200.191 cpyftest-3（主机名或者域名） Datanode2（别名）

hadoop可运行兼容文件：

hadoop1.0.2替换了核心包的可以正常使用的版本包含hbase0.94.0

>1.设置/etc/hosts

192.169.200.101 cpyftest-1 Namenode

192.169.200.161 cpyftest-2 Datanode1

192.169.200.191 cpyftest-3 Datanode2

>2设置账户：

3台虚拟机的用户组和用户名和密码一致

>3.设置三台虚拟机ssh无密码登录

在三台虚拟机上运行命令：ssh-keygen -t rsa

把生成的3个id_rsa.pub拷贝到 authorized_keys中每台虚拟机持有一份该文件

参考命令：cat id_rsa.pub >> authorized_keys 拷贝内容

在101运行命令：scp authorized_keys cpyftest-2 :/.ssh/ 实在不行就打开文件复制黏贴

>4.配置hadoop文件

hadoop-env.sh

export JAVA_HOME=/usr/java/jdk1.7.0_07

core-sit.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>  
      <property>  
        <name>fs.default.name</name>  
        <value>hdfs://192.169.200.101:9000</value>  
      </property>  
 </configuration>

hdf-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
 <property>  
        <name>dfs.replication</name>  
        <value>2</value>  设置拷贝2份
      </property> 
      
 <property>
 <name>dfs.data.dir</name>      
 <value>/root/hadoop/data</value>  
 </property>

      <property>  
        <name>dfs.web.ugi</name>  
        <value>用户名,用户组</value>  设置web访问权限否则提示：no id webuser
      </property> 
      
      <property>  
        <name>dfs.permissions</name>  
        <value>true</value>  
      </property> 
     
      <property>  
        <name>dfs.permissions.supergroup</name>  
        <value>supergroup</value>  
      </property> 
      
      
</configuration>

mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
 <property>  
      <name>mapred.job.tracker</name>  
    <value>192.169.200.101:9001</value>
    </property>  
</configuration>

>5配置master和slaves

在hadoop/conf文件夹中找到

两个文件

master配置如下

192.169.200.101 或 cpyftest-1 或 Namenode 选择一个即可

slaves

配置其他两个

Datanode1

Datanode2

>6把配置好的hadoop文件每台机器拷贝一份

>7在101机器进入hadoop/bin目录中运行命令：./start-all.sh

运行结束后输入命令：./jps 查看进程（jps是java自带的，如果提示没有该命令请检查java的环境变量是否设置了，也可以进入java/bin目录中运行该命令）

28037 NameNode 名称节点进程 28037是进程号

28950 Jps28220 SecondaryNameNode 辅助名称节点进程 28220是进程号

28259 JobTracker 作业跟踪器进程 28259是进程号

>8在161和191运行./jps

查看进程都会有如下进程：

DataNode 数据节点进程

Jps java进程

TaskTracker 任务跟踪器进程

至此hadoop可以运行了

请查看http://namenode:50070查看是否运行正常以及后台log日志是否报错

>9配置hbase该文件也是替换了hadoop的核心包，可以正常运行，同样需要拷贝到其他的2台机器上，要求路径相同，并且计算机的时间相同

hbase-env.sh

export JAVA_HOME=/usr/java/jdk1.7.0_07
其中默认文件的jar包引入错误
# Extra Java CLASSPATH elements. Optional.
export HBASE_CLASSPATH=$HADOOP_CLASSPATH:$HBASE_HOME/hbase-0.94.0.jar:$HBASE_HOME/hbase-0.94.0-tests.jar:$HBASE_HOME/conf:${HBASE_HOME}/lib/zookeeper-3.4.3.jar

export HBASE_MANAGES_ZK=true 完全分布式为true，自带的zookeeper

hbase.site.xml

<property>
    <name>hbase.rootdir</name>
    <value>hdfs://192.169.200.101:9000/hbase</value>
    <description>The directory shared by region servers and into
    which HBase persists.  The URL should be 'fully-qualified'
    to include the filesystem scheme.  For example, to specify the
    HDFS directory '/hbase' where the HDFS instance's namenode is
    running at namenode.example.org on port 9000, set this value to:
    hdfs://namenode.example.org:9000/hbase.  By default HBase writes
    into /tmp.  Change this configuration else all data will be lost
    on machine restart.
    </description>
  </property>
   <property>
    <name>hbase.tmp.dir</name>
    <value>hdfs://192.169.200.101:9000/tmp</value>
    <description>Temporary directory on the local filesystem.
    Change this setting to point to a location more permanent
    than '/tmp' (The '/tmp' directory is often cleared on
    machine restart).
    </description>
  </property>

<property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
    <description>The mode the cluster will be in. Possible values are
      false for standalone mode and true for distributed mode.  If
      false, startup will run all HBase and ZooKeeper daemons together
      in the one JVM.
    </description>
  </property>
<property>
    <name>hbase.tmp.dir</name>
    <value>hdfs://192.169.200.101:9000/tmp</value>
    <description>Temporary directory on the local filesystem.
    Change this setting to point to a location more permanent
    than '/tmp' (The '/tmp' directory is often cleared on
    machine restart).
    </description>
  </property>
<property>
    <name>hbase.zookeeper.quorum</name>
    <value>192.169.200.101,192.169.200.161,192.169.200.191</value>
    <description>Comma separated list of servers in the ZooKeeper Quorum.
    For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
    By default this is set to localhost for local and pseudo-distributed modes
    of operation. For a fully-distributed setup, this should be set to a full
    list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh
    this is the list of servers which we will start/stop ZooKeeper on.
    </description>
  </property>

regionservers中配置

Namenode

Datanode1

Datanode2

>10进入hbase/bin目录中

运行：./start-hbase.sh

运行：./hbase shell

在shell中运行list 不报错则运行正常

查看Master

http://192.169.200.101:60010/master.jsp

查看Region Server

http://192.169.200.101:60030/regionserver.jsp

查看ZK Tree

http://192.169.200.101:60010/zk.jsp

参考：http://www.itpub.net/thread-1713683-1-1.html

http://genius-bai.iteye.com/blog/641724

http://www.yankay.com/wp-content/hbase/book.html

http://www.linuxidc.com/Linux/2012-03/55622.htm