Hbase学习记录1

HBase介绍

官网:Apache HBase – Apache HBase™ Home

Nosql并不能直接说的非关系型数据库,这种说法是错的,泛指非关系型的数据库。

Nosql解释:not only sql 不仅仅是sql

扩展nosql

NOSQL 不仅仅是数据库,不能说非关系型数据库

--Hadoop Database,是一个高可靠性、高性能、面向列、可伸缩、实时读写的分布式数据库

--利用Hadoop HDFS作为其文件存储系统,利用Hadoop MapReduce来处理HBase中的海量数据,利用Zookeeper作为其分布式协同服务

--主要用来存储非结构化和半结构化的松散数据(列存NoSQL数据库)

Hbase 的cell单元格数据没有类型的,全部都是字节数组形式存储的

 Hbase数据模型

一行记录,时间搓(自动生成),3个列足(最小的控制单元)

rowkey一行记录,CF1~3,列足

ROW KEY

 

 Hbase数据模型

 时间搓

 Cell单元格

 Hbase架构图

Clint:客户端

Zookeeper:提交到zk里面

Hmaster:主备(高可用,两台机器或者两台服务器做同一件事情)/主从(一个主节点有n多个从节点);Hbase主节点

HRegionServer:Hbase的从节点

Hlog:WAL ,write ahead log 预写日志

HRegion:区域,相当于数据库中的每一个表

Store:列足

MemStore:内存

StoreFile:磁盘文件

HFile:StoreFile相当于HFile的包装

第一次Client先访问Zk验证表名,先通过ZK拿到元数据存储地址, 写入大小满足64m就会溢写到MemStore,如果手动写入,小文件过多情况下,会自动合并小文件,压缩。

Client会首先先写入到HLog,如果出现宕机的时候,可以通过HLog恢复,Client写入到HLog先写入到内存,每隔一秒写入磁盘

 

 

  

 

 

 Hbase 安装单机配置

  1. 从此 Apache 下载镜像列表中选择一个下载站点。单击建议的顶部链接。这将带您进入 HBase 版本的镜像。单击名为 stable 的文件夹,然后将以.tar.gz结尾的二进制文件下载到本地文件系统。现在不要下载以 src 结尾的文件.tar.gz

  2. 解压缩下载的文件,然后转到新创建的目录。

    $ tar xzvf hbase-2.0.6-bin.tar.gz
    $ cd hbase-2.0.6/
  3. 您需要在启动 HBase 之前设置环境变量。您可以通过操作系统的常用机制设置变量,但 HBase 提供了一个中心机制conf/hbase-env.sh。编辑此文件,取消注释 以 开头的行,并将其设置为适用于您的操作系统的位置。该变量应设置为包含可执行文件 bin/java 的目录。大多数现代Linux操作系统都提供了一种机制,例如RHEL或CentOS上的/usr/bin/alternatives,用于在Java等可执行文件版本之间透明地切换。在这种情况下,您可以设置为包含指向 bin/java 的符号链接的目录,这通常是 /usr。(配置javahome目录JAVA_HOMEJAVA_HOMEJAVA_HOMEJAVA_HOME  JAVA_HOME=/usr

  4. 编辑 conf/hbase-site.xml,这是主要的 HBase 配置文件。此时,您需要在本地文件系统上指定 HBase 和 ZooKeeper 写入数据的目录,并确认一些风险。默认情况下,将在 /tmp 下创建一个新目录。许多服务器配置为在重新启动时删除 /tmp 的内容,因此应将数据存储在其他位置。以下配置将 HBase 的数据存储在 hbase 目录中,该目录位于名为 .将标记粘贴到标记下方,在新的 HBase 安装中,标记应为空。

    示例 1.示例 hbase-site.xml 用于独立 HBase
    <configuration>
      <property>
        <name>hbase.rootdir</name>
        <value>file:///home/testuser/hbase</value>
      </property>
      <property>
        <name>hbase.zookeeper.property.dataDir</name>
        <value>/home/testuser/zookeeper</value>
      </property>
      <property>
        <name>hbase.unsafe.stream.capability.enforce</name>
        <value>false</value>
        <description>
          Controls whether HBase will check for stream capabilities (hflush/hsync).
    
          Disable this if you intend to run on LocalFileSystem, denoted by a rootdir
          with the 'file://' scheme, but be mindful of the NOTE below.
    
          WARNING: Setting this to false blinds you to potential data loss and
          inconsistent system state in the event of process and/or node failures. If
          HBase is complaining of an inability to use hsync or hflush it's most
          likely not a false positive.
        </description>
      </property>
    </configuration>

 Hbase 启动不需启动zk,Hbase自带zk,如果启动可能会出现端口占用异常,2181端口。

配置

1.解压

tar -zxvf hbase.tar.gz -C /opt
mv /opt/hbase2.05 /opt/hbase

2.删除docs文件,可以不用删,hbase文档,删除防止拷贝文件大

3.配置环境变量

vim /etc/profile

export HBASE_HOME=/opt/hbase
export PATH=$PATH:$HBASE_HOME/bin

source /etc/profile

4.配置conf/hbase-env.sh

export JAVA_HOME=/usr/java/jdk1.8.0_162

5.配置conf/hbase-site.xml

<configuration>
  <property>
        <!--说明-->hbase根目录<!---->
    <name>hbase.rootdir</name>
    <value>file:///home/testuser/hbase</value>
  </property>
  <property>
        <!--说明-->存放zk的目录<!---->
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/home/testuser/zookeeper</value>
  </property>
  <property>
        <!--说明-->单节点的时候,这个属性必须是fales<!---->
    <name>hbase.unsafe.stream.capability.enforce</name>
    <value>false</value>
    <description>
      Controls whether HBase will check for stream capabilities (hflush/hsync).

      Disable this if you intend to run on LocalFileSystem, denoted by a rootdir
      with the 'file://' scheme, but be mindful of the NOTE below.

      WARNING: Setting this to false blinds you to potential data loss and
      inconsistent system state in the event of process and/or node failures. If
      HBase is complaining of an inability to use hsync or hflush it's most
      likely not a false positive.
    </description>
  </property>
</configuration>

启动

start-hbase.sh

[root@master hbase]# start-hbase.sh 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hbase/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.7/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
running master, logging to /opt/hbase/logs/hbase-root-master-master.out
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hbase/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.7/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

jps
3790 HMaster

独立实例具有所有 HBase 守护程序(主、区域服务器和 ZooKeeper)在单个 JVM 中运行,这些守护程序将持续到本地文件系统

注意:

1.x之前,默认端口是60010

1.x之后,默认端口是16010

web 页面  http://192.168.23.94:16010/

 shell页面

因为前面配置了环境变量 直接

hbase shell


[root@master hbase]# hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hbase/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.7/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.0.5, r76458dd074df17520ad451ded198cd832138e929, Mon Mar 18 00:41:49 UTC 2019
Took 0.0025 seconds                                                                                                                                                                                                                             
hbase(main):001:0> 

或者参考:(27条消息) HBase搭建--单机_qq_42722387的博客-CSDN博客 

命令说明

版本2.x

hbase(main):006:0> help

COMMAND GROUPS: 命令组
  Group name: general 通用的
      Commands: 
            processlist, 
            status,
            table_help, 
            version,
            whoami

  Group name: ddl 数据库定义语言
      Commands: 
            alter, 
            alter_async,
            alter_status, 
            create(创建表),
            describe(描述表), 
            disable(禁用这张表), 
            disable_all, 
            drop(drop删除这张表), 
            drop_all, enable, 
            enable_all, 
            exists, 
            get_table, 
            is_disabled, 
            is_enabled, 
            list(展示表), 
            list_regions, 
            locate_region, 
            show_filters

  Group name: namespace 数据库的概率,命令空间,默认hbase default这两个命令空间
      Commands:
            alter_namespace, 
            create_namespace, 
            describe_namespace, 
            drop_namespace, 
            list_namespace, 
            list_namespace_tables

  Group name: dml 增删改操作
      Commands: 
            append(追加), 
            count(统计这张表有多少记录), 
            delete(删除这张表), 
            deleteall(清空白表数据), 
            get(获取一条表数据), 
            get_counter(统计计数器), 
            get_splits(获取切片), 
            incr(递增,类似累加器), 
            put(放数据), 
            scan(遍历获取数据), 
            truncate(删除数据), 
            truncate_preserve

  Group name: tools  工具命令
  Commands: assign, balance_switch, balancer(负载均衡命令), balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, cleaner_chore_enabled, cleaner_chore_run, cleaner_chore_switch, clear_block_cache, clear_compaction_queues, clear_deadservers, close_region, compact, compact_rs, compaction_state, flush(溢写磁盘文件), is_in_maintenance_mode, list_deadservers, major_compact, merge_region, move, normalize, normalizer_enabled, normalizer_switch, split, splitormerge_enabled, splitormerge_switch, trace, unassign, wal_roll, zk_dump

  Group name: replication 副本
  Commands: add_peer, append_peer_exclude_namespaces, append_peer_exclude_tableCFs, append_peer_namespaces, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, get_peer_config, list_peer_configs, list_peers, list_replicated_tables, remove_peer, remove_peer_exclude_namespaces, remove_peer_exclude_tableCFs, remove_peer_namespaces, remove_peer_tableCFs, set_peer_bandwidth, set_peer_exclude_namespaces, set_peer_exclude_tableCFs, set_peer_namespaces, set_peer_replicate_all, set_peer_tableCFs, show_peer_tableCFs, update_peer_config

  Group name: snapshots 快照
  Commands: clone_snapshot, delete_all_snapshot, delete_snapshot, delete_table_snapshots, list_snapshots, list_table_snapshots, restore_snapshot, snapshot

  Group name: configuration 配置文件
  Commands: update_all_config, update_config

  Group name: quotas
  Commands: list_quota_snapshots, list_quota_table_sizes, list_quotas, list_snapshot_sizes, set_quota

  Group name: security 安全配置
  Commands: grant, list_security_capabilities, revoke, user_permission

  Group name: procedures 存储功能
  Commands: list_locks, list_procedures


2.0新增加功能↓

  Group name: visibility labels
  Commands: add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility

  Group name: rsgroup
  Commands: add_rsgroup, balance_rsgroup, get_rsgroup, get_server_rsgroup, get_table_rsgroup, list_rsgroups, move_namespaces_rsgroup, move_servers_namespaces_rsgroup, move_servers_rsgroup, move_servers_tables_rsgroup, move_tables_rsgroup, remove_rsgroup, remove_servers_rsgroup
查看命名空间,在不指定空间的时候,默认放在default
hbase(main):007:0> list_namespace
NAMESPACE                                                                                                                                                                           
default                                                                                                                                                                             
hbase                                                                                                                                                                               
2 row(s)
Took 0.0369 seconds   

展示default命名空间有多少表
hbase(main):008:0> list
TABLE                                                                                                                                                                               
0 row(s)
Took 0.0084 seconds                                                                                                                                                                 
=> []

指定查看命名空间有多少表
hbase(main):011:0> list_namespace_tables 'hbase'
TABLE                                                                                                                                                                               
meta                                                                                                                                                                                
namespace                                                                                                                                                                           
2 row(s)
Took 0.0352 seconds                                                                                                                                                                 
=> ["meta", "namespace"]


遍历表中所有数据
hbase(main):014:0> scan 'hbase:meta'
ROW                                            COLUMN+CELL                                                                                                                          
 hbase:namespace                               column=table:state, timestamp=1655045340831, value=\x08\x00                                                                          
 hbase:namespace,,1655045339932.3c789cff68aa91 column=info:regioninfo, timestamp=1655045340820, value={ENCODED => 3c789cff68aa91b63c653278b31d1bcb, NAME => 'hbase:namespace,,165504
 b63c653278b31d1bcb.                           5339932.3c789cff68aa91b63c653278b31d1bcb.', STARTKEY => '', ENDKEY => ''}                                                            
 hbase:namespace,,1655045339932.3c789cff68aa91 column=info:seqnumDuringOpen, timestamp=1655045340820, value=\x00\x00\x00\x00\x00\x00\x00\x02                                        
 b63c653278b31d1bcb.                                                                                                                                                                
 hbase:namespace,,1655045339932.3c789cff68aa91 column=info:server, timestamp=1655045340820, value=master:16020                                                                      
 b63c653278b31d1bcb.                                                                                                                                                                
 hbase:namespace,,1655045339932.3c789cff68aa91 column=info:serverstartcode, timestamp=1655045340820, value=1655045327420                                                            
 b63c653278b31d1bcb.                                                                                                                                                                
 hbase:namespace,,1655045339932.3c789cff68aa91 column=info:sn, timestamp=1655045340614, value=master,16020,1655045327420                                                            
 b63c653278b31d1bcb.                                                                                                                                                                
 hbase:namespace,,1655045339932.3c789cff68aa91 column=info:state, timestamp=1655045340820, value=OPEN                                                                               
 b63c653278b31d1bcb.                                                                                                                                                                
2 row(s)
Took 0.0641 seconds 


创建一张表
hbase(main):019:0> create 'psn', (表名
hbase(main):020:0* 'cf'    (列足
Created table psn
Took 0.8356 seconds                                                                                                                                                                 
=> Hbase::Table - psn

hbase(main):022:0> create 'psn1','cf1','cf2' psn1表名 cf1-2列足
Created table psn1
Took 0.7475 seconds                                                                                                                                                                 
=> Hbase::Table - psn1

查看列足描述
hbase(main):023:0> describe 'psn' 或者 desc 'psn'
Table psn is ENABLED                                                                                                                                                                
psn                                                                                                                                                                                 
COLUMN FAMILIES DESCRIPTION                                                                                                                                                         
{NAME => 'cf', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODIN
G => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 
'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}                                                                     
1 row(s)
Took 0.0898 seconds 

插入数据
hbase(main):026:0> put 'psn','1','cf:name','zhangsan'
Took 0.0679 seconds  

hbase(main):027:0> scan 'psn'
ROW                                            COLUMN+CELL                                                                                                                          
 1                                             column=cf:name, timestamp=1655049403593, value=zhangsan                                                                              
1 row(s)
Took 0.0085 seconds  

hbase(main):028:0> put 'psn','2','cf:gae','12'
Took 0.0049 seconds                                                                                                                                                                 
hbase(main):029:0> scan 'psn'
ROW                                            COLUMN+CELL                                                                                                                          
 1                                             column=cf:name, timestamp=1655049403593, value=zhangsan                                                                              
 2                                             column=cf:gae, timestamp=1655049489586, value=12                                                                                     
2 row(s)
Took 0.0078 seconds 

删除表
1.先把表禁用
hbase(main):037:0> disable 'psn'
Took 0.4813 seconds 
2.删除表                                                                                                                                                                
hbase(main):038:0> drop 'psn'
Took 0.2619 seconds  

溢写数据到磁盘,如果不溢写,到达64m才会写到磁盘
hbase(main):043:0> flush 'psn'
Took 0.2378 seconds 

[root@master cf]# pwd
/home/testuser/hbase/data/default/psn/bcd00127bbc4db96f791acd1c0c22d72/cf
[root@master cf]# ll
总用量 8
-rw-rw-rw-. 1 root root 4832 6月  13 00:05 38915b69b55245b599121f5aba49f1cf

查看表文件
hbase hfile -p -f file:///home/testuser/hbase/data/default/psn/bcd00127bbc4db96f791acd1c0c22d72/cf/38915b69b55245b599121f5aba49f1cf 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hbase/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.7/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2022-06-13 00:08:55,842 INFO  [main] metrics.MetricRegistries: Loaded MetricRegistries class org.apache.hadoop.hbase.metrics.impl.MetricRegistriesImpl
K: 1/cf:name/1655049916196/Put/vlen=8/seqid=4 V: zhangsan
Scanned kv count -> 1

停止hbase
stop-hbase.sh

搭建完全分布式

启动zkcli.sh 

[zk: localhost:2181(CONNECTED) 0] ls /
[cluster, controller_epoch, brokers, zookeeper, admin, isr_change_notification, consumers, log_dir_event_notification, latest_producer_id_block, config, hbase]
[zk: localhost:2181(CONNECTED) 1] rmr /hbase

删除hbase

停止zk

参考:(27条消息) Hbase分布式_qq_42722387的博客-CSDN博客

把/hadoop/etc/hadoop/目录的core-site.xml和hdfs-site.xml文件复制到/hbase/conf/目录中

cd /hadoop/etc/hadoop
cp core-site.xml hdfs-site.xml /hbase/conf

如果zk三台集群都有直接启动

因为本地只有master一个有zk,所有配置更改

 vim hbase-env.sh

#使用主机的zk实列
export HBASE_MANAGES_ZK=true

更改合并文件时间,每隔一秒压缩一次小文件

  <property>
    <!--在搜索工作之间入睡的时间(以毫秒为单位)。由服务线程(如日志滚筒)用作休眠间隔-->
    <name>hbase.server.thread.wakefrequency</name>
    <value>1</value>
  </property>

  <property>
    <!--配置hbase检查的频率(秒),以查看是否被压缩-->
<!--检查间隔为 hbase.server.compactchecker.interval.multiplier 乘以 hbase.server.thread.wakefrequency-->
    <name>hbase.server.compactchecker.interval.multiplier</name>
    <value>1</value>
  </property>

!此处有bug,只有在重新启动hbase的时候才会合并文件

  • 8
    点赞
  • 31
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值