HBase 修改 TTL 属性以释放空间

hbase 同时被 2 个专栏收录
4 篇文章 0 订阅
1 篇文章 0 订阅

40. Time To Live (TTL)

ColumnFamilies can set a TTL length in seconds, and HBase will automatically delete rows once the expiration time is reached. This applies to all versions of a row - even the current one. The TTL time encoded in the HBase for the row is specified in UTC.

Store files which contains only expired rows are deleted on minor compaction. Setting hbase.store.delete.expired.storefile to false disables this feature. Setting minimum number of versions to other than 0 also disables this.

See HColumnDescriptor for more information.

Recent versions of HBase also support setting time to live on a per cell basis. See HBASE-10560 for more information. Cell TTLs are submitted as an attribute on mutation requests (Appends, Increments, Puts, etc.) using Mutation#setTTL. If the TTL attribute is set, it will be applied to all cells updated on the server by the operation. There are two notable differences between cell TTL handling and ColumnFamily TTLs:

  • Cell TTLs are expressed in units of milliseconds instead of seconds.

  • A cell TTLs cannot extend the effective lifetime of a cell beyond a ColumnFamily level TTL setting.

 40.生存时间(TTL)
ColumnFamilies可以设置TTL长度(以秒为单位),HBase将在到达到期时间后自动删除行。这适用于行的所有版本 - 即使是当前版本。在HBase中为行编码的TTL时间以UTC指定。
在轻微压缩时删除仅包含过期行的存储文件。设置hbase.store.delete.expired.storefile为false禁用此功能。将最小版本数设置为0以外也会禁用此功能。
最新版本的HBase还支持基于每个单元格设置生存时间。使用Mutation#setTTL将cell TTL作为突变请求(Appends,Increments,Puts等)的属性提交。如果设置了TTL属性,它将应用于操作在服务器上更新的所有单元格。

Cell的TTL与Column family的TTL区别:

  • Column family的TTL以秒为单位,cell的TTL以毫秒为单位
  • 如果有有cell级别的TTL,则cell的TTL override CF的TTL; 但是不能超出CF级别的TTL

以上内容来自Apache的hbase官网,可供参考。以下实际操作一下。

创建表:
    create 'dc:event',{NAME => 'f1'},{NAME => 'cf'},{NAME => 'f2'}
查看表结构:
    desc "dc:event"
'dc:event', {NAME => 'cf', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1',COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'f1', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'f2', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}


put 数据
    put  'dc:event','866925023233621','f1:eventid','866925023233621'
    put  'dc:event','866925023233622','f1:eventid','866925023233621'
    put  'dc:event','866925023233623','f1:eventid','866925023233621'
    put  'dc:event','866925023233624','f1:eventid','866925023233621'
    put  'dc:event','866925023233625','f1:eventid','866925023233621'
    put  'dc:event','866925023233626','f1:eventid','866925023233621'
    put  'dc:event','866925023233627','f1:eventid','866925023233621'
    put  'dc:event','866925023233628','f1:eventid','866925023233621'
    put  'dc:event','866925023233629','f1:eventid','866925023233621'
    put  'dc:event','866925023233630','f1:eventid','866925023233621'
    put  'dc:event','8669250232336-21','cf:eventid','866925023233621'
    put  'dc:event','8669250232336-22','cf:eventid','866925023233621'
	put  'dc:event','8669250232336-23','cf:eventid','866925023233621'
	put  'dc:event','8669250232336-24','cf:eventid','866925023233621'
	put  'dc:event','8669250232336-25','cf:eventid','866925023233621'
	put  'dc:event','8669250232336-26','cf:eventid','866925023233621'
	put  'dc:event','8669250232336-27','cf:eventid','866925023233621'
	put  'dc:event','8669250232336-28','cf:eventid','866925023233621'
	put  'dc:event','8669250232336-29','cf:eventid','866925023233621'
	put  'dc:event','8669250232336-30','cf:eventid','866925023233621'
	put  'dc:event','866925023233-6-21','f2:eventid','866925023233621'
	put  'dc:event','866925023233-6-22','f2:eventid','866925023233621'
	put  'dc:event','866925023233-6-23','f2:eventid','866925023233621'
	put  'dc:event','866925023233-6-24','f2:eventid','866925023233621'
	put  'dc:event','866925023233-6-25','f2:eventid','866925023233621'
	put  'dc:event','866925023233-6-26','f2:eventid','866925023233621'
	put  'dc:event','866925023233-6-27','f2:eventid','866925023233621'
	put  'dc:event','866925023233-6-28','f2:eventid','866925023233621'
	put  'dc:event','866925023233-6-29','f2:eventid','866925023233621'
	put  'dc:event','866925023233-6-30','f2:eventid','866925023233621'


scan 'dc:event'
	hbase(main):048:0> scan 'dc:event'
	ROW                                              COLUMN+CELL                                                                                                                                   
	 866925023233-6-21                               column=f2:eventid, timestamp=1536805384815, value=866925023233621                                                                             
	 866925023233-6-22                               column=f2:eventid, timestamp=1536805384873, value=866925023233621                                                                             
	 866925023233-6-23                               column=f2:eventid, timestamp=1536805384881, value=866925023233621                                                                             
	 866925023233-6-24                               column=f2:eventid, timestamp=1536805384890, value=866925023233621                                                                             
	 866925023233-6-25                               column=f2:eventid, timestamp=1536805384898, value=866925023233621                                                                             
	 866925023233-6-26                               column=f2:eventid, timestamp=1536805384907, value=866925023233621                                                                             
	 866925023233-6-27                               column=f2:eventid, timestamp=1536805384922, value=866925023233621                                                                             
	 866925023233-6-28                               column=f2:eventid, timestamp=1536805384936, value=866925023233621                                                                             
	 866925023233-6-29                               column=f2:eventid, timestamp=1536805384946, value=866925023233621                                                                             
	 866925023233-6-30                               column=f2:eventid, timestamp=1536805384958, value=866925023233621                                                                             
	 8669250232336-21                                column=cf:eventid, timestamp=1536805310816, value=866925023233621                                                                             
	 8669250232336-22                                column=cf:eventid, timestamp=1536805310850, value=866925023233621                                                                             
	 8669250232336-23                                column=cf:eventid, timestamp=1536805310861, value=866925023233621                                                                             
	 8669250232336-24                                column=cf:eventid, timestamp=1536805310870, value=866925023233621                                                                             
	 8669250232336-25                                column=cf:eventid, timestamp=1536805310881, value=866925023233621                                                                             
	 8669250232336-26                                column=cf:eventid, timestamp=1536805310890, value=866925023233621                                                                             
	 8669250232336-27                                column=cf:eventid, timestamp=1536805310911, value=866925023233621                                                                             
	 8669250232336-28                                column=cf:eventid, timestamp=1536805310918, value=866925023233621                                                                             
	 8669250232336-29                                column=cf:eventid, timestamp=1536805310930, value=866925023233621                                                                             
	 8669250232336-30                                column=cf:eventid, timestamp=1536805310937, value=866925023233621                                                                             
	 866925023233621                                 column=f1:eventid, timestamp=1536805258985, value=866925023233621                                                                             
	 866925023233622                                 column=f1:eventid, timestamp=1536805259053, value=866925023233621                                                                             
	 866925023233623                                 column=f1:eventid, timestamp=1536805259060, value=866925023233621                                                                             
	 866925023233624                                 column=f1:eventid, timestamp=1536805259070, value=866925023233621                                                                             
	 866925023233625                                 column=f1:eventid, timestamp=1536805259078, value=866925023233621                                                                             
	 866925023233626                                 column=f1:eventid, timestamp=1536805259084, value=866925023233621                                                                             
	 866925023233627                                 column=f1:eventid, timestamp=1536805259112, value=866925023233621                                                                             
	 866925023233628                                 column=f1:eventid, timestamp=1536805259119, value=866925023233621                                                                             
	 866925023233629                                 column=f1:eventid, timestamp=1536805259127, value=866925023233621                                                                             
	 866925023233630                                 column=f1:eventid, timestamp=1536805259143, value=866925023233621                                                                             
	30 row(s) in 0.0920 seconds


以下内容设置TTL值,
1.disable 'dc:event'
2. alter  "dc:event" ,NAME=>'cf',TTL=>600
   alter  "dc:event" ,NAME=>'f1',TTL=>600
3. enable 'dc:event'
4. scan 'dc:event'
    ROW                                              COLUMN+CELL                                                                                                                                   
 866925023233-6-21                               column=f2:eventid, timestamp=1536805384815, value=866925023233621                                                                             
 866925023233-6-22                               column=f2:eventid, timestamp=1536805384873, value=866925023233621                                                                             
 866925023233-6-23                               column=f2:eventid, timestamp=1536805384881, value=866925023233621                                                                             
 866925023233-6-24                               column=f2:eventid, timestamp=1536805384890, value=866925023233621                                                                             
 866925023233-6-25                               column=f2:eventid, timestamp=1536805384898, value=866925023233621                                                                             
 866925023233-6-26                               column=f2:eventid, timestamp=1536805384907, value=866925023233621                                                                             
 866925023233-6-27                               column=f2:eventid, timestamp=1536805384922, value=866925023233621                                                                             
 866925023233-6-28                               column=f2:eventid, timestamp=1536805384936, value=866925023233621                                                                             
 866925023233-6-29                               column=f2:eventid, timestamp=1536805384946, value=866925023233621                                                                             
 866925023233-6-30                               column=f2:eventid, timestamp=1536805384958, value=866925023233621                                                                             
10 row(s) in 0.0740 seconds

对表中原有的cf,f1,f2 列中的cf,f1列设置ttl,时间到之后,cf、f1列的数据会自动清除,f2的数据由于没有设置ttl时间,数据依然还在。

表的TTL修改前后对比:

修改HBASE ttl shell 

#!/bin/bash -l

# 针对这一步骤的操作是否需要做回滚操作
# 如果需要,需要查看生产的对应表的ttl,回滚时数据无法回滚

WB_DIR=$(cd $(dirname $0); pwd)
HBASE_NAMESPACE='hochoy'

origin_tables="tabTest1 tabTest2 tabTest3"

alter_ttl="alter_hbase.script"
get_ttl_value(){
	years=${1}
	ttl=FOREVER
	ttl=$(echo "scale = 0; 60 * 60 * 24 * 365 * ${years} " | bc)
	echo ${ttl%\.*}
}
gen_alt_script(){
	ttl=${1}
	echo ''>${WB_DIR}/${alter_ttl}
	for table in ${origin_tables}
	do
		echo "desc    '${HBASE_NAMESPACE}:${table}'">>${WB_DIR}/${alter_ttl}
		echo "disable '${HBASE_NAMESPACE}:${table}' ">>${WB_DIR}/${alter_ttl}
		echo "alter   '${HBASE_NAMESPACE}:${table}', {NAME=>'f',TTL=>${ttl} } ">>${WB_DIR}/${alter_ttl}
		echo "enable  '${HBASE_NAMESPACE}:${table}' ">>${WB_DIR}/${alter_ttl}
		echo "desc    '${HBASE_NAMESPACE}:${table}'">>${WB_DIR}/${alter_ttl}
	done
	echo "exit">>${WB_DIR}/${alter_ttl}
}
if [ $# -lt 1 ]; then
	echo "Usage:
	Input value of TTL please!
	"
	exit
fi
if [ "${1}" = "FOREVER" ] ;then
	gen_alt_script FOREVER
else
	ttl=$(get_ttl_value ${1})
	gen_alt_script $ttl
fi

cat  ${WB_DIR}/${alter_ttl}
hbase shell  ${WB_DIR}/${alter_ttl}

 

  • 3
    点赞
  • 1
    评论
  • 5
    收藏
  • 一键三连
    一键三连
  • 扫一扫,分享海报

©️2021 CSDN 皮肤主题: 大白 设计师:CSDN官方博客 返回首页
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、C币套餐、付费专栏及课程。

余额充值