首先我有两个测试的表: t_lisa和 t_lisa1,t_lisa的一个region和t_lisa1的整个表都在相同的regionserver上,所以他们的更新会写相同的HLog文件,我会在t_lisa和t_lisa1都进行更新操作,然后手工的对t_lisa进行flush操作,将t_lisa的memstore刷新到HFile,然后再对比t_lisa的HFile,t_lisa1的HFile和HLog中的序列号。
先确认一下两个表当前的HFile情况:
[root@a01 ~]# hadoop fs -ls /hbase_root/t_lisa/e56ca60a2a54ae55b8631bbd21672e35/cf_1
Found 1 items
-rw-r--r-- 3 root supergroup 964 2013-11-16 15:26 /hbase_root/t_lisa/e56ca60a2a54ae55b8631bbd21672e35/cf_1/b6ab19771b4f4a7597afffd37ccf907b
[root@a01 ~]# hadoop fs -ls /hbase_root/t_lisa/e56ca60a2a54ae55b8631bbd21672e35/cf_2
Found 1 items
-rw-r--r-- 3 root supergroup 706 2013-11-14 09:44 /hbase_root/t_lisa/e56ca60a2a54ae55b8631bbd21672e35/cf_2/3cc23abe61f74ccfa1e2240153487101
[root@a01 ~]# hadoop fs -ls /hbase_root/t_lisa/fa8f6eb2a0bcb54e443f9bfc2693768d/cf_1
Found 1 items
-rw-r--r-- 3 root supergroup 881 2013-11-26 18:33 /hbase_root/t_lisa/fa8f6eb2a0bcb54e443f9bfc2693768d/cf_1/ad4992b0fd5843318b00c0c60a43f786
[root@a01 ~]# hadoop fs -ls /hbase_root/t_lisa/fa8f6eb2a0bcb54e443f9bfc2693768d/cf_2
Found 1 items
-rw-r--r-- 3 root supergroup 712 2013-11-15 14:26 /hbase_root/t_lisa/fa8f6eb2a0bcb54e443f9bfc2693768d/cf_2/5df75d8416bd4d049937901c154d3dfd
[root@a01 ~]# hadoop fs -ls /hbase_root/t_lisa1/787ce41dabb55075935e7060583ae6af/cf_1
Found 1 items
-rw-r--r-- 3 root supergroup 795 2013-11-14 09:39 /hbase_root/t_lisa1/787ce41dabb55075935e7060583ae6af/cf_1/e541b8a04f224e869166ee43783bd8d0
[root@a01 ~]# hadoop fs -ls /hbase_root/t_lisa1/787ce41dabb55075935e7060583ae6af/cf_2
Found 1 items
-rw-r--r-- 3 root supergroup 736 2013-11-14 09:39 /hbase_root/t_lisa1/787ce41dabb55075935e7060583ae6af/cf_2/6eb83725b6b042958067d900761ef613
然后我们对两个表的四个列族都进行修改操作:
hbase(main):001:0> put't_lisa','lisa10','cf_1:w1','10z2'
0 row(s) in 0.5120 seconds
hbase(main):002:0> put't_lisa','lisa1','cf_1:w100','abcd'
0 row(s) in 0.0050 seconds
hbase(main):003:0> put't_lisa','lisa10','cf_2:w1','10z2'
0 row(s) in 0.0050 seconds
hbase(main):004:0> put't_lisa','lisa1','cf_2:w100','abcd'
0 row(s) in 0.0030 seconds
hbase(main):005:0> put't_lisa1','lisa11','cf_1:23','3sdfs'
0 row(s) in 0.1430 seconds
hbase(main):006:0> put't_lisa1','lisa12','cf_1:34','zzzz'
0 row(s) in 0.0040 seconds
hbase(main):007:0> put't_lisa1','lisa11','cf_2:23','3sdfs'
0 row(s) in 0.0040 seconds
hbase(main):008:0> put't_lisa1','lisa12','cf_2:34','zzzz'
0 row(s) in 0.0040 seconds
hbase(main):011:0> put't_lisa','lisa77','cf_1:w100','abcd'
0 row(s) in 0.0370 seconds
hbase(main):012:0> put't_lisa','lisa79','cf_2:w1','10z2'
0 row(s) in 0.0040 seconds
确认在不刷新memstore的时候,HFile是没有变化的。
t_lisa1的MAX_SEQ_ID_KEY = 2309244
t_lisa的MAX_SEQ_ID_KEY = 2316021
HLog中的序列号都大于HLog的序列号,所以内存还没有刷新到HFile
[root@a01 hbase]# bin/hbase hlog /hbase_root/.logs/*,60020,1385442023669/*%2C60020%2C1385442023669.1385528428894 -p
Sequence 2316025 from region 787ce41dabb55075935e7060583ae6af in table t_lisa1
Action:
row: lisa11
column: cf_1:23
at time: Wed Nov 27 14:32:17 CST 2013
value: 3sdfs
Sequence 2316026 from region 787ce41dabb55075935e7060583ae6af in table t_lisa1
Action:
row: lisa12
column: cf_1:34
at time: Wed Nov 27 14:32:17 CST 2013
value: zzzz
Sequence 2316027 from region 787ce41dabb55075935e7060583ae6af in table t_lisa1
Action:
row: lisa11
column: cf_2:23
at time: Wed Nov 27 14:32:17 CST 2013
value: 3sdfs
Sequence 2316028 from region 787ce41dabb55075935e7060583ae6af in table t_lisa1
Action:
row: lisa12
column: cf_2:34
at time: Wed Nov 27 14:32:17 CST 2013
value: zzzz
Sequence 2316029 from region fa8f6eb2a0bcb54e443f9bfc2693768d in table t_lisa
Action:
row: lisa77
column: cf_1:w100
at time: Wed Nov 27 14:45:08 CST 2013
value: abcd
Sequence 2316030 from region fa8f6eb2a0bcb54e443f9bfc2693768d in table t_lisa
Action:
row: lisa79
column: cf_2:w1
at time: Wed Nov 27 14:45:08 CST 2013
value: 10z2
hbase(main):010:0> help 'flush'
Flush all regions in passed table or pass a region row to
flush an individual region. For example:
hbase> flush 'TABLENAME'
hbase> flush 'REGIONNAME'
flush是可以针对表或者regionname的,所以是我多虑了。
我们flush ‘t_lisa,lisa3,1384393470579.fa8f6eb2a0bcb54e443f9bfc2693768d.’
hbase(main):013:0> flush 't_lisa,lisa3,1384393470579.fa8f6eb2a0bcb54e443f9bfc2693768d.'
0 row(s) in 0.2180 seconds
[root@a01 hbase]# hadoop fs -ls /hbase_root/t_lisa/fa8f6eb2a0bcb54e443f9bfc2693768d/cf_1/
Found 2 items
-rw-r--r-- 3 root supergroup 711 2013-11-27 14:46 /hbase_root/t_lisa/fa8f6eb2a0bcb54e443f9bfc2693768d/cf_1/9731d813c4a54c018eee7c7a5ed4b11f
-rw-r--r-- 3 root supergroup 881 2013-11-26 18:33 /hbase_root/t_lisa/fa8f6eb2a0bcb54e443f9bfc2693768d/cf_1/ad4992b0fd5843318b00c0c60a43f786
[root@a01 hbase]# hadoop fs -ls /hbase_root/t_lisa/fa8f6eb2a0bcb54e443f9bfc2693768d/cf_2
Found 2 items
-rw-r--r-- 3 root supergroup 712 2013-11-15 14:26 /hbase_root/t_lisa/fa8f6eb2a0bcb54e443f9bfc2693768d/cf_2/5df75d8416bd4d049937901c154d3dfd
-rw-r--r-- 3 root supergroup 705 2013-11-27 14:46 /hbase_root/t_lisa/fa8f6eb2a0bcb54e443f9bfc2693768d/cf_2/cbb4312c3861473faf04a17a7861d51e
[root@a01 hbase]# hadoop fs -ls /hbase_root/t_lisa/e56ca60a2a54ae55b8631bbd21672e35/cf_1
Found 1 items
-rw-r--r-- 3 root supergroup 964 2013-11-16 15:26 /hbase_root/t_lisa/e56ca60a2a54ae55b8631bbd21672e35/cf_1/b6ab19771b4f4a7597afffd37ccf907b
[root@a01 hbase]# hadoop fs -ls /hbase_root/t_lisa/e56ca60a2a54ae55b8631bbd21672e35/cf_2
Found 1 items
-rw-r--r-- 3 root supergroup 706 2013-11-14 09:44 /hbase_root/t_lisa/e56ca60a2a54ae55b8631bbd21672e35/cf_2/3cc23abe61f74ccfa1e2240153487101
[root@a01 hbase]# hadoop fs -ls /hbase_root/t_lisa1/787ce41dabb55075935e7060583ae6af/cf_1
Found 1 items
-rw-r--r-- 3 root supergroup 795 2013-11-14 09:39 /hbase_root/t_lisa1/787ce41dabb55075935e7060583ae6af/cf_1/e541b8a04f224e869166ee43783bd8d0
[root@a01 hbase]# hadoop fs -ls /hbase_root/t_lisa1/787ce41dabb55075935e7060583ae6af/cf_2
Found 1 items
-rw-r--r-- 3 root supergroup 736 2013-11-14 09:39 /hbase_root/t_lisa1/787ce41dabb55075935e7060583ae6af/cf_2/6eb83725b6b042958067d900761ef613
[root@a01 hbase]# bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -f /hbase_root/t_lisa/fa8f6eb2a0bcb54e443f9bfc2693768d/cf_1/9731d813c4a54c018eee7c7a5ed4b11f -v -m -p
MAX_SEQ_ID_KEY = 2316031
[root@a01 hbase]# bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -f /hbase_root/t_lisa/fa8f6eb2a0bcb54e443f9bfc2693768d/cf_2/cbb4312c3861473faf04a17a7861d51e -v -m -p
MAX_SEQ_ID_KEY = 2316031
为什么是2316031呢,比之前看到的HLog的序列号大1,最后的这个操作是什么?
Sequence 2316031 from region fa8f6eb2a0bcb54e443f9bfc2693768d in table t_lisa
Action:
row: METAROW
column: METAFAMILY:
at time: Wed Nov 27 14:46:30 CST 2013
value: HBASE::CACHEFLUSH
最后这个操作就是flush memstore。
那么那些没有flush的操作什么时候会刷新到HFile呢,理论上HLog每小时要滚动,滚动的时候判断这部分的修改没有固化,是不是强制刷新?
要等下一个小时观察一下变化:
[root@a01 hbase]# hadoop fs -ls /hbase_root/.logs
Found 5 items
drwxr-xr-x - root supergroup 0 2013-11-27 13:00 /hbase_root/.logs/*,60020,1385442023669
drwxr-xr-x - root supergroup 0 2013-11-26 13:00 /hbase_root/.logs/*,60020,1385442025408
drwxr-xr-x - root supergroup 0 2013-11-26 13:00 /hbase_root/.logs/*,60020,1385442024055
drwxr-xr-x - root supergroup 0 2013-11-26 13:00 /hbase_root/.logs/*,60020,1385442028712
drwxr-xr-x - root supergroup 0 2013-11-26 15:00 /hbase_root/.logs/*,60020,1385442028696
[root@a01 hbase]# hadoop fs -ls /hbase_root/.oldlogs
[root@a01 hbase]# date
Wed Nov 27 15:00:10 CST 2013
[root@a01 hbase]# hadoop fs -ls /hbase_root/.logs
Found 5 items
drwxr-xr-x - root supergroup 0 2013-11-27 15:00 /hbase_root/.logs/*,60020,1385442023669
drwxr-xr-x - root supergroup 0 2013-11-26 13:00 /hbase_root/.logs/*,60020,1385442025408
drwxr-xr-x - root supergroup 0 2013-11-27 15:00 /hbase_root/.logs/*,60020,1385442024055
drwxr-xr-x - root supergroup 0 2013-11-26 13:00 /hbase_root/.logs/*,60020,1385442028712
drwxr-xr-x - root supergroup 0 2013-11-26 15:00 /hbase_root/.logs/*,60020,1385442028696
可以看到日志的确滚动了,每个regionserver下出现了第二个HLog文件:
[root@a01 hbase]# hadoop fs -ls /hbase_root/.logs/*,60020,1385442023669
Found 2 items
-rw-r--r-- 3 root supergroup 986 2013-11-27 13:00 /hbase_root/.logs/*,60020,1385442023669/*%2C60020%2C1385442023669.1385528428894
-rw-r--r-- 3 root supergroup 0 2013-11-27 15:00 /hbase_root/.logs/*,60020,1385442023669/*%2C60020%2C1385442023669.1385535629603
[root@a01 hbase]# hadoop fs -ls /hbase_root/.logs/*,60020,1385442024055
Found 2 items
-rw-r--r-- 3 root supergroup 608 2013-11-26 13:00 /hbase_root/.logs/*,60020,1385442024055/*%2C60020%2C1385442024055.1385442024670
-rw-r--r-- 3 root supergroup 0 2013-11-27 15:00 /hbase_root/.logs/*,60020,1385442024055/*%2C60020%2C1385442024055.1385535626291
但是对应的HFile并没有发生变化,也就是memstore没有刷新到HFile:
[root@a01 hbase]# hadoop fs -ls /hbase_root/t_lisa/e56ca60a2a54ae55b8631bbd21672e35/cf_1
Found 1 items
-rw-r--r-- 3 root supergroup 964 2013-11-16 15:26 /hbase_root/t_lisa/e56ca60a2a54ae55b8631bbd21672e35/cf_1/b6ab19771b4f4a7597afffd37ccf907b
[root@a01 hbase]# hadoop fs -ls /hbase_root/t_lisa/e56ca60a2a54ae55b8631bbd21672e35/cf_2
Found 1 items
-rw-r--r-- 3 root supergroup 706 2013-11-14 09:44 /hbase_root/t_lisa/e56ca60a2a54ae55b8631bbd21672e35/cf_2/3cc23abe61f74ccfa1e2240153487101
[root@a01 hbase]# hadoop fs -ls /hbase_root/t_lisa1/787ce41dabb55075935e7060583ae6af/cf_1
Found 1 items
-rw-r--r-- 3 root supergroup 795 2013-11-14 09:39 /hbase_root/t_lisa1/787ce41dabb55075935e7060583ae6af/cf_1/e541b8a04f224e869166ee43783bd8d0
[root@a01 hbase]# hadoop fs -ls /hbase_root/t_lisa1/787ce41dabb55075935e7060583ae6af/cf_2
Found 1 items
-rw-r--r-- 3 root supergroup 736 2013-11-14 09:39 /hbase_root/t_lisa1/787ce41dabb55075935e7060583ae6af/cf_2/6eb83725b6b042958067d900761ef613
也就是HLog仍然在.logs目录下面,只有等到HLog中所有的更新都刷新到HFile以后,才会将文件挪到.oldlogs目录下面。日志的滚动不会触发flush的操作。当某个regionserver下面的HLog超过一定数量的时候,会有一个机制强制将memstore刷新到HFile中。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/51862/viewspace-1061291/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/51862/viewspace-1061291/