Hash分区表及数据分布

最新推荐文章于 2024-08-01 00:57:52 发布

cuixie2370

最新推荐文章于 2024-08-01 00:57:52 发布

阅读量289

点赞数

建立分区数为5的hash分区表：

create table test01
partition by hash(object_id)
(partition p1,
partition p2,
partition p3,
partition p4,
Partition p5)
as select * from sys.dba_objects;

查看各个分区的记录数：

select count(*) from test01 partition (p1);
6746
select count(*) from test01 partition (p2);
13550
select count(*) from test01 partition (p3);
13764
select count(*) from test01 partition (p4);
13445
select count(*) from test01 partition (p5);
6777

建立分区数为8 (2的3次方) 的hash分区表：

create table test02
partition by hash(object_id)
(partition p1,
partition p2,
partition p3,
partition p4,
partition p5,
partition p6,
partition p7,
Partition p8)
as select * from sys.dba_objects;

查看各个分区的记录数(平均分布)：

select count(*) from test02 partition (p1);
6750
select count(*) from test02 partition (p2);
6861
select count(*) from test02 partition (p3);
6891
select count(*) from test02 partition (p4);
6682
select count(*) from test02 partition (p5);
6778
select count(*) from test02 partition (p6);
6689
select count(*) from test02 partition (p7);
6874
select count(*) from test02 partition (p8);
6766

在test01上增加hash分区p6：
alter table test01 add partition p6 ;

这时候后来看test01的数据分布：

select count(*) from test01 partition (p1); -- 没变
6746
select count(*) from test01 partition (p2); -- 少了6689
6861
select count(*) from test01 partition (p3); -- 没变
13764
select count(*) from test01 partition (p4); -- 没变
13445
select count(*) from test01 partition (p5); -- 没变
6777
select count(*) from test01 partition (p6); -- 恰好是6689
6689

在test01上增加hash分区p7：
alter table test01 add partition p7 ;

这时候后来看test01的数据分布(以下比较是相对于加入p6后)：

select count(*) from test01 partition (p1); -- 没变
6746
select count(*) from test01 partition (p2); -- 没变
6861
select count(*) from test01 partition (p3); -- 少了6874
6890
select count(*) from test01 partition (p4); -- 没变
13445
select count(*) from test01 partition (p5); -- 没变
6777
select count(*) from test01 partition (p6); -- 没变
6689
select count(*) from test01 partition (p7); -- 恰好是6874
6874

在test01上增加hash分区p8：
alter table test01 add partition p8 ;

这时候后来看test01的数据分布(以下比较是相对于加入p7后)：

select count(*) from test01 partition (p1); -- 没变
6746
select count(*) from test01 partition (p2); -- 没变
6861
select count(*) from test01 partition (p3); -- 没变
6890
select count(*) from test01 partition (p4); -- 少了6765
6680
select count(*) from test01 partition (p5); -- 没变
6777
select count(*) from test01 partition (p6); -- 没变
6689
select count(*) from test01 partition (p7); -- 没变
6874
select count(*) from test01 partition (p7); -- 恰好是6765
6765

大家从上面的数据分布拆分情况可以大致看出Oracle是如何将数据平均分布
的，也应该大致理解了为什么Oracle的HASH分区数建议是2的幂。

还可以看到加入到8个分区(2的3次方)后数据都平均分布了，和一次性直接划分
为8个分区数据分布比较接近 (但是不相同)。

下面简单测试一下如果从8个分区继续加入到9，10，11，16
个分区又是怎样的情况呢？这里我们还是以test01表来做测试。

alter table test01 add partition p9 ;

这时候后来看test01的数据分布(以下比较是相对于加入p8后)：

select count(*) from test01 partition (p1); -- 少了3390
3356
select count(*) from test01 partition (p2); -- 没变
6861
select count(*) from test01 partition (p3); -- 没变
6890
select count(*) from test01 partition (p4); -- 没变
6680
select count(*) from test01 partition (p5); -- 没变
6777
select count(*) from test01 partition (p6); -- 没变
6689
select count(*) from test01 partition (p7); -- 没变
6874
select count(*) from test01 partition (p8); -- 没变
6765
select count(*) from test01 partition (p9); -- 恰好是3390
3390

alter table test01 add partition p10 ;

这时候后来看test01的数据分布(以下比较是相对于加入p9后)：

select count(*) from test01 partition (p1); -- 没变
3356
select count(*) from test01 partition (p2); -- 少了3443
3418
select count(*) from test01 partition (p3); -- 没变
6890
select count(*) from test01 partition (p4); -- 没变
6680
select count(*) from test01 partition (p5); -- 没变
6777
select count(*) from test01 partition (p6); -- 没变
6689
select count(*) from test01 partition (p7); -- 没变
6874
select count(*) from test01 partition (p8); -- 没变
6765
select count(*) from test01 partition (p9); -- 没变
3390
select count(*) from test01 partition (p10); -- 恰好是3443
3443

alter table test01 add partition p11 ;

这时候后来看test01的数据分布(以下比较是相对于加入p10后)：

select count(*) from test01 partition (p1); -- 没变
3356
select count(*) from test01 partition (p2); -- 没变
3418
select count(*) from test01 partition (p3); -- 少了3444
3446
select count(*) from test01 partition (p4); -- 没变
6680
select count(*) from test01 partition (p5); -- 没变
6777
select count(*) from test01 partition (p6); -- 没变
6689
select count(*) from test01 partition (p7); -- 没变
6874
select count(*) from test01 partition (p8); -- 没变
6765
select count(*) from test01 partition (p9); -- 没变
3390
select count(*) from test01 partition (p10); -- 没变
3443
select count(*) from test01 partition (p11); -- 恰好是3444
3444

OK, 其实不用测试这么多，大家就可以看出规律了，但是这里之所以测试
这些，是为了通过概率的方式统计一下到底每次在拆分数据量的时候有
什么规律，是一半一半的拆分，还是怎样的一个算法，不过能力有限，暂
时还是没有能看出端倪，还是需要参考深入讨论的资料。

现在我们一次性将分区加到16个，看看数据分布情况，明显已经均匀分布了。

select count(*) from test01 partition (p1);
3356
select count(*) from test01 partition (p2);
3418
select count(*) from test01 partition (p3);
3446
select count(*) from test01 partition (p4);
3322
select count(*) from test01 partition (p5);
3427
select count(*) from test01 partition (p6);
3367
select count(*) from test01 partition (p7);
3392
select count(*) from test01 partition (p8);
3421
select count(*) from test01 partition (p9);
3390
select count(*) from test01 partition (p10);
3443
select count(*) from test01 partition (p11);
3444
select count(*) from test01 partition (p12);
3358
select count(*) from test01 partition (p13);
3350
select count(*) from test01 partition (p14);
3322
select count(*) from test01 partition (p15);
3482
select count(*) from test01 partition (p16);
3344

1， HASH分区不存在split partition，只能add partition。数据在各个分区的分布
不能人为控制，不能显示的指定某一个分区进行分裂，但是加入分区后，从上面的例
子可以看出数据拆分分布是有规律的。

2，当我们确定合理的分区数量的之后，数据的分布完全由分区表中的数据本身决定。
对于某些特定的数据来说，HASH分区后的效果可能并不好。数据的随机性越大，数据的
样本量越大，HASH分区后的效果越好，因为数据有可能更加平均的分散到每个bucket中。

3，对于分区个数为m的HASH分区表来说，无论期间经历了怎么样的过程（比如说先建
立n个分区的HASH分区表(n区数为m的分区表；）最后的数据分布都是相近的(不完全一样) 。

4，对于HASH分区表，drop partition操作是不可以的。
alter table test01 drop partition p1;
ORA-14255: 未按范围, 组合范围或列表方法对表进行分区

来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/35489/viewspace-684342/，如需转载，请注明出处，否则将追究法律责任。

转载于:http://blog.itpub.net/35489/viewspace-684342/

cuixie2370

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫