Mycat分片规则介绍

养歌

已于 2023-06-09 10:25:16 修改

阅读量1k

点赞数 3

分类专栏：数据库文章标签： java 开发语言后端

于 2022-04-01 15:38:28 首次发布

本文链接：https://blog.csdn.net/wuhuayangs/article/details/123895682

版权

数据库专栏收录该内容

6 篇文章 0 订阅

订阅专栏

配置介绍

rule.xml 的 tableRule 标签配置表的分片规则如下：

<tableRule name="rule">
	<rule>
		<columns>id</columns>
		<algorithm>rule1</algorithm>
	</rule>
</tableRule>

name指定分片规则名字，唯一性;
columns指定用于分片的mysql实体表的字段,
algorithm指定使用的分片算法，取自标签定义的函数，函数的实现是编写在java代码中的

rule.xml 的 function 标签配置表的分片算法如下：

<function name="rule1" class="io.mycat.route.function.PartitionByHashMod" >
	<property name="count">5</property>
</function>

name 指定算法的名字,供algorithm引用
class 制定路由算法具体的类名字
property 为具体算法需要用到的一些属性

常用分片算法

枚举分片 - PartitionByFileMap


<tableRule name="sharding-by-intfile">
    <rule>
      <columns>name</columns>
      <algorithm>hash-int</algorithm>
    </rule>
  </tableRule>
<function name="hash-int" class="io.mycat.route.function.PartitionByFileMap">
    <property name="mapFile">partition-hash-int.txt</property>
    <property name="type">1</property>
    <property name="defaultNode">0</property>
  </function>

partition-map-city.txt 这个mapFile文件放在/srv/mycat/conf目录下，就是和 schema.xml 和 rule.xml 文件同级。
为 type默认值为 0，0 表示 mapFile 中的枚举类型 Integer，非 0 表示 mapFile 中的枚举类型 String。
默认节点:小于0表示不设置默认节点，大于等于0表示设置默认节点,结点为指定的值（所有的节点配置都是从0开始，及0代表节点1），默认节点的作用：枚举分片时，如果碰到不识别的枚举值，就让它路由到默认节点，如果不配置默认节点（defaultNode值小于0表示不配置默认节点），碰到不识别的枚举值就会报错.

partition-hash-int.txt 配置:

10000=0

范围约定 - AutoPartitionByLong

适用提前规划好分片字段某个范围属于哪个分片

<tableRule name="auto-sharding-long">
    <rule>
      <columns>user_id</columns>
      <algorithm>rang-long</algorithm>
    </rule>
  </tableRule>
<function name="rang-long" class="io.mycat.route.function.AutoPartitionByLong">
    <property name="mapFile">auto-sharding-long.txt</property>
    <property name="defaultNode">0</property>
</function>

auto-sharding-long.txt 配置：

2000001-4000000=1
0-2000000=0
4000001-8000000=2

defaultNode 超过范围后的默认节点。

哈希取模-PartitionByHashMod

这个主要针对非整数字段，通过哈希算法得到一个整数值，然后对分片数量 count 进行取模分片，就可以得到分片位置。

分片没有简单取模算法均匀，因为存在 hash 重复的情况，两个相同的数据进行 hash 运算后的数值是一样的，那么取模后得出来的分片位置也就一样，会被储存到同一分片，比如有 100 条订单记录，其中 90 条是 A 的，其他分别是B、C 的，如果根据用户名字段进行哈希取模分片，那么 A 的 90 条记录都会集中在同一分片中，只有其它 10 条在别的分片中，造成数据存储不均匀。

<tableRule name="users">
	<rule>
		<columns>users_Fname</columns>
		<algorithm>hash-mod</algorithm>
	</rule>
</tableRule>
<function name="hash-mod" class="io.mycat.route.function.PartitionByHashMod">
<!-- 要分片的数据库节点数量，必须指定，否则没法分片 -->
	<property name="count">3</property>
</function>

count 表示 dataNode 个数，该属性必须配置，否则在插入数据计算 dataNode 时会抛出异常：ArithmeticException: BigInteger: modulus not positive。
计算方式：hash(分片列) % 分片基数

简单取模 - PartitionByMod

适用于根据整数字段进行分片,取模的值对应分片库的序号,比如某条记录的id=100, mod 3 = 1,那这条记录就在分片db2（分片节点的序号从0开始）上

<tableRule name="mod-long-2_id">
    <rule>
        <columns>id</columns>
        <algorithm>mod-long</algorithm>
    </rule>
</tableRule>
<function name="mod-long" 
          class="io.mycat.route.function.PartitionByMod">
    <property name="count">2</property>
</function>

count：取模的基数，也就是分片数量
数据分布均匀，适用于整数类型的列，不能用于非整型的列
计算方式：分片列 % 分片基数

字符串范围取模分片 - PartitionByPrefixPattern

分片算法根据列进行分片的，也可能通过字符串来进行分片，例如，需要对 ABCDEF 这个字符串的前三位进行分片计算，其计算过程如下图：

逻辑顺序:

首先,计算 ABC 的 ascii 码值之和 = 65 + 66 + 67 = 198;
然后，根据一个自定义的基数取模,198%128=70
最后，将取模的值匹配mapFile中设置(0-63=0 64-127=1)的对应的分片 db2

对指定的字符串范围分别进行 ascii 码计算并求和，然后对配置的求模基数进行取模计算，最后根据 mapFile 里配置的取值范围与数据节点索引的映射关系得出分片的数据节点。因此，mapFile 需要配置所有可能的取值范围，否则找不到对应的数据节点就会报错。

<tableRule name="sharding-by-prefix-pattern_login_name">
    <rule>
        <columns>login_name</columns>
        <algorithm>sharding-by-prefix-pattern</algorithm>
    </rule>
</tableRule>
<function name="sharding-by-prefix-pattern"
          class="io.mycat.route.function.PartitionByPrefixPattern">
    <property name="mapFile">prefix-partition-pattern.txt</property>
    <property name="patternValue">128</property>
    <property name="prefixLength">2</property>
</function>

patternValue属性是取模基数
prefixLength属性是取分片字段的前几个字符来运算

通配取模 - PartitionByPattern

<tableRule name="sharding-by-pattern">
      <rule>
        <columns>user_id</columns>
        <algorithm>sharding-by-pattern</algorithm>
      </rule>
   </tableRule>
<function name="sharding-by-pattern" class="io.mycat.route.function.PartitionByPattern">
    <property name="patternValue">256</property>
    <property name="defaultNode">2</property>
    <property name="mapFile">partition-pattern.txt</property>
 
  </function>

patternValue 即求模基数
defaoultNode 默认节点，如果不配置默认节点，则默认是0即第一个节点

# id partition range start-end ,data node index
###### first host configuration
1-32=0
33-64=1
65-96=2

1-32 即代表 id % 256 后分布的范围，如果在 1-32 则在分区 1，其他类推，如果 id 不为 int 则分配到 defaultNode 配置的分区中

编程指定 - PartitionDirectBySubString

该方法的指定字段必须为数字，size 为截取的位数，partitionCount 为分区个数，defaultPartition 为默认节点，例如 id = 05-100000002，在此配置中代表根据 id 中从 startIndex = 0，开始，截取 siz = 2 位数字即 05，05 就是获取的分区，如果没传默认分配到 defaultPartition。

<tableRule name="sharding-by-substring">
      <rule>
        <columns>user_id</columns>
        <algorithm>sharding-by-substring</algorithm>
      </rule>
   </tableRule>
<function name="sharding-by-substring" class="io.mycat.route.function.PartitionDirectBySubString">
    <property name="startIndex">0</property> <!-- zero-based -->
    <property name="size">2</property>
    <property name="partitionCount">8</property>
    <property name="defaultPartition">0</property>
  </function>

一致性hash - PartitionByMurmurHash

一致性hash 预算有效解决了分布式数据的扩容问题。

<tableRule name="sharding-by-murmur">  
    <rule>
      <columns>user_id</columns>        
      <algorithm>murmur</algorithm>
    </rule>
</tableRule>

<function name="murmur" class="io.mycat.route.function.PartitionByMurmurHash">
    <property name="seed">0</property>
    <property name="count">2</property>
    <property name="virtualBucketTimes">160</property>
  <property name="weightMapFile">weightMapFile</property>
  <property name="bucketMapPath">/etc/mycat/bucketMapPath</property>
</function>

seed：默认是0
count：要分片的数据库节点数量，必须指定，否则没法分片
virtualBucketTimes：一个实际的数据库节点被映射为这么多虚拟节点，默认是160倍，也就是虚拟节点数是物理节点数的160倍
weightMapFile：节点的权重，没有指定权重的节点默认是1。以properties文件的格式填写，以从0开始到count-1的整数值也就是节点索引为key，以节点权重值为值。所有权重值必须是正整数，否则以1代替
bucketMapPath：用于测试时观察各物理节点与虚拟节点的分布情况，如果指定了这个属性，会把虚拟节点的murmur hash值与物理节点的映射按行输出到这个文件，没有默认值，如果不指定，就不会输出任何东西

范围取模分片 - PartitionByRangeMod

范围取模分片的优点在于，既拥有范围分片的固定范围数据不做迁移的优点，也拥有了取模分片对于热点数据均匀分布的优点，综合了范围分片和取模分片的优点，分片组内使用求模可以保证组内数据比较均匀，分片组之间是范围分片，可以兼顾范围查询，最好事先规划好分片的数量，数据扩容时按分片组扩容，则原有分片组的数据不需要迁移。由于分片组内数据比较均匀，所以分片组内可以避免热点数据问题。

id
rang-mod

<function name="rang-mod"class=“io.mycat.route.function.PartitionByRangeMod”>
partition-range-mod.txt
21

defaultNode：默认节点，小于 0 表示不设置默认节点，大于等于 0 表示设置默认节点，如果超出配置的范围，则使用默认节点。

partition-range-mod.txt 配置：

# 以下配置一个范围代表一个分片组，=号后面的数字代表该分片组所拥有的分片的数量。
# range start-end ,data node group size
0-200M=5 //代表有 5 个分片节点
200M1-400M=1
400M1-600M=4
600M1-800M=4
800M1-1000M=6

这里需要注意：如上0-200M 存入到 5 个分片中，开始范围-结束范围=该分片组有多少个分片。如果超过配置范围需要增加分片组。

养歌

关注

3
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
Mycat分片规则介绍

配置介绍rule.xml 的 tableRule 标签配置表的分片规则如下：<tableRule name="rule"> <rule> <columns>id</columns> <algorithm>rule1</algorithm> </rule></tableRule>name指定分片规则名字，唯一性;columns指定用于分片的mysql实体表的字段,algorithm指定使用
复制链接

扫一扫