环境准备
我们可以用真实的物理机器或云上的虚拟主机来做实验,充分体现自主的分库分表的乐趣,在这里我找了三台机器,具体信息如下所示:
机器名 | IP地址 | 配置 |
Proxy1 | 192.168.1.6 | 8 Core,GB Memory,美团云虚拟机 |
group1 | 192.168.1.6 | 8 Core,GB Memory,美团云虚拟机 |
group2 | 192.168.1.52 | 8 Core,GB Memory,美团云虚拟机 |
1 建表
-
创建表时,要根据分区中指定的表名和位置去创建相应的物进表,创建的表中应当包括分区字段,字段名称和类型都要匹配上。
在第一台MySQL服务器(group1)上创建如下表:Create table my_range_0 (id int not null primary key, col2 int, col3 varchar(32));
Create table my_range_1 (id int not null primary key, col2 int, col3 varchar(32));
Create table my_list_0 (id int not null primary key, col2 int, col3 varchar(32));
Create table my_list_1 (id int not null primary key, col2 int, col3 varchar(32));
Create table my_hash_0 (id int not null primary key, col2 int, col3 varchar(32));
Create table my_hash_1 (id int not null primary key, col2 int, col3 varchar(32));
- group2上面建立如下的表
-
Create table my_range_2 (id int not null primary key, col2 int, col3 varchar(32));
Create table my_range_3 (id int not null primary key, col2 int, col3 varchar(32));
Create table my_list_2 (id int not null primary key, col2 int, col3 varchar(32));
Create table my_list_3 (id int not null primary key, col2 int, col3 varchar(32));
Create table my_list_4 (id int not null primary key, col2 int, col3 varchar(32));
Create table my_hash_2 (id int not null primary key, col2 int, col3 varchar(32));
Create table my_hash_3 (id int not null primary key, col2 int, col3 varchar(32));
2 可以看到,两台MySQL服务器上都没有创建OneProxy里的三张虚拟表:“my_range”、“my_list”和“my_hash”。接下来我们将三张分区表的信息用JSON格式保存到文本文件(“part.txt”)中,如下所示:
- 如:
- [root@hostnfsd :/usr/local/oneproxy]$ more part.txt
[
{
"table" : "my_range",
"pkey" : "id",
"type" : "int",
"method" : "range",
"partitions":
[
{ "name" : "my_range_0", "group": "group1", "value" : "100000" },
{ "name" : "my_range_1", "group": "group1", "value" : "200000" },
{ "name" : "my_range_2", "group": "group2", "value" : "300000" },
{ "name" : "my_range_3", "group": "group2", "value" : null }
]
},
{
"table" : "my_hash",
"pkey" : "id",
"type" : "int",
"method" : "hash",
"partitions":
[
{ "name" : "my_hash_0", "group": "group1" },
{ "name" : "my_hash_1", "group": "group1" },
{ "name" : "my_hash_2", "group": "group2" },
{ "name" : "my_hash_3", "group": "group2" }
]
},
{
"table" : "my_list",
"pkey" : "id",
"type" : "int",
"method" : "list",
"partitions":
[
{ "name" : "my_list_0", "group": "group1", "value" : ["1","2","3"] },
{ "name" : "my_list_1", "group": "group1", "value" : ["4","5","6"] },
{ "name" : "my_list_2", "group": "group2", "value" : ["7","8","9"] },
{ "name" : "my_list_3", "group": "group2", "value" : ["10","11","12"] },
{ "name" : "my_list_4", "group": "group2", "value" : [] }
]
}
]
3配置文件
- [root@hostnfsd :/usr/local/oneproxy]$ vi demo.sh
#/bin/bash
#
export ONEPROXY_HOME=/usr/local/oneproxy
# valgrind --leak-check=full \
${ONEPROXY_HOME}/bin/oneproxy --keepalive \
--proxy-address=192.168.1.6:3308 \
--admin-address=192.168.1.6:4041 \
--proxy-master-addresses=192.168.1.6:3307@group1 \
--proxy-master-addresses=192.168.1.52:3306@group2 \
--proxy-user-list=testuser/950434F7872CB57A600E1B2B7237766FA7E95460@testbak \
--proxy-charset=utf8_chinese_ci \
--proxy-part-tables=${ONEPROXY_HOME}/part.txt \
--log-file=${ONEPROXY_HOME}/oneproxy.log \
--pid-file=${ONEPROXY_HOME}/oneproxy.pid
4 重启oneproxy
[root@hostnfsd :/usr/local/oneproxy]$ ps -ef | grep -i one|grep -v grep |awk '{print $2}'|xargs kill -9
[root@hostnfsd :/usr/local/oneproxy]$ sh demo.sh
更新操作
现在三张表里没有任何数据,第一步就是用标准的MySQL客户端连接到OneProxy,须注意连接时,端口的写OneProxy的端口,这里是“3308”,如下所示:
# /usr/local/mysql5.6/bin/mysql -u testuser -h192.168.1.6 -P3308 -ptestuser
我们来对三张表做一个查询试试:
mysql> select * from my_range;
Empty set (0.00 sec)
mysql> select * from my_list;
Empty set (0.00 sec)
mysql> select * from my_hash;
Empty set (0.00 sec)
使用SQL语句来插入几条记录看看,针对插入语句,要求语句中显式地指定列名,如下所示:
mysql> insert into my_range (id, col2, col3) values (100, 1, 'This is row 1');
Query OK, 1 row affected (0.01 sec)
mysql> insert into my_range (id, col2, col3) values (100100, 2, 'This is row 2');
Query OK, 1 row affected (0.00 sec)
mysql> insert into my_range (id, col2, col3) values (200100, 3, 'This is row 3');
Query OK, 1 row affected (0.01 sec)
mysql> insert into my_range (id, col2, col3) values (300100, 4, 'This is row 4');
Query OK, 1 row affected (0.01 sec)
然后重新来查询一次“my_range”表,如下所示:
mysql> select * from my_range;
+--------+------+---------------+
| id | col2 | col3 |
+--------+------+---------------+
| 100 | 1 | This is row 1 |
| 100100 | 2 | This is row 2 |
| 200100 | 3 | This is row 3 |
| 300100 | 4 | This is row 4 |
+--------+------+---------------+
4 rows in set (0.01 sec)
在OneProxy中,可以直接使用分区的名字来进行查询,来验证一下是否每个分区各包含一条记录,也可以直接登录MySQL进行验证,如下所示:
mysql> select * from my_range_0;
+-----+------+---------------+
| id | col2 | col3 |
+-----+------+---------------+
| 100 | 1 | This is row 1 |
+-----+------+---------------+
1 row in set (0.00 sec)
mysql> select * from my_range_1;
+--------+------+---------------+
| id | col2 | col3 |
+--------+------+---------------+
| 100100 | 2 | This is row 2 |
+--------+------+---------------+
1 row in set (0.00 sec)
mysql> select * from my_range_2;
+--------+------+---------------+
| id | col2 | col3 |
+--------+------+---------------+
| 200100 | 3 | This is row 3 |
+--------+------+---------------+
1 row in set (0.00 sec)
mysql> select * from my_range_3;
+--------+------+---------------+
| id | col2 | col3 |
+--------+------+---------------+
| 300100 | 4 | This is row 4 |
+--------+------+---------------+
1 row in set (0.00 sec)
对于更新操作来讲,要求更新的语句只能操作一个分区的数据,否则报错并不能执行。比如我们没有显式指定列名,会怎么样呢?
mysql> insert into my_range values (400100, 4, 'This is row 4');
ERROR 1044 (42000): Partitioned tables should choose only one partition for DML queries!
当OneProxy无法从SQL取到分区列上的值时,会选择所有的分区,对于DML更新类操作,OneProxy因为不支持分布式事务,所在对DML语句能操作的分区数做了严格的限定,即只能操作一个分区。事实上对于在显式事务中的查询语句,也有同样的限制,比如我们显式开始一个事务,然后对虚拟表做全表查询,看看OneProxy会如何反馈?
mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from my_range;
ERROR 1044 (42000): Partitioned tables should choose only one partition when in transaction!
mysql> rollback;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from my_range;
+--------+------+---------------+
| id | col2 | col3 |
+--------+------+---------------+
| 100 | 1 | This is row 1 |
| 100100 | 2 | This is row 2 |
| 200100 | 3 | This is row 3 |
| 300100 | 4 | This is row 4 |
+--------+------+---------------+
4 rows in set (0.00 sec)
当在一个插入语句中插入多行时,要求所有的行必须位于同一个分区,否则并不能进行插入操作。如下所示:
mysql> insert into my_range (id, col2, col3) values (400100, 5, 'This is row 5'), (101, 6, 'This is row 6');
ERROR 1044 (42000): Partitioned tables should choose only one partition for DML queries!
如果我们调整一下SQL语句,使多行位于同一个分区,则可以成功操作。如下所示:
mysql> insert into my_range (id, col2, col3) values (101, 5, 'This is row 5'), (102, 6, 'This is row 6');
Query OK, 2 rows affected (0.00 sec)
Records: 2 Duplicates: 0 Warnings: 0
现在再来查询一下第一个分区的数据,如下所示:
mysql> select * from my_range_0;
+-----+------+---------------+
| id | col2 | col3 |
+-----+------+---------------+
| 100 | 1 | This is row 1 |
| 101 | 5 | This is row 5 |
| 102 | 6 | This is row 6 |
+-----+------+---------------+
3 rows in set (0.00 sec)
对于另外的两张表“my_list”和“my_hash”,我在这里就不重复举例子了,请大家自行测试验证。
对于OneProxy里的分区表来讲,更新操作(DML语句)只能涉及一个分区;对事务中的任何查询语句,也有此要求。
查询操作
对于查询语句,如果OneProxy发现只涉及一个分区操作,则直接交给后端的数据库进行处理,包括排序、分页、分组统计等功能,都不需要经过OneProxy干预,只有当查询语句涉及多个分区时,才会参与处理。
前面已经有一些查询的例子了,对于查询操作,如果不在显式的事务中,则可以做多个分区的结果集合并,如果在事务中,则必须要求查询只涉及一个分区。在查询条件中,可以包含分区列,也可以不包含分区列,OneProxy都能自动处理。如下所示:
mysql> select * from my_range where id = 100;
+-----+------+---------------+
| id | col2 | col3 |
+-----+------+---------------+
| 100 | 1 | This is row 1 |
+-----+------+---------------+
1 row in set (0.00 sec)
mysql> select * from my_range where col2=2;
+--------+------+---------------+
| id | col2 | col3 |
+--------+------+---------------+
| 100100 | 2 | This is row 2 |
+--------+------+---------------+
1 row in set (0.00 sec)
在OneProxy中,当查询涉及多个分区时,依然可以对结果集进行排序,如下所示:
mysql> select * from my_range order by col2;
+--------+------+---------------+
| id | col2 | col3 |
+--------+------+---------------+
| 100 | 1 | This is row 1 |
| 100100 | 2 | This is row 2 |
| 200100 | 3 | This is row 3 |
| 300100 | 4 | This is row 4 |
| 101 | 5 | This is row 5 |
| 102 | 6 | This is row 6 |
+--------+------+---------------+
6 rows in set (0.00 sec)
mysql> select * from my_range order by col2 desc;
+--------+------+---------------+
| id | col2 | col3 |
+--------+------+---------------+
| 102 | 6 | This is row 6 |
| 101 | 5 | This is row 5 |
| 300100 | 4 | This is row 4 |
| 200100 | 3 | This is row 3 |
| 100100 | 2 | This is row 2 |
| 100 | 1 | This is row 1 |
+--------+------+---------------+
6 rows in set (0.00 sec)
mysql> select * from my_range order by col3;
+--------+------+---------------+
| id | col2 | col3 |
+--------+------+---------------+
| 100 | 1 | This is row 1 |
| 100100 | 2 | This is row 2 |
| 200100 | 3 | This is row 3 |
| 300100 | 4 | This is row 4 |
| 101 | 5 | This is row 5 |
| 102 | 6 | This is row 6 |
+--------+------+---------------+
6 rows in set (0.00 sec)
mysql> select * from my_range order by col3 desc;
+--------+------+---------------+
| id | col2 | col3 |
+--------+------+---------------+
| 102 | 6 | This is row 6 |
| 101 | 5 | This is row 5 |
| 300100 | 4 | This is row 4 |
| 200100 | 3 | This is row 3 |
| 100100 | 2 | This is row 2 |
| 100 | 1 | This is row 1 |
+--------+------+---------------+
6 rows in set (0.00 sec)
当查询语句有多层时,要求排序条件必须写在最外层的查询语句中,否则OneProxy不进行排序,比如我们将上面的某个语句,放到子查询里,会怎么样呢?如下所示:
mysql> select * from my_range order by col3 desc;
+--------+------+---------------+
| id | col2 | col3 |
+--------+------+---------------+
| 102 | 6 | This is row 6 |
| 101 | 5 | This is row 5 |
| 300100 | 4 | This is row 4 |
| 200100 | 3 | This is row 3 |
| 100100 | 2 | This is row 2 |
| 100 | 1 | This is row 1 |
+--------+------+---------------+
6 rows in set (0.00 sec)
mysql> select * from (select * from my_range order by col3 desc) as a;
+--------+------+---------------+
| id | col2 | col3 |
+--------+------+---------------+
| 102 | 6 | This is row 6 |
| 101 | 5 | This is row 5 |
| 100 | 1 | This is row 1 |
| 100100 | 2 | This is row 2 |
| 200100 | 3 | This is row 3 |
| 300100 | 4 | This is row 4 |
+--------+------+---------------+
6 rows in set (0.00 sec)
可以看到第二个SQL语句里,第一个分区内的记录排序了,然后OneProxy根据处理分区的顺序进行结果集的合并,这是需要注意的地方,排序子句中也支持按多个列排序。
也可以对查询的结果集进行分页操作,比如前面的6条记录,我们可以按3条记录一页,进行两次查询,同样分区的条件也要写在最外层的查询语句中。如下所示:
mysql> select * from my_range order by col2 desc limit 3;
+--------+------+---------------+
| id | col2 | col3 |
+--------+------+---------------+
| 102 | 6 | This is row 6 |
| 101 | 5 | This is row 5 |
| 300100 | 4 | This is row 4 |
+--------+------+---------------+
3 rows in set (0.00 sec)
mysql> select * from my_range order by col2 desc limit 3, 3;
+--------+------+---------------+
| id | col2 | col3 |
+--------+------+---------------+
| 200100 | 3 | This is row 3 |
| 100100 | 2 | This is row 2 |
| 100 | 1 | This is row 1 |
+--------+------+---------------+
3 rows in set (0.01 sec)
对分页语句有一个要求,要求分页的起点值(“offset”)不要超过一万,超过一万的话,结果集会不准确。因为分页语句,需要OneProxy缓存所有的记录,在内存中进行排序,如果起点值(“offset”)太大,OneProxy被迫缓存大量的记录,可以导致OneProxy内存耗尽,变得不够稳定。
对于分组统计语句,OneProxy精确支持“”、“”、“”和“”四种操作符,并且可以按字段分组进行操作,当查询语句有多层时,要求排序条件必须写在最外层的查询语句中,否则不进行处理。来看一个最简单的统计语句吧,如下所示:
mysql> select max(id), min(id), max(col2), min(col2) from my_range;
+---------+---------+-----------+-----------+
| max(id) | min(id) | max(col2) | min(col2) |
+---------+---------+-----------+-----------+
| 300100 | 100 | 6 | 1 |
+---------+---------+-----------+-----------+
1 row in set (0.00 sec)
接下来我们将“my_range”表的“col2”字段的值,做一些更新,以方便进行分组汇总的演示,如下所示:
mysql> update my_range set col2=1 where id=102;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update my_range set col2=3 where id=300100;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update my_range set col2=1 where id=101;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0
此时整个表的数据如下所示:
mysql> select * from my_range;
+--------+------+---------------+
| id | col2 | col3 |
+--------+------+---------------+
| 100 | 1 | This is row 1 |
| 101 | 1 | This is row 5 |
| 102 | 1 | This is row 6 |
| 100100 | 2 | This is row 2 |
| 200100 | 3 | This is row 3 |
| 300100 | 3 | This is row 4 |
+--------+------+---------------+
6 rows in set (0.00 sec)
下面来根据“col2”列做分组汇总的查询,如下所示:
mysql> select col2, max(id), min(id), sum(id), count(*) from my_range group by col2;
+------+---------+---------+---------+----------+
| col2 | max(id) | min(id) | sum(id) | count(*) |
+------+---------+---------+---------+----------+
| 1 | 102 | 100 | 303 | 3 |
| 2 | 100100 | 100100 | 100100 | 1 |
| 3 | 300100 | 200100 | 500200 | 2 |
+------+---------+---------+---------+----------+
3 rows in set (0.01 sec)
对于另外的两张表“my_list”和“my_hash”,我在这里就不重复举例子了,请大家自行测试验证。
对于OneProxy里的分区表来讲,查询操作(SELECT语句)在事务中只能涉及一个分区;当查询操作只涉及一个分区时,OneProxy不做任何处理,直接返回数据库的处理结果;当涉及多个分区时,OneProxy里可以做结果集的合并、排序、分页、汇总统计操作。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/29096438/viewspace-2071766/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/29096438/viewspace-2071766/