需求场景:
一个记录日志的数据库,日增量巨大,需要将表进行分表,但查询的时候怎么办呢,那就联合查询,输入一个时间段,在这个时间段内找到相关联的表,用union all的方式将这些表的数据合在一起后,再进行查询。
还有这样的需求:需要对日志进行统计,我怎么知道你想怎么统计呢,你有N种统计方法,还要导出到文件,看来不能写死啊,万一哪你想改,不累死我啊,那就将统计方法分离出来吧,把它写成sql文件,这里你想怎么统计就怎么统计,我只管执行sql,并进行分表查询,唉,你哪会写sql啊,还不是让哥来写。
一。首先,建个存储过程,它的作用是,将单表查询的sql转化为分表查询sql,并执行
CREATE DEFINER=`root`@`localhost` PROCEDURE `proc_tablepart_query`(IN `fromTime` varchar(20),IN `toTime` varchar(20),IN `prefix` varchar(50),IN `sqlModule` LONGTEXT,IN `databaseName` varchar(50))
BEGIN
DECLARE tableName varchar(50);
DECLARE querySQL LONGTEXT;
DECLARE T int;
DECLARE lastTableTime VARCHAR(20);
DECLARE ttt VARCHAR(1000);
DECLARE _Cur CURSOR FOR
SELECT case TABLE_NAME
when prefix then CONCAT(prefix, now())
else TABLE_NAME
end as table_time, TABLE_NAME
FROM information_schema.TABLES where TABLE_NAME like concat(prefix,'%') AND TABLE_SCHEMA=databaseName order by table_time asc;
set querySQL = '';
set lastTableTime = '';
SELECT count(1) into T FROM information_schema.TABLES where TABLE_NAME like concat(prefix,'%') AND TABLE_SCHEMA=databaseName order by TABLE_NAME ASC;
set @FirstAdd = TRUE;
open _Cur;
REPEAT
FETCH _Cur INTO ttt,tableName;
set @tmp = REPLACE(tableName, prefix, '');
if LENGTH(@tmp) < 1 THEN
set @tmp = now();
end if;
set @isAdd = 0;
if toTime<= lastTableTime THEN
#包括这张表,加入sql语句查询
#set @sql = concat('a---', tableName);
#select @sql;
set @isAdd = 0;
elseif fromTime > @tmp THEN
set @isAdd = 0;
else
set @isAdd = 1;
end if;
set lastTableTime = @tmp;
if @isAdd = 1 THEN
if @FirstAdd then
set querySQL = CONCAT(querySQL,'(select * from `', tableName ,'`)');
ELSE
set querySQL = CONCAT(querySQL,' union all (select * from `', tableName ,'`)');
end if;
set @FirstAdd = FALSE;
end if;
set T = T -1;
UNTIL T=0 end REPEAT;
CLOSE _Cur;
#select querySQL;
#set @alias = 'alias_name';
#set sqlModule = replace(sqlModule,concat(prefix,'.'),concat(@alias,'.') );
#set querySQL = replace(sqlModule, prefix, concat('(',querySQL,') as ',@alias));
set querySQL = replace(sqlModule, prefix, concat('(',querySQL,')'));
#select querySQL;
#set querySQL = concat('select * from (',querySQL,') t');
set @quer = querySQL;
prepare stmt from @quer;
EXECUTE stmt; #执行SQL语句
deallocate prepare stmt; #释放掉预处理段
END
参数解释:
fromTime与toTime: 查询时间段,存储过程会根据这个时间段来查找哪些分表符合查询条件。
prefix: 表名的前缀,分表的表名格式为prefix2000-01-01 00:00:00,意思是这个表里的数据是2000-01-01 00:00:00以前的;如果还有更前以前的分表,则就在最近的上一个分表时间到这个时间之间,唉,好复杂,我都晕了。
sqlModule:单表查询的sql语句
databaseName:数据库名称
存储过程分析:
首先,查询分表数量
SELECT count(1) into T FROM information_schema.TABLES where TABLE_NAME like concat(prefix,'%') AND TABLE_SCHEMA=databaseName;
其次,声明游标,查询分表名称并根据时间排序,排升序,将当前表排到最后
DECLARE _Cur CURSOR FOR
SELECT case TABLE_NAME
when prefix then CONCAT(prefix, now())
else TABLE_NAME
end as table_time, TABLE_NAME
FROM information_schema.TABLES where TABLE_NAME like concat(prefix,'%') AND TABLE_SCHEMA=databaseName order by table_time asc;
然后,遍历游标,组装动态sql
open _Cur;
REPEAT
FETCH _Cur INTO ttt,tableName;
set @tmp = REPLACE(tableName, prefix, '');
if LENGTH(@tmp) < 1 THEN
set @tmp = now();
end if;
set @isAdd = 0;
if toTime<= lastTableTime THEN
#包括这张表,加入sql语句查询
#set @sql = concat('a---', tableName);
#select @sql;
set @isAdd = 0;
elseif fromTime > @tmp THEN
set @isAdd = 0;
else
set @isAdd = 1;
end if;
set lastTableTime = @tmp;
if @isAdd = 1 THEN
if @FirstAdd then
set querySQL = CONCAT(querySQL,'(select * from `', tableName ,'`)');
ELSE
set querySQL = CONCAT(querySQL,' union all (select * from `', tableName ,'`)');
end if;
set @FirstAdd = FALSE;
end if;
set T = T -1;
UNTIL T=0 end REPEAT;
CLOSE _Cur;
若传入的sql 是select * from A,则组装后的sql为
select * from ((select * from A1) union all (select * from A2)) as ddd
最后,执行动态sql
set querySQL = replace(sqlModule, prefix, concat('(',querySQL,')'));
#select querySQL;
#set querySQL = concat('select * from (',querySQL,') t');
set @quer = querySQL;
prepare stmt from @quer;
EXECUTE stmt; #执行SQL语句
deallocate prepare stmt; #释放掉预处理段
END
二。建立java项目,读取sql文件和主要参数,并调用存储过程
代码就没必要贴了,用的mybatis做持久层,将查询的数据写入到文件,两种格式写入,txt,与excel(poi插件)
数据库分表应该有很多应用了,技术也应该很成熟了,但却找不到这方面的框架,我的这种实现方式感觉不好,但作为内部人员使用已经足够了。