日志表处理
背景介绍:在蜂巢做电商抓取,日志记录在mysql一张表中,出现问题的时候方便查询到是哪个流程出错了。也方便排查官网改版带来的问题。一个用户从登录到抓取完毕可能有几百条日志。随着每日抓取的数据越来越多,日志增量也越来越大,之前一般几个月清理一次,现在几天就要清理。否则查询会变得非常慢。删除都是人工去做。而且线上数据也不能直接TRUNCATE 或者DELETE删除,避免锁表。该表有个特点,一般保存一周数据即可。
我们也想了一些方案,比如
1,定时去TRUNCATE 或者DELETE
这里除了刚才说了锁表问题,其实mysql表空间也不会释放。可能导致表越来越大。
2,使用mongdb
使用mongdb其实是挺不错的,数据量的支撑远远比mysql好,而且查询的业务相对简单。后来由于运维缺乏mong的维护经验,而且mong在删除数据也会存在mysql一样表空间释放的问题,所以也没有采用。
最后相对一个相对简单而且难道也不大的方案,使用mysql定时删增分区的方式:
3,mysql定时删增分区
(1)建表:
CREATE TABLE log
(
id
int(11) NOT NULL AUTO_INCREMENT ,
type
char(1) CHARACTER SET utf8 COLLATE utf8_bin NULL DEFAULT NULL ,
operator
varchar(64) CHARACTER SET utf8 COLLATE utf8_bin NULL DEFAULT NULL ,
operName
varchar(100) CHARACTER SET utf8 COLLATE utf8_bin NULL DEFAULT NULL ,
operResult
varchar(11) CHARACTER SET utf8 COLLATE utf8_bin NULL DEFAULT NULL ,
resultMsg
varchar(500) CHARACTER SET utf8 COLLATE utf8_bin NULL DEFAULT NULL ,
loginTimes
bigint(20) NULL DEFAULT NULL ,
pageType
varchar(64) CHARACTER SET utf8 COLLATE utf8_bin NULL DEFAULT NULL ,
pageNum
varchar(100) CHARACTER SET utf8 COLLATE utf8_bin NULL DEFAULT NULL ,
state
varchar(20) CHARACTER SET utf8 COLLATE utf8_bin NULL DEFAULT NULL ,
operRemark
varchar(20) CHARACTER SET utf8 COLLATE utf8_bin NULL DEFAULT NULL ,
createTime
timestamp NULL DEFAULT NULL ,
operParams
varchar(500) CHARACTER SET utf8 COLLATE utf8_bin NULL DEFAULT NULL ,
crawlHistoryId
int(11) NULL DEFAULT NULL ,
ip
varchar(20) CHARACTER SET utf8 COLLATE utf8_bin NULL DEFAULT NULL ,
port
varchar(20) CHARACTER SET utf8 COLLATE utf8_bin NULL DEFAULT NULL ,
PRIMARY KEY (id
,createTime),
INDEX index_operResult
(operResult
) USING BTREE ,
INDEX index_operator
(operator
) USING BTREE ,
INDEX index_createTime
(createTime
) USING BTREE
)
PARTITION BY RANGE (UNIX_TIMESTAMP(createTime))
(
PARTITION p20170401 VALUES LESS THAN (UNIX_TIMESTAMP(‘2017-04-01 00:00:00’)),
PARTITION p20170410 VALUES LESS THAN (UNIX_TIMESTAMP(‘2017-04-10 00:00:00’)),
PARTITION p20170420 VALUES LESS THAN (UNIX_TIMESTAMP(‘2017-04-20 00:00:00’)),
PARTITION p20170430 VALUES LESS THAN (UNIX_TIMESTAMP(‘2017-04-30 00:00:00’)),
PARTITION p20170510 VALUES LESS THAN (UNIX_TIMESTAMP(‘2017-05-10 00:00:00’)),
PARTITION p20170520 VALUES LESS THAN (UNIX_TIMESTAMP(‘2017-05-20 00:00:00’))
)
(2)添加执行逻辑
DELIMITER $$
USE test
$$
DROP PROCEDURE IF EXISTS create_Partition_log
$$
CREATE DEFINER=root
@localhost
PROCEDURE create_Partition_log
()
BEGIN
DECLARE EXIT HANDLER FOR SQLEXCEPTION ROLLBACK;
START TRANSACTION;
SELECT REPLACE(partition_name,’p’,”) INTO @P12_Name FROM INFORMATION_SCHEMA.PARTITIONS
WHERE table_name=’log’ ORDER BY partition_ordinal_position DESC LIMIT 1;
SET @Max_date= DATE(DATE_ADD(@P12_Name+0, INTERVAL 1 DAY))+0;
/* 修改表,在最大分区的后面增加一个分区,时间范围加1天 */
SET @s1=CONCAT(‘ALTER TABLE log ADD PARTITION (PARTITION p’,@Max_date,’ VALUES LESS THAN (UNIX_TIMESTAMP (”’,DATE(@Max_date),”’)))’);
/删除分区/
SELECT REPLACE(partition_name,’p’,”) INTO @P13_Name FROM INFORMATION_SCHEMA.PARTITIONS
WHERE table_name=’log’ and partition_name is not null ORDER BY partition_ordinal_position LIMIT 1;
SET @MIN_date= DATE(@P13_Name)+0;
SET @t1=CONCAT(‘ALTER TABLE log drop PARTITION p’,@MIN_date);
SELECT @s1;
PREPARE stmt2 FROM @s1;
EXECUTE stmt2;
DEALLOCATE PREPARE stmt2;
SELECT @t1;
PREPARE stmt3 FROM @t1;
EXECUTE stmt3;
DEALLOCATE PREPARE stmt3;
/* 取出最小的分区的名称,并删除掉 。 */
/* 提交 */
COMMIT ;
END$$
DELIMITER ;
(3)添加任务调度
CREATE EVENT Partition_log_event
ON SCHEDULE
EVERY 1 day STARTS ‘2016-05-27 23:59:59’
DO
CALL test.create_Partition_log
;
================================
问题备注:
(1)mysql必须开启任务调度
show
VARIABLES
like ‘event_scheduler’;
如果关闭,请执行:
set GLOBAL event_scheduler=1;
(2)查看分区创建和删除
select * from information_schema.PARTITIONS where TABLE_NAME=’log’
如果发现有问题,可以单独执行CALL test.create_Partition_log
;
看看执行逻辑是否报错。
(3)为了方便测试,mysql任务调度可设置成每秒执行一次
CREATE EVENT Partition_log_event
ON SCHEDULE
EVERY 1 second STARTS ‘2016-05-27 23:59:59’
DO
CALL test.create_Partition_log
;