0.什么是物化视图
物化视图是查询结果的预运算。不同于简单的视图,物化视图的结果一般存储于表中。物化视图用于需要对查询立即做出响应,而又需要耗费很长时间获得结果。物化视图必须能快速更新。它取决于对更新频率和内容的准确性的要求。一般说来物化视图能够在一定时间内及时更新。
Mysql本来是不支持视图的,但是在5.0以上的版本,支持了视图功能,但是可惜的是不提供物化视图,但是这也难不住咱们,自己动手丰衣足食。
1. 实现自己的物化视图
看一个它是如何实现的简单的查询实例:
SELECT COUNT(*) FROM MyISAM_table;
由于计数值存储在表的头部 立即返回结果。接下来的例子会耗费几秒到数分钟。
SELECT COUNT(*) FROM innodb_huge;
对此的可能解决方案是创建一个存储所有 InnoDB 行的表。
CREATE TABLE innodb_row_count ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY , schema_name VARCHAR(64) NOT NULL , table_name VARCHAR(64) NOT NULL , row_count INT UNSIGNED NOT NULL );
取决于对该信息结果正确性的需要,该表可以每天更新一次(花费系统资源最少,结果错误最大),一小时一次甚至是极端情况下每次改变都更新(最慢)。
另一种可能就是从信息架构中读取数据。但是信息会有高达20%的错误。
SELECT table_schema, table_name, table_rows
FROM information_schema.tables WHERE table_type = ‘BASE TABLE’;
2.更新物化视图
物化视图的更新方式有很多种。比如:
l 从不更新(只在开始更新,只用于静态数据)
l 根据需要(比如每天,比如每夜)
l 及时(每次数据修改之后)
一半使用的更新方法:
l 全部更新(速度慢,完全从无到有)
l 延时的(速度快,使用log表)
通过在日志表中存储变更信息,通常会产生简单的“快照”或者延时状况:
l 及时更新
l 完全更新
3.测试
为了理解这个方法,我们举个例子,详细讲解一下。
CREATE TABLE sales ( sales_id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY , product_name VARCHAR(128) NOT NULL , product_price DECIMAL(8,2) NOT NULL , product_amount SMALLINT NOT NULL ); INSERT INTO sales VALUES (NULL, 'Apple', 1.25, 1), (NULL, 'Apple', 2.40, 2), (NULL, 'Apple', 4.05, 3), (NULL, 'Pear', 6.30, 2), (NULL, 'Pear', 12.20, 4), (NULL, 'Plum', 4.85, 3); SELECT * FROM sales;
我们要知道售价和每种产品获得的利润,就要使用到两次的分组查询,我们晓得在mysql中连接查询和分组排序是会用到临时表和filesort的,这个如果数据量大的话,是十分耗时,如题查询如下:
EXPLAIN SELECT product_name , SUM(product_price) AS price_sum, SUM(product_amount) AS amount_sum , AVG(product_price) AS price_avg, AVG(product_amount) amount_agg , COUNT(*) FROM sales GROUP BY product_name ORDER BY price_sum \G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: sales type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 6 Extra: Using temporary; Using filesort 1 row in set (0.00 sec)
因为表中记录较少,因此速度很快,但是如果记录量很大这种,查询将会花费很多时间。
3.1创建物化视图
DROP TABLE sales_mv; CREATE TABLE sales_mv ( product_name VARCHAR(128) NOT NULL , price_sum DECIMAL(10,2) NOT NULL, amount_sum INT NOT NULL, price_avg FLOAT NOT NULL, amount_avg FLOAT NOT NULL, sales_cnt INT NOT NULL, UNIQUE INDEX product (product_name) ); INSERT INTO sales_mv SELECT product_name, SUM(product_price), SUM(product_amount), AVG(product_price), AVG(product_amount), COUNT(*) FROM sales GROUP BY product_name;
最简单的方法,我们得到了预期的正确结果:
mysql> SELECT * FROM sales_mv \G *************************** 1. row *************************** product_name: Apple price_sum: 7.70 amount_sum: 6 price_avg: 2.56667 amount_avg: 2 sales_cnt: 3 *************************** 2. row *************************** product_name: Pear price_sum: 18.50 amount_sum: 6 price_avg: 9.25 amount_avg: 3 sales_cnt: 2 *************************** 3. row *************************** product_name: Plum price_sum: 4.85 amount_sum: 3 price_avg: 4.85 amount_avg: 3 sales_cnt: 1 3 rows in set (0.01 sec)
这会导致我们刚才提到的“从不更新”模式失败。但是这不是我们想要的。
3.2按需更新物化视图
根据需要更新物化视图,我们可以用存储过程来实现
DROP PROCEDURE refresh_mv_now;
DELIMITER $$
CREATE PROCEDURE refresh_mv_now (
OUT rc INT
)
BEGIN
TRUNCATE TABLE sales_mv;
INSERT INTO sales_mv
SELECT product_name
, SUM(product_price), SUM(product_amount), AVG(product_price), AVG(product_amount)
, COUNT(*)
FROM sales
GROUP BY product_name;
SET rc = 0;
END;
$$
DELIMITER ;
好!我们看下他的运行结果:
CALL refresh_mv_now(@rc);
SELECT * FROM sales_mv \G
*************************** 1. row ***************************
product_name: Apple
price_sum: 7.70
amount_sum: 6
price_avg: 2.56667
amount_avg: 2
sales_cnt: 3
*************************** 2. row ***************************
product_name: Pear
price_sum: 18.50
amount_sum: 6
price_avg: 9.25
amount_avg: 3
sales_cnt: 2
*************************** 3. row ***************************
product_name: Plum
price_sum: 4.85
amount_sum: 3
price_avg: 4.85
amount_avg: 3
sales_cnt: 1
3 rows in set (0.00 sec)
INSERT INTO sales VALUES
(NULL, ‘Apple’, 2.25, 3), (NULL, ‘Plum’, 3.35, 1)
, (NULL, ‘Pear’, 1.80, 2);
CALL refresh_mv_now(@rc);
SELECT * FROM sales_mv \G
*************************** 1. row ***************************
product_name: Apple
price_sum: 7.70
amount_sum: 6
price_avg: 2.56667
amount_avg: 2
sales_cnt: 3
*************************** 2. row ***************************
product_name: Pear
price_sum: 18.50
amount_sum: 6
price_avg: 9.25
amount_avg: 3
sales_cnt: 2
*************************** 3. row ***************************
product_name: Plum
price_sum: 4.85
amount_sum: 3
price_avg: 4.85
amount_avg: 3
sales_cnt: 1
3 rows in set (0.00 sec)
3.3即时更新物化视图
每条语句之后做全部更新没有任何意义。但是我们想要合适的结果。做到这一点还是有点复杂的。
在每次insert sales 表我们都要更新物化视图。我们可以在sales表中透明的使用insert/update/delete触发器实现。
现在来创建需要的触发器:
DELIMITER $$ CREATE TRIGGER sales_ins AFTER INSERT ON sales FOR EACH ROW BEGIN SET @old_price_sum = 0; SET @old_amount_sum = 0; SET @old_price_avg = 0; SET @old_amount_avg = 0; SET @old_sales_cnt = 0; SELECT IFNULL(price_sum, 0), IFNULL(amount_sum, 0), IFNULL(price_avg, 0), IFNULL(amount_avg, 0) , IFNULL(sales_cnt, 0) FROM sales_mv WHERE product_name = NEW.product_name INTO @old_price_sum, @old_amount_sum, @old_price_avg, @old_amount_avg, @old_sales_cnt; SET @new_price_sum = @old_price_sum + NEW.product_price; SET @new_amount_sum = @old_amount_sum + NEW.product_amount; SET @new_sales_cnt = @old_sales_cnt + 1; SET @new_price_avg = @new_price_sum / @new_sales_cnt; SET @new_amount_avg = @new_amount_sum / @new_sales_cnt; REPLACE INTO sales_mv VALUES(NEW.product_name, @new_price_sum, @new_amount_sum, @new_price_avg, @new_amount_avg, @new_sales_cnt) ; END; $$ DELIMITER ; DELIMITER $$ CREATE TRIGGER sales_del AFTER DELETE ON sales FOR EACH ROW BEGIN SET @old_price_sum = 0; SET @old_amount_sum = 0; SET @old_price_avg = 0; SET @old_amount_avg = 0; SET @old_sales_cnt = 0; SELECT IFNULL(price_sum, 0), IFNULL(amount_sum, 0), IFNULL(price_avg, 0), IFNULL(amount_avg, 0) , IFNULL(sales_cnt, 0) FROM sales_mv WHERE product_name = OLD.product_name INTO @old_price_sum, @old_amount_sum, @old_price_avg, @old_amount_avg, @old_sales_cnt ; SET @new_price_sum = @old_price_sum - OLD.product_price; SET @new_amount_sum = @old_amount_sum - OLD.product_amount; SET @new_price_avg = @new_price_sum / @new_amount_sum; SET @new_sales_cnt = @old_sales_cnt - 1; SET @new_amount_avg = @new_amount_sum / @new_sales_cnt; REPLACE INTO sales_mv VALUES(OLD.product_name, @new_price_sum, @new_amount_sum, IFNULL(@new_price_avg, 0) , IFNULL(@new_amount_avg, 0), @new_sales_cnt) ; END; $$ DELIMITER ; DELIMITER $$ CREATE TRIGGER sales_upd AFTER UPDATE ON sales FOR EACH ROW BEGIN SET @old_price_sum = 0; SET @old_amount_sum = 0; SET @old_price_avg = 0; SET @old_amount_avg = 0; SET @old_sales_cnt = 0; SELECT IFNULL(price_sum, 0), IFNULL(amount_sum, 0), IFNULL(price_avg, 0), IFNULL(amount_avg, 0) , IFNULL(sales_cnt, 0) FROM sales_mv WHERE product_name = OLD.product_name INTO @old_price_sum, @old_amount_sum, @old_price_avg, @old_amount_avg, @old_sales_cnt ; SET @new_price_sum = @old_price_sum + (NEW.product_price - OLD.product_price); SET @new_amount_sum = @old_amount_sum + (NEW.product_amount - OLD.product_amount); SET @new_sales_cnt = @old_sales_cnt; SET @new_price_avg = @new_price_sum / @new_sales_count; SET @new_amount_avg = @new_amount_sum / @new_sales_cnt; REPLACE INTO sales_mv VALUES(OLD.product_name, @new_price_sum, @new_amount_sum, IFNULL(@new_price_avg, 0) , IFNULL(@new_amount_avg, 0), @new_sales_cnt) ; END; $$ DELIMITER ;
现在来看结果:
INSERT INTO sales VALUES (NULL, 'Apple', 1.25, 1); INSERT INTO sales VALUES (NULL, 'Apple', 2.40, 2); INSERT INTO sales VALUES (NULL, 'Apple', 4.05, 3); INSERT INTO sales VALUES (NULL, 'Pear', 6.30, 2); INSERT INTO sales VALUES (NULL, 'Pear', 12.20, 4); INSERT INTO sales VALUES (NULL, 'Plum', 4.85, 3); DELETE FROM sales WHERE sales_id = 5; DELETE FROM sales WHERE sales_id = 4; UPDATE sales SET product_amount = 3 where sales_id = 2; SELECT * from sales_v;
来看一下结果:
INSERT INTO sales VALUES (NULL, 'Apple', 1.25, 1); INSERT INTO sales VALUES (NULL, 'Apple', 2.40, 2); INSERT INTO sales VALUES (NULL, 'Apple', 4.05, 3); INSERT INTO sales VALUES (NULL, 'Pear', 6.30, 2); INSERT INTO sales VALUES (NULL, 'Pear', 12.20, 4); INSERT INTO sales VALUES (NULL, 'Plum', 4.85, 3); DELETE FROM sales WHERE sales_id = 5; DELETE FROM sales WHERE sales_id = 4; UPDATE sales SET product_amount = 3 where sales_id = 2; SELECT * from sales_v;
3.4具有快速拍摄功能的物化视图
与上面例子的不同之处在于改变并不立即使用,改变存放在日志表中在一定时间之后才更新物化视图。
除了以上的的例子我们需要另外的日志表
CREATE TABLE sales_mvl ( product_name VARCHAR(128) NOT NULL , product_price DECIMAL(8,2) NOT NULL , product_amount SMALLINT NOT NULL , sales_id INT UNSIGNED NOT NULL , product_ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP() );
不是更新mv日志是填充:
DROP TRIGGER sales_ins; DELIMITER $$ CREATE TRIGGER sales_ins AFTER INSERT ON sales FOR EACH ROW BEGIN INSERT INTO sales_mvl VALUES (NEW.product_name, NEW.product_price, NEW.product_amount, NEW.sales_id, NULL); END; $$ DELIMITER ; DROP TRIGGER sales_del; DELIMITER $$ CREATE TRIGGER sales_del AFTER DELETE ON sales FOR EACH ROW BEGIN DELETE FROM sales_mvl WHERE sales_id = OLD.sales_id; END; $$ DELIMITER ; DROP TRIGGER sales_upd; DELIMITER $$ CREATE TRIGGER sales_upd AFTER UPDATE ON sales FOR EACH ROW BEGIN UPDATE sales_mvl SET product_name = NEW.product_name , product_price = NEW.product_price , product_amount = NEW.product_amount , sales_id = NEW.sales_id , product_ts = CURRENT_TIMESTAMP() WHERE sales_id = OLD.sales_id; END; $$ DELIMITER ;
建立一个更新物化视图的存储过程。可能模式:
l 完全更新(更新至当前)
l 更新(更新到特定时间戳)
l 重建(全部重建清除MV日志表)
DELIMITER $$ CREATE PROCEDURE refresh_mv ( IN method VARCHAR(16) , IN ts TIMESTAMP , OUT rc INT ) BEGIN IF UPPER(method) = 'REBUILD' THEN TRUNCATE TABLE sales_mvl; TRUNCATE TABLE sales_mv; INSERT INTO sales_mv SELECT product_name , SUM(product_price), SUM(product_amount), AVG(product_price), AVG(product_amount) , COUNT(*) FROM sales GROUP BY product_name ; ELSEIF UPPER(method) = 'REFRESH FULL' THEN REPLACE INTO sales_mv SELECT product_name, SUM(price_sum), SUM(amount_sum) , SUM(price_sum)/SUM(sales_cnt), SUM(amount_sum)/SUM(sales_cnt) , SUM(sales_cnt) FROM ( SELECT product_name, price_sum, amount_sum, sales_cnt FROM sales_mv UNION ALL SELECT product_name , SUM(product_price), SUM(product_amount), COUNT(*) FROM sales_mvl GROUP BY product_name ) x GROUP BY product_name ; TRUNCATE TABLE sales_mvl; SET rc = 0; ELSEIF UPPER(method) = 'REFRESH' THEN REPLACE INTO sales_mv SELECT product_name, SUM(price_sum), SUM(amount_sum) , SUM(price_sum)/SUM(sales_cnt), SUM(amount_sum)/SUM(sales_cnt) , SUM(sales_cnt) FROM ( SELECT product_name, price_sum, amount_sum, sales_cnt FROM sales_mv UNION ALL SELECT product_name , SUM(product_price), SUM(product_amount), COUNT(*) FROM sales_mvl WHERE product_ts < ts GROUP BY product_name ) x GROUP BY product_name ; DELETE FROM sales_mvl WHERE product_ts < ts ; SET rc = 0; ELSE SET rc = 1; END IF; END; $$ DELIMITER ;
检验是否正常工作:
INSERT INTO sales VALUES (NULL, 'Apple', 1.25, 1); wait some time INSERT INTO sales VALUES (NULL, 'Apple', 2.40, 2); wait some time INSERT INTO sales VALUES (NULL, 'Apple', 4.05, 3); wait some time INSERT INTO sales VALUES (NULL, 'Pear', 6.30, 2); wait some time INSERT INTO sales VALUES (NULL, 'Pear', 12.20, 4); wait some time INSERT INTO sales VALUES (NULL, 'Plum', 4.85, 3); SELECT * from sales_mvl; SELECT * from sales_v; CALL refresh_mv('REFRESH', '2006-11-06 10:57:55', @rc); SELECT * from sales_v; CALL refresh_mv('REFRESH FULL', NULL, @rc); SELECT * from sales_v; CALL refresh_mv('REBUILD', NULL, @rc); SELECT * from sales_v;
4结论
MySQL中的触发器(5.0.27)是不是非常快。
大量数据时,物化视图可以提高及时查询速度。
如果插入数据速度上不受限制,这个方法可以提高查询速度。
在大量数据,或者是非内存表、或者内存不够大时,可以提高性能(即,在基础表中查询耗时较大时)
注:感谢小康同学,对本文档的部分翻译
参考文献:
http://www.fromdual.com/mysql-materialized-views
转帖:http://www.yqshare.com/materialized-views-with-mysql.html