在线应用里面经常要统计某时间段里面的订单量,查询量之类。有的时候为了在同一页面生产不同类别的报表,可能会从不同角度去统计数据库或者在不同的表里面去做统计。这个时候有必要选择是一次查询出所有结果还是多次查表呢?本文我们结合例子来看一下那种方法的效率会高一些。
我们现在有如下的表 transactions:
mysql> desc transactions;
+---------------+--------------+------+-----+-------------------+-----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+-------------------+-----------------+
| trsid | int(11) | NO | PRI | NULL | auto_increment |
| usrid | int(11) | YES | MUL | NULL | |
| trsproperty | varchar(80) | YES | | NULL | |
| trsvalue | int(8) | YES | | NULL | |
| trstype | varchar(100) | YES | | NULL | |
| trsmodifiedts | timestamp | YES | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| trscreatedts | timestamp | YES | MUL | NULL | |
+---------------+--------------+------+-----+-------------------+----------------+
分次查询会比一次大量查询更高效,因为每次的数据量较小,效率反而高
mysql> SELECT COUNT( usrid ) as total FROM transactions WHERE YEAR( trscreatedts) = '2017' AND usrid = '48' AND trsproperty = 'booking';
+--------+
| total |
+--------+
| 406557 |
+--------+
1 row in set (2.15 sec)
mysql> SELECT COUNT( usrid ) as total FROM transactions WHERE YEAR( trscreatedts) = '2017' AND usrid = '66' AND trsproperty = 'booking';
+---------+
| total |
+---------+
| 3196754 |
+---------+
1 row in set (9.33 sec)
分开查的时间总和为 11.48 sec,现在我们来看一下如果一次查询:
mysql> SELECT COUNT( usrid ) as total FROM transactions WHERE YEAR( trscreatedts) = '2017' AND usrid in ('48', '66') AND trsproperty = 'booking' group by usrid;
+---------+
| total |
+---------+
| 3196731 |
| 406557 |
+---------+
2 rows in set (15.76 sec)
在不加WHERE限制条件的情况下,COUNT( * )与COUNT(COL)基本可以认为是等价的;但是在有WHERE限制条件的情况下,COUNT( * )会比COUNT(COL)快, 测试结果如下:
mysql> SELECT COUNT( * ) as total FROM transactions WHERE UNIX_TIMESTAMP( trscreatedts ) BETWEEN 1498867200 AND 1562716800 AND usrid = '66' AND trsproperty = 'booking' ;
+---------+
| total |
+---------+
| 1038593 |
+---------+
1 row in set (9.35 sec)
mysql> SELECT COUNT( usrid ) as total FROM transactions WHERE UNIX_TIMESTAMP( trscreatedts ) BETWEEN 1498867200 AND 1562716800 AND usrid = '66' AND trsproperty = 'booking';
+---------+
| total |
+---------+
| 1038599 |
+---------+
1 row in set (9.77 sec)
通过建索引可以提高查询效率。
mysql> SELECT COUNT( usrid ) as total FROM transactions WHERE usrid = '66' AND trsproperty = 'booking';
+---------+
| total |
+---------+
| 8793563 |
+---------+
1 row in set (8.23 sec)
mysql> ALTER TABLE `timaticweb2`.`transactions`
ADD INDEX `usrid_trsproperty` (`usrid` ASC, `trsproperty` ASC);
mysql> SELECT COUNT( usrid ) as total FROM transactions WHERE usrid = '66' AND trsproperty = 'booking';
+---------+
| total |
+---------+
| 8793569 |
+---------+
1 row in set (3.21 sec)
另外,很多人说的用 like ‘ 2017%’代替函数运算YEAR(timestamp) 可以使用索引从而提高查询速度。会用到索引确实,但是查询的时间在这里体现不大。说明了时间运算和普通的查询有所区别。
mysql> SELECT COUNT( usrid ) as total FROM transactions WHERE YEAR( trscreatedts) = '2017' AND usrid = '66' AND trsproperty = 'booking';
+---------+
| total |
+---------+
| 3196217 |
+---------+
1 row in set (8.87 sec)
mysql> SELECT COUNT( usrid ) as total FROM transactions WHERE trscreatedts like '2017%' AND usrid = '66' AND trsproperty = 'booking';
+---------+
| total |
+---------+
| 3196213 |
+---------+
1 row in set, 1 warning (9.53 sec)