背景
在平时的工作中,有些同学对count的用法还是有疑惑的,为此我做个简单的总结和测试,希望对大家有帮助。
count(*)和count(X)是不等价的
表达式
|
含义
| |
count(*)
|
返回总行数,包括空和非空值
| |
count(expression)
|
返回expression中的非空值,例如count(1)或count(0)和count(*)等价
| |
count(column)
|
只返回column的非空值
|
不同类型的count的速度是不一样的
因为MyISAM已经在一个表里缓存了表数据量,MyISAM可以很快的返回count(*)或者count(not null)的值,如果想count(column can be null)的话就会比较慢,因为count可以为null的列要遍历列的数据的。
举个例子:
show create table sbtest1\G
*************************** 1. row ***************************
Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`gmt_create` datetime NOT NULL,
`gmt_modified` datetime NOT NULL,
`k` int(11) DEFAULT NULL,
`c` varchar(500) NOT NULL DEFAULT '',
`pad` char(60) NOT NULL DEFAULT '',
`is_used` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=69313841 DEFAULT CHARSET=latin1
测试环境
Server version: 5.7.20 MySQL Community Server (GPL)
32核 128G
执行计划
mysql> explain extended select count(*) from sbtest1;
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
| 1 | SIMPLE | sbtest1 | NULL | index | NULL | k_1 | 5 | NULL | 25926320 | 100.00 | Using index |
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
1 row in set, 2 warnings (0.00 sec)
mysql> explain extended select count(0) from sbtest1;
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
| 1 | SIMPLE | sbtest1 | NULL | index | NULL | k_1 | 5 | NULL | 25926320 | 100.00 | Using index |
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
1 row in set, 2 warnings (0.00 sec)
mysql> explain extended select count(1) from sbtest1;
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
| 1 | SIMPLE | sbtest1 | NULL | index | NULL | k_1 | 5 | NULL | 25926320 | 100.00 | Using index |
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
1 row in set, 2 warnings (0.00 sec)
mysql> explain extended select count(id) from sbtest1;
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
| 1 | SIMPLE | sbtest1 | NULL | index | NULL | k_1 | 5 | NULL | 25926320 | 100.00 | Using index |
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
1 row in set, 2 warnings (0.00 sec)
mysql> explain extended select count(k) from sbtest1;
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
| 1 | SIMPLE | sbtest1 | NULL | index | NULL | k_1 | 5 | NULL | 25926320 | 100.00 | Using index |
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
1 row in set, 2 warnings (0.00 sec)
测试的SQL
[root@localhost ~]# mysqlslap -h 127.0.0.1 -P 3306 -u root -p'123456' --iterations=30 --concurrency=1 --number-of-queries=1 --create-schema='test' --query='select count(*) from sbtest1'
Benchmark
Average number of seconds to run all queries: 4.295 seconds
Minimum number of seconds to run all queries: 4.197 seconds
Maximum number of seconds to run all queries: 4.463 seconds
Number of clients running queries: 1
Average number of queries per client: 1
[root@localhost ~]# mysqlslap -h 127.0.0.1 -P 3306 -u root -p'123456' --iterations=30 --concurrency=1 --number-of-queries=1 --create-schema='test' --query='select count(0) from sbtest1'
Benchmark
Average number of seconds to run all queries: 4.277 seconds
Minimum number of seconds to run all queries: 4.192 seconds
Maximum number of seconds to run all queries: 4.452 seconds
Number of clients running queries: 1
Average number of queries per client: 1
[root@localhost ~]# mysqlslap -h 127.0.0.1 -P 3306 -u root -p'123456' --iterations=30 --concurrency=1 --number-of-queries=1 --create-schema='test' --query='select count(1) from sbtest1'
Benchmark
Average number of seconds to run all queries: 4.281 seconds
Minimum number of seconds to run all queries: 4.188 seconds
Maximum number of seconds to run all queries: 4.849 seconds
Number of clients running queries: 1
Average number of queries per client: 1
[root@localhost ~]# mysqlslap -h 127.0.0.1 -P 3306 -u root -p'123456' --iterations=30 --concurrency=1 --number-of-queries=1 --create-schema='test' --query='select count(id) from sbtest1'
Benchmark
Average number of seconds to run all queries: 4.716 seconds
Minimum number of seconds to run all queries: 4.631 seconds
Maximum number of seconds to run all queries: 4.778 seconds
Number of clients running queries: 1
Average number of queries per client: 1
[root@localhost ~]# mysqlslap -h 127.0.0.1 -P 3306 -u root -p'123456' --iterations=30 --concurrency=1 --number-of-queries=1 --create-schema='test' --query='select count(k) from sbtest1'
Benchmark
Average number of seconds to run all queries: 4.832 seconds
Minimum number of seconds to run all queries: 4.739 seconds
Maximum number of seconds to run all queries: 5.054 seconds
Number of clients running queries: 1
Average number of queries per client: 1
从上面的测试可以看出查询速度:count(expression) > count(*) > count(column not null) > count( column can be null )
值得说明的是:
count(column not null) > count( column can be null ) 对于不同的列并不是绝对的,对于同一列的count(column not null) > count( column can be null )这个仍然是成立的。
比如:
mysql> alter table sbtest1 add index idx_is_used (is_used);
Query OK, 0 rows affected (1 min 43.12 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> alter table sbtest1 add index idx_gmt_create (gmt_create);
Query OK, 0 rows affected (1 min 49.33 sec)
Records: 0 Duplicates: 0 Warnings: 0
跑出来的结果是:
[root@localhost ~]# mysqlslap -h 127.0.0.1 -P 3306 -u root -p'123456' --iterations=30 --concurrency=1 --number-of-queries=1 --create-schema='test' --query='select count(is_used) from sbtest1'
Benchmark
Average number of seconds to run all queries: 5.391 seconds
Minimum number of seconds to run all queries: 5.222 seconds
Maximum number of seconds to run all queries: 5.494 seconds
Number of clients running queries: 1
Average number of queries per client: 1
[root@localhost ~]# mysqlslap -h 127.0.0.1 -P 3306 -u root -p'123456' --iterations=30 --concurrency=1 --number-of-queries=1 --create-schema='test' --query='select count(gmt_create) from sbtest1'
Benchmark
Average number of seconds to run all queries: 5.187 seconds
Minimum number of seconds to run all queries: 5.104 seconds
Maximum number of seconds to run all queries: 5.321 seconds
Number of clients running queries: 1
Average number of queries per client: 1
这两列is_used和gmt_create都是非空,而k是是可空,但是查询速度却没有count(k)快,原因是什么呢?我们来看下执行计划:
mysql> explain select count(gmt_create) from sbtest1;
+----+-------------+---------+------------+-------+---------------+----------------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+----------------+---------+------+----------+----------+-------------+
| 1 | SIMPLE | sbtest1 | NULL | index | NULL | idx_gmt_create | 5 | NULL | 25931936 | 100.00 | Using index |
+----+-------------+---------+------------+-------+---------------+----------------+---------+------+----------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain select count(is_used) from sbtest1;
+----+-------------+---------+------------+-------+---------------+-------------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+-------------+---------+------+----------+----------+-------------+
| 1 | SIMPLE | sbtest1 | NULL | index | NULL | idx_is_used | 4 | NULL | 25931936 | 100.00 | Using index |
+----+-------------+---------+------------+-------+---------------+-------------+---------+------+----------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain select count(k) from sbtest1;
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
| 1 | SIMPLE | sbtest1 | NULL | index | NULL | k_1 | 4 | NULL | 25931936 | 100.00 | Using index |
+----+-------------+---------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
我们再改变下列的形式:
mysql> alter table sbtest1 change column k k int(11) NOT NULL DEFAULT '0';
Query OK, 0 rows affected (17 min 44.57 sec)
Records: 0 Duplicates: 0 Warnings: 0
再跑下:
[root@localhost ~]# mysqlslap -h 127.0.0.1 -P 3306 -u root -p'015891' --iterations=30 --concurrency=1 --number-of-queries=1 --create-schema='test' --query='select count(k) from sbtest1'
Benchmark
Average number of seconds to run all queries: 4.684 seconds
Minimum number of seconds to run all queries: 4.528 seconds
Maximum number of seconds to run all queries: 4.812 seconds
Number of clients running queries: 1
Average number of queries per client: 1
对比之前的压测结果:对于k列非空比空快(4.684 < 4.832)。
count带where场景
在实际的应用场景中,已经有其他方法代替不带where条件的count(比如预估值,比如计数器等),很多带where条件的是需要count的,那这类SQL的速度是什么样的呢?
[root@localhost ~]# mysqlslap -h 127.0.0.1 -P 3306 -u root -p'123456' --iterations=30 --concurrency=1 --number-of-queries=1 --create-schema='test' --query='select count(*) from sbtest1 where id>1000000'
Benchmark
Average number of seconds to run all queries: 6.696 seconds
Minimum number of seconds to run all queries: 6.508 seconds
Maximum number of seconds to run all queries: 6.817 seconds
Number of clients running queries: 1
Average number of queries per client: 1
[root@localhost ~]# mysqlslap -h 127.0.0.1 -P 3306 -u root -p'123456' --iterations=30 --concurrency=1 --number-of-queries=1 --create-schema='test' --query='select count(0) from sbtest1 where id>1000000'
Benchmark
Average number of seconds to run all queries: 6.717 seconds
Minimum number of seconds to run all queries: 6.490 seconds
Maximum number of seconds to run all queries: 6.865 seconds
Number of clients running queries: 1
Average number of queries per client: 1
[root@localhost ~]# mysqlslap -h 127.0.0.1 -P 3306 -u root -p'123456' --iterations=30 --concurrency=1 --number-of-queries=1 --create-schema='test' --query='select count(1) from sbtest1 where id>1000000'
Benchmark
Average number of seconds to run all queries: 6.656 seconds
Minimum number of seconds to run all queries: 6.519 seconds
Maximum number of seconds to run all queries: 6.859 seconds
Number of clients running queries: 1
Average number of queries per client: 1
[root@localhost ~]# mysqlslap -h 127.0.0.1 -P 3306 -u root -p'123456' --iterations=30 --concurrency=1 --number-of-queries=1 --create-schema='test' --query='select count(id) from sbtest1 where id>1000000'
Benchmark
Average number of seconds to run all queries: 6.691 seconds
Minimum number of seconds to run all queries: 6.514 seconds
Maximum number of seconds to run all queries: 6.865 seconds
Number of clients running queries: 1
Average number of queries per client: 1
执行计划
mysql> explain select count(id) from sbtest1 where id>1000000;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+
| 1 | SIMPLE | sbtest1 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 12963160 | 100.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain select count(*) from sbtest1 where id>1000000;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+
| 1 | SIMPLE | sbtest1 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 12963160 | 100.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain select count(0) from sbtest1 where id>1000000;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+
| 1 | SIMPLE | sbtest1 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 12963160 | 100.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain select count(1) from sbtest1 where id>1000000;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+
| 1 | SIMPLE | sbtest1 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 12963160 | 100.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
在这类场景中count(expression)>count(id)>count(*), count(id)和count(*) 微差。
另外在5.7中也做了一些改动,“[会根据flag判断是否可以把count(*)下推到引擎层,由于只有一次引擎层的调用,减少了Server层和InnoDB的交互,避免了无谓的内存操作或格式转换](http://mysql.taobao.org/monthly/2016/06/10/)”,从这个角度讲好像是优化了count(*),但是“[由于总是强制使用聚集索引,缺点很明显:当二级索引的大小远小于聚集索引,且数据不在内存中时,使用二级索引显然要快些,因此文件IO更少。](http://mysql.taobao.org/monthly/2016/06/10/)”
结论
- count(expression)比count(*)和count(column),如果没有特殊业务含义的话,可以优先使用
- 到底是使用count(*)快还是count(column)快,这个也不是一定的
参考
https://dev.mysql.com/doc/refman/5.6/en/group-by-functions.html#function_count
http://www.rndblog.com/mysql-select-count/
http://www.mysqldiary.com/limited-select-count/
http://mysqlha.blogspot.com/2009/08/fast-count-for-innodb.html