用过sql的同学基本都会过like,但是大家对like了解多少,很多同学可能认知在like条件,如果第一个字符为通配符,sql语句就不会走索引,如果不为通配符,sql语句就会走索引,真相真的是这样的吗,我用实际测试案例来说明。
在mysql5.7数据库上我用sysbench做了一张5000w记录的表来模拟测试,表结构如下
Create Table: CREATE TABLE `sbtest1` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `k` int(10) unsigned NOT NULL DEFAULT '0', `c` char(120) COLLATE utf8_bin NOT NULL DEFAULT '', `pad` char(60) COLLATE utf8_bin NOT NULL DEFAULT '', PRIMARY KEY (`id`), KEY `k_1` (`k`), KEY `idx_pad` (`pad`)) ENGINE=InnoDB AUTO_INCREMENT=50000001 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
like语句基本有如下三种写法
select count(*) from sbtest1 where pad like '00000005198-22795060602-81061358729%';select count(*) from sbtest1 where pad like '%00000005198-22795060602-81061358729%';select count(*) from sbtest1 where pad like '%00000005198-%22795060602-81061358729%';
那我们来看看在数据中这三条语句的执行计划是怎样的
mysql> explain select count(*) from sbtest1 where pad like '00000005198-22795060602-81061358729%';+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+| 1 | SIMPLE | sbtest1 | NULL | range | idx_pad | idx_pad | 180 | NULL | 1 | 100.00 | Using where; Using index |+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+1 row in set, 1 warning (0.00 sec)mysql> explain select count(*) from sbtest1 where pad like '%00000005198-22795060602-81061358729%';+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+| 1 | SIMPLE | sbtest1 | NULL | index | NULL | idx_pad | 180 | NULL | 49315602 | 11.11 | Using where; Using index |+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+1 row in set, 1 warning (0.00 sec)mysql> explain select count(*) from sbtest1 where pad like '%00000005198-%22795060602-81061358729%';+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+| 1 | SIMPLE | sbtest1 | NULL | index | NULL | idx_pad | 180 | NULL | 49315602 | 11.11 | Using where; Using index |+----+-------------+---------+------------+-------+---------------+---------+---------+------+----------+----------+--------------------------+1 row in set, 1 warning (0.00 sec)
到这里,是不是颠覆了你的认知,三个写法竟然都是走索引的。这个时候,一定要保持冷静,多问自己几个为什么,为什么会走索引,三个语句的执行计划一样,执行时间是一样的吗。
三种写法,执行计划都是一样,这是因为mysql优化引擎做了很多事情,那三个sql执行时间一样吗,我们来实际测试一下
mysql> select count(*) from sbtest1 where pad like '00000005198-22795060602-81061358729%';+----------+| count(*) |+----------+| 1 |+----------+1 row in set (0.00 sec)mysql> select count(*) from sbtest1 where pad like '%00000005198-22795060602-81061358729%';+----------+| count(*) |+----------+| 1 |+----------+1 row in set (46.04 sec)mysql> select count(*) from sbtest1 where pad like '%00000005198-22795060602%-81061358729%';+----------+| count(*) |+----------+| 1 |+----------+1 row in set (45.94 sec)
从上面结果,可以很明显的看出第一个字符不是通配符的like查询语句效率最高。这是因为第一种写法,mysql引擎只需要对b+tree进行range的范围扫描即可,而第二种和第三种写法,需要将b+tree的节点整个扫描完,才能得到需要的数据,所以时间相差才会这么大。
大家在学习技术或者看技术文章的时候,一定要自己动手去验证,如果发现测试结果不一样,一定要去思考,这样才会走的更远。