left join 索引失效无条件_从零开始学数据分析-mysql索引优化方案

最新推荐文章于 2023-03-30 14:20:50 发布

weixin_39614109

最新推荐文章于 2023-03-30 14:20:50 发布

阅读量388

点赞数

文章标签： left join 索引失效无条件

接上节，执行计划还有一个重要的字段是extra，以下是出现的选项。

1、using filesort：需要一次额外的查询或者排序，性能开销大。常见于order by语句中。

create

对于单索引来说，如果排序和查找是同一个字段，则不会出现using filesort，反之会使用using filesort。

复合索引：不能跨列（最佳左前缀），如果跨列必然出现using filesort。

alter table test_02 add index idx_a1_a2_a3(a1, a2, a3);
# using filesort  a2 被跨了
explain select * from test_02 where a1 = '' order by a3;
# using filesort  a1被垮了
explain select * from test_02 where a2 = '' order by a3;
# 不会出现using filesort
explain select * from test_02 where a1 = '' order by a2;

只要复合索引的顺序是保证从左到右，无论where和order by是什么组合的，都不会出现filesort， a1 -> (a2) -> (a3)。

2、using temporary，用到了临时表，性能开销较大，尽量避免。一般出现在group by中。

原则：查询哪些列就按照哪些列进行group by分组。

# 不出现 using temporary
explain select a1 from test_02 where a1 in ('1', '2', '3') group by a1;
# 出席那 using temporary，原因是因为查询一次之后又使用了a2分组。
explain select a1 from test_02 where a1 in ('1', '2', '3') group by a2;

如果发现语句出现问题，下面这篇文章解决，原因是mysql默认不允许对空行分组

https://blog.csdn.net/yalishadaa/article/details/72861737blog.csdn.net

再来熟悉一下sql解析过程：

from ... on ... join ... where ... group by ... having ... select distinct ... order by ... limit

3、using index，不读取源文件，即从索引中获取数据，不需要回表查询，性能提升。即索引覆盖。

// 使用到的数据均在索引里，使用索引即可获得所有数据
select a1, a2 from test_02 where a1 = '' and a2 = '';
// 无发使用using index
drop index idx_a1_a2_a3 on test_02;
alter table test_02 add index idx_a1_a2(a1, a2);
explain select a1, a3 from test_02 where a1 = '' and a2 = '';

如果使用到了using index，会对possiblekeys和key造成影响，如果没有where则索引只出现在key中，如果有where，则索引只出现在key和possible_keys中。

4、using where，需要回表查询，开销大，在where中使用。

假设一个字段age是索引，但是其他字段不是，那么需要回表查询。

# 继上表，where查询a3，那么该查询需要回表。
explain select a1, a3 from test_02 where a1 = '' and a3 = '';

5、impossible where，永远不可为true的查询where。

优化实例

create table test_03 (
    a1 int(3) not null,
    a2 int(3) not null,
    a3 int(3) not null,
    a4 int(3) not null
);
alter table test_03 add index idx_a1_a2_a3_a4(a1, a2, a3, a4);
# 这里虽然不是按照顺序进行查询，但是优化器做了优化，依旧可以利用索引
explain select a1, a2, a3, a4 from test_03 where a1 = 1 and a3 = 2 and a2 = 3 and a4 = 1;

# a1->a2-> a4失效（回表）-> a3  using where
explain select a1,a2,a3,a4 from test_03 where a1 = 1 and a2 = 2 and a4 = 4 order by a3;
# a1-> a4 失效-> a3跨列    using where ,using filesort
explain select a1,a2,a3,a4 from test_03 where a1 = 1 and a4 = 4 order by a3;

# 推荐写法
explain select a1, a2, a3, a4 from test_03 where a1 = 1 and a2 = 2 and a3 = 3 and a4 = 4;

# 编写时不按顺序，但是优化器做了优化
explain select a1, a2, a3, a4 from test_03 where a1 = 1 and a2 = 2 and a4 = 3 and a3 = 4;

# 不会using filesort 但是会触发using index using where
explain select a1, a2, a3, a4 from test_03 where a1 = 1 and a4 = 2 ordet by a2, a3;
# a4失效，其他不夸列，using where， a4从using where中查找

优化案例

单表优化、双表优化、多表优化。

单表优化

create

查询authorid为1并且typeid为2或3的bid。

explain select bid from book where typeid in (2, 3) and authorid = 1 order by typeid;

执行发现性能很差。

优化：加索引。

alter table book add index idx_bid_typeid_authorid(bid, typeid, authorid);
explain select bid from book where typeid in (2, 3) and authorid = 1 order by typeid;

type有了优化，但是不够。

根据sql语句的实际解析过程，添加索引。先解析where 后解析select

drop

性能提升。但是这里in可能会失效，导致整个索引失效，所以将in放到后面

drop

总结：a、保持索引的定义和使用顺序是一致的。b、将含有in的范围查询放到where最后，防止干扰。尽量少用in。

双表优化

create table teacher2(
    tid int(4) primary key,
    cid int(4) not null
);
insert into teacher2 values(1, 2);
insert into teacher2 values(2, 1);
insert into teacher2 values(3, 3);

create table course2(
    cid int(4),
    cname varchar(20)
);
insert into course2 value(1, 'java');
insert into course2 value(2, 'python');
insert into course2 value(3, 'js');

左连接

explain select * from teacher2 t left join course2 c 
on t.cid = c.cid where c.cname='java';

如果发现有using join buffer说明mysql内部做了优化。

索引加在哪里？a、小表驱动大表。b、索引建立在经常使用的字段上。（上述t.cid字段使用频繁，因此左外连接加到t.cid上，右外连接的话加到右表上）

小表 10，大表 500

select ... where 小表.x * 10 = 大表.y * 500 。两次循环次数结果一样，小表循环较少。

alter table teacher2 add index index_teacher2_cid(cid);
alter table course2 add index index_course2_cname(cname);

避免索引优化失效

a.复合索引，不要跨列或者无序使用。

b.尽量使用全索引匹配。尽量索引全用上。

不要在索引上进行任何操作。（计算，函数，类型转换）索引会失效。

select ... where a.x * 2 = xxx

对于复合索引，左边的一个索引失效，那么右边索引都失效。

索引不能使用（!= <>）或者is null (is not null)，否则索引失效，右侧跟着失效。> 概率失效

独立的索引没有影响。

有的时候底层优化不生效，是概率的。需要explain 推测。尽量使用覆盖索引。

like尽量以常量开头，不要以%开头，否则索引失效。

尽量不要使用类型转换，显式转换和隐式转换都不要，否则索引失效。

使用or，索引失效，可以将左侧的索引都失效。

其他优化方法：

exist 和 in，如果主查询的数据集大，则用in，如果子查询的数据集大，则用exist。

select tname from teacher exist(select * from teacher);

weixin_39614109

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
left join 索引失效无条件_从零开始学数据分析-mysql索引优化方案

接上节，执行计划还有一个重要的字段是extra，以下是出现的选项。1、using filesort：需要一次额外的查询或者排序，性能开销大。常见于order by语句中。create 对于单索引来说，如果排序和查找是同一个字段，则不会出现using filesort，反之会使用using filesort。复合索引：不能跨列（最佳左前缀），如果跨列必然出现using filesort。alter...
复制链接

扫一扫