mysql force index 语法,MySQL详细基础语法2

GROUP BY

GROUP BY即分组 主要作用是把一个集合中根据group by后面的关键字进行分组,里面包含2个计算

去掉重复值(参考丁奇文章)

当没有索引时,根据group by后面的列,在内存中创建一个对应列为pk的临时表。对数据集进行遍历,没有则插入临时表,有则对对应的聚合函数列进行比较和更新,使最终结果中没有重复值

有索引时,直接使用索引,只需要一个计数器,对索引列进行更新,不需要维护临时表。

所以对group by进行优化的方式之一就是对group by后面的列添加索引,减少临时表(use tempfile)的产生和维护

减少use tempfile的两种方式

Loose Index Scan索引跳跃式扫描

要实现跳跃式扫描要满足如下条件

联合索引的前列选择率要低且必须在索引中,group by依次按照索引顺序,且select列中只有索引的左边的一部分且不能回表

例如如下表结构和执行计划

CREATE TABLE `t_group6` (

`emp_no` int(11) NOT NULL,

`dept_no` char(4) NOT NULL,

`from_date` date NOT NULL,

`to_date` date NOT NULL,

KEY `ix_empno_to_date` (`dept_no`,`to_date`,`emp_no`)

) ENGINE=InnoDB DEFAULT CHARSET=utf8;

按照索引顺序的语句执行计划如下

bdd47c084552

group1.png

但不按照索引顺序的时候则没有优化效果

bdd47c084552

group2.png

2. 前导列是关键

Tight Index Scan

range scan或者index scan

创建索引的时候必须按照where条件中的等号字段再加上group后面的列的顺序来创建

bdd47c084552

group3.png

进行排序(5.5~5.7),8.0开始不在对group by后面的关键字进行排序

group by的排序方式,字符类型是按照ascii码从左到右进行排序的。当含有空和null时 NULL在前空在后

相关参数

5.7 tmp_table_size,max_heap_table_size

group by分为完全模式和不完全模式

开启完全group by形式之后 select后面的字段中不能出现非聚合函数包裹的字段否则会报错

通过set session sql_mode = ' '; 这个配置来控制

并且不完全group模式会随着执行计划不同导致结果不同

例如有如下的表,索引情况和数据如下

bdd47c084552

group4.png

当使用ix_torder_2索引时,执行结果如下

bdd47c084552

group5.png

当时用ix_torder_3索引时,执行结果就改变了

bdd47c084552

group6.png

DISTINCT

和group by一样,可以去掉重复值。区别在于distinct不排序。

在内存中创建一个对应列为pk的临时表。对数据集进行遍历,没有则插入临时表,有则违反uk约束

从执行计划上也能看出distinct和group by的区别

bdd47c084552

group.png

order by

order by后面可以写数字表示列名,如下例子中1表示emp_no

bdd47c084552

order.png

order by有两种排序方式

single pass

原理:一次性把sql中涉及的字段全部多出来,然后依据排序字段排序,最后直接返回结果。

优点:只需要一次顺序I/O,无需任何随机I/O,大大降低I/O开销

缺点:内存容不下大量数据时,可能会先放到磁盘上对数据切分,对小数据块逐个排序,然后将结果集merge,在返回结果集

例子如下,下图为数据库排序相关数据记录,sort_merge_passes为0,一般表示执行了一次排序

bdd47c084552

order1.png

执行语句如下

SELECT emp_no,first_name,last_name FROM employees ORDER BY first_name LIMIT 100;

执行后的数据纪录如下

bdd47c084552

order2.png

执行计划如下

bdd47c084552

order3.png

简单的说就是把emp_no,first_name,last_name全部装到sort_buffer中排序

SQL单次排序效率与order by后面的字段以及select后面的字段都有关系,主要优化点在于减少数据集面积。所以查询的时候一般不要用*

two pass

原理:先读取行指针和排序字段进行排序,然后依据排序结果在读取所需数据

优点:排序数据量小,完全可以在内存中完成

缺点:第二次读取时,会发生大量随机I/O,太昂贵

two pass有两个要求

排序的列的值要大于max_length_for_sort_data

select列当中包含blob或者text的值

实例如下,下图为排序语句执行前的数据库记录

bdd47c084552

order1.png

执行语句如下

SELECT SQL_NO_CACHE COUNT(1) FROM (

SELECT * FROM t11 IGNORE INDEX (PRIMARY) ORDER BY c10 DESC LIMIT 10000

) a;

执行语句后,服务器记录如下

bdd47c084552

order4.png

sort_merge_pass超过sort buffer大小时 排序数据就会写入temp file中 temp file合并的次数 数据的大小都和sort_buffer_size有直接关系

order by limit有重复列的情况

如下表,其中ID不是primary key

bdd47c084552

order5.png

三个问题

当只有8条数据记录的时候,用create_time排序分页会有数据重复和丢失的问题,而用age则没有出现,age也有重复值

运行SQL如下

SELECT * FROM test ORDER BY create_time LIMIT 0,4;

SELECT * FROM test ORDER BY create_time LIMIT 4,4;

运行结果如下

bdd47c084552

order6.png

如图所示,id=8的数据重复,$\color{red}{原因是create_time重复,MySQL再回表查询剩余数据的时候,只要找到符合当前create_time的数据即可,所以造成了数据重复}$

当增加了4条记录后,用create_time排序分页没有问题,而使用age排序会出现数据重复或者丢失,age的重复值没有create_time多

给排序列加索引对这个问题没有帮助

math?formula=%5Ccolor%7Bred%7D%7B(%E5%85%B7%E4%BD%93%E8%AF%B4%E6%98%8E%E8%A7%81http%3A%2F%2Fmysql.taobao.org%2Fmonthly%2F2015%2F06%2F04%2F)%7D

解决方法

加入id列作为二级排序列

切换innodb引擎,对create_time列添加索引

创建内部视图,使用distinct关键字阻止视图合并

语句如下

SELECT * FROM (SELECT DISTINCT t.* FROM test t ORDER BY create_time) a LIMIT 0,4;

COUNT

count不计算NULL,在不影响业务的情况下可以使用count(1)或者count(*)。但是由于count是不计算NULL的,所以最好使用非空并且not null的列进行统计,如自增的primary key

count优化的例子

有如下的两条SQL

1. SELECT COUNT(fid=6 AND displayorder >= 0) AS posts FROM cdb_threads;

2. SELECT COUNT(fid) AS threads FROM cdb_threads WHERE fid=6 AND displayorder >= 0;

上面两条SQL对比,2要比1效率高,主要原因在于count的调用次数。假设当前表没有任何索引,查询fid=6并且displayorder >= 0的数据都需要全表扫描,效果相同。如果结果集数据为100条,2只需要对结果集进行一次count,但是1在整个过程中要进行100次count,显然1的效率更低

原语句如下

SELECT COUNT(emp_no IS NULL) AS c1 FROM t_order;

优化语句如下

1. SELECT COUNT(*) c1 FROM t_order WHERE emp_no IS NULL;

2. SELECT COUNT(case when emp_no is null then 1 end) c1 FROM t_order;

如上两个SQL 2相对1来说,效率会低。原因是行数不同。1会根据条件进行过滤,然后count统计实际的数据量。2则是全表扫描,会产生大量I/O

MIN/MAX

min和max可以求出数据列的最大最小值,但是不包含null

min/max使用例子

查看employess表中是否存在emp_no=1000111,存在返回1不存在返回0。直接使用case加where不能满足,如图

bdd47c084552

max.png

根据图片我们可以发现,直接使用case加where有值返回1,但是没值不会返回0。我们可以使用max或者min解决上面的问题,如图

bdd47c084552

max1.png

max/min在没有值返回的时候,会返回NULL,可以使用这个特性解决问题

实际优化案例

有如下语句,运行时间超过5s

SELECT t.systen_phone AS plattorm, t.platform_name AS channel,

COALESCE(

sum(case

when t.is_success='T' and t.create_time >= '2018-04-24' and t.create_time <= '2018-04-24 23:59:59'

then 1 else 0 end

)

,0) as applytotal ,

COALESCE(

sum(case

when is_påss='T' and tcl.is_refuse = 'F' and t.create_time >= '2018-04-24' and t.create_time <= '2018-04-24 23:59:59'

then 1 else 0 end

)

,0) as tpasscount,

COALESCE(

sum(case

when (status >= 40 or status = -40) and yrd_time IS NOT NULL and t.yrd_time >= '2018-04-24' and t.yrd_time <= '2018-04-24 23:59:59'

then 1 else 0 end

)

,0) as transportnum,

COALESCE(

sum(case

when (status >= 40 or status = -40)and yrd_time IS NOT NULL and t.yrd_time >= '2018-04-24' and t.yrd_time <= '2018-04-24 23:59:59'

then t.apply_amount/10000 else 0 end

)

,0) as transportamount,

COALESCE(

sum(

case

when audit_time IS NOT NULL and t.audit_time >= '2018-04-24' and t.audit_time <= '2018-04-24 23:59:59'

then 1 else 0 end

)

,0) as cponum,

COALESCE(

sum(

case

when (status !=50 ) and audit_time IS NOT NULL and t.audit_time >= '2018-04-24' and t.audit_time <= '2018-04-24 23:59:59'

then 1 else 0 end

)

,0) as cpopassnum,

COALESCE(

sum(

case when (status !=50 ) and audit_time IS NOT NULL and t.audit_time >= '2018-04-24' and t.audit_time <= '2018-04-24 23:59:59'

then t.apply_amount/10000 else 0 end

)

,0) as cpopassamount,

COALESCE(

sum(

case

when (status = 90 or status = 80 or status = 60) and tc.contractamount is not NULL and t.loan_date >= '2018-04-24' and t.loan_date <= '2018-04-24 23:59:59'

then tc.contractamount/10000

when (status = 90 or status = 80 or status = 60 ) and tc.contractamount is NULL and t.loan_date >= '2018-04-24' and t.loan_date <= '2018-04-24 23:59:59'

then t.applt_amount/10000 else 0 end

)

,0) as contractamount,

COALESCE(

sum(

case when (status = 90 or status = 80 or status = 60) and t.loan_date >= '2018-04-24' and t.loan_date <= '2018-04-24 23:59:59'

then 1 else 0 end

)

,0) as closenum

FROM tb_cbm_transport_yrd AS t

LEFT JOIN tb_cbm_transport_result tc ON t.id = tc.transport_id

LEFT JOIN tb_cbm_1imit tcl ON t.limit_id = tcl.limit_id

WHERE t.tools_platorm_code = 'uxWap'

GROUP BY left(t.system_phone,1);

执行计划如下图

bdd47c084552

max2.png

查询效率低的主要是原因是where条件提前,造成全表扫描,I/O较大,优化思路就是where条件优化,提取公共部分,并且创建索引

优化后语句如下,优化后时间为0.07s。可以看出优化的主要位置是将where提取出来

SELECT t.systen_phone AS plattorm, t.platform_name AS channel,

COALESCE(

sum(case

when t.is_success='T' and t.create_time >= '2018-04-24' and t.create_time <= '2018-04-24 23:59:59'

then 1 else 0 end

)

,0) as applytotal ,

COALESCE(

sum(case

when is_påss='T' and tcl.is_refuse = 'F' and t.create_time >= '2018-04-24' and t.create_time <= '2018-04-24 23:59:59'

then 1 else 0 end

)

,0) as tpasscount,

COALESCE(

sum(case

when (status >= 40 or status = -40) and yrd_time IS NOT NULL and t.yrd_time >= '2018-04-24' and t.yrd_time <= '2018-04-24 23:59:59'

then 1 else 0 end

)

,0) as transportnum,

COALESCE(

sum(case

when (status >= 40 or status = -40)and yrd_time IS NOT NULL and t.yrd_time >= '2018-04-24' and t.yrd_time <= '2018-04-24 23:59:59'

then t.apply_amount/10000 else 0 end

)

,0) as transportamount,

COALESCE(

sum(case

when audit_time IS NOT NULL and t.audit_time >= '2018-04-24' and t.audit_time <= '2018-04-24 23:59:59'

then 1 else 0 end

)

,0) as cponum,

COALESCE(

sum(case

when (status !=50 ) and audit_time IS NOT NULL and t.audit_time >= '2018-04-24' and t.audit_time <= '2018-04-24 23:59:59'

then 1 else 0 end

)

,0) as cpopassnum,

COALESCE(

sum(case

when (status !=50 ) and audit_time IS NOT NULL and t.audit_time >= '2018-04-24' and t.audit_time <= '2018-04-24 23:59:59'

then t.apply_amount/10000 else 0 end

)

,0) as cpopassamount,

COALESCE(

sum(case

when (status = 90 or status = 80 or status = 60) and tc.contractamount is not NULL and t.loan_date >= '2018-04-24' and t.loan_date <= '2018-04-24 23:59:59'

then tc.contractamount/10000

when (status = 90 or status = 80 or status = 60 ) and tc.contractamount is NULL and t.loan_date >= '2016-04-24' and t.loan_date <= '2018-04-24 23:59:59'

then t.applt_amount/10000 else 0 end

)

,0) as contractamount,

COALESCE(

sum(case

when (status = 90 or status = 80 or status = 60) and t.loan_date >= '2018-04-24' and t.loan_date <= '2018-04-24 23:59:59'

then 1 else 0 end

)

,0) as closenum

FROM(

SELECT ifnull(max(t.system_phone) , 'androld') system_phone,

ifnull(max(t.platform_name),'YRD_APP') platform_name,

t.is_success, t.create_time ,t.is_påss,t.status,t.yrd_time,

t.apply_amount, t.audit_time, t.loan_date,t.id ,t.limit_id

FROM

tb_cbm_transport_yrd AS t

where t.tools_platorm_code = 'uxWap'

and e.system_phone = 'android'

and t.create_time >= '2018-04-24' and t.create_time <= '2018-04-24 23:59:59'

and (

(t.create_time >= '2018-04-24' and t.create_time <= '2018-04-24 23:59:59')

or (t.yrd_time >= '2018-04-24' and t.yrd_time <= '2018-04-24 23:59:59')

or (t.audit_time >= '2018-04-24' and t.audit_time <= '2018-04-24 23:59:59')

or (t.loan_date >= '2018-04-24' and t.loan_date <= '2018-04-24 23:59:59')

)

union all

SELECT ifnull(max(t.system_phone) , 'androld') system_phone,

ifnull(max(t.platform_name),'YRD_APP') platform_name,

t.is_success, t.create_time ,t.is_påss,t.status,t.yrd_time,

t.apply_amount, t.audit_time, t.loan_date,t.id ,t.limit_id

FROM

tb_cbm_transport_yrd AS t

where t.tools_platorm_code = 'uxWap'

and e.system_phone = 'ios'

and t.create_time >= '2018-04-24' and t.create_time <= '2018-04-24 23:59:59'

and (

(t.create_time >= '2018-04-24' and t.create_time <= '2018-04-24 23:59:59')

or (t.yrd_time >= '2018-04-24' and t.yrd_time <= '2018-04-24 23:59:59')

or (t.audit_time >= '2018-04-24' and t.audit_time <= '2018-04-24 23:59:59')

or (t.loan_date >= '2018-04-24' and t.loan_date <= '2018-04-24 23:59:59')

)

) t

LEFT JOIN tb_cbm_transport_result tc ON t.id = tc.transport_id

LEFT JOIN tb_cbm_1imit tcl ON t.limit_id = tcl.limit_id

WHERE t.tools_platorm_code = 'uxWap' GROUP BY left(t.system_phone,1);

sum/avg

只能对数字或者能转化成数字的字符串类型计算

在MySQL中1/0返回NULL,如下图

bdd47c084552

sum.png

avg是不会计算NULL那一行的,有如下例子

bdd47c084552

sum1.png

math?formula=%5Ccolor%7Bred%7D%7B%E7%94%A8avg%E8%AE%A1%E7%AE%97%E5%B9%B3%E5%9D%87%E6%95%B0%E5%92%8C%E4%BD%BF%E7%94%A8sum%E5%8A%A0count%E8%AE%A1%E7%AE%97%E5%B9%B3%E5%9D%87%E6%95%B0%E7%9A%84%E5%80%BC%E5%AD%98%E5%9C%A8%E5%B7%AE%E5%BC%82%EF%BC%8C%E5%8E%9F%E5%9B%A0%E6%98%AFavg%E4%BC%9A%E6%8E%92%E9%99%A4NULL%E9%82%A3%E4%B8%80%E8%A1%8C%EF%BC%8C%E4%B8%8D%E8%BF%9B%E8%A1%8C%E8%AE%A1%E7%AE%97%E3%80%82%E4%BB%8Ecount%E5%8F%AF%E4%BB%A5%E7%9C%8B%E5%87%BA%EF%BC%8Ccount%E8%BF%9B%E8%A1%8C%E7%BB%9F%E8%AE%A1%E6%97%B6%EF%BC%8C%E4%BD%BF%E7%94%A8*%E4%BC%9A%E7%BB%9F%E8%AE%A1%E5%85%A8%E9%83%A8%E7%9A%84%E8%A1%8C%E6%95%B0%EF%BC%8C%E4%BD%86%E6%98%AF%E4%BD%BF%E7%94%A8%E5%88%97%E5%90%8D%E5%88%99%E4%BC%9A%E6%8E%92%E9%99%A4%E7%A9%BA%E8%A1%8C%EF%BC%8C%E6%89%80%E4%BB%A5avg%E6%98%AF%E6%80%BB%E6%95%B0%E9%99%A4%E4%BB%A59%E8%A1%8C%E7%9A%84%E7%BB%93%E6%9E%9C%E3%80%82%7D

HAVING

having主要用于group by后的二次过滤

having可以减少子查询嵌套,如下例子

SELECT * FROM (SELECT 'wassup' AS hi FROM (select 1) x) a WHERE hi = 'wassup';

可以使用having改写,简化嵌套

SELECT 'wassup' AS hi FROM (select 1) x HAVING hi = 'wassup';

行转列,列转行

行转列

bdd47c084552

line.png

列转行

bdd47c084552

line1.png

思路:

先把一行边两行

math?formula=%5Ccolor%7Bred%7D%7B(%E8%BF%99%E7%A7%8D%E5%A4%8D%E5%88%B6%E6%93%8D%E4%BD%9C%E4%BA%A7%E7%94%9F%E7%AC%9B%E5%8D%A1%E5%B0%94%E7%A7%AF%E4%B8%80%E5%AE%9A%E8%A6%81%E5%8F%91%E7%94%9F%E5%9C%A8%E5%85%A8%E9%83%A8%E6%9F%A5%E8%AF%A2%E4%B9%8B%E5%90%8E%EF%BC%8C%E5%90%A6%E5%88%99%E4%B8%A5%E9%87%8D%E5%BD%B1%E5%93%8D%E6%95%88%E7%8E%87)%7D

bdd47c084552

line2.png

使用case when分类

bdd47c084552

line3.png

GROUP BY和JOIN优化案例

有如下语句,运行时间为166s

SELECT product_sku.branch_id AS branchId,product_sku.product_id AS productId,

sum((product_sku.num - ifnull(branch_return_goods_info.num,0))) AS totalNum,

sum((product_sku.unit_price * product_sku.num) - ifnull(branch_return_goods_info.money,0)) AS totalMoney,

sum(ifnull(branch_return_goods_info.num,0)) returnNum,

sum(ifnull(branch_return_goods_info.money,0)) returnMoney,

product_info.branch_name AS branchName,

product_info.brand_name AS brandName,

product_info.product_name AS productName,

product_info.cate_one_name AS cateOneName,

product_info.cate_two_name AS cateTwoName,

product_info.cate_three_name AS cateThreeName,

sk.miniNum AS miniNum,

product_info.sell_unit_convert AS sellUnitConvert,

sk.stockMoney As stockMoney

FROM product_sku_201704 poduct_sku FORCE INDEX (IDX_time_order)

INNER JOIN product_info on product_info.branch_id= product_sku.branch_id AND product_info.pcoduct_id= poduct_sku.product_id

AND product_sku.store_id NOT in (108,109)

LEFT JOIN branch_return_goods_info on branch_return_goods_info.order_id = product_sku.orderNo AND branch_return_goods_info.hmp_id = product_sku.product_id AND branch_return_goods_info.type = 2

LEFT JOIN stock_turnover as sk on sk.store_id = product_sku.store_id AND sk.product_id = product_sku.product_id and sk.datetime ='2017-04-25'

WHERE product_sku.order_type >= 0

AND product_sku.orderNo not like 'YL%'

AND product_sku.ordetrNo not like 'PT%'

AND product_sku.order_time >= '2017-04-01 00:00:00'

AND product_sku.order_time <= '2017-04-25 23:59:59'

AND product_sku.orderState = 100

GROUP BY product_sku.product_id,product_sku.branch_id;

执行计划如下

bdd47c084552

group7.png

从执行计划我们可以看出,

math?formula=%5Ccolor%7Bred%7D%7Busing%20where%EF%BC%8Cusing%20temporary%EF%BC%8Cusing%20filesort%E5%90%8C%E6%97%B6%E4%BA%A7%E7%94%9F%EF%BC%8C%E8%AF%B4%E6%98%8Egroup%20by%E5%8F%91%E7%94%9F%E5%9C%A8join%E4%B9%8B%E5%90%8E%EF%BC%8C%E6%98%AF%E5%AF%B9%E6%95%B4%E4%B8%AA%E7%BB%93%E6%9E%9C%E9%9B%86%E8%BF%9B%E8%A1%8C%E6%93%8D%E4%BD%9C%EF%BC%8C%E6%80%A7%E8%83%BD%E4%BC%9A%E6%AF%94%E8%BE%83%E5%B7%AE%E3%80%82%7D阅读上述SQL可以发现,where过滤条件只和product_sku表有关,和join表没有关系,所以优化思路是先过滤,缩小结果集,然后join

优化后的SQL如下,优化后执行时间为25s

SELECT

product_sku.btanch_id as btanchId,

product_sku.ptoduct_id As ptoductId,

sum((product_sku.num - ifnull(branch_return_goods_info.num,0))) AS totalNum,

sum((product_sku.unit_price * product_sku.num) - ifnull(branch_return_goods_info.money,0)) AS totalMoney,

sum(ifnull(branch_return_goods_info.num,0)) returnNum,

sum(ifnull(branch_return_goods_info.money,0)) returnMoney,

product_info.branch_name AS branchName,

product_info.brand_name AS brandName,

product_info.product_name AS productName,

product_info.cate_one_name AS cateOneName,

product_info.cate_two_name AS cateTwoName,

product_info.cate_three_name AS cateThreeName,

sk.miniNum AS miniNum,

product_info.sell_unit_convert AS sellUnitConvert,

sk.stockMoney As stockMoney

FROM (

SELECT

product_sku.prcdct_id,

pcoduct_sku.branch_id,

sum(product_sku.num) num,

SUM(product_sku.unit_price) unit_price,

product_sku.orderNo,

product_sku.store_id

FROM proauct_sku_201704 produet_sku

WHERE product_sku.order_type >= 0

AND product_sku.orderNo not like 'YL%'

AND product_sku.ordetrNo not like 'PT%'

AND product_sku.order_time >= '2017-04-01 00:00:00'

AND product_sku.order_time <= '2017-04-25 23:59:59'

AND product_sku.orderState = 100

AND product_sku.store_id NOT in (108,109)

GROUP BY product_sku.product_id,product_sku.branch_id

) product_sku

INNER JOIN product_info on product_info.branch_id= product_sku.branch_id AND product_info.pcoduct_id= poduct_sku.product_id

LEFT JOIN branch_return_goods_info on branch_return_goods_info.order_id = product_sku.orderNo AND branch_return_goods_info.hmp_id = product_sku.product_id AND branch_return_goods_info.type = 2

LEFT JOIN stock_turnover as sk on sk.store_id = product_sku.store_id AND sk.product_id = product_sku.product_id and sk.datetime ='2017-04-25'

GROUP BY product_sku.product_id,product_sku.branch_id;

优化后执行计划如下

bdd47c084552

group8.png

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值