题目:
假如两个库表分别有如下字段:
表名:customers(客户基本信息,约1亿条数据) | ||
字段名 | 数据类型 | 是否非空 |
id | bigint(10) | 是 |
name | varchar(20) | 是 |
city | varchar(20) | 是 |
gender | tinyint(1) | 是 |
birthday | date | 是 |
mobile | varchar(20) | |
photo | varchar(255) |
表名:salary(客户收入情况,约600万条记录) | ||
字段名 | 数据类型 | 是否非空 |
id | bigint(11) | 是 |
monthsalary | decimal(10,2) | 是 |
yearbonus | decimal(10,2) |
请写出sql语句分别满足如下查询需求:(请注意查询性能,优化语句为方便测试目前数据库数据为2198031条数据,仅供参考)
(1)女性客户数量和平均月薪;
(2)不同城市的客户数量和平均月薪;
(3)列出没有手机号,或者没有照片,或者没有年奖金的客户姓名
(4)不同年龄段(0到100岁之间,每10岁为一个年龄段)的客户年收入;
(5)列出每个城市的年收入最高和最低的男性和女性的姓名和全年收入
答:
1)sql语句
select count(*)as c,avg(monthsalary) as a_monthsalary from customers where gender=0
未做优化前 :时间耗时为2.079s
优化思路一: 时间耗时为9.491s
ALTER TABLE customers ADD INDEX gender(gender);
优化思路二:时间耗时为2.134s
ALTER TABLE customers ADD INDEX monthsalary(monthsalary);
优化思路三:时间耗时为1.306s
ALTER TABLE customers ADD INDEX monthsalary_gender(monthsalary,gender);
优化思路四:时间耗时为1.176s(最佳)
ALTER TABLE customers ADD INDEX gender_monthsalary(gender,monthsalary);
2) sql语句
select count(*) c,avg(monthsalary) from customers1 GROUP BY city;
优化一: 耗时时间为:108.47s
ALTER TABLE customers ADD INDEX city(city);
优化思路二:耗时时间为: 3.858s
ALTER TABLE customers ADD INDEX monthsalary(monthsalary);
优化思路三:耗时时间为: 3.183s
ALTER TABLE customers ADD INDEX monthsalary_city(monthsalary,city);
优化思路四:耗时时间为:3.37S
ALTER TABLE customers ADD INDEX gender_monthsalary_city(gender,monthsalary,city);
优化思路五:耗时时间为: 1.720s
ALTER TABLE customers ADD INDEX city_monthsalary_gender(city,monthsalary,gender);
优化思路六:耗时时间为: 1.698S (最佳)
ALTER TABLE customers ADD INDEX city_monthsalary(city,monthsalary);
3) 第一种方案 sql语句
select name from customers where mobile="0" or photo='' or yearbonus="0"
未加索引前:2.459s
优化思路一:覆盖索引 (先计算占比)
select
count(distinct mobile)/ count(*) as mobile_select,
count(distinct photo)/ count(*) as photo_select,
count(distinct yearbonus) / count(*) as yearbonus_select,
count(*) as counts
from customers;
创建索引: 耗时时间为:2.178S
ALTER TABLE customers ADD INDEX mobile_yearbonus_photo(mobile,yearbonus,photo);
优化思路二:耗时时间为:2.343s
ALTER TABLE customers add INDEX mobile_yearbonus_photo(mobile,yearbonus,photo);
ALTER TABLE customers ADD INDEX idx_mobile_name(mobile,`name`);
ALTER TABLE customers ADD INDEX idx_photo_name(photo,`name`);
ALTER TABLE customers ADD INDEX idx_yearbonus_name(yearbonus,`name`);
由此可见索引对使用or连接的sql语句不起作用。
方案二:SQL语句
SELECT name from customers where mobile="0"
UNION ALL
SELECT name from customers where photo="0"
UNION ALL
SELECT name from customers where yearbonus=0
未加任何索引前:耗时时间为:5.74s
优化思路一: 耗时时间为:2066.578s
ALTER TABLE customers add INDEX mobile_yearbonus_photo(mobile,yearbonus,photo);
优化思路三:2.885s(最佳)
ALTER TABLE customers ADD INDEX idx_mobile_name(mobile,`name`);
ALTER TABLE customers ADD INDEX idx_photo_name(photo,`name`);
ALTER TABLE customers ADD INDEX idx_yearbonus_name(yearbonus,`name`);
4)sql语句
select elt(
interval(
TIMESTAMPDIFF(YEAR, birthdate, CURDATE()),0,10,20,30,40,50,60,70,80,90,100),
"0-10","10-20","200-30","30-40","40-50","50-60","60-70","70-80","80-90","90-100",">100"
) as age ,monthsalary*12+yearbonus as income
from customers GROUP BY age order by null
未优化前:耗时时间为:4.614s
优化思路一:耗时时间为:5.296s
计算占比:
select
count(distinct yearbonus)/ count(*) as year_select,
count(distinct monthsalary)/ count(*) as mon_select,
count(distinct TIMESTAMPDIFF(YEAR, birthdate, CURDATE())) / count(*) as bir_select,
count(*) as counts
from customers;
优化思路二:耗时时间为:4.124s
ALTER TABLE customers ADD INDEX birthdate_yearbonus_monthsalary(birthdate,yearbonus,monthsalary)
5)sql语句
#最小
select a.name,b.income FROM
customers a,
(
select
city,gender,min(monthsalary * 12 + yearbonus) as income
from
customers
group by city,gender
) b
where
a.city = b.city and
a.gender = b.gender and
(a.monthsalary * 12 + a.yearbonus) = b.income
#最大
select a.name,b.income FROM customers a,
(
select
city,gender,max(monthsalary * 12 + yearbonus) as income
from
customers ignore index(idx_gender_city_monthsalary)
group by city,gender
) b
where
a.city = b.city and
a.gender = b.gender and
(a.monthsalary * 12 + a.yearbonus) = b.income
优化思路一:耗时时间:7.263s
alter table customers1 add index idx_gender_city_name_monthsalary_yearbonus(gender, city, name, monthsalary, yearbonus);
优化思路二: 耗时时间:6.848s
alter table customers add index idx_gender_city_monthsalary_yearbonus(gender, city, monthsalary, yearbonus);
如果使用sql语句优化后仍然不是很理想的情况下,就要考虑分库分表,或者是程序来替代mysql的一些函数运算。(因为mysql函数是无法进行优化的,比如:函数的分组可以通过程序的数组来进行分组等等),以上只是针对这道题所做的一些优化,仅作为参考。