内容来源为六星教育,这里仅作为学习笔记
批量新增
Java环境安装参考地址:https://www.cnblogs.com/duanzq/p/12640335.html
索引优化与建立索引思想
分页优化复习
问题1:查询商品数据-分页查询
select * from products limit 10000000,10;–用时7s
优化思想: 可以通过id进行查询
select * from products where id >= (select id from products limit 10000000,1) LIMIT 10;–用时3s
优化
测试表如下
Customers1s 员工信息薪资表
Product 商品表
-
查询通过审核与销量排名前十的商品(当前店铺id为1的店铺)
select * from products where status =1 and shop_id =1 order by sold_count desc limit 0,10;–用时22s
添加索引
alter table products add index idx_shopid_status_sold_count(shop_id,status,sold_count);
再次查询
select * from products where status =1 and shop_id =1 order by sold_count desc limit 0,10;–用时0.022s思路:
- 先看索引信息
- 在使用explain进行sql分析
- 调整索引
-
查询女性的平均月薪与客户数量
select count(*),avg(monthsalary) from customers1s where gender = 0;
用时12s
建立索引:idx_gender_monthsalary(gender,monthsalary);alter table customers1s add index idx_gender_monthsalary(gender,monthsalary);
再次查询会快很多 用时1.8s -
查询不同城市的客户数量与平均月薪
select count(*),avg(monthsalary) from customers1s group by city;
用时21s
建立索引:idx_city_gender_monthsalary(city,gender,monthsalary);alter table customers1s add index idx_city_gender_monthsalary(city,gender,monthsalary);
再次查询会快很多 用时2s
函数索引
不同年龄段的客户年收入。
这个地方需要使用elt,interval,在查询中有些时候我们是需要进行区间分组
elt,interval主要是去解决区间分组的查询问题
select
elt(interval(TIMESTAMPDIFF(YEAR, birthdate, CURDATE()), 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100),
'0-10', '10-20', '20-30', '30-40', '40-50', '50-60', '60-70', '70-80', '80-90', '90-100', '100>'
) as age_level, avg((monthsalary * 12 + yearbonus)) as income
from customers1s
group by
elt(interval(TIMESTAMPDIFF(YEAR, birthdate, CURDATE()), 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100),
'0-10', '10-20', '20-30', '30-40', '40-50', '50-60', '60-70', '70-80', '80-90', '90-100', '100>'
);
Sql解释:
TIMESTAMPDIFF(interval,colum1,colum2)
字段类型:date或者datetime
计算过程:colum2减去colum1,即后面的减去前面的
计算结果:整数
CURDATE() 函数返回当前的日期。
思路:
创建索引从上可以得知我们可以通过给birthdate, monthsalary, yearbonus 这三个字段定义索引就可以起到很好地效果,但是如何定义,索引字段 顺序如何选择?
如上的SQL可通过按经验法则来做,因为经验法则考虑的是全局基数和选择性,而不是某个具体查询:查找方式主要是统计数据对于总数据量的分布范围,选择 与常作为条件并且分布相对多的字段
select
count(distinct yearbonus)/ count() as year_select,
count(distinct monthsalary)/ count() as mon_select,
count(distinct TIMESTAMPDIFF(YEAR, birthdate, CURDATE())) / count() as bir_select,
count() as counts
from customers1s;
解决方案:
函数索引:MySQL所谓的函数索引其实在本质上解释还是btree索引,实现的方式主要是通过在数据表中新增一个字段用来存放函数字段列,然后对这个字段定 义一个索引。
创建函数索引
1、添加一个虚拟的字段 alter table customers1s add column birthdate_generated int generated always as (elt(interval(TIMESTAMPDIFF(YEAR, birthdate, CURDATE()), 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100),
‘0-10’, ‘10-20’, ‘20-30’, ‘30-40’, ‘40-50’, ‘50-60’, ‘60-70’, ‘70-80’, ‘80-90’, ‘90-100’, ‘100>’
));
alter table customers1s add key idx_birthdate_interval(elt(interval(TIMESTAMPDIFF(YEAR, birthdate, CURDATE()), 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100),
‘0-10’, ‘10-20’, ‘20-30’, ‘30-40’, ‘40-50’, ‘50-60’, ‘60-70’, ‘70-80’, ‘80-90’, ‘90-100’, ‘100>’
));
2、添加索引 alter table customers1s add key idx_birthdate_monthsalary_yearbonus(birthdate,monthsalary,yearbonus);
Max函数与分组问题
当我们使用max与group by一起是用时会出现数据不一致的问题。
select max(sold_count),id from products where status = 1 group by shop_id;
select * from products where id = 213690;
两次查出来的销量sole_count不一样
原因:
主要是由group by 分组后显示的是第一条记录,而max()取的是相同sid中的最大score值造成的
解决办法:子查询
SELECT
a.id,
a.sold_count,
a.shop_id
FROM
products AS a,
(
SELECT
max(sold_count) as sold_count,
shop_id
FROM
products
WHERE
STATUS = 1
GROUP BY
shop_id
) AS b
WHERE
a.shop_id = b.shop_id and a.sold_count = b.sold_count;