数据表
表名:taobao_data
item_id:商品类型
shop_id:店铺ID
insert_time:更新时间
volume:月销量
cprice:实际价格
问题:
- 找出一天内有多次更新的商品ID
- 对于一天内有多次更新的商品,只保留当天最后一条记录,删除其他记录。
问题1解析:
我们可以看到时间列中是包含时、分、秒的,故我们需要创建一个只包含年、月、日的列,然后进行group by操作即可。
实现代码如下:
select item_id, date, count(item_id) as con_item
from(
select
item_id, date(insert_time) as date
from taobao_date) a
group by item_id, date
having con_item > 1
我们也可以尝试在原表中添加date列的方法进行解题,代码如下:
later table taobao_data add date VARCHAR(20)
UPDATE taobao_data set date = date(insert_time)
select item_id, date, count(item_id)
from taobao_data
group by item_id, date
having count(item_id) > 1
问题2解析
我们可以分步骤处理:
1)获取一天内多次更新的商品ID;
2)根据更新时间对商品ID倒序排列,并添加序号;
3)删除排序后序号大于1的记录即可。
通过问题1我们可以很轻松的得到步骤一的代码:
select item_id
from(
select item_id, date(insert_time) as date
from taobao_data
) a
group by item_id, date
having count(item_id) > 1
因为MySQL5.7中没有row_number() over()函数,所以步骤2中我们要通过set设置变量的方式添加序号。
set @i :=0, @type1 :='', @type2 :='';
select
@i := case when @type1 = item_id and @type2 = date(insert_time) then @i+1 else 1 end as num
@type1 := item_id as type1
@type2 := date(insert_time) as type2
insert_time
from taobao_data
where item_id in(
select item_id
from(
select item_id, date(insert_time) as date
from taobao_data
) a
group by item_id, date
having count(item_id) > 1)
order by item_id, insert_time desc
步骤三代码如下:
set @i :=0, @type1 :='', @type2 :='';
delete from taobao_data
where insert_time in(
select insert_time
from(
select
@i := case when @type1 = item_id and @type2 = date(insert_time) then @i+1 else 1 end as num
@type1 := item_id as type1
@type2 := date(insert_time) as type2
insert_time
from taobao_data
where item_id in(
select item_id
from(
select item_id, date(insert_time) as date
from taobao_data
) a
group by item_id, date
having count(item_id) > 1)
order by item_id, insert_time desc)b
where num > 1)
可以看到原表中,只剩下某一天中的最后一条更新记录。
对于问题2,如果使用新建date列的方式,代码如下:
SET @i :=0, @type1 :='', @type2 :='';
delete from taobao_data_copy
where insert_time in(
select insert_time
from
(select
@i := case when @type1 = item_id and @type2 = date then @i+1 else 1 end as num,
@type1 := item_id as type1,
@type2 := date as type2,
insert_time
from taobao_data_copy
where item_id in(
select item_id
from taobao_data_copy
group by item_id, date
having count(item_id)>1)
order by item_id, insert_time desc) a
WHERE num >1)
PS:对于MySQL8.0来说解决第二问的方法会方便很多,具体可以参考如下代码:
delete from taobao_data
where insert_time in (
select insert_time
from(
select item_id, insert_time,
row_number() over(partition by item_id, date order by insert_time desc) as num
from taobao_data
where item_id in(
select item_id, date, count(item_id)
from taobao_data
group by item_id, date
having count(item_id) > 1
)
) as a
where num >1)