Methods for processing duplicate records

The measurement of duplicate is the values of two or more records for columns(prod_id,cust_id,time_id) are the same.

select prod_id,cust_id,time_id,channel_id,promo_id,quantity_sold,amount_sold from (
   select prod_id,cust_id,time_id,channel_id,promo_id,quantity_sold,amount_sold,
   rank() over(partition by   prod_id,cust_id,time_id order by rowid)p1
   from sales_copy1
  )v where v.p1=1;

select prod_id,cust_id,time_id,channel_id,promo_id,quantity_sold,amount_sold from (
   select prod_id,cust_id,time_id,channel_id,promo_id,quantity_sold,amount_sold,
   row_number() over(partition by   prod_id,cust_id,time_id order by rowid)p1
   from sales_copy1
  )v where v.p1=1;

The following sql sentence is to select duplicate records with max amount_sold:

select prod_id,cust_id,time_id,channel_id,promo_id,quantity_sold,amount_sold from (
   select prod_id,cust_id,time_id,channel_id,promo_id,quantity_sold,amount_sold,
   row_number() over(partition by   prod_id,cust_id,time_id order by amount_sold desc)p1
   from sales_copy1
  )v where v.p1=1;


The following sql sentence uses group by method.
   select prod_id,cust_id,time_id,channel_id,promo_id,quantity_sold,amount_sold
   from sales_copy1
   where (rowid)in(select min(rowid) from sales_copy1 group by              prod_id,cust_id,time_id
  )

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/24057587/viewspace-735392/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/24057587/viewspace-735392/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值