最近帮朋友处理一批数据,将指定的列重复的数据拿出来。这些数据在excel里。有10W条左右。由于EXCEL不好操作。就想着将数据导入数据库通过SQL语句来操作。
导入数据库成功了,但由于SQL不是很牛,所以查询有些问题。
我用group by 把重复的拿掉,但还有个问题,大家都知道用group by的话,select 后面指定的字段必须与group by后面的一致。group by 只有个别字段,如果拿出其他未分组的字段信息呢?在网上搜了下,
总结如下:
使用了group by 之后,就要求select后面的字段包含在group by 或聚合函数里面,这时如果想读取其它字段则无法实现。
将你需要的字段放进max或min函数中,max:支持字符类型、数字类型。select
max(id) as id,username,password from users
group by username,password
order by id desc
或者用:
select * from
(select part from employee group by part) as t1
inner join
(select distinct englishname from employee where part in (select part from employee group by part )) as t2
on t1.part =t2.part
参考:
select v.p ,v.a,v.b,v.c,v.d,v.e,v.f,v.g,v.h,v.i,v.j,v.k,v.l,v.m,v.m,v.n,v.o from vegaga v right join (
select min(id) as id,a,b,c,d,e,f,g,h,n from vegaga where a is not null group by a,b,c,d,e,f,g,h,n
) as v1 on v1.id=v.id order by v.id