GROUP BY和DISTINCT有什么区别

本文翻译自:Is there any difference between GROUP BY and DISTINCT

I learned something simple about SQL the other day: 前几天,我学到了一些关于SQL的简单知识:

SELECT c FROM myTbl GROUP BY C

Has the same result as: 与以下结果相同:

SELECT DISTINCT C FROM myTbl

What I am curious of, is there anything different in the way an SQL engine processes the command, or are they truly the same thing? 我很好奇,SQL引擎处理命令的方式是否有所不同,或者它们确实是同一回事?

I personally prefer the distinct syntax, but I am sure it's more out of habit than anything else. 我个人更喜欢独特的语法,但是我敢肯定,它比其他任何东西都更加习惯。

EDIT: This is not a question about aggregates. 编辑:这不是有关聚合的问题。 The use of GROUP BY with aggregate functions is understood. 了解将GROUP BY与聚合函数一起使用。


#1楼

参考:https://stackoom.com/question/GKj/GROUP-BY和DISTINCT有什么区别


#2楼

I read all the above comments but didn't see anyone pointed to the main difference between Group By and Distinct apart from the aggregation bit. 我阅读了以上所有注释,但没有人指出除聚合位以外,Group By和Distinct之间的主要区别。

Distinct returns all the rows then de-duplicates them whereas Group By de-deduplicate the rows as they're read by the algorithm one by one. Distinct返回所有行,然后对它们进行重复数据删除,而Group By对算法逐行读取的行进行重复数据删除。

This means they can produce different results! 这意味着他们可以产生不同的结果!

For example, the below codes generate different results: 例如,以下代码产生不同的结果:

SELECT distinct ROW_NUMBER() OVER (ORDER BY Name), Name FROM NamesTable

 SELECT ROW_NUMBER() OVER (ORDER BY Name), Name FROM NamesTable
GROUP BY Name

If there are 10 names in the table where 1 of which is a duplicate of another then the first query returns 10 rows whereas the second query returns 9 rows. 如果表中有10个名称,其中一个与另一个名称重复,则第一个查询返回10行,而第二个查询返回9行。

The reason is what I said above so they can behave differently! 原因是我上面所说的,所以他们的行为可能会有所不同!


#3楼

它们具有不同的语义,即使它们在您的特定数据上碰巧得到相同的结果。


#4楼

GROUP BY lets you use aggregate functions, like AVG , MAX , MIN , SUM , and COUNT . GROUP BY使您可以使用聚合函数,例如AVGMAXMINSUMCOUNT On the other hand DISTINCT just removes duplicates. 另一方面, DISTINCT只会删除重复项。

For example, if you have a bunch of purchase records, and you want to know how much was spent by each department, you might do something like: 例如,如果您有一堆采购记录,并且想知道每个部门花费了多少,则可以执行以下操作:

SELECT department, SUM(amount) FROM purchases GROUP BY department

This will give you one row per department, containing the department name and the sum of all of the amount values in all rows for that department. 这将为您提供每个部门一行,其中包含部门名称以及该部门所有行中所有amount值的总和。


#5楼

Use DISTINCT if you just want to remove duplicates. 如果您只想删除重复项,请使用DISTINCT Use GROUPY BY if you want to apply aggregate operators ( MAX , SUM , GROUP_CONCAT , ..., or a HAVING clause). 如果要应用聚合运算符( MAXSUMGROUP_CONCAT ,...或HAVING子句),请使用GROUPY BY


#6楼

For the query you posted, they are identical. 对于您发布的查询,它们是相同的。 But for other queries that may not be true. 但是对于其他查询可能并非如此。

For example, it's not the same as: 例如,它与以下内容不同:

SELECT C FROM myTbl GROUP BY C, D
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值