I would like the select records from a table, or insert them into a new blank table where multiple of the columns is the same as another record in the database. The problem is similar to this Question.
Find duplicate records in MySQL
However that only compares one column. Also, one of my columns, lets say column C in the example below, is an integer. Like the question in the link above, I want each of the rows to be returned. Unforunately I am just not familiar enough with how joins work to figure this out on my own yet. I know that the code below doesn't resemble the actual SQL code need at all, it is just the clearest way I can think to describe the comparisons I am trying to get.
SELECT ColumnE, ColumnA, ColumnB, ColumnC from table where (
Row1.ColumnA = Row2.ColumnA &&
Row1.ColumnB = Row2.ColumnB &&
Row1.ColumnC = Row2.ColumnC
)
Any help would be appreciated, all of the "select duplicates from MYSQL" questions I have seen use only one column as a comparison.
解决方案
If you want to count duplicates among multiple columns, use group by:
select ColumnA, ColumnB, ColumnC, count(*) as NumDuplicates
from table
group by ColumnA, ColumnB, ColumnC
If you only want the values that are duplicated, then the count is bigger than 1. You get this using the having clause:
select ColumnA, ColumnB, ColumnC, count(*) as NumDuplicates
from table
group by ColumnA, ColumnB, ColumnC
having NumDuplicates > 1
If you actually want all the duplicate rows returns, then join the last query back to the original data:
select t.*
from table t join
(select ColumnA, ColumnB, ColumnC, count(*) as NumDuplicates
from table
group by ColumnA, ColumnB, ColumnC
having NumDuplicates > 1
) tsum
on t.ColumnA = tsum.ColumnA and t.ColumnB = tsum.ColumnB and t.ColumnC = tsum.ColumnC
This will work, assuming none of the column values are NULL. If so, then try:
on (t.ColumnA = tsum.ColumnA or t.ColumnA is null and tsum.ColumnA is null) and
(t.ColumnB = tsum.ColumnB or t.ColumnB is null and tsum.ColumnB is null) and
(t.ColumnC = tsum.ColumnC or t.ColumnC is null and tsum.ColumnC is null)
EDIT:
If you have NULL values, you can also use the NULL-safe operator:
on t.ColumnA <=> tsum.ColumnA and
t.ColumnB <=> tsum.ColumnB and
t.ColumnC <=> tsum.ColumnC