Finding Duplicates with SQL
By Pete Freitag
Let's suppose you coded a email newsletter signup form, but you forgot to double check that the email address was not a duplicate, or already in the database. We can write a query to find all the emails in our table that are duplicates, or occurs in more than one row.
The following SQL query works great for finding duplicate values in a table.
SQL To Find Duplicates
SELECT email, COUNT(email) AS NumOccurrences FROM users GROUP BY email HAVING ( COUNT(email) > 1 )
group by and then having a
count greater than one, we find rows with with duplicate email addresses using the above SQL.
Find Rows that Occur Once
The opposite of our duplicates SQL query, we can use a variant of that SQL statement to find rows that occur exactly once:
SELECT email FROM users GROUP BY email HAVING ( COUNT(email) = 1 )
Original Sources: Huajun Zhai's Blog, How to remove duplicate rows from a table - Microsoft Knowledge base article (139444).
Finding Duplicates with SQL was first published on October 06, 2004.
If you like reading about sql, databases, or duplicates then you might also like:
- Cheat Sheet for SQL Server
- Use Char instead of Varchar to Store UUID's
- Sphinx - Open Source SQL Full Text Search Engine
- Updated SQL Reserved Words Checker