Finding Duplicates with SQL

Emanuel November 19, 2004

In T-SQL you may use "insert" to remove duplicates. What you must do is create another work table with the IGNORE_DUP_KEY option set. So you copy the data from one table to the other. Drop the original table and rename the work table to your original table name and that's it. you have a plain, clean, non duplicated records table. Something like: Create table tableCleanDup (idfield int, field1 varchar(30), field2 varchar(30)) Create unique index removeduplicates on tableCleanDup (field1,field2) with IGNORE_DUP_KEY insert tableCleanDup select * from tableOriginal it will send a message "duplicate key was ignored" but that is fine.

murali August 23, 2005

i want to get all the duplicate records in the table but i dont wont the original data .for eg, murali is a table name which has many names as one of its field ,in that mani name is repeated for ten times,iwant to get just nine times of 'mani' name from the table

Mooky Desai October 24, 2005

I have a complete table 'A' and a subset of that table called table 'B'. How do i remove the entries listed in table 'B' from table 'A'??? TIA

Frustrated October 27, 2005

I have a similar problem and none of the above answers appear to work. I have 2 databases. One is the master with all email records. The second has only "Anytown" email addresses. I need to remove all the "Anytown" email addresses that appear in the master? What is the the proper SQL query to not only find the dupes but acutally delete them from the Master table?

Sangeetha K November 24, 2005

I want the query as soon as possible.I have a table with duplicate records with company,first and last name repeating.I need to put all duplicates in 1 table and all unique records in the other.

sam November 25, 2005

i am working on a table that contains client details consisting of Name, Number and Response fields now i want to include a 4th field that would show the number of occurence based on Name, number and response. can some one help me on that.

PH December 13, 2005

I've got a table of purchases by clients (CustomerID, OrderID, OrderDate, Amount). The same customer might have placed more than 1 order, so might appear more than once in the table. How can I select the latest order for each customer (so, all customers would appear only once, but include the latest order date and amount)?

AJ December 16, 2005

I have a similar problem to PH (last post). I need to obtain the last of a series of entries in a table for a client. In Access I could use a group by option called LAST. no such feature in SQL Server. Any help

Mark December 17, 2005

Hey there, I don't know allot about MYSQL but I tried this using your query. Regarding query: SELECT email, COUNT(email) AS NumOccurrences FROM users GROUP BY email HAVING ( COUNT(email) > 1 ) I did: SELECT email, COUNT(email) AS NumOccurrences FROM users GROUP BY email HAVING ( NumOccurrences > 1 ) I tried it and it seemed to work. Would it reduce the COUNT time because I don't need to use the COUNT() function again or is it the same thing? Cheers, Mark adcoil.com

Nailesh December 27, 2005

To Get Duplicate Values form table do this: add one more identity column in table. for example we have customer table like this -customer_id [identity], name, email Query SELECT a.customer_id,a.name,a.email from customer a inner join customer b on a.name=b.name and a.email=b.email abd a.customer_id<>b.customer_id I think this will work

amir January 6, 2006

What if i wanted to delete the rows. How would I do that with PHP and MYSQL. I have been looking for an answer to this for days.

Arvind Singh January 14, 2006

I want to select distinct record from any table . Which is the best method. (According to Performance)

Brian Moore January 16, 2006

if i have a table with 10 columns and one of the columns (column 2) contains some duplicates. I want to delete the whole row if it has a duplicate value in column 2. So if i encounter a value of "3" in the the second column in 3 rows in the table. My results should just be the entire row of one of the records. Is there any simple way of doing this?

Anonymous February 2, 2006

How would I select all of the information in the duplicate rows, kinda like (select * from table GROUP BY id HAVING ( COUNT(id) > 1 )). "Group BY" only allows one row to be selected instead of the whole row. Any suggestions? -thanks in advance!

david February 2, 2006

Hi I found my dups but how do I display the different columns (fields) of the duplicated records (ie:name, address, phone, etc). I tried to use "where exists" along with the count stmt but that displays all recs dave

Anonymous February 3, 2006

I have found a solution to my problem of displaying a record that has a duplicate and a count of how many exist. Here is the script: --select all fields literaly select empno,ename,job,mgr,hiredate,sal,comm,deptno, count(*) from tempemp --group by all fields literaly group by empno,ename,job,mgr,hiredate,sal,comm,deptno having count(*) > 1 order by count(*) desc,sal asc; I hope this helps someone else!

lalith February 9, 2006

I have two tables say A(m,p,n as elements) and B(q,r,m as elements), i want to print elements which are common in both tables in the following format element name Table name m A m B thanks for ur help

chat February 17, 2006

How to find out the SQL query that was used to create a view in DB2 Tx for the help in advance.

Brandon March 21, 2006

I wanted to find all fields which were duplicated X number of times in my database. Thanks to this post, I figured it out. Thanks! In fact, I went further and found not only the dupes, but just *how* duplicated they are. This's my query: SELECT field3, COUNT (field3) AS count FROM table2 GROUP BY field3 HAVING ( COUNT (field3) > 1) ORDER BY count DESC, field3 The results start with the entries duplicated the *most*, and continue on to the ones duplicated only twice. Anyway, yer post helped me out a lot, and next time I'm in New York, I ought to buy you a drink.

Milelr April 7, 2006

wiley, Obvously EVERYONE should RTFM. Here is what you want: SELECT field1, COUNT(*) FROM table GROUP BY field1; Remember to include any fields which you are SELECTing other than functions (in this case COUNT(*)) in the GROUP BY clause. That will take care of your "not a single group function" error.

Pvedi April 12, 2006

Select * from table1 Where KEY_ID IN (Select MIN(KEY_ID) FROM table1 Group by REPEATED FIELDS Having count(REPEATED FILEDS) > 1

prashant April 17, 2006

How to find out column name of the table by passing only field number to the procedure.

Miller April 17, 2006

Yo Man: Here is a query that displays all records in a table where two selected field values are the same in Oracle syntax. SELECT * FROM table WHERE same_value_field1 = same_value_field2; I hope that is what you are looking for.

pdelaurentis April 18, 2006

Here's a case for a wiki I'm building... There are topics, and each topic has multiple revisions for different languages. I want to first show the version in the target language... and if not, fall back on English (this way, the Wiki starts out filled in for the user of any language). It's like taking the following two queries and merging them together so that I have distinct topic_id's and don't miss any topics... and always making sure the current language wins. SELECT topic_id, title FROM revisions WHERE topic_id=:parent_id AND language_id=:current_language SELECT topic_id, title FROM revisions WHERE topic_id=:parent_id AND language_id=1 # in this example, language 1 is english Any ideas? Performance is important since there could be a large # of topics. Thanks, Pete

Walt May 11, 2006

Worked for me except I needed to check a Unigueidentifier field and SQL will not allow a count on that data type.

Vasily May 15, 2006

Hello, Could I do this? SELECT email, COUNT(email) AS NumOccurrences FROM users GROUP BY email ORDER BY created_date desc HAVING ( COUNT(email) > 1 ) Will it work with ORDER BY? I need to start from latest record.

jj May 23, 2006

Great Solution. I was looking for this, just typed it in google and saw your page. Thanks.

Sujit June 8, 2006

How would I select all the information in the duplicate rows,like (select * from table GROUP BY code HAVING ( COUNT(code) > 1 )). "Group BY" only allows one row to be selected instead of the whole row. Any suggestions? Sorry this is the the similer question asked earlier. Thanks in advance!

Ranjeet Kumar Bhatia June 23, 2006

count how many records in two tables and return sum of both tables

bogdan July 10, 2006

how can this work if i want to group the emails after the subject and to return their email id?

Vijay August 18, 2006

I have a table with dups and i wish to select the distinct records and insert to another table based on 3 column condition,i need to select few columns based on the 3 col condition.This may be a repeat qn but i am not sure of the answer, can anyone help ? Cheers,Vijay

Jasmine October 12, 2006

I need to add something to my query to allows me to check for duplicates that are similar, not identical, so that I know how many records are similar (within the other parameters of the query). For example: the first 3-5 characters are the same - Smith, J. Smith, John and Smith, John B. Also need to still pull any number of characters after the specified ones. I am new to SQL so may not be seeing something obvious. Any help?

Stan Daymond October 25, 2006

With all the talk going on here I don't see anybody providing good generic answer to the question. If you have a key based on multiple columns, the correct statement is: SELECT ColKey1, ColKey2, ... , COUNT(*) FROM TableName GROUP BY ColKey1, ColKey2, ... HAVING COUNT(*) > 1 This will return rows having duplicates. Regards, Stan Daymond, London, UK

arun January 9, 2007

How to find a duplicate records of a table where recordno and course_code in a table

sikruti February 26, 2007

hi how can i filter out the duplicate values present in my table depending on not just with 1 column name but multiple column names . my tables has some 7 columns manes and i want to search n find out the duplicate rows depending on a combination of all the columnnames

Avijt Pramanik March 19, 2007

The Newsletter table contains two fields. 1. Name 2. Email And we have to find out the common email with their all fields value. So this should be a solution select n.* from newsletter n where (select count(email) from newsletter where email=n.email)>1

Adam April 10, 2007

You can also use 'select distinct' as shown below: SELECT distinct email FROM users

Rajini May 6, 2007

I was told to work on one logic which is used to find dup values from 2 tables. Actually, this is happening through package. But my TL asked me to test the particular logic whether it is working or not. Once the testing is done, the package would go for deployment. I can find the dup's values from a table using simple sql. But, i don't know how to use the same concept here to test the logic. The logic is simply 3 select statements using the word EXISTS. Can any one help me? Let me know if you need the same logic to be poted.

Srikanth May 14, 2007

Hi, Please help me with this. I have a table with 4 colmuns: A, B, C, D. The table has maybe around 5 rows in which two rows have exactly the same values in all the 4 columns. What is the SQL with which I can pickup the duplicate row? Thx, Srikanth

Eric August 9, 2007

I have a table with the Columns: IFILN, IBOOKN, IDTBOK, inmate_id, visit_no. inmate_id is a unique identifier attached to each person abd I need to be able to count the number of times that person appears and put that count into visit_no. example: inmate_id visit_no A 1 B 1 A 2 C 1 A 3 B 2 so that it shows this was the 1st visit, this was the 2nd etc. Can i use a form of this, if so/not any ideas on how to implement?

Daniel August 22, 2007

I have the following column: A, B, C and data for column A and B is duplicated, and need to remove the duplicated records but before I remove the record i have to check column 'B' which has some condtion if column 'B' data is 0 I have to remove all the other data which is duplicated either wise I have o keep the records, which means the first priority is for to get 0. Example: A B C 1 3 0 1 3 2 I need to have the result of A B C 1 3 0 Please any idea? Thanks, Daniel

Shrey August 23, 2007

Hi All: I have a similar issue -- let's say I have two tables, A and B, with two columns in each, table A is a table where users upload data, and B is a final table. A has duplicate data, as well as updated records. I've been trying to figure out a sql query which would go through table A, find which entries are not in table B and then copy them over. Further, it should check to see if any records have been updated in Table A and replace the data in Table B with the new data. Any ideas? example: A(Temp Table) 1 1 2 22 3 3 B(Final Table) (Before addition of A) 1 1 2 2 4 4 B(Final Table) (After addition of A) 1 1 2 22 3 3 4 4

Ana August 23, 2007

delete from table That should do the trick...

Sinoy Xavier October 11, 2007

If two rows having duplicates, then there is an easy solution to find it out. Suppose, trading_id and trading_name are the columns, which contain duplicates of marketer table, then, SELECT trading_id,trading_name FROM schema.marketer_tbl a WHERE ROWID > (SELECT MIN (ROWID) FROM schema.marketer_tbl b WHERE b.trading_id = a.trading_id AND b.trading_name= a.trading_name); And if we'r putting DISTINCT after the first select, we'll get the exact columns who are repeating. Thanks, Sinoy Xavier Infosys, Bangalore

Bobby November 8, 2007

I have a table with a column of team names that are duplicated. I need to fill a combo box with these name but not the duplicates. How can I do this?

Lakshmi November 14, 2007

I have two tables A,B.Fields in A include userid, name and Fields in B include userid, groupname. I want to select users who are belonging to more than one group.

Mike November 26, 2007

Hi, I am looking to display all duplicate records in my table but in two fields. tried this but it's not working. Could anyone tell me what's wrong with it? SELECT NAME, Address1 COUNT(NAME) AS NumOccurName, COUNT(Address1) AS NumOccurAddress1, FROM general_table GROUP BY Address1 HAVING ( COUNT(NAME) > 1 ) AND ( COUNT(Address1) > 1)

suraj December 4, 2007

Thanks guys, Great solution

Akila January 8, 2008

Hi, I need to filter the records based on the unique combination 3 fields eg: in source fld1 fld2 fld3 a 1 1 a 1 1 dup record a 2 1 x x 1 x x 1 dup record i need filter the duplicates so my output should be fld1 fld2 fld3 a 1 1 a 2 1 x x 1 so i need a query to get this output, i need to get the first occurance of the unique record. pls hlp me thanks in advance

ron January 17, 2008

hi, i have a problem similar to this,, i have duplicated rows in my db and i need to select nonduplicates (*) and only 1 from the duplicates .. I have been using this... SELECT transfer_id ,date, COUNT(transfer_id ) As NumOccurrences From my_table GROUP BY date, transfer_id HAVING ( COUNT(transfer_id ) > 1 ) order by date asc, transfer_id asc **** but it returns only duplicates... Does anybody have a solution to get both singles and duplicates ONLY ONCE

serj January 30, 2008

The statement above works for me, but how do I merge the cells that It found, and uses the SUM of the values in the other two cells. Thank you. Serj

Kannan P February 5, 2008

thank you so much...

Gajendra October 30, 2008

I am having column like status in a table A that is having values like a,b,c,d i need to take status based on the priority wise(a-2,b-3,c-1,d-4) eg: Emp status 1 a 1 b 1 c 2 b 2 c 3 d 3 b in the above table output should be emp status 1 c 2 c 3 b please help me out to get that above output thanks in advance

KN January 28, 2009

i need some help. am develope training calender.. so in table i have training title n date for training ( from jan to dec). i have some problem when i wanna view all the training. fo example . in table i got 3 data: 1. training abc with date january. 2. training def with date march and 3. training abc with date april. so here i got one same training (training abc) but with different date.. so i want grap all the data but i want view it like this: training abc jan april training def march any idea how to do that.. i dont want to repeat the same training....i just want view it as onne training buat have diffenret date.. TQ so much

Terry Pearson March 6, 2009

@Ed, thanks for your comment. That was exactly what I needed... "Hi Padam. retrieve all columns from duplicate records like this: SELECT * from tbl where tbl.col in ( SELECT tbl.col FROM tbl GROUP BY tbl.col HAVING ( COUNT(*) > 1 )) order by tbl.col Ed"

Suriya April 12, 2009

Hi, What will happen if the data like below. In the below structure, INDEX 1 and 2 are duplicate. In these case i need to identify any one of the index as a duplicate? NAME NAME_1 VALUE DATE INDEX SURI SE 275 13/12/2005 1 SURI SE 375 1 SURI SE 475 1 SURI SE 275 13/12/2005 2 SURI SE 375 2 SURI SE 475 2

Vignesh September 16, 2009

I have two tables A and B Now Table A has Last Name, First Name, Country, New Table Now Table B has Last Name, First Name, Country, Old Table I want to remove dupilcates of the these 2 tables and have my result as a New Table C which has no duplicates. i.e C= a-b plz help me.. Plz

Anon October 1, 2009

Vignesh, try this: create table table c as select last name, first name, country, new table from table a union select last name, first name, country,old table from table b; This will give you all the unique records in a and b

Anon October 14, 2009

Frustrated, you could do the opposite of what I wrote for Vignesh which is: create table table c as select last name, first name, country, new table from table a intersect select last name, first name, country,old table from table b; This should give you only the commonalities of a and b.

Lui November 17, 2009

ok, there are a lot of excellent solutions here for finding duplicates,and they all great. I am new with SQL. Could someone please suggest one of the most commonly use solution ( I know there is many) for finding duplicate records from two tables. The results will show the duplicates. Thx

je December 4, 2009

Is there a way to list just the set of duplicates from a table. the solutions above list the records only once. I have a table where the duplicate is based on 5 columns but the remaining column may be different. So I want to query and bring back only the dupes. I don't want a count.

Richard January 25, 2010

Here's a good question: I have a table with duplicate records, but the duplicate records are based on all fields with the exception of the key field. All records have an ID which is the pk, so technically, the records are unique, but I need to delete duplicate records based on the other fields. Example: ID Name Number City 1 John Doe Nashville 2 John Doe Nashville I want to keep one of them and remove the other. Each has a unique pk, so selecting which records to delete is difficult. Doesn't matter which one is deleted, as long as only one remains. Any help would be greatly appreciated!! Thank you so much!

hanuman February 10, 2010

i need to get distinct name of the person. but in my data table person is is there with persionID FullName 01245 Donkey kong 01245 Donkey kongKing both are same person's name now i can not get the distinct name out of it as the name stored differently , even though i use Distinct person ID. And joining with other datatables it gives me more bad results . I don't see any condition also which i can apply for this selection. Help

Ansari Mohammad Qayamuddin. H February 22, 2010

You can use. SELECT * FROM [your_table_name] ORDER BY [your_table_name].[date] ASC

kiran February 27, 2010

I have two fields in a table which while retreiving i want to combine both and display output as one. for ex:empno,empname when retreived i want it to come as single field .Can anyone provide me sql query for this.

Francois April 14, 2010

Hey, @Simon No, a simple select statement will not do. The example that I have given only has three IDs, but might have millions in real life. I need a generic solution that will ONLY list IDs which have records that ONLY contain a status of 'n'. Any ID that has even one record with a different status MUST be excluded. The way that I have done this is to use a cursor, first summing the number of entries for each ID, the secondly summing the number of entries for each ID that has a status of 'n', and then comparing the two results. If an ID has a "total record" count that equals the "'n' status record" count of x, it is included in the end result set. If the "'n' status" count is less that the "total" count, it is obvious that there are other statuses involved, and the ID is excluded from the result set. The above works fine, but I am sure there must be a more elegant way of doing this. I just don't know how! Kind regards Francois

Trev May 4, 2010

here's one that i'm scratching my head on, i have a table with only two columns, [sales contact], and [sent to customer date] - the sent to customer date field is not a true date field, its an nchar(30) and is in the format of 01-jan-10 i'm trying to distinctly count the number of entries by date by sales contact - this query works but only returns 1 column with the months i specify, i would like to iterate through all months: select distinct [sales contact], COUNT([sent to customer date]) as 'jan' from quotes where [sent to customer date] like '%feb-10%' group by [Sales Contact] and of course changing it to: select distinct [sales contact], COUNT([sent to customer date]) as 'jan', COUNT([sent to customer date]) as 'feb' from quotes where [sent to customer date] like '%feb-10%' group by [Sales Contact] adds the second column as feb, but fills it with the results from jan.

Trev May 4, 2010

correction on my 'like' statement in the first query i said that worked above - should have been '%jan-10%' - but again, still only returns the count the specified month

jyothi June 6, 2010

I have a table like below, Formid FormDate company profile In that FormDate contains Dates of the records formed.It contains the dates of Year 2008 to 201o and unique formid. I have created one web page in that From Date and To Date selection list is there. If I select the particular from date and To Date and if click submit button, It need to show the particular form id of the records according to the date. And when I click on to that Particular form id it need to show the particular record according to the particular form id.

markw June 24, 2010

Hi, just what I was looking for thanks. Should I join the crowd with a comment on the lines of "I want an SQL statement that cleans my teeth, puts my babies to bed, and goes to the supermarket for me"? I think not. ;)

domenic July 7, 2010

I am a bit confused. I have a table in a database with phone numbers, each phone number assigned to a specific account. Sometimes a phone number gets assigned to more than one account which is a problem. I need to list all DUPLICATE entries in the phone number table. How do I do this?

aa33030 August 21, 2010

SELECT email, COUNT(email) AS NumOccurrences FROM users GROUP BY email HAVING ( COUNT(email) > 1 ) =========== Where do I go to do all of this?

Daruru October 6, 2010

Hi I would like to know the DATABASE concepts. Could anyone can provide good URS's for DATABASE concepts, it would be grateful.

Santu November 7, 2010

Hi all, I have trying to write a sybase query to check for duplicate records and to execute conditions when the number of duplicate records are more than 5, and also when the duplicate records is less than 5, So i have the below query, please guys provide a review and feedback on the below query IF (select A,B,C from TEMP_TABLE group by A,B,C having count(*) > 1 ) > 5 Print "Error" else ((select A,B,C from TEMP_TABLE group by A,B,C having count(*) > 1) <5 INSERT INTO TABLE_1 (A,B,C) SELECT A,B,C FROM TEMP_TABLE )

ooliki January 10, 2011

Hi am havin problem of duplication of records when i querry two tables with innerjoin clause (SELECT * FROM registration c INNER JOIN products p where c.id = p.validcode && p.productcategory = 'Others'). please any body that can assist

ela January 19, 2011

assume that one column has 10 values...sum has duplicates...how to find which value has maximum number of duplicates in sql

hazari February 3, 2011

thank you so much but i have some other query.i have a database having column name(name,type(dedicated,shared),startdate,enddate) i want to find duplication when a new entry is inserted if a name,type(dedicated),fromdate is enter and its already inserted in between a particular date then it show message that name cant be appoint as it already appoint on that day..

Marcus February 3, 2011

I have a similar problem to Trev's but more complex. I have one table with sales data spawning over multiple years. This table includes all buying customers (CustNo) with sales month (Date) and shop area (ShopNo) What I need is a result displaying a unique count of customers, starting in january and for each month adding the customers that haven't been buying yet in this year(firstbuyers). This will be calculated for each area too. Example: In january 500 customers bought articles, in february it was 530, of which 40 where there for the first time this year, in march 490 customers bought articles, of which 25 were there for the first time this year. Smaller figure will occur for individual areas. Example: In one sub area 350 customers bought something in january; 360 in february (15 firstbuyers) and 340 in march(20 firstbuyers). My increasing counting table would then look like this: Year:Month:Area:Count: 2010:January:AllAreas:500 2010:February:AllAreas:540 2010:march:AllAreas:565 2010:January:SubArea:350 2010:February:SubArea:365 2010:march:SubArea:385 How do I achieve that without creating multiple temp tables for each month and merging them manually, grouping and counting customers after each merger? Thanks for your help!

Mohsin March 1, 2011

I need some help i have two tables table(A)(one column) having 200k records and another other table(B)(one column) having 600k records i am trying to find duplicate records (i-e records that that are in both tables) also i need records that exist in table A and are not in table B.

vinod May 5, 2011

how are i retrieve data from database. i have a table in my database but it has no primary key and i want retrieve the 10 row from table after 30 row.

HMIles May 20, 2011

I want to insert rows that exist in table A but do not exist in table B using multiple fields from table example insert into table b (a,b,c,d,e,f) values (1,2,3,4,5,6) Where a,b,c,d,e,f Not in (select a,b,c,d,e,f from table b) I cant seem to make it work.. is this possible

Finding Duplicates with SQL

SQL To Find Duplicates

Find Rows that Occur Once

Comments

SQL To Find Duplicates

Find Rows that Occur Once

Related Posts

Comments