MySQL- Quickly find duplicate records based on any field

Recently I was working on a project where the store gets data files from a vendor. XLS files! Just horrid. After going through the hoops for XLS-to-MySQL it was time to churn the records into the store’s actual database tables for xCart. It turns out there were duplicate SKUs in the vendor’s data files. So that had to be remedied, by giving my client a XLS of the 700+ duplicated records for evaluation.

The following will find all skus that are used more than once. You only get a single record for each sku.

SELECT sku
FROM 2009_product_catalog
GROUP BY sku
HAVING count(sku) > 1

Example output:

10150	PSE Hunter Hip Quiver
16294	CAP No Snow Water Repellent
16701	Vista Rustler Quiver
16714	Allen Broadhead Hip Quiver

You can then use those results in a WHERE IN clause to get all the records based on duplicated skus:

SELECT *
FROM 2009_product_catalog
WHERE sku IN (
	SELECT sku
	FROM 2009_product_catalog
	GROUP BY sku
	HAVING count(sku) > 1
	)
ORDER BY sku

Example output:

10150	PSE Hunter Hip Quiver
10150	PSE Hunter Hip Quiver
16294	CAP No Snow Water Repellent
16294	CAP No Snow Water Repellent
16701	Vista Rustler Quiver
16701	Vista Rustler Quiver
16714	Allen Broadhead Hip Quiver
16714	Allen Broadhead Hip Quiver

2 Replies to “MySQL- Quickly find duplicate records based on any field”

  1. This is EXACTLY what I was looking for, and it was not easy to find. There are lots of resources everywhere on “how to remove duplicate entries”, but not many on this subject. This has helped me A LOT today. I was wondering how to do this, and now I find I start my working day in a great mood. Thanks A LOT.

Comments are closed.