Tuesday, March 27, 2012

Duplicate rows

I am trying to clean up a set of a table where there are some duplicate data
with different ids. What's the easiest way to eliminate the duplicate rows
,
for there are several hundred rows and only about 400 rows are unique. Than
ks
--
bicOne way would be to create a new table from the original one, minus the
duplicate rows:
SELECT MIN(id) AS id, Column1, Column2, Column3
INTO [New Table]
FROM [Old Table]
GROUP BY Column1, Column2, Column3
Then delete the duplicate rows in the original table:
/*Untested, make a backup */
DELETE FROM [Old Table]
WHERE EXISTS
(SELECT 1 FROM [New Table]
WHERE [Old Table].Column1 = [New Table].Column1
AND [Old Table].Column2 = [New Table].Column2
AND [Old Table].Column2 = [New Table].Column2
AND [Old Table].id > [New Table].id )
"bic" wrote:

> I am trying to clean up a set of a table where there are some duplicate da
ta
> with different ids. What's the easiest way to eliminate the duplicate ro
ws,
> for there are several hundred rows and only about 400 rows are unique. Th
anks
> --
> bic|||Something like the following should work, but check the code before you
run it:
-- Create a temporary table
CREATE TABLE #TMPTABLE (col1 datatype(size), col1 datatype(size), etc.)
-- Insert rows in to a temporary table by grouping duplicate records
together or use DISTINCT, thus eliminating the duplicate records
INSERT INTO #TMPTABLE
--SELECT DISTINCT * FROM [source-table]
--SELECT * FROM [source-table] GROUP BY col1, col2, col3, col4, etc.
-- Delete all the records from the source table
DELETE FROM [source-table]
-- Insert the records from the temporary table in to the source table
INSERT INTO [source-table]
SELECT * FROM #TMPTABLE
-- Drop the temporary table
DROP TABLE #TMPTABLE
Something like that should be fine. Sorry if my code isn't up to spec.
- Curt Morrison|||It worked. Thanks for the help, Mark.
--
bic
"Mark Williams" wrote:
> One way would be to create a new table from the original one, minus the
> duplicate rows:
> SELECT MIN(id) AS id, Column1, Column2, Column3
> INTO [New Table]
> FROM [Old Table]
> GROUP BY Column1, Column2, Column3
> Then delete the duplicate rows in the original table:
> /*Untested, make a backup */
> DELETE FROM [Old Table]
> WHERE EXISTS
> (SELECT 1 FROM [New Table]
> WHERE [Old Table].Column1 = [New Table].Column1
> AND [Old Table].Column2 = [New Table].Column2
> AND [Old Table].Column2 = [New Table].Column2
> AND [Old Table].id > [New Table].id )
> --
> "bic" wrote:
>

No comments:

Post a Comment