I am having a headache working on a huge database which contains 120000 records. the big problem is that the database contains duplicates of the "serial no" of products. Another important field is the "date_purchased". How do I remove the duplicates and retain the serial no with the latest date purchased? I have managed to remove duplicates but the program does not remove on the basis of the date_purchased.Hi,
i found some info's about duplicates on planet-source-code.com as i had same problems:
see attached file (too big to quote here)|||How about doing it in 2 stages ie:-
SELECT DISTINCT SerialNo,Max(DatePurchased) From tbl GROUP BY SerialNo
Then inner Join it to
SELECT * FROM tbl
U could create 1 Statement as a Permanent View or Do it all in a stored procedure & #tmp table
Should work -
U'l prob find it can all be done in one Statement - Some1 Post & I'll C which way is faster
GW|||Or a refinement on GWilley's code
SELECT DISTINCT SerialNo, Max (DISTINCT DatePurchased)
From tbl??
Group By SerialNo
I ran this on something similar on my db (table with over 700,000 records) and it seemed to work correctly.|||you do not need DISTINCT when you use GROUP BY -- groups are distinct by definition
save the serial numbers and max dates in a working table, then delete all the rows that don't have a match in the working table
select serial, max(date_purchased) as maxdate
into keepthese
from yourtable
group by serial
delete from yourtable
where not exists
( select 1
from keepthese
where serialno = yourtable.serial
and maxdate = yourtable.date_purchased )
rudy
http://r937.com/
No comments:
Post a Comment