Tuesday, March 27, 2012

Duplicate results in 2 columns but reversed

I have to write a query which extracts everyone from a table who has the same surname and forenames as someone else but different id's.

The query should have a surname column, a forenames column, and two id columns (from the person column of the table).

I need to avoid duplicates i.e. the first table id should only be returned in the first id column and not in the second - which is what i am getting at the mo.

This is what i have done

select first.surname, first.forenames, first.person, second.person
from shared.people first, shared.people second
where first.surname= second.surname
and first.forenames = second.forenames
and not first.person = second.person
order by first.surname, first.forenames

and i get results like this

Porter Sarah Victoria 9518823 9869770
Porter Sarah Victoria 9869770 9518823 - i.e. duplicates

cheerswhat about:

select
first.surname,
first.forenames,
min(first.person),
min(second.person)
from
shared.people first, shared.people second
where
first.surname= second.surname
and first.forenames = second.forenames
and not first.person = second.person
group by
first.surname,
first.forenames
order by
first.surname,
first.forenames

I'm not positive that this would work if there were more than one duplicate entry.

regards,

hmscott|||Thanks for the quick reply,

Yes, it seems to work if the first column is called min(first.person) and the second is called max(second.person) but this would fail if there were more than one duplicate.

Any ideas?

Cheers :)|||This will be a pig, so I hope you only need to run this once...

select
first.surname,
first.forenames,
min(first.person),
min(second.person)
from
shared.people first, shared.people second
where
first.surname= second.surname
and first.forenames = second.forenames
and first.person < second.person
group by
first.surname,
first.forenames
order by
first.surname,
first.forenames|||I hope you are talking about something like this:

drop table #tmp
create table #tmp(id int,fname varchar(10),lname varchar(10))
insert #tmp values(1,'a','b')
insert #tmp values(2,'b','b')
insert #tmp values(3,'a','b')
insert #tmp values(4,'f','b')
insert #tmp values(5,'b','b')
insert #tmp values(6,'b','b')
insert #tmp values(7,'a','b')

select distinct t.fname,t.lname,t.id,t2.id
from #tmp t
join #tmp t2 on t2.fname=t.fname and t2.lname=t.lname and t2.id<>t.id
where t.fname+t.lname in(
select fname+lname
from #tmp
group by fname+lname
having count(*)>1)
order by 1,2,3,4

--- OR

select fname,lname,id
from #tmp
where fname+lname in(
select fname+lname
from #tmp
group by fname+lname
having count(*)>1 )
order by 1,2,3|||cheers all - thanks for the quick responses!

:D

No comments:

Post a Comment