Thursday, March 22, 2012

Duplicate Dimension Members

Hi everyone,

In AS 2000, some of the dimensions I was using had both the key & the name of the dimension as the same text field (dimTable.name). This grouped duplicate dimension members together. Browsing the dimension only showed unique members.

In SSAS, my understanding is you need a key attribute that links to the fact table, or another reference dimension. Since the dimension I was using had an intermediate key on another dimension table, I created a key in the dimension, and used a reference dimension to link indirectly to the fact table to another dimension.

Now I'm getting duplicate dimension members, since the keys are unique in the dimension table. Is there anyway I can consolidate these?

I'll try and diagram for a better explanation.

status dimension table --> dimension table --> fact table

status dimension table has key, name, foreignkey fields that links to a dimension. There can be many statuses for the dimension, based on the foreign key. There are duplicate statuses.

Any help is much appreciated!

...perhaps it's easier to understand if you post some sample data for your scenario... I get a little confused what's duplicate and what's not... Maybe you're talking about a n:m dimension...|||

It seems to be happening not just with the dimension with 2 reference tables, but a standard 1-M dimension-fact relationship.

1 example.

DIM table.

STAT_ID - STAT_NAME

1 Andrew
2 Thomas
3 Thomas
4 Andrew
5 Andrew

Fact Table

STAT_ID
1
1
3
4

AS 2000 Dimension

Key - STAT_NAME
Name - STAT_NAME

Relationship in Cube - STAT_ID - STAT_ID

The result is 1 name in dimensino but keys are related.

AS 2005 Dimension
Key - Stat_ID
Name - Stat Name

A new attribute is created for Stat_ID
This attribute is used in cube Dimensions tab.
The tables are linked in DSV to STAT_ID

The result is duplicate names, due to different keys.

One suggestion is to create a Named query & join on the names in the DSV. I don't want to do that for every dimension we have KEY & Name with the same field in AS 2000.

If I change the Key in the dimension back to STAT_NAME, I can no longer link the granularity settings in the cube.

|||

Andrew,

what's the reason for the key in the dimension table not being unique? Perhaps you should setup some ETL to make this "clean"...

|||

I was thinking about doing that, however I didn't want to change the 2000 data model. Can I do that in the DSV, by just creating a derived query on the name column?

The reason they're not unique is because the data isn't totally denormalized. The actual table looks more like...

key -- status name -- child name

where status name is not unique, but the child is.

So do you think we should split group name off into it's own table, or is there a way to set a property on the dimension so that it will ignore duplicates? It seemed to work in AS 2000.

thanks for all your help,

Andrew

No comments:

Post a Comment