Sunteți pe pagina 1din 3

Dimensional Modeling - Slowly Changing

Dimensions
About
Tracking changes in dimension is referred in datawarehousing as slowly changing dimensions.
In the source system a lot of changes are daily made :

new customers are added,


addresses are modified,
new regional hierarchies are implemented,
or simply the product descriptions and packaging change.

These sorts of changes need to be reflected in the dimension tables and in several cases, the history
of the changes also needs to be tracked.
By remembering history, we are then able to look at historical data and compare it to their current
situation.

Articles Related

Hash function
Dimensional Modeling - Dimension
Data Integration - ETL Moving to Pervasive Integration
Data Warehousing
Data Warehousing - Subsytem
OWB - Dimension
OWB - Key lookup operator
OWB - How to implement a type 2 slowly changing dimension with a hash function ?

Techniques
Type 1 - Overwrite Original Value
A change does not require tracking

Type 2 - Add a new record


With Type II SCD, a new version of the dimension record is created, and the existing version is
marked as history.
Each row does not correspond to a different instance of an entity but a different state, a snapshot
of the instance at a point in time.

To accommodate this, extra metadata is required for the dimension table, including an effective date
column and an expiration date column. These columns are used to differentiate a current version
from a historical version as follows:

Effective date column stores the effective date of the version, also known as start date.
Expiration date column stores the expiration date of the version, also known as end date.
Expiration date value of the current version is always set to NULL or a default date value.

The user must identify the columns whose history will be tracked (by creating a new version)
whenever their values are changed. These columns are known as trigger columns.

Type 3 - Add a new column to store the previous value


With Type III SCD, a current value field is created to keep the current value of dimension record
apart from its previous value.
To accomplish this, two columns are created for each data field:

one storing the current value


and one storing the previous value, respectively.

One of our previous clients had a similar business problem, the business wanted to change the SCD
type 1 to SCD type 2.
Well if you are trying to model the SCD in such a way that the business wants to see a comparative
analysis of current items & history items in a line order then we may need SCD1 to do this kind of
analysis.
Its always a recommended approach to have a flag of validity for a SCD suppose if I have a Item
dimension then I would also imbibe a Item_Valid field to do what if analysis for example if the
Business wants to do a comparison of the current Year valid items against their previous Year items

which may also contain discontinued items then we may need a type 1 SCD & also a Type 2 SCD
based on the effective & start dates.
Now when creating a Business Model in a tool if we want to see an Item according to the validity we
can use the Oracle Last function to aggregate on a dimension suppose we have an Item Id & a Item
Effective date then we would model our business logic in such a way that the Item Effective date is
aggregated as last using a Last function.
If possible try to model a flag field for historical validity too this gives more resilience to leverage
the reports.
Also Its pretty easy to model a SCD type 2 and change it to SCD type 1 but the vice versa is not
simple

S-ar putea să vă placă și