Introduction

The concept of a Slowly Changing Dimension is the foundation of dimensional modelling.

Other than the dimension key and the business key, each column in a dimension must carry one of the following roles:

  • SCD0
  • SCD1
  • SCD2
  • SCD3

While much of the literature around dimensional modelling considers that it is the dimension (table) that carries the role, this is not complete or correct. While all columns in a dimension may have the same role, they may also carry mixed roles. In other words, some columns may be SCD0, while others may be (say) SCD1 or SCD3. Ajilius correctly implements this concept.

SCD0

SCD0 columns are written at the time the row is created, but never updated.

An example of an SCD0 dimension is a calendar. Unless we get a new Pope or a new Sun, it is highly unlikely that the calendar will change. Thus, it may be appropriate to consider that a date dimension is SCD0.

For a more realistic example, consider the following extract from a Bank Account dimension:

  • account_key (DK)
  • account_number (BK)
  • opening_branch (SCD0)
  • current_branch (SCD1)

Our business might consider the branch at which an account was opened to be an important attribute, but the account-holder might move that account from branch to branch to meet their personal requirements, and in response the bank may update the current_branch.

SCD1

SCD1 columns are written at the time the row is created, and updated whenever a new row is received for the business key. This is the method of inserting and updating data with which most users are familiar from other computer systems.

In the example we used for SCD0, the current_branch column is of type SCD1. If the user moves their account from one branch to another, this value will be over-written the next time the dimension is updated.

No history of changes is kept for an SCD1 value.

SCD2

If the history of changes to a dimension is required, the most common means of tracking change is through one or more SCD2 columns.

Ajilius automatically adds three special columns to a dimension which contains one or more SCD2 columns:

  • scd_from (valid from date-time)
  • scd_to (valid to date-time)
  • scd_current (current row flag Y/N)

When a row is added to the dimension, its values are created. The SCD_FROM value is set to the earliest possible Ajilius date (1900-01-01), the SCD_TO value is set to the latest Ajilius date (2999-12-31), and SCD_CURRENT is set to ‘Y’.

When a row already exists in the dimension, its values are compared. If the value of one or more SCD2 columns has changed, the following operations take place:

  • The current row is updated
    • The value of SCD_TO is set to the current date
    • The value of SCD_CURRENT is set to ‘N’
  • A new copy of the row is created
    • A new dimension key is assigned
    • The value of SCD_FROM is set to the current date
    • The value of SCD_TO is set to the latest Ajilius date
    • The value of SCD_CURRENT is set to ‘Y’

Thus, we can have multiple rows for a given business key, representing the number of times that one or more SCD2 columns have changed during the life of that business key.

Considering our earlier example, we might want to track the history of the account-holder’s name over time:

  • account_key (DK)
  • account_number (BK)
  • opening_branch (SCD0)
  • current_branch (SCD1)
  • account_holder_name (SCD2)

A new row will be created for this account_number, whenever the user changes their name.

The CREATE statement for this table might look like:

create table dim_account (
    account_key integer,
    account_number integer,
    opening_branch text,
    current_branch text,
    account_holder_name text,
    scd_from datetime,
    scd_to datetime,
    scd_current char(1)
    );

SCD3

We use an SCD3 column when we want to track a “previous” value of a column, but we’re not concerned with every change that has ever been made.

When a column has the role SCD3, Ajilius will automatically create an “_old” version of the column.

Let’s assume that we want to add the account manager to our account dimension:

  • account_key (DK)
  • account_number (BK)
  • opening_branch (SCD0)
  • current_branch (SCD1)
  • account_holder_name (SCD2)
  • account_manager (SCD3)

When the table is created, Ajilius will automatically add a column named account_manager_old to the table, immediately following the account_manager column.

When the row is updated the value of the account_manager column will be moved to the account_manager_old column, and the new value written to the account_manager column.

The CREATE statement for the table might now look like this:

create table dim_account (
    account_key integer,
    account_number integer,
    opening_branch text,
    current_branch text,
    account_holder_name text,
    account_manager text,
    account_manager_old text,
    scd_from datetime,
    scd_to datetime,
    scd_current char(1)
    );