Slowly Changing Dimensions and Date Chaining
Posted On: Jul 09, 2014
Attributes that experience changes once in a while are called slowly changing dimensions. Date type attributes that need to be updated while saving previous records, are managed using the concept of date chaining.
In relational databases every schema has a unique identifier attribute, example a person schema can be identified using social security number. Other data that can be stored about a person include name, phone, email, gender, address, date of birth, salary and so on. Details like gender and date of birth never needs to be changed. But, it is possible that a person changes address and may experience raise in salary, or a change in phone number, all these changes need to be updated time to time. However, such data is not changed on a daily basis. Data attributes that changes slowly are called slowly changing dimensions and there are a few approaches to manage slowly changing dimensions, some of which are listed below.
One way to deal with changing dimensions is to override the data that needs to be changed, for an example in case of change in a person name, if we do not care about what the previous data in the person’s name was, we can override the data with new name. In that case if we need information about past data we cannot have access to it. There may be cases when we might want to keep track of changes in data, in that case other approaches may be used.
Specially with dates, example a person worked as a programmer for period X to Y and as a manager for period Y to Z and we are interested in keeping historical information. Then we can use another approach where we add a new column to person table for storing information about which is the most current record and keep adding new rows for every time there is a change in designation of person(s). This concept is also known as date chaining, the start date of one designation is the end date of previous designation for that person.
Sometimes we may not be interested in storing all previous records but just the 2 recent ones, in that case instead of adding new rows, we just add additional column for storing past data. For an instance if the salary of a person changes, we might want to store the new salary and just one salary before the new one for keeping track of percentage increase. In this scenario we do not want all other old salaries, just the recent salary and the salary before it, this can be done by having 2 salary columns in person table, one that stores new salary and the other that store the salary just before new salary.