blogs.conchango.com

welcome to the conchango blogging site
Welcome to blogs.conchango.com Sign in | Join | Help
in Search

SSIS Junkie

SSIS: SCD Wizard demo

SQL Server Integration Services (SSIS) contains a wizard that enables you to easily build packages to manage slowly changing dimensions (SCDs).

Nick Barclay has been reading my blog for a while now and decided that due to the lack of demo material here about the SCD Wizard he would go ahead and build his own.  Nick has asked me if I could distribute it here which I am delighted to be able to do.

Its an excellent little demo that's very easy to run so I recommend that if you are unfamiliar with the SCD wizard or even slowly changing dimensions in general you should download this demo and have a play. The only change you will have to make to the package is the name of the server in the connection manager. The only pre-requisite is that you have the AdventureWorksDW database installed. Nick has provided 3 scripts that you need to run this (actually you only really need 2 of them) as well as instructions on which order to execute them and the package.

Here's a screenshot showing the pckage after the second execution. You can see that data is flowing down all 3 of the various outputs from the SCD component.

 

Download the zip file from here: http://blogs.conchango.com/jamiethomson/files/20050606MyTestSCD.zip

Thanks for this Nick!

-Jamie

 

Published 06 June 2005 18:35 by jamie.thomson

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Thomas Pagel BI Blog said:

One of the very nice enhancements of SQL Server 2005 Integration Services (SSIS) compared to DTS is the...
July 24, 2005 17:04
 

Thomas Pagel BI Blog said:

One of the very nice enhancements of SQL Server 2005 Integration Services (SSIS) compared to DTS is the...
July 24, 2005 18:14
 

John P said:

Where can I get the download installation for AdventureWorksDW?


August 2, 2005 20:12
 

Jamie Thomson - Life, the universe and SSIS! said:

Nick Barclay has begun writing his own blog at http://nickbarclay.blogspot.com/.
Why am I telling you...
September 28, 2005 16:17
 

Jamie Thomson - Life, the universe and SSIS! said:

Nick Barclay has begun writing his own blog at http://nickbarclay.blogspot.com/.
Why am I telling you...
October 1, 2005 20:32
 

Dave said:

Where is the zip file?
November 17, 2005 00:11
 

jamie.thomson said:

Dave,
The word "here" is a hyperlink to it.

-Jamie
November 17, 2005 06:50
 

jamie.thomson said:

Dave,
In fact I've edited the post so that the hyperlink is now explicit.

-Jamie
November 17, 2005 07:51
 

SSIS Junkie said:

Nick Barclay has begun writing his own blog at http://nickbarclay.blogspot.com/ . Why am I telling you

January 16, 2007 18:15
 

GT said:

Try as I will I can not find any info on the use of SCD Original ID. This field is created for you, along with End Date and Start Date when you use the AS Dimension Wizard. Generate scheme will then update or create your Dim Table accordingly.

The IS SCD transform seems to use the fields created by this, all but the SCD Original ID - I guess I was expecting tighter integration between AS and IS regarding these two. But that aside, I can not find any example in BOL or the WWW for that matter on how SCD Original ID can be used within the context of an IS SCD Data Flow with an SCD transform.

I've got an flat file input connected to SCD Transform and using the SCD Transform edit (wizard) the Data Transform is very nicely created and works. The only thing missing is how to use the SCD Original ID.

January 18, 2008 14:24
 

Ramesh said:

Hi ,

This is nice but everyone needs one good example in Slowly changing Dimensions that will cover all the examples either Lookup or SCD Wizard it helps us lot

February 4, 2008 14:15
 

Mitch said:

Hi Jamie,

   Any thoughts on why the Slowly Changing Dimension does not have a 'deleted items' output?  I have a transactional tabel of employees, and I would like to track who they report to, historically, as well as when they leave the company (indicated by being removed from the transactional table).

  Another useful feature might be a Column_Changed Column added to the Historical Dimesion output, allowing the developer to see which columns were changed.  This is helpful for alerting supervisors when their employee's information has changed.  Any thoughts on how to do these things with SSIS 2005 Scripts?

Mitch

July 1, 2008 00:45
 

jamie.thomson said:

Mitch,

I would imagine that the reason for there being no 'deleted items' output is because it is not expected that you would ever delete anything from a data warehouse. If you remove people from a table then you can no longer track them historically.

-Jamie

July 1, 2008 10:07
 

Mitch said:

Jamie,

Thanks for the reply!

That's the beauty of the SCD.  While the row is necessarily deleted from the Transactional table (which only houses information immediately relevant, i.e. current employees), the Deleted Items output of an SCD could identify this, find the current row for that business key in the historical table and mark it as expired.  Here's an example:

Let's say John is an employee in my company.  He started on January 1 (sorry, using US dates.  can't handle the British format) and changed supervisors on May 1, and leaves the company on July 1.  After July 1, his record would be gone from the transactional table, as it only contains active employees.  The historical table, however, should look like this:

Name:       Supv:       StartDate:       EndDate:

John          Cyndi        1/1/08            5/1/08

John          Sally         5/1/08            7/1/08

Because the end date field is populated for all of John's rows, we can know that he is no longer an employee, but still maintain his records for historical purposes.  

Does this make sense, or am I totally missing the point?

Mitch

July 1, 2008 19:26
 

jamie.thomson said:

Mitch,

I think I see what you're saying. You basically want to know all of the rows that ARE in the destinaion but NOT in the source. Am I right?

If so, that makes sense and hence I'm no nearer to answering your original question "Any thoughts on why the Slowly Changing Dimension does not have a 'deleted items' output?" :)  Perhaps this isn't seen as a scenario that raises its head too often. The other possible reason could be that working out what is in the destination that is not in the source is computationally intensive. You are effectively doing the reverse of what the SCD does today (i.e. find everything in the source that isn't in the destination). With that in mind you can still use the SCD to do what you want - simply make the destination table an input and the source table the cache.

Hope that helps.

-Jamie

July 2, 2008 09:53
 

Mitch said:

Hi Jamie,

That's not a bad idea.  I'll give it a shot.  Thanks!

Mitch

July 3, 2008 00:05
 

Chat said:

Has anyone had trouble creating surrogate keys for type 2 dimensions. I had to use a trigger to set my surrogate keys, although I wished I could of done it in SSIS. The primary key for my dimension is the surrogate key + sequence number

Just wandering if anyone had a similar issue and resolved it. I have looked at quite a few articles but nothing related to my specific issue

August 8, 2008 20:20
 

jamie.thomson said:

Chat,

Did you look at this article: http://www.sqlis.com/37.aspx

-Jamie

August 11, 2008 08:54

Leave a Comment

(required) 
(optional)
(required) 
Submit

This Blog

Syndication

News

Powered by Community Server (Personal Edition), by Telligent Systems