blogs.conchango.com

welcome to the conchango blogging site
Welcome to blogs.conchango.com Sign in | Join | Help
in Search

Kristian Wedberg - Business Intelligence, Data Warehousing, SSIS

A Swede on a walkabout

Database Snapshots and SSIS

Did you think SQL Server 2005 database snapshots were really cool, but wondered how useful they were in practice? Building a Data Warehouse, here's an ideal use for them:

In my current project we're extracting data for the data warehouse from several SQL Server 2005 source system databases. Since we're extracting from multiple tables from each database, and since those tables continue to be populated during our extract, there is a definite risk that the extracted data is out of sync due to the extracts from different tables happens at slightly different times.

Database snapshots to the rescue

By first creating a snapshot of the source system database, and extracting from that static snapshot (instead of the original, changing database), these synchronization issues are avoided!

Even better, while the underlying database might be modified during the extract, the overhead for creating and using the snapshot is very small - it's proportional to the number of source system table pages changed during the extract.

Each source system database to be extracted has the following tasks configured:
 

Control Flow Image

1. The first Execute SQL Task creates the snapshot. Since no data is copied, this is a virtually instantaneous operation. At this stage the source system Connection Manager still points to the original database.

Creating the snapshot

2. The first Script Task switches the database ConnectionString from the original database to the newly created snapshot database. This avoids having to create and keep a second connection manager in sync with the original database.

Switch ConnectionString

3. The Sequence Container contains all the Data Flow tasks etc. needed to move the source system data (now guaranteed to be static during the extract) into the data warehouse. NB: the following precedence constraints are "On Completion", so even if the extract fails, the snapshot will be deleted.

4. The second Script Task switches the ConnectionString back to the original database, i.e. almost same code as 2 above, but change "Source_System_Snapshot_Extract" into "Source_System", and vice versa.

5. The final Execute SQL Task simply drops the snapshot, again an instantaneous operation.

Dropping the snapshot

Wrapping up

One thing to watch out for is when additional database files are created, either manually by the DBA or automatically due to the source system having a maximum file size set. The snapshot creation command must specify ALL database files, so either update the snapshot command manually when this happens, or extend the Script Tasks to automatically figure out the number and filenames required.

In summary, database snapshots and SSIS have proved to be a very useful combination for extracting consistent data into the data warehouse.

Published 06 March 2006 18:45 by Kristian.Wedberg
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Kirk said:

I need to extract some parts of large databases and copy to another server programatically. Worked like a charm in DTS2000 but I am at a loss with this new SSIS. I am now stuck with an error VS_NEEDSNEWMETADATA (Google search turns up one non-helpful hit).

Any suggestions where to look for the simplest "Copy a table from serverA to serverB" sample code (VB.Net or C#)?

Thanks
kirk.dybvik
at
us.benfieldgroup.c
om
March 10, 2006 22:16
 

Kristian.Wedberg said:

Hey there Kirk,

My approach would be to first create a template package using the VS SSIS GUI, to ensure my package(s) are working correctly. Then read the package programmatically and list all the components of it (tasks, connection managers...) so you understand how it's constructed.

For production, you can of course create it from scratch, but it's likely easier to read and modify a template package programmatically.

BOL (in all its gory detail) is of course the prime place to figure out how to do the programming, I'm sure you've dug though this:

http://msdn2.microsoft.com/en-us/library/ms135946(SQL.90).aspx

Let me know how it goes!Kristian

March 12, 2006 01:43

Leave a Comment

(required) 
(optional)
(required) 
Submit
Powered by Community Server (Personal Edition), by Telligent Systems