blogs.conchango.com

welcome to the conchango blogging site
Welcome to blogs.conchango.com Sign in | Join | Help
in Search

SSIS Junkie

Conchango are busy and need talented consultants in and around London. Interested? Email me or send me a message

SSIS Nugget: Bitwise Checkpointing

Checkpoint usage in SSIS is a topic that is not well understood at all. The mechanism for defining checkpoint files and configuring containers to use them is "fiddly" at best and isn't helped by a bug in the Sequence Container (of which you can read more about here). I hope to find the time to write a definitive article in the near future that explain all the nuances of checkpoint files.

One thing is true however and I can state this categorically. In order for a checkpoint file to exist after the package has finished executing, the package MUST fail execution. Note that by default, if any task in your package fails then the package will still succeed. If you want to prove this, follow these steps:

  1. Start a new package
  2. Drag on a data-flow task
  3. Set the data flow task property ForceExecutionResult='Failure'
  4. Put a breakpoint on the data flow task.
  5. Set package properties CheckpointFileName='c:\temp\chkpnt.xml', CheckpointUsage='IfExists' & SaveCheckpoints='True'
  6. Execute the package

The package will hit the checkpoint. At this point check 'c:\temp' and you will see that the checkpoint file 'chkpnt.xml' has been created. Run the package through to completion and you will see that the data flow task fails and the checkpoint file disappears. This is because even though a task in the package failed the package did not, hence no checkpoint file is created. You will also see the following messages in the output window:

Note the warning near the end of the output. Even though the task failed the package will continue running. And it ends, exiting with code 0 - meaning success.

This can sometimes be a problem. What if I want my package to always succeed but on the next execution I only want the previously failed containers to execute? Unfortunately, as just proven, SSIS checkpoint files are not an option to us so we need another solution. And here it is - bitwise checkpointing.

The basic premise is simple. Each container in the package is preceded by a conditional precedence constraint that will cause the container to be skipped (or not). Question is, what do we base the expression on?

Well, if we conceptually assign each container a value which is some power of 2 then we can maintain an integer variable (scoped to the package) that the conditional precedence constraint can check, using the bitwise '&' operator, to see if the power of 2 value of the container that follows it has been added to the variable value. A script task following each container adds the container's power of 2 value to the integer. At the end of the package we store the variable value somewhere so that we can use it on the next execution.

That was a heck of a mouthful so I've built a package to demo what I am talking about. Here is the package:

Some things to note from this picture:

  1. Variable called CheckpointVariable that is initialised to zero
  2. Script task Store checkpoint value external to package that stores the value of CheckpointVariable in an operating system environment variable called BitwiseCheckpoint. Note that this container executes last.
  3. Script task Retrieve checkpoint value from environment variable that retrieves the value of BitwiseCheckpoint and stores it in CheckpointVariable. Note that on the first execution BitwiseCheckpoint will be empty so CheckpointVariable will remain initialised at zero.
  4. Seven dataflow tasks. For convenience I have named them according to the power of 2 that I have conceptually assigned to them. These names are only a visual aid and are not used in any way.
  5. Each dataflow tasks is preceded by a conditional precedence constraint. The expression on each precedence constraint is
    (@[User::CheckpointVariable] & X) != X
    where X is the name of the subsequent dataflow task.
  6. Each dataflow task is followed by a script task that adds its power of 2 value to CheckpointVariable.
  7. Script task Store checkpoint value external to package is outside of the sequence container hence it will execute regardless of whether all the dataflow tasks succeed or not.

What you cannot see from the picture above is that the dataflows don't actually do anything (they don't contain any components) and dataflow task "4" has got property ForceExecutionResult='Failure' so as to simulate a failure.

When we execute the package for the first time, here is what happens:

There's nothing out of the ordinary there. Task "4" has failed and the package has run through to completion. I now set property ForceExecutionResult='None' (i.e. the default) on task "4" and execute again. Here is what happens.

Task "4" has executed (successfully this time) but none of the other dataflow tasks have. Remember, there is no checkpoint file at work here - we've done it all through our makeshift bitwise checkpointing mechanism.

That's all there is to it. As ever I've made the demo package available here. The Live SkyDrive team have done an upgrade and annoyingly none of the old links work anymore. The new link is here:

If there are any more problems with this link, please reply with a comment and let me know.

-Jamie

Published 19 July 2007 00:11 by jamie.thomson

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Professional Association for SQL Server (PASS) SIG said:

July 27, 2007 15:54
 

Professional Association for SQL Server (PASS) SIG said:

July 27, 2007 15:54
 

SSIS Junkie said:

I've spoken about Live Folders quite alot on here lately and some big news from that product team

August 9, 2007 19:00

Leave a Comment

(required) 
(optional)
(required) 
Submit

This Blog

Syndication

News

Powered by Community Server (Personal Edition), by Telligent Systems