blogs.conchango.com

welcome to the conchango blogging site
Welcome to blogs.conchango.com Sign in | Join | Help
in Search

SSIS Junkie

SSIS: Parent package configurations. Yay or nay?

A few of my fellow SSIS MVPs and I were asked if we wanted to submit a whitepaper to coincide with the new release of SQL Server; today those papers were (bar one) released:

Quite a common theme there, I guess Matthew, John, Rafael and I should have conferred before we started writing :)

My paper discusses a technique that could be used to eliminate the use of external configurations by instead using the /SET and /CONFIGFILE options of dtexec.exe in conjunction with Parent Package configurations. I have implemented this approach on a recent project and hence have come to experience one of its limitations that I alluded to in the paper:

"There are also disadvantages to using the /SET and /CONFIGFILE options. For example, the Execute Package Task does not have a similar feature that reproduces their behavior. Therefore, if you have a solution that uses parent and child packages, you will probably find that you need to make use of parent package configurations."

In said project I wanted to use one log file per package and pass the folder location of those log files from the master package to child packages using a parent package configuration. When I tried this I was quickly reminded of Davide Mauri's blog entry Beware of Parent Package Variable configuration! where he explains the order in which certain events occur at the beginning of a package's execution:

  1. All configurations other than parent package configurations are applied to the package
  2. Logging starts
  3. Package is validated
  4. Parent Package Variable Configurations are loaded
  5. Expression are applied
  6. Package is run

As Davide says:

"if you need to make the logging connection manager configurable at run time you MUST avoid using Parent Package Variables"

At best the implication of not doing this will be that your package logs to the wrong file. At worst it will attempt to log to a non-existent location and the package will fail before it has even started. I chose not to use one log file per package and instead have a single log file in the master package.

This limitation would not be a problem if the Execute Package Task supported the same options as dtexec.exe so I urge you to vote for the same at Execute Package Task should support parameters. The afore-mentioned Matthew Roche states at that link

"It would be difficult for me to agree any more with this suggestion. The lack of reuse in SSIS is a major barrier to productivity and usability, and is likely a major barrier to product adoption as well."

Phew. Don't mince your words Matthew!!!

On a very similar vein I was recently emailed by Sam Loud (Sam is one of this blog's perennial commenters for which I am very grateful) and he asked me to share the following:

When using child packages, is it better to use Parent Package configurations, or go straight to the configuration itself?


I'm using a classic Parent / Child package solution. The parent package is configured from SQL Server, via indirect configurations using environment variables. This all works just fine (although I hate the Windows environment variable editor).

I'm using the configuration to populate about 15 variables in the parent package - exactly what you'd expect; variables for connection strings, log files, SSIS_Root, all that sort of thing. Obviously, I need the child packages to inherit some of these configured items, particularly connection strings.

Currently I'm setting up parent package variables on the child packages, and pulling the variables I need down from the parent. This works just fine. However, when I want to debug a child package, it can't inherit the configured items from its parent, because it hasn't been called from a parent. This means blank variables, which means blank connection strings, which means it can't validate.

To get around this, I hardcode defaults into the variables in the child package. This is an awful workaround though, because I'm totally bypassing the configuration, and there may be many child packages that need changing. It also causes problems when moving packages, and violates all of the 'keep packages in the dark' best practices.


So, should I go straight to the configuration to populate the variables? This has the advantage that I can debug the child package and it will configure according to the configuration, rather than the hardcoded variable defaults. One problem here is that the SQL Configuration has 15 configuration items, but the child package only requires say..2 variables to be configured. This means 13 warnings will be thrown, because 13 of the configured items will fail to find their destination variables in the child package. I don't want my packages to be throwing warnings.


Surely some of you must have come to an optimum solution here. Please share. I'd be interested to hear your approaches and rationale. Hopefully this will lead to a new best practice; one that has been validated by the community.

One workaround to Sam's problem is to set the package property SuppressConfigurationWarnings=TRUE (its set to FALSE by default) but to be honest that's just a problem avoidance, not a problem solution. My personal opinion is that there is no optimum solution because there are so many variable scenarios - not many will have 15 configurations like Sam. What say you?

-Jamie

Published 29 August 2008 22:49 by jamie.thomson

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

MatthewRoche said:

Thanks for bringing this up again, Jamie. The fact that we can't "pass in parameters" to an SSIS package (such as by using /SET) and have those parameter values automatically cascade to all child packages (such as by using /SETCASCADE or /SET and /CASCADE, if they existed) is such a huge and obvious gap I can't believe that people aren't falling over backward trying to plug it.

Sorry - I've had a long week. ;-)

In any event, this is a problem that affects many SSIS users, but I do not believe that there is any one "correct" solution. Just 10,000 different workarounds...

August 29, 2008 23:39
 

SSIS: Parent package configurations. Yay or nay? said:

August 29, 2008 23:56
 

Fridtjof said:

Our approach to configuring SSIS packages seems quite different to those mentioned above.

We use a combination of using an environment variable (STDCONN) and SQL configurations (SSIS_Configurations table).

The SQL config is being read by an OLEDB connection manager whose connection parameters are being set

by the configuration which is referenced by the environment variable.

So if you transport a package to another environment (e.g. from DEV to QA or PR) all your relevant configurations

reside in the SQL-Server which belongs to that environment.

To separate and distinguish different Package configurations, we use hierarchially named categories in the ConfigurationFilter attribute of

the SSIS_Configurations table. Hence we are able to use specific portions of the whole bunch of configs and can avoid unwanted warnings.

That means every package has at least two configurations:

* The config STDCONN for the location of the SSIS_Configurations table and

* One ore more further configurations from SSIS_Configurations (one config for each desired category)

Of course this method comes with some caveats. One of them is that you cannot use one SSIS-Environment for different

production scenarios (altough we do practice that in some cases) because the name of the environment variable is unique.

I consider this not to be a workaround but a working solution. Developers are even able to open packages from different

projects because of our naming conventions, so I think it's worth the effort to maintain the configuration table.

September 1, 2008 09:03
 

Phil said:

As it happens I work with Sam - and I've posted about parent package configurations versus direct configuration for child packages in some detail on my blog: http://phil-austin.blogspot.com/2008/08/child-vs-parent-package-configurations.html.

In terms of solutions, although I have yet to look at it in detail, a couple of ideas occur to me:

1. Use a child package template - for commonly used configuration you could hard-code values into the package variables.

2. Use different configurations for development-time and deployment-time. You could use direct configurations when developing packages and once frinished, switch to using parent configuration when finished. A bit dangerous this since you could simply forget to do it - although this should be spotted if you have a test environment.

Anyway as I say in my blog post this is something I'll return to. SSIS configuration is something I've spent a bit of time on over the last few weeks...

September 1, 2008 18:29
 

Jon said:

Another useful post, Jamie. Thanks to all for the insightful whitepapers...

September 2, 2008 19:37
 

Log Buffer #113: A Carnival of the Vanities for DBAs said:

September 5, 2008 17:59
 

Alex said:

We have taken an approach very much like the one Fridtjof outlined - I commend it, it works well, and makes deployment very straighforward and managable.

Use of SQL configurations and sensible naming conventions for the configurations filters allow configuation to be selectively applied as needed to your packages. For example we have a 'Common' filter which applies common environment-specific settings used by most packages, with other filters denoting more specific usage.

In my view this is much more manageable, less onerous and error prone than setting up potentially numerous Parent Package Configurations across all your child packages.

Also packages relying on Parent Package configurations cannot readily be run in isolation - for example for debugging during development.

We do use some Parent Package configurations - generally limited to information which cannot be predetermined in static configuration - for example is a package being invoked as child or not.

One difference in our set up compared to Fridtjof's overcomes the limitation of being unable to run separate environments on the same SSIS server due to the uniqueness of the environment variable. Less of consideration with the growth of virtualised servers I suppose, but one which we still need to put up with in our infrastructure.

Use an xml configuration file specific to each logical environment (Test,  Support,  UAT etc.) to hold the connection properties to the relevant SSIS SQL configuration database. Within the package, define  the xml configuration without any absolute path - just the name of the configuration file.

Have a specific single folder for each environment holding the relevant configuration file. The packages can be located wherever you like, and can be shared between environments if that is appropriate.

Now just ensure that when a package is invoked via DTExec, the working folder for DTExec is set to the folder holding the configuration file for the required environment. Everything then falls into place. The package picks up the correct configuration file and connects to the required database to pick up the rest of its configured settings. And it works fine for any child packages that are called in-process.

We have a formal mechanism to construct the required DTExec command line which amongst several other things passes the path to the package folder as a /SET argument, allowing simple run-time configuration of file connection managers to co-located child packages. Child packages located elsewhere can be defined in a SQL configuration

Packages can be invoked in this way by any suitable programmatic menas - we call packages from SQL procedures, SQL Agent jobs, Batch Files and Scheduled Tasks. With little effort such SQL procedures can and should be made location-independent too - easing the deployment task.

In Visual Studio the packages work in debug without any changes at all. The requirement here is that the solution, project, xml configuration, and packages are in the same folder.

Deployment involves setting up the SQL Configuration entries, then copy and paste the packages.

September 9, 2008 18:34
 

Recent Faves Tagged With "logfile" : MyNetFaves said:

September 14, 2008 19:50
 

Clark said:

I found that when debugging, I was passing a variable from one package to another.  The Package being called defined the Package Variable as well as a few others that weren't passed.

Seems that SSIS is polite enough to warn us that a defined Package variable didn't exist and wasn't set, however, it also didn't bother setting any more Package Variables after the warning.

Had to create variables in the calling package even though they weren't used to get around this.

September 16, 2008 22:55
 

Jeff said:

Alex,

Your suggestion to "define  the xml configuration without any absolute path" worked like a charm. I just put all of our configuration in the XML config file. Doing so made this whole ugly configuration mess disapper for us: there is no need to use SQL Server configuration; no need to use the /CONFIG option or environment variable to pass the name of the config file; no need to pass the package folder as a /SET argument (it's in the XML config file); and there is now only one package configuration per package (vs requiring SQL Server configuration *and* XML configuration for SQL Server configuration connection string). Our configuration needs are perhaps more simple than yours, so we don't require anything fancier.

Too bad this feature doesn't get more explanation... in the Microsoft documentation or on the Net.

Thank you!

October 17, 2008 23:43
 

Colleen said:

This was extremely helpful.  Thanks to all of you for you advice!!

November 2, 2008 13:25
 

Josey said:

I have several child packages which all have their own configuration files. They work fine as expected. They all point to a global configuration file with the "LoggingMode" option set to "0". When i run each child package on its own, it runs perfectly and does NOT log any events to the sql server table. This is expected and am happy with this result.

HOWEVER....when i use a parent package (driver package) to run all child packages in sequence, for some odd reason, the child package seem to start logging. This is against their configuration. I have even gone so far as to explicitly turn off logging in the parent package and still no success.

Is this a SSIS bug ?

How is it that once all child packages are exectuted via the "Execute Paclage Task" in the parent package, that they begin logging? yet when run individually, they dont log ?

November 10, 2008 03:35
 

jamie.thomson said:

Hi Josey,

This isn't a bug as such its just the wy it is but it is a HUGE annoyance. it basically comes down to the way that parent package configs are applied - they get applied after everything else and the most visible consequence of this is that the package begins logging BEFORE you get a chance to change the logging config.

Davide Maur's blog entry (that I pointed to above) explains this in more detail ad is also why I vehemently believe that the Execute Package Task should support parameters (again, link above).

-Jamie

November 10, 2008 09:15
 

Josey said:

Thanks for your reply.

I dont really need the parent package to pass in parameters to the child packages. All the parent package does is fire off each child package one after the other. So my question is....why is it that when the child packages are run individually, they are able to read their config (from config file), and do what theyre supposed to do (not log)? yet when they are executed from some parent package, they ignore their own individual configurations and just start logging ?

I did read Davide Maur's blog entry and it seems that when a child package is run from a parent package, it will begin logging before it reads in its own configuration ? - very odd.

The only difference i see, is that in one scenario I am clicking a button to execute the child package individually (which works), and in the other scenario, another package is taking care of that and automating that process for a series of child packages (which doesnt work as expected).

Also, the same sets of packages run about 4X faster in SSIS 2008 (they are SSIS 2005 packages) as opposed to running much slower in SSIS2005. And whats even more startling is that im using a foreach container. So its only iterating one item at a time, so how on earth can this be sped up without a newer more effective archirtecture i dont know. what ever it is, SSIS2008 is way faster.

November 10, 2008 21:48

Leave a Comment

(required) 
(optional)
(required) 
Submit

This Blog

Syndication

News

Powered by Community Server (Personal Edition), by Telligent Systems