blogs.conchango.com

welcome to the conchango blogging site
Welcome to blogs.conchango.com Sign in | Join | Help
in Search

SSIS Junkie

Conchango are busy and need talented consultants in and around London. Interested? Email me or send me a message

SSIS: Should I store connection strings in variables?

I've had an idea about how I'd like to use connection managers, variables, and consequently, configurations and its quite a diversion from how I manage these object types in SSIS currently.

Quite simply I'm toying with the idea of storing all my connection strings in variables and then using expressions to apply those connection strings to my connection managers. Why? Well there are a few reasons but it all comes down to one thing - reusability. If I store my connection strings in a variable then a single connection string can be used (and modified if needs be) by multiple connection managers - and I am currently working through a scenario where this would be useful.

This approach also has the added advantage that your connection strings would be available for whatever else you want to do with them such as entering them into a custom logging table.

That's quite a diversion from how I work currently so I'd be interested to see what the SSIS developer community thinks about that so if you have an opinion please let me know by posting a comment below.

 

Thanks

-Jamie

P.S. I'm aware of occasional problems when trying to post comments up here. I think you need to sign-up in order to do this.

Published 01 December 2006 18:01 by jamie.thomson

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

furmangg said:

I did one project where all the connection strings were variables. This was nice because the master package could control the connection strings then pass the master package variable to the child package variable.

But the downside was that when you were trying to test a package that another developer wrote, your connection string might not be the same as his. So every time you'd open up someone else's package, it would take about 5 minutes to finish validating all the pieces of the package because the connection strings pointed to a non-existent SQL server.

So on my most current SSIS project, we opted to use configuration files using an env variable to point to the path of the configuration file. That way, I can open someone else's package and it will automatically pick up my connection strings.

Also, I can keep multiple dtsConfig files and swap them in and out when I need to test against the dev database and the QA database. When I was using package variables, I had to constantly retype those connection strings over and over again since often you just test one package manually without running it from the master package.

December 1, 2006 19:59
 

jamie.thomson said:

furmangg,

Interesting. Well what I am contemplating doing is a mixture of the two. I store the conn strs in variables and, rather than pass the values from a parent package, I set those variables using configurations. Would that solve the problems that you encountered?

-Jamie

December 1, 2006 20:10
 

furmangg said:

I think setting the variables based upon configurations would have solved the problems I had with using variables for connection strings.

December 1, 2006 20:38
 

paschott said:

I've been using expressions for a while.  This has been useful for me because I have worked with a lot of different sources and destinations, but the same basic structure.  I found that using the "Delay Validation" option in the code helps quite a bit to avoid validating that everything exists when I open a package.  It also stops the package from trying to write to the incorrect database or read from the wrong source.

My basic method has been package variables to control individual elements of the connection string where possible.  I have to manually change them when I open the package, but that's just a whole lot easier and faster than using the UI to search for source files or change database names.

-Pete Schott

December 1, 2006 21:14
 

Darren said:

Why have you got multiple connection managers with the same connection string? I would expect to have one connection per connection string, and use the multiple-instance behaviour of the connection manager to provide individual connections to each task or component as required. Saying that I have found having two connections is useful sometimes when using explicit T-SQL transactions.

One issue that people should be aware of when using variables for sensitive information, is that the value will be clearly visible in any checkpoint file, if used. You can take a view that these files will be secured correctly using NT permissions, but something to bear in mind. Some organisations have strong security rules or guidelines that may prohibit this.

December 5, 2006 09:40
 

jamie.thomson said:

Pete,

Thanks for the comments. Storing individual parts of th conn str in a dedicated variable is something else again that I hadn't considered.

Darren,

In answer to your first question....its because I need one of them to have RetainSameConnection=TRUE. But then I have other executables that do NOT require this.

-Jamie

December 5, 2006 14:40
 

mf_fits said:

Hi Jamie,

We are using connection string parts in variables and have an expression on the connection manager for just two reasons:

- first

we store a kind of status [dev or productive] in a variable to trigger the right connection strings and source / target database servers from the configuration

- second - and more important

regarduing our ETL-scenario we have to loop through several servers, databases and tables which all have the same metadata and will be activated via a kind of loadtable-configuration.

Using the connection string information in expression enables to modify the source / target and also enables a loop while loading data from different sources to different targets.

We also use expressions for the sql statements though that also will be modifiable during runtime of the package.

That in my opinion simplifies development because you are just talking about source and target and the dedicated servers are triggered via variables and expressions.

cheers,

Markus

December 8, 2006 19:30
 

jamie.thomson said:

Hi Markus,

Regarding your second point ... you are still using an expression on the connection manager to set the connection string. Am i right?

Certainly that is one way where you HAVE to use variables to achieve this. I guess I was driving more at storing your connection strings in variables when you have the option of NOT doing so.

-Jamie

December 8, 2006 20:32
 

lauriebgori said:

Hello,

If I'm following the discussion here correctly, you are saying to store 'Data Source=' in one variable then append the server name then ';Initial Catalog=' in another varible and so on?

My other question is naming conventions for database connections and how to use those in a configuration file and how to easily change those when moving packages to different servers without making changes to the package. Any ideas?  I've done a ton of searching and reading for specifics and there really isn't anything spelled out with specifics.  Or would you even put database connections in the configuration file? Or would each server you could connect to have a seperate configuration file?

Laurie

December 8, 2006 21:23
 

jamie.thomson said:

Laurie,

No, I'm not suggesting that. In fact, in my suggested best practices (http://blogs.conchango.com/jamiethomson/archive/2006/01/05/SSIS_3A00_-Suggested-Best-Practices-and-naming-conventions.aspx), I advocate storing the whole of the connection string rather than its constituent parts. Other people in this discussion maybeg to differ (which is fine by the way)

Regarding configuration files...I would ALWAYS put all of my database connection strings into a configuration file.I would also put each connection string into a seperate configuration file (that's something else I recommend in the suggested best practices).

I hope that answers your questions. if not, reply here and let me know.

-Jamie

December 8, 2006 21:44
 

lauriebgori said:

Hi Jamie,

Thanks so much for the reply.  I've actually read that best practices doc, a few times now (even before you sent this).  And I'm still a little confused...sorry, I have no idea why this is the hardest part for me to grasp.  I understand everything else pretty easily.  So lets say I have a package, package1.  Package1 has 2 database connections, one is an ADO.NET connection and one is an OLE DB connection.  I have named them DEVSERVERNAME.DatabaseName ADO.NET and DEVSERVERNAME.DatabaseName OLE DB in the package.  Each has a different connection string referencing the same server and database in our development environment.  So you said store each of these in their own config files.  What would I call the config files?  My guess would be something like DEVSERVERADONET and DEVSERVEROLEDB?  This is where I get confused.  All these names have to be generic enough to reference in the package so when I change servers, it works.  But what if I have more than one ADONET connection to another database?  I can't name the config with the server name in it because if I reference that server name config in the package it will change when I move to production to PRODSERVERADONET.  And if I want to use the ADONET connection to connect to another database, I can't reuse that connection string.  So that's when I started thinking about storing the different parts of the string like I mention in the first message. And what would you name the variables in the package to keep them generic enough?  It is the idea of keeping things generic that is messing with me I think.  I hope I haven't confused you with all of this.

Thanks again so much for the help!

December 8, 2006 22:36
 

jamie.thomson said:

Errr you've confused me a bit yeah, but don't worry about that :)

I think your fundamental misunderstanding is that you're supposed to create a connection manager for your dev environment, then when you deploy to another environment you would have a different connection manager. That's not the case.

In your scenario you should have a connection manager called DatabaseNameADONet and another called DatabaseNameOLEDB. You store the connection string for those two connection managers in a configuration file.

The mistake you are making is that you think there is a set of connection managers and associated config files for your dev environment, and then another set for each other environment. That is not the case. You only ever have one set of connection managers and associated config files. When you move through the environments the only thing you change is the connection string in the config files.

Hope that helps.

-Jamie

December 8, 2006 22:55
 

lauriebgori said:

AHhhhh Haaaa!  So then as the package moves from dev to prod, all I have to do is make sure that in each database connection in the config file has the right server name in the connection string and it will reference the right server without having to make changes in the package.  Perfect!  

The article below was what got me started thinking this way.  Check out the section called Guidelines on using XML configuration files in SSIS. If you leave the server name out, then that makes sense.

http://vyaskn.tripod.com/sql_server_2005_making_ssis_packages_portable.htm

Thanks again so much! I've learned so much from the stuff you have posted.

Laurie

December 8, 2006 23:26
 

Subhash Subramanyam said:

Jamie and others,

Will be of great help by you, if anyone can solve my issue with Package Configurations:

I have initialized few global variables and I want to assign them to the properties of few tasks, log provider and connection managers.

Would you please give me steps how I can achieve this ...

--------------------------------

"furmangg,

Interesting. Well what I am contemplating doing is a mixture of the two. I store the conn strs in variables and, rather than pass the values from a parent package, I set those variables using configurations. Would that solve the problems that you encountered?

-Jamie

December 1, 2006 8:10 PM "

March 15, 2007 15:35
 

jamie.thomson said:

March 15, 2007 18:16
 

Subhash Subramanyam said:

Hi, Jamie and furmangg,

  Thanks for your support. It was a great help.

Subhash Subramanyam

June 13, 2007 06:54
 

Tim McCurdy said:

You don't need to do this at all, you're making it too complicated as was I in the begining.  Do NOT add Connection Managers within each Package!  All you need to do is add Connections under the "Data Sources" folder of the SSIS Solution.  I usually add two DataSources for each Database I use, one OLEDB and one .NET Connection.  

Once you have all your Data Sources added, NOW you can add your Connection Managers, however, you need to select the option "New Connection from Data Source..." from the Connection Manager menu.

This will basically allow you to have the same exact connection in every Package and be able to modify the connection string in one location; the Data Source!  Now, when you move your Solution to production, you don't need to change a bunch of Variables, you just modify the Data Source connection properties!  I'm positive this can be done with configuration files, but I haven't tried it yet.

June 21, 2007 16:04
 

furmangg said:

Tim-

That's really an interesting idea. I've never uses the Data Sources folder in an SSIS project so I tried it out today. I created a data source and created a connection manager in a package based upon it. So far so good. Now I change my data source. If the package is open, it automatically updates the connection string on the connection manager. If the package is closed, it doesn't automatically update the connection string on the connection manager. When you open up that package, though, you get a Synchronize Connection Strings dialog box. So this works OK for development.

However, for deploying packages to production, I can't see this would work well. So if you were deploying some package to production, wouldn't you have to open dev studio and open each package and then change the data source? Doing so actually changes the dtsx file. It's never a good idea to change your source code (i.e. dtsx file) when deploying to production. As a workaround, I suppose you could control the connection manager's connection string with a config file in production only, but then you're not testing those config files during development. Any other thoughts?

June 21, 2007 21:00
 

jamie.thomson said:

furmangg,

That was my understanding as well. To be honest I can't speak with any authority about data sources because it is so long since I even considered them. They were built for Analysis Services not SSIS and that raised my doubts straight away. When I heard many MANY people complaining about them for the same reasons you have outlined here then I decided to steer well clear of them.

Tim, if you have had different experiences then I would love to hear about them.

-Jamie

June 22, 2007 02:40
 

Greg Galloway said:

If you’ve regularly found yourself changing the targetserver property and the data source connection string before you deploy yourcube to another environment, it’s simply because you’re not aware of a ...

March 19, 2008 23:57

Leave a Comment

(required) 
(optional)
(required) 
Submit

This Blog

Syndication

News

Powered by Community Server (Personal Edition), by Telligent Systems