blogs.conchango.com

welcome to the conchango blogging site
Welcome to blogs.conchango.com Sign in | Join | Help
in Search

SSIS Junkie

Conchango are busy and need talented consultants in and around London. Interested? Email me or send me a message

SSIS: Package parts

As SQL Server 2008 is practically now a done deal my thoughts have turned to what will come in future versions. One feature I would like is something I'm notionally calling "package parts". Let me explain.

Today in SSIS we build, distribute and execute these things called packages. Packages are fairly large, verbose, non-descript BLOBs and that has some rather negative implications such as:

  • Changing anything in the package, even something as small as moving a container one pixel to the left, causes the package to get checked out (assuming you have your packages under source code control)
  • Support for multiple-developer teams is limited. If you have two people building dataflows then they have to be working on separate packages.
  • The only method of reusing tasks is copy-and-paste
  • Comparing two packages to check for differences and hence merging them together is not possible

I'd really like these problems to be alleviated in a future version of SSIS and this could be done by breaking the package into separate parts. In other words, containers (and remember, tasks are a type of container) exist as separate files in solution explorer. This would allow:

  • Check in/out of containers rather than a whole package
  • A developer could be building a dataflow without preventing another developer from working on another dataflow
  • Containers could be instantiated in different places. Define once, use many times - isn't that the premise of code reuse?
  • Compare tools could be built that understand how a container gets defined once it is serialised as XML without having to worry about extraneous matter such as workflow and designer formatting


Does that sound like a good idea? Let me know in the comments and if there is enough interest then I'll submit this to Connect.

-Jamie 

UPDATE: OK, 2 days on and there have been 18 comments saying "yes" and none saying "no". Hence, I've submitted this here.

 

Published 15 April 2008 22:39 by jamie.thomson

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

John Welch said:

Absolutely! Been wanting this since the first time I used SSIS.

April 15, 2008 23:34
 

Mike Sexton said:

sounds good.  Make it so.

April 15, 2008 23:39
 

Christian Nordbakk said:

How could that not sound like a great idea ;) I'll vote!

April 16, 2008 00:09
 

Peter said:

Definitely.  I'd love to not have to check out an entire package when I debug it or be able to work on just a small section, knowing that other pieces wouldn't be affected.  Right now, I'm one of two or three SSIS users and we're all separate, but this would be useful as we're about to all be working on pieces of a data warehouse load.

April 16, 2008 00:25
 

Dave said:

I couldn't agree more! And I just wanted to add, that I would like to be able to change Data Flow Tasks and not have to delete and recreate Union All Nodes EVERY time something changes.

April 16, 2008 01:18
 

MatthewRoche said:

I could not agree more. Defining the container as the unit of reuse and versioning makes a lot of sense to me, for all of the reasons described above. The two biggest value propositions for me are the opportunities for reuse and reducing "developer check-in contention" and these two alone make this a "killer feature" in my book.

If you take the time to write up a Connect suggestion, I'll vote early and often.

April 16, 2008 01:20
 

Sam Webb said:

I'm all for anything that makes source-controlled development of SSIS packages less stressful!

April 16, 2008 02:31
 

Ivan Peev said:

Jamie,

You have to agree CozyRoc already provides code reuse with Script Task Plus and Data Flow Task Plus. You may not like it is a third-party solution, but it exists ready to be harnessed.

April 16, 2008 04:39
 

Joost said:

That would be great!

April 16, 2008 07:47
 

Dave said:

Count me in! I'd really really really like this to be include in a new release of SSIS (2010 ? ;-) ).

Let me know when you submit it to Connect.

April 16, 2008 09:22
 

Joe Harris said:

Jamie,

You're clearly the king of the SSIS user community. If anyone can get them to see sense it's got to be you. Good luck.

My wishlist:

> Sort of the f**king data types. Choose one set. Any set. Just one.

> Separate the functionality from the layout. (Hello Microsoft. It's 2008.)

> Make it easier to hand code VBA style. Why do I have to learn about dependencies just to script in SSIS?

> No XML: XML is evil.. how about some YAML? (I'm dreaming here.)

> Generic packages: I want to design a package that I can use as a task in another package. Doesn't seem like it would be that hard.

> Visual explain: I want to see which packages are blocking/non-blocking. Sort of like fan in/out in Ab Initio.

> Persistent connections: Why do I have to go through all this connection crap for every new package. What's wrong with ODBC? Use it as a paradigm if nothing else.

April 16, 2008 10:36
 

P R W said:

Definitely some good ideas there Jamie.

I particularly like point 3. We have worked on large packages where it would have been really useful to split that up and share the building between several developers (I suppose you could argue about the use of sub-packages but that isin't always ideal).

Sharing development also brings about issues with Package Configs which needs to be addressed.

Oh yes, and the point the last commentor made about datatypes, I agree....considering data flow is the fundamental point in SSIS.........why oh why there are so many differing datatypes between tasks, variables, expressions is just ridiculous.

April 16, 2008 11:20
 

Andy said:

I agree, having containers as separate items under source control would be good, separating code and layout could also be benificial. Visual comparison would be nice but happy with XML comparison, not forgetting that XML comparison is not the same as text comparison. Sharing and reuse of blocks would be wonderful.

April 16, 2008 11:25
 

Anthony Martin said:

Great idea.  I'll definitely vote for it.

April 16, 2008 14:39
 

Brian Schkerke said:

Another vote for yes.

April 16, 2008 15:37
 

Dave Fackler said:

Jamie,

Great idea.  Don't forget to mention that a container used within a package should allow the package to override properties of the container and tasks/objects within the container.  Thus, I could define a reusable container that did some work, but override properties at runtime within the package calling the container.  Somewhat like passing variable values from one package to another via package configs, but it needs to be easier than that is today...

Dave F.

PS.  It would also be nice (not sure how easy it would be though) to be able to instantiate a specific version of a container within a package.  Thus, if a container was updated, it should be possible for packages that currently call the container to use an older version.  Perhaps by importing the content of the container into the package itself?  Obviously, this would be something you'd want the package developer to have control over (freeze on a version of a resuable container or always use the most recent version).

April 16, 2008 19:32
 

Graeme Oliver said:

Absolutely yes - great idea.

April 16, 2008 23:18
 

LK said:

How about solution level variables, for truly solution scope items? (We have packages that are run on a monthly basis, and you currently have to change the month variable in every package...)

Otherwise, sounds great.

April 17, 2008 05:13
 

jamie.thomson said:

OK all, I think this is pretty conclusive. I've submitted this to Connect here: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=338609

-Jamie

April 17, 2008 07:13
 

Mike C said:

Hey Jamie, while you're at it how about asking them to use standard SQL Server data type names instead of all those God-awful inconsistent C++ data type names (B_STR, I4??  How hard could it be for them to change the labels to VARCHAR and INT for God's sake?)

April 20, 2008 01:09
 

jamie.thomson said:

Mike C,

I won't be requesting that because personally I don't want them to make that change. If you want it then feel free to make a Connect submission yourself.

-Jamie

April 20, 2008 13:10
 

Dan English said:

This is a fabulous idea and I could really see this becoming useful especially in multi-developer environments and for reusability (is this a new concept?).  I get tired of the copy-paste and trying to hunt down where the object got pasted to in the designer and then if I can't find it right away I'll paste it again.

I could really see this being useful with script tasks and according to another comment posted above it sounds like there is already a 3rd party that has implemented this logic - so people must be requesting this type of functionality.  I am trying to recall when I used DataStage, but I am almost positive that there were containers that you could reuse just like there were connection managers that you could reuse.  In doing this it is really going to be key to be able to have the ability to have a dependency analyzer built right into the IDE instead of having to use the Metadata Toolkit (seems to currently be an issue downloading this on Microsoft - but you can get it here http://www.ssistalk.com/2008/03/28/microsoft-bi-metadata-toolkit).  We are going to need to find out what the impact of modifying these containers is going to have on the overall project and be able to easily track down the packages that are referencing this container.  I am assuming that if a change was made to the container object that it would be replicated throughout the project, but maybe not.  Thoughts?  I really think that it is time to start working on the metadata and having improved metadata reporting within the tool for documentation purposes (like IBM's WebSphere Business Glossary tool).

I know that BIDS Helper includes the ability to perform a DIFF on two SSIS packages now within the IDE which is helpful and it is also nice that it includes some highlighting on your tasks and connection managers to see if configurations or expressions are being used (http://www.codeplex.com/bidshelper).  It also provides an expression list window like variables which is really helpful too.

April 20, 2008 13:44
 

Carlos Vacca said:

Jamie,

Great idea. If I understand correctly your suggestion is aimed mostly at grouping TASKS into containers, right?

How about being able to group data flow transformations into separate containers as well?

It would be really powerful if you could group a sequence of data transformations into a custom black-box transformation, and then be able to reference it from various packages. This would allow breaking up the packages even further. Some advantages:

- more granular source control / parallel development

- encapsulation of frequently used custom data transformations.  more 'code' reuse

- blackbox/contract based approach when building data flows

May 1, 2008 04:50
 

ashish sinha said:

No second thoughts. Absolute yes.

May 6, 2008 15:49
 

Craig Mischenko said:

Yes, yes, yes.  SSIS packages are the very worst object type that we have to work with in our source control projects for our BI solutions.  Voted on Connect already!

May 30, 2008 08:46
 

Snork said:

I would like to comment on this, but as I have been subjected to developing SSIS functionality today, my tourette's syndrome is coming out of remission.  Am I the only person in the world who thinks SSIS is probably, all things being equal, the worst program ever created?  I'd love to find an alternative (besides just coding what I want to do in c#?)

June 25, 2008 20:28
 

jamie.thomson said:

Snork,

If you have some constructive criticism to make then I'm sure it will be appreciated at http://connect.microsoft.com/sqlserver

-Jamie

June 26, 2008 12:30

Leave a Comment

(required) 
(optional)
(required) 
Submit

This Blog

Syndication

News

Powered by Community Server (Personal Edition), by Telligent Systems