blogs.conchango.com

welcome to the conchango blogging site
Welcome to blogs.conchango.com Sign in | Join | Help
in Search

SSIS Junkie

SSIS: Random information for beginners

Anyone who has used a software tool will tell you that its impossible to know that tool inside out after just a few dys of using it and of course SSIS is no different.

As you get to use SSIS then your knowledge improves and achieving your SSIS goals becomes alot easier. However, attaining that level of knowledge is not easy. That are alot of foibles in SSIS that you need to know, things that won't be documented, and that sort of knowledge can only be acquired by getting down and dirty with the product and actually using it.

As such, I had the idea that posting a list of tips, tricks, guidelines and similar for new users would be useful. The list is small to begin with but I expect it to grow over time and I'm looking for alot of input into this. In other words, I need your help!

If you have any suggestions then add them as a comment (N.B. Anonymous comments are disabled so you will have to register and login, otherwise email me) and if I think its a good suggestion I will add it to the list below. And remember that this is intended as an aid, so try and be polite! This isn't an invitation to whinge at me about things that don't work or that you think could/should work differently.

Without further ado, here is the burdgeoning list. I hope it proves to be useful.

  • In Derived Column component it is not possible to replace a value in a column with a value of a different type
  • Synchronous components output the same number of rows as in the input. Asynchronous components may not.
  • Synchronous components output data in the same buffer as is input. Asynchronous components use a different buffer.
  • If you get a warning on an OLE DB Source component when connecting to Oracle you can probably fix it by setting AlwaysUseDefaultCodePage='True'
  • You can change the default type mappings that SSIS (and the Import Export wizard) uses for connecting to external sources by editing the XML files at C:\Program Files\Microsoft SQL Server\90\DTS\MappingFiles (credit for this one to Pete Schott)
  • If you are dynamicly setting connection strings of connection managers at runtime then you may need to set DelayValidation=TRUE on the tasks that use those connection managers (Pete Schott)
  • When storing connection manager properties in a configuration file you only need to store the ConnectionString property as this is an amalgamation of all the other properties.
  • SSIS will not store passwords for you in a configuration file as this is considered a security risk. You need to manually edit the configuration file yourself to enter the password.
  • Setting IsSorted=TRUE on a component output does not sort the data for you, it only tells the engine that the data is sorted. Only set this property to be TRUE if you know for sure that the data is sorted.
  • When using OpenRowset Using FastLoad in the  OLE DB Destination, the component will attempt to commit records in batches, the size of which is determined by the FastLoadMaxInsertCommitSize property. If you have set the component to divert failed rows then in the case of an insertion failure all rows in the batch will be diverted to the error output. Hence, you are likely to see perfectly valid rows in the error output..

 

 

-Jamie

UPDATE 2006-10-23: Philip Coupar, pointed out that you need to login in order to post a comment so I've updated this post accordingly.

 

 

Published 18 October 2006 17:50 by jamie.thomson

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

skreekc said:

Loops in SSIS do not release memory until the package is stopped. While this is a known bug on MSDN, it will not be fixed until SP3 (link below).  

This is good to know because we built 5 SSIS packages that use a infinite loop and a File Watcher Task to monitor a directory for new files being dropped into it. When new files are dropped, the watcher tasks makes the loop startover, and process the files. This in essence makes the package always be in a constant run mode.

This seemed fine while we where testing the packages with a limited timeframe, and a limited number of files, becuase the package was being started/stopped. After testing had passed we deployed to production, where a point-of-sale server was constantly pushing files (about 6500 every 30 minutes). After 3 hours of processing files, the package would start failing on files becuase of a memory leak (out of memory exception).

We then learned the the loops don't release memory, so SSIS packages will consume memory everytime a loop inside one is ran, and it will not be released until the package is stopped and restarted.

To resolve this we removed the infinite loop and file watcher, and now just use a scheduled job to start/stop the package every 10 minutes and process the files located in the directory.

In short, SSIS Loops consume memory everytime they are ran, and do not release it until the package is stopped.

I don't know if this is where it should belong but i've been wanting to vent on this fore awhile!

http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=219782

October 23, 2006 19:55
 

jamie.thomson said:

skreekc,

This is really valuable info and I appreciate you providing it. However, its in depth stuff and kinda out of scope for what i want this post to be about.

That's not to say this info doesn't need to be distributed. perhps you could contact me offline jamieDOTthomson[AT]conchangoDOTcom

-Jamie

October 23, 2006 20:56

Leave a Comment

(required) 
(optional)
(required) 
Submit

This Blog

Syndication

News

Powered by Community Server (Personal Edition), by Telligent Systems