UPDATE: I have posted a response from Denise Draper, head of the SSIS team, here: http://blogs.conchango.com/jamiethomson/archive/2007/07/30/SSIS_3A00_-A-response-from-Microsoft-to-the-growing-criticism.aspx
A colleague recently pointed out this thread to me from Oren Eini's blog. The post is entitled SSIS' 15 Faults and boy-oh-boy does this guy have it in for SSIS. So do alot of the people leaving comments.
I have to admit that Oren and his commenters raise some great points that I wholeheartedly agree with. A far-from-perfect design UI, overly verbose errors, painful deployment strategies, VERY poor expression builder, less-than-perfect support for other platforms, too many mouse clicks required to do many things, maintaining different packages that do basically the same thing save for differering metadata, no 'UNDO' functionality, packages are effectively binary objects in regard to source control, dead hyperlinks to help pages, awfully crappy SQL editor...the list is pretty long and doesn't make great reading for the SSIS team. I would suggest that fixing these pain points should be a priority for katmai and that the powers that be give the SSIS team the manpower they need to do that - unfortunately I have it on good authority that that is not happening.
There are other points on there though that I don't feel are warranted and I think it only fair to reply to them which I'm going to do below. Not all of them are Oren's points either - many of them are from his blog commenters.
- I wish I had a dollar for every time that SSIS kept track of something it shouldn't. Be it the old configuration, hard coding the connection string inside the package and completely ignoring the configuration JT: I have never seen this happen in three years of using the product. If it seems as though configurations are not being used then they have been setup wrongly. That is not to say that the process of setting them up couldn't be improved.
- Security? Who needs that JT: Is this a serious comment? the Trustworthy Computing Initiative was instigated in Microsoft to ensure products were MORE secure. As a result SQL Server is now considered to be the most secure database platform on the planet (http://news.com.com/2008-1002_3-6164785.html?part=rss%26tag=2547-1_3-0-20%26subj=news).
- I should also mention that SSIS packages requires sysadmin rights to run when scheduled as a job JT: This is completely untrue. It is possible to setup proxy accounts that are not sysadmin in order to run packages.
- Bad configuration scheme: This bring me to the configuration scheme itself, yes, you can put the configuration anywhere you want, from incomprehensible XML files with hard coded paths to SQL tables to which you will have to have an hard coded connection string, from environment variables (WTF?!) to stored inside the package. The problem with all those options is that SSIS seems to choose between those at random. At times it would choose the connection that it had assimilated inside the package, at times it would go to the wrong file, get the old configuration, or just complains that it is not having fun and that I should baby sit it again. JT: Back to my point above, if this is happening then the configurations have been setup wrongly. It NEVER chooses configurations at random. It would be good if the person making the point could make some suggestion as to how it could be improved because if people are experiencing this then there needs to be improvement somewhere. And what's the issue with environment variables?
- Today I had a dynamically generated SQL inside an SSIS package that was causing an error. I had no way of finding out how to get this dynamic SQL (which was stored in a variable) JT: Use a watch window (http://blogs.conchango.com/jamiethomson/archive/2005/12/05/SSIS_3A00_-Using-variables-to-store-expressions.aspx).
- I lost count of the many things that I consider a given that SSIS doesn't have [such as] the basic UPSERT support JT: Hmm...not sure about this one. UPSERT is an operation that would have to be supported by the database platform being inserted/updated wouldn't it? Not sure why this is SSIS's fault. Perhaps I'm misunderstanding in which case I'm happy to be put straight.
- connection "managers" cannot cope with simulataneous multiple data flows JT: Shame this wasn't elaborated on so I'll answer the best I can. Connection Managers only maintain connections to external data sources so if the package is performing badly it could be because of those external data source - that's not the fault of SSIS. The other most likely cause is bad package design.
- My biggest gripe though is how SLOOWWWW it is. If you have a solution with say eight complicated scripts in it it can take 5mins to load on my dual-core Pentium-D 930 with 3Gb RAM JT: There is room for improvement here in the bloated VS shell but mainly its important to understand WHY this is happening. When a package opens up it tries to validate all external connections. If this is taking a long time then the blame is on the external connections and the network inbetween, not on SSIS. It is possible to turn off this validation by selecting 'Work Offline' from the SSIS menu.
- Setting 'sort order' on a column, does not actually sort the data JT: Its not supposed to. RTFM
- No throttle control. It does allow you to set some thread settings, and write out files to a temp path as it does what it wants, but I had cases where it would absolutely swamp a well-built dev machine so much that the machine became absolutely unusable as it ran. JT: This can be solved by better package design and performance tuning. Its a fact of life that the more work a CPU has to do then the slower it will do stuff so design your package accordingly. I think the kind of throttling that the commenter was after can be achieved using raw files.
- stupid proprietry datatypes JT: The guy that complained about this complained about it on the MSDN forum as well. He wanted all the SSIS data types to be the same as SQL Server. I disagreed with him then and I disagree with him now. SSIS is a tool for integrating data from heterogeneous places - something that, by and large, it does pretty well. Hence, the data types are generic enough to allow this, they are not tied to any one particular platform. If that means they are 'proprietary' then so be it.
- There really should be a configurable store option for NO SECURITY, the protect sensitive and whatever else are fine but really don't lend themselves to working in a dev team where you need to share these files. JT: See earlier comment about security. I feel the guy's pain on this but trust me, SSIS can easily be used in a multi-developer environemnt. I know this because I'm currently working in one.
That's all. I'm going to post a comment on Oren's blog with a link back to here so I'm fully expecting some angry comments. Watch this space.
-Jamie
UPDATE: Oren has posted a response here: http://ayende.com/Blog/archive/2007/07/27/SSIS-The-backlash.aspx
I tried to reply but for some reason his blog barfed when I hit submit (I guess cos its too big) so here it is below instead:
Hi Oren,
I'll take those in turn (bit difficult in unformatted text but here we go)
Oren: "Here is a simple story, I had canceled the package configuration, reconfigured a data source to point to the test database, and run the package. It executed itself against the production database!! Once I found that out, I managed to see it do it twice (while the data source point to the test database!), but I haven't been able to consistently reproduce it since"
JT: I really wish I could explain this but I'm non-plussed. As I alluded to in my post the whole interface to configurations could be improved but I still maintain that the behaviour is predictable. I wish I could come to your desk and see it happen for myself. I refer you to Sam Loud's comment on my blog entry where he states "Point I'm making is that it's easy for beginners to make mistakes with configurations. Once you've know what you're doing, they work just fine." I'm not trying to tag you as a beginner Oren, just emphasize the point that the behaviour of configurations IS predictable, whilst accepting that grasping it isn't simple. If I had witnessed the same behaviour as you then I wouldn't be writing this now but I can honestly say that in 3 years of using this thing I've never seen it. Sorry. I guess we're not going to see eye-to-eye on this.
Oren: "the very next statement acknowledged the existence of proxy solutions, I still want to understand why this exists."
JT: But this is a SQL Server Agent thing, not SSIS. I feel your pain but if you're commenting on SSIS then its only fair to comment on things that are in the scope of SSIS right? If you're complaining about SQL Server in general and you have grievances then I suggest you address it to the correct people. I've no doubt there is an explanation for SQL Server Agent having to run as sysadmin - I just don't know what it is. SQL Server Agent aint my bag.
Oren: "The issue with environment variables is this is actually something that I would never consider for configuration. Putting a connection string in an environment variables is strange. "
JT: I would agree with that. I use env vars alot but definitely not to store connection strings. Is the fact that SSIS offers you an alternative that you are happy to disregard really grounds for complaint though? If you don't want to use it, don't use it. There are plenty of other options available (which I concede you are unhappy about also).
Oren: "I want to point to a configuration file that is in the same directory as the package, it doesn't let me handle it."
JT: SSIS doesn't allow relative paths that is true and there are good reasons for that (can be expanded on elsewhere if you wish). There are teqhniques to get around that and I'd be happy to share them with you. In fact, I've already blogged about them: http://blogs.conchango.com/jamiethomson/archive/2006/01/05/SSIS_3A00_-Common-folder-structure.aspx
I guess you will construe this as a bloated method and I realise trying to change your mind on that is futile. Again, we're just going to have to agree to disagree.
Oren: "It quite trivial to allow update / insert based on a given set of key fields, and it is certainly something that I would expect to see in an ETL product. Given the common need for this."
JT: Oh ok, I gotcha. This is eminently possible through use of the OLE DB Command component. No doubt you will complain about the crappy interface of the OLE DB Command and I am 100% with you on that one. As I said on my blog post and as you have observed (in slightly stronger tones
) the design UI needs LOTS of work.
Oren: "Um, there is something that is called a background thread, and it is used to do work without freezing the UI"
JT: Fair enough - I don't have enough knowledge about the VS shell to comment here.
Oren: "Working offline is not a valid option, because then you get a whole lot of validation errors"
JT: So what? As long as you know that they're due to you working offline what's the problem? Just ignore them. I know I know...we won't agree on this.
Oren: "how do you handle two developers working on the same package? How do you handle branching and merging?"
JT: I hope to expand on this in a dedicated blog post soon (someone replied to my post asking me to do so as well) so I'll talk more about it there. Headline though is that we have a framework that doesn't require multiple developers to work on the same package because we split "work" over seperate packages and that approach works really well for us. If your complaint is around the lack of ability to work on a task in isolation then again I am 100% with you and I have been complaining about that since I first got my hands on the product.
Branching and merging? Never had a requirement to do that yet but if/when we do and if/when I run into problems I'll (try to) remember to let you know.
A key point here is that multi-user development is possible by seperating work over separate packages and I would view that approach as best practice anyway.
My colleague Howard Van Rooijen is keeping me abreast of your updates but if you respond and we miss it, let me know at jamieDOTthomsonATremovethisConchangoDOTcom or by hitting the 'contact me' link on this blog.
I would really like someone from the SSIS team to respond to these complaints because I feel like I'm doing their job for them. I know from past experience though that they respond alot better to constructive criticisms than flaming.
Although Oren, his co-complainants and myself are failing to agree on a lot of things I think this is a really good discussion and is bringing some important issues out into the public domain. I hope it proves beneficial.
-Jamie
UPDATE: David Reed from Microsoft (but not from the SSIS team) has posted his own reply here: http://blogs.msdn.com/reedme/archive/2007/07/29/re-ssis-15-faults.aspx