Last week was an interesting week in the SSIS community instigated fairly and squarely by Oren Eini's blog entry entitled SSIS' 15 faults. The discussion really reached critical mass when Steve Jones highlighted it in his weekly newsletter Database Daily.
I posted a response here which prompted alot of comments. In fact, in terms of concentrated activity this blog post has become my most popular ever (thanks for that Oren) and its clear that many people agree with Oren whilst others are more tolerant of SSIS's perceived shortcomings.
Deep in the depths of those comments was one from Denise Draper. Denise is the SSIS Group Program Manager (I think that's her title -whatever, she is head of the SSIS team). With Denise's permission I have re-posted her comments here below because a Microsoft response to this is important. Make of this what you will.
Wow --- there's a lot of good material in here, and it will be required reading for the SSIS team, along with Oren's original post and Phil Brammer's response as well. Everyone has done a great job of covering a lot of the specific details already (thank you Jamie, Phil and others), so I won't try to cover everything. I'll just add some thoughts that come to mind:
DTS was pretty popular --- in fact it was hands down the most popular ETL utility out there. But we knew the design of DTS wouldn't scale to upper-0end ETL requirements, and we heard from customers that they wanted and expected us to do that. So we bit the bullet and built a completely new product. There are lots of aspects of the product that I'm very proud of (I say "I" but it wasn't me who shipped it, it was my predecessor) --- the clean separation of control from data flow, the way error flows are handled within data flows, and the open extensibility through scripting and coding are some of my favorites. There are some places where we didn't get mature enough in the design and we tried in those cases to have a very open solution, so that our users could adapt the product to their own needs --- package management and deployment, configuration and logging all fall into those categories. In all of these cases, there is confusion (justified) as to how to use the product, but it is usually possible to adapt to the needs of any particular installation.
So one question I ask is: did we achieve a platform on which we can continue to build functionality that will scale up to data integration generally? Is this the right architecture? I think the answer is yes. I know customers who have built astoundingly large solutions --- large in terms both of the amount of data handled and of the number and complexity of the packages that they manage. And SSIS works for them reliably. I know of lots of things that I want to change in the product, but I think SSIS was the right step in the right direction.
Some architecture choices are not so obvious. For example, we have chosen to use generic open interfaces to access data. On one hand, this lowers our cost (and hence your cost) greatly --- the portfolio of special-purpose connectors is a big part of the (huge) price of upper-end ETL. But the downside is that we inherit whatever issues come with the data source drivers, which are not under our control. We get a lot of issues about Oracle connectivity, for example, that have nothing to do with SSIS. But if you are an SSIS user, that distinction doesn't really matter, does it? This is an area we are still debating --- it really is expensive to own and ship our own connectors.
Another example of a choice that isn't obvious is the whole question of integration with Visual Studio and the development environment generally: I get feedback from some customers that SSIS is too "developer oriented" but from some developers that it isn't developer oreiented enough. It looks as though folks on this thread are more from the developer camp, but let me ask: in an ideal world, do you switch back and forth between graphic ETL design and code, or do you never see code, or something else? Do you prefer very simple tools that you scale up to large solutions (by creating subroutines, macros, etc.), or do you prefer tools that have enough smarts in them to know most of the problems you want to solve?
There is still lots of things we want to improve --- we wouldn't be software engineers if we didn't feel that way, would we :-) The next release of SQL Server is a small one, but it will fix some big annoyances: better scripting support, better debugability, better wizard. But some areas, like deployment, were just too big to fit into this release --- we'll try to provide better guidelines in those cases.
Better organization to the documentation and better help getting started generally --- these are things that we know are needed and do intend to keep working on. I look at something like Oren's original criticism of SSIS configuration and that is what I see --- it may be that there is something funky in Oren's setup, or it may be that he's come across a bug --- either way, what is clear is that the expected behavior of SSIS isn't clear enough to him for him to tell what is going on.
We have a lot of good plans on the drawing board. Feedback like this helps us --- we hear about issues that we maybe didn't know about, and we use it to prioritize all the stuff we'd like to work on. I know it takes longer to get new stuff out than any of us want, and that it's frustrating to deal with the flaws in the meantime. But I still think there's already a lot of great value in SSIS once you get to learn it, enough to make it worth learning it.
-Jamie