I will be using this medium for occasional musings on SQL Server. Especially when I resolve a problem which I believe may help others…
My first posting concerns an MSDTC Error 8522 that a client was experiencing. This is a common problem with many causes (just try googling it), I believe I have identified one problem and resolution and I am posting it here to help others. The problem could be summarised as due to Transactional Scope.
A client was encountering a periodic problem on a number of child nodes in the upstream data transfer load process, whereby data is transferred from a child node to a parent node under the control of a Linked Server, MSDTC and SQL Server Agent. They were running SQL Server 2000 Personal Edition running on XP SP1.
The client has a number of these child nodes and was getting, the following error message in SQL Server. Once a child node started to generate the error message, then it would repeatedly fail. Other child nodes with exactly the same code base would continue to work OK (i.e. the worse problems to solve because they are intermittent!)
Executed as user: <Domain_Name>\<SQL_Server_Service_Account>. Distributed transaction aborted by MSDTC. [SQLSTATE 42000] (Error 8522). The step failed.
MSDTC is an acronym for Microsoft Distributed Transaction Coordinator which is a Windows service providing transaction infrastructure for distributed systems.
The client had undertaken a number of attempts to resolve the problem, including using tools such as DTCPing and DTCTester. Please see Florin Lazar's Blog for these:
http://blogs.msdn.com/florinlazar/archive/2005/09/16/469064.aspx
I was asked to investigate the issue and try and identify a solution. Normally, I would have recommended using the tools listed above, and looked at Security settings (notably TurnOffRpcSecurity) to resolve this problem, as this had worked successfully at other clients but this only applies to XP SP2 (not SP1). Again see Florin Lazar's Blog for this
http://blogs.msdn.com/florinlazar/archive/2004/06/18/159127.aspx
It was soon evident that this was different problem entirely and required more radical thought to resolve.
The problem was isolated to a stored procedure I will call TransferDataToParent. I reviewed this stored procedure and discovered a couple of areas which gave me concern:
- It was over engineered for its purpose, as it was invoking superfluous local transactions
- It was undertaking combined local and distributed transactions in a single unit of work.
I wondered if this was the second point was cause of the problem, and that MSDTC was getting ‘confused’ over what was the scope of a local transaction and the scope of a distributed transaction.
Basically (at a high level) the stored procedure performed 3 functions:
1) Obtained data to transfer up the hierarchy in a series of SQL Server table variables (@table) under the control of a local transaction
2) Obtained secondary data into a local table variable (@table) and attempted to transfer this data to the parent node all under the control of a Distributed Transaction
3) Transferred the data obtained in Function 1) to the parent node in a Distributed Transaction
It was determined that the process was failing at Function 2). However, I was concerned with Functions 1) and 2) for the following reasons
- Function 1) - This is in a superfluous Transaction. The data was being stored into a series of table variables (@tables). @tables and # temporary tables are both non-persistent data stores. In the case of @tables, these are located in SQL Server memory and not on disk. Therefore a transaction is not needed, because if the code fails, then the data is automatically lost anyway. This does not need transactional control.
- Function 2) – This process met two different requirements:
a) Obtaining data in local memory tables (@tables) but within a Distributed Transaction
b) Transferring the data to the parent node within the same Distributed Transaction
I realised that a) should NOT be in a Distributed Transaction as it is an extension of the process in Function 1 above. Furthermore, MSTDC was probably also getting confused with attempting to perform a Distributed Transaction on local data (and with a @table variable as well!) – all within the same transactional scope.
Therefore, I rewrote the stored procedure to remove these areas of concern, namely:
- Obtain all data in local table variables outside of any transaction (local or distributed)
- Ensured all data transfers up the hierarchy were purely under the control of a Distributed Transaction (i.e. reduced transaction scope)
This worked successfully on two child nodes, that were previously having this problem and I now believe this problem to be resolved.
In summary, this may not be the cause of your particular MSDTC Error 8522 problem, but I would recommend that you look at the scope of the distributed transaction, are you combining it with a local transaction and/or non-persistent data?