Archive for April 23, 2009

When and How to Use Xml Data Types in SQL Server

During a design session with a client, the question came up of “When is it a good idea to consider Xml storage in SQL Server?” I have been a big fan of Xml technologies for some time now and was resisting the urge to say “All the time!” However, in this particular case, we were designing a fairly generic Auditing sub-system for use in a specific CRM application. As we discussed the problem, it became clear that we really wanted to track operations on a broad set of business objects. So we began to think about XML in SQL Server 2005 more seriously.

XML documents can go a long way towards reducing the overall complexity of a relational model which is trying to be “all things to all people”. For our audit system, we wanted to start out with a small set of “messages”, but we wanted to support a broader set in future releases. Whenever I hear that, I get a bit scared, because it runs the risk of either creating wild complexity in the data model or over-simplifications. On the other hand, a single column of type XML can support a number of schemas at once.

What is the best approach for using XML in a relational model? Here are my thoughts:

Strongly Type the XML – it is extremely easy to use Visual Studio tools to generate an XSD schema from a candidate XML message. So, take the time to define what your message ought to look like and then refine your schema to enforce it. That way you can have some comfort with the data you are storing.

Qualify your Elements – people seem a bit skittish about Xml Namespaces. Think about this in terms of .NET code – we always want to define a Namespace structure for code which reflects the owner of the code, its source service or application, and its intended purpose. Otherwise we would have type collisions all over the place! Well, same applies to Xml – Xml messages should be as specific as possible. I like to use the following standard when assigning a namespace:

Promote Important Properties – I’m a SharePoint guy, but I saw this behavior first in BizTalk Server, and I think it is widely applicable. In BizTalk, developers have the ability to “promote” values out of an Xml message and into the “header” of the message. Imagine designing a class which included an Xml document as a public property. Now add a set of public properties which act as XPath queries into the Strongly Typed Xml document within. This approach is possible because you have already designed the schema for the message, so you know what to expect – it’s not just ad-hoc XML. Now, the values within the message are readily available to even casual interrogators of the class – no need for XPath or schema knowledge on the client’s end. In SharePoint, this concept is manifested in Forms Libraries: when an InfoPath form is published to a Forms Library, the publisher can quickly promote values from the internal InfoPath payload into columns on the SPListItem which will be created by each InfoPath form entry. Perfect! Let’s do the same in SQL Server! Your SQL colleagues who can now perform some SQL queries on our tables without any knowledge of XPath will thank you.

Any success/horror stories you’d like to share around using XML data in a relational data store?

Brian Berry is a Director of Technology Consulting with BlumShapiro, focusing on Microsoft Business Intelligence solutions, with a strong focus on Systems Integration, Master Data Management and PerformancePoint Services. He has been helping companies optimize their investments in Microsoft technology for over 12 years.