Archive for May 28, 2010

Master Data Management Sample Hub Architectures

I thought it would be a good idea to write up our thinking on MDM solution architectures. There are many ways to do MDM. Some of them work well with SQL Server 2008 R2 Master Data Services, some not so much, and some really require collaboration with some other system to do message brokering or human workflow. I like to think if these four sample architectures listed here as starting points: in reality, most enterprises have a good example of one of each of these in their systems. We believe that a successful MDM solution starts by understanding the models, assessing what is working and what is not, and then applying the tools to take advantage of aspects of these models in a rational manner. In other words, before we can really understand what an MDM solution might look like for your firm, we need to evaluate the architectural models to see which will be a good fit for your enterprise architecture.

So before jumping into any discussion of key concepts in Master Data Services with an audience who is not familiar with MDM work, I think it is helpful to describe 4 architectural models. I think it’s helpful in discussing these to recall the fundamental data issues we are trying to solve:

1. Who owns the Master Data?

2. Where will the Master Data reside?

3. How should Master Data subscribers be notified of updates?

If these are simple questions to answer, then you likely do not need to go down the MDM path. MDM architectures are designed to provide alternatives when these questions are hard to answer. However, there is rarely a straightforward or easy answer to these questions.


On one end of the spectrum is the Repository architecture. This architecture describes a single data source, a services layer or gateway and multiple source systems which consume Master Data from the services layer.


In the Repository architecture, we have a single data location and all of the master data is stored and managed there. Subscribing applications do not have a local copy of the master data. This can present some issues with data connectivity and response time for these applications; in response to this issue, solution architects often employ a data caching approach either at the Service Gateway layer, at the subscribing application layer, or both. I’m amused by the fact that when I look at this picture, it’s immediately clear to me that this represents an ideal state – the design for a system which is services oriented and can provide data pervasively throughout the enterprise. When I design an enterprise application, I tend to think of this architecture first.

Of course the problem with Repository is that it is best implemented when you have a blank canvas to start with.


On the other end of the spectrum is the Registry architecture. In this architecture, each application owns its own data – it has a local copy. The MDM solution itself has no local copy of the data; instead, it has pointers to where the data can be retrieved from. The column on the left refers to a single Master Data Record or Entity. Each application owns a piece of that record. I’ve color-coded the applications to illustrate which attributes of the entity are owned by which entity. Applications which need data which is not owned by them may look to the Registry to find the data; however, they are not permitted to write data to that attribute. If an end-user wants to update a master record attribute owned by App 1, they must login to App 1 to perform that operation.


Based upon the potential for “spaghetti” integration (all of those lines scare me!), this may seem like an irrational approach, but in fact, this architecture describes a great many ad-hoc MDM solutions. If you do MDM integration at the data tier, you may be using Views or Linked Servers to accomplish this. If you are farther along towards SOA, you may be employing an Enterprise Service Bus to pick up all of the data from source systems and respond with a complete data set for service clients looking for a complete Master Data Record. The point to this approach, I think, is that each application is permitted to maintain stewardship and governance over its piece of the Master Data. Further, the registry helps keep the clients insulated from changes to other subscribing applications.

I think there are two big problems with the Registry approach. First, I expect that system responsiveness would be intolerable if each application must fetch data from multiple places in order to compile a complete picture of the Master Data record. Second, it seems to me that it is very difficult to enforce coherent business rules for a Master Data Record when the complete data record lives in multiple systems. Changes in App 1 may invalidate data in App 4 (or simply make that data functionally unusable). With these issues in mind, I’m not wholly certain if this architecture represents a solution, or the problem itself.

Somewhere between the repository and registry approaches lie two architectures which I think appear more realistic.


The Federation architecture attempts to reconcile these two approaches by allowing each subscribing application to keep a local copy of its data while maintaining the “Master” copy of the Master Data in a central repository. The result is an architecture which resembles a Database Replication strategy, with a database acting as the Publisher to a number of subscription databases.


Federation ensures that you indeed have a single version of the truth while also helping each application remain responsive to end users. The down side to this is that, unlike the Registry architecture wherein each application owns a segment of the master data, the applications in the Federated approach do not have ownership of any attributes which comprise the master data record. Data Stewards who are tasked with maintaining the master data are therefore required to move outside of the subscribing applications and interact with a new Stewardship application. Depending upon the current state of your systems, this approach may be extremely difficult to implement; depending on the disposition of your users, it may make your MDM solution appear worse than the problem.


The Hybrid architecture offers a great deal of benefit in the way of addressing issues uncovered in the first three architectures. First, each subscribing application has ownership of its own data; it can read and write to its local data. Second, each subscribing application has a complete picture of the master data records it requires. Third, management and stewardship of the master data occurs outside of each application, within the Master Data Hub, which acts as a Broker orchestrating the notification and validation of changes across each subscribing application in the enterprise. The Master Data Repository keeps a copy of the Master Data for the purposes of defining business rules and notification requirements for the broker.


This approach is quite complex and includes heavy reliance upon using the right tool to act as the “Man in the Middle” – the Master Data Broker. It should be clear from this picture that a lot of coordination and orchestration is occurring within that broker system: this system is responsible for notifying each of the subscribing systems of changes to its master data. The opportunity for applications to get out of sync is enormous. However, for those enterprises which have heavy investments in ERP packages which are extremely difficult to customize, a hybrid approach to MDM may be the correct solution.

In my next post, I’ll talk about Microsoft technologies such as SQL Server 2008 R2 Master Data Services, Microsoft Office InfoPath 2010, Microsoft SharePoint Server 2010 and Microsoft BizTalk Server 2009. I’ll identify which products are well suited to each of these architectures and decompose the pieces to show the aspects fulfilled by each product. Stay tuned.

I want offer acknowledgements to Roger Wolter at Microsoft and David Loshin at Knowledge Integrity, Inc. for setting us along a reasonable path. Please feel free to offer your thoughts on these approaches – which have worked well for you?

Further Reading:

Team Foundation Server 2010 – Observations on Migrating

As I mentioned in a previous post, we’re moving wholesale to the Wave 14 product set here at Blum Shapiro. I was back in the office today, moving our TFS installation to 2010 after a few days of planning last week and some client work yesterday. I wanted to share some observations.

First, many of the posts out there indicate that running the upgrade wizard is preferable starting with a clean install. The rationale is that it is difficult to get the content moved from SharePoint Project Team Sites if you start from scratch. The same issue exists for Reporting Services integration.

I agree with this approach because not everyone is fluent in SharePoint, Reporting Services and TFS/ALM concepts; it seems a terribly onerous requirement in order to find someone to get you to TFS 2010. However, internally we have not taken advantage of SharePoint work item tracking in TFS 2008, nor did we leverage the analysis capabilities; we are a Solution Architecture, SharePoint and BI consulting firm, not an ISV. Therefore, my objectives were a bit simpler – move the source code with as little disruption as possible.

To move the source code with as little disruption as possible, I suggest creating a fresh TFS 2010 install and then importing the TFS 2008 projects into their own project collection. I found this blog post helpful in importing the old TFS 2008 projects:

Two things it does not mention are:

1. The TFS 2008 databases need to be SQL Server 2008 databases

2. Keep the database names intact (i.e. TfsBuild, TfsIntegration, TfsVersionControl, TfsWorkItemTrackingAttachments, TfsWorkItemTracking, TfsWarehouse)

Upon realizing that I needed to “upgrade” my databases from 2005 to 2008 before I could upgrade, I decided to get cute and rename the databases with TFS2008 in the name. Getting cute never pays off.

Also, in order to eliminate any risk of mixing and matching process templates from TFS 2008 into TFS 2010, I created a dedicated project collection for new TFS 2010 projects.

Finally, any of your colleagues who are still in transition from Visual Studio 2008 will need to follow these instructions:

I’m looking forward to playing with the new Build Server in TFS 2010! Enjoy!

Moving to Team Foundation Server 2010

I love it when a new wave of releases comes out from Microsoft. Sure, there’s more to learn and absorb, but more importantly it’s time to apply planning and installation exercises with new technologies. When it comes to planning solution architecture, you have to keep testing your methodologies – “Does our approach to planning for MOSS 2007 work as well for SharePoint 2010?” for example. I enjoy this exercise.

The technology team here recognized on Monday that now was the time to move our client projects to the new platforms – run our tests, address compatibility issues, etc. My tasks were focused on planning and performing the migration of Dynamics CRM and Team Foundation Server solutions to a new lab environment. The CRM was not really a platform migration; we simply decided that it needed a new home. However, we are all anxious to start leveraging the latest and greatest with TFS.

For those who are still using Visual SourceSafe, I’m truly sorry. The good news is that now is your chance to make the move to TFS (i.e. convince your program manager that it’s time to make the move). I saw this announcement at a Microsoft partner event last fall and still cannot believe the price point on TFS 2010. Essentially, if you have an MSDN license, you’ve got TFS:

Next Question: does TFS 2010 support the new technologies we want to use: Windows Server 2008 R2, SQL Server 2008 R2 and SharePoint Server 2010? Yes, Yes, and Yes. AS you can see from this excellent blog, the setup and administration capabilities have been vastly improved. I remember conversations with TFS 2005/2008 architects wherein they outlined the pain involved in getting all of the components to play nicely together. In 2010, the setup operations have been separated from the install – very similar to the approach taken with SharePoint ( and SSRS before it) – the result is a set of optional features which can be turned on after you have confirmed that more important features are working (like say, project collections and builds)

Here are the critical downloads for getting your team up and running with TFS 2010:

Team Foundation Installation Guide for Visual Studio 2010

Administration Guide for Microsoft Visual Studio 2010 Team Foundation Server

I found this Upgrade Guide also very, very useful:


Profisee Releases Thick Client for Master Data Services

I was very impressed with the demo I saw of this product from Profisee last week and thought you may want to know about it. Like SQL Server Reporting Services’ Report Manager Interface before it, the Master Data Manager is the Web UI which comes out of the box with SQL Server 2008 R2 Master Data Services. I remember when SSRS was first released (with SQL Server 2000), we were all a bit under whelmed with the UI. I predict that some may find the MDS UI similarly disappointing. But remember – Microsoft is a platform company – and partners step up to the plate to extend and enhance the core services.

The product name is Master Data Maestro. The UI looks very similar to Office 2010 clients. I can see this being extremely useful for real “Power Stewards” – the power users who will work with Master Data Services every day.

We are looking forward to working with Master Data Maestro to get some MDM models up and running quickly with our clients in the northeast.