Tag Archive for Architecture

Avoid Creating Another Information Silo with MDM

In my last article, I discussed the five most critical success factors when implementing a Master Data Management solution. At the top of the list was a warning: “Don’t Create Another Information Silo”. What does that mean? Why is that important? What is different about this new system we are calling “MDM”?

I define “Information Silo” as an application which has the following characteristics:

1. The application has its own database.

2. The database is highly normalized, because the designer of the data model has made an effort to reduce duplication of data.

3. Identity is invented in the database, because things like Customer, Product, User all need primary keys to identify the record within the system.

4. Domain Logic controls most, if not all, of the data quality requirements, because encapsulating business logic outside a database promotes re-use.

Each of these principles makes perfect sense for a custom application. But if you use them in your Master Data Management solution, you will make another Information Silo, and you’ll be right back where you started.

Certainly, our Master Data Management solution will include a database, the “MDM Hub”, which will be the central place for data in our system to reside. But we need to design a different kind of database model, unlike many other systems we have designed for the enterprise. What should this data model look like? There are 3 modeling approaches to consider. I’ll explain each, and then tell you which is the best and why.

A Registry approach is the lightweight model whereby only the pointers to source data are stored in the Master Data Hub. We are capturing only minimal attributes in the hub with this pattern, and consumers of data will be required to seek out an authoritative data source elsewhere. Essentially, this MDM pattern is designed to store nothing, only to tell consumers where to get the data. This does not work well when it comes time to implement Data Governance, because the MDM system has not defined a Jurisdiction for the Data Governance team to work in.

A Transaction approach represents the opposite end of the spectrum from Registry, and it helps to address the common fear of “Garbage In, Garbage Out” which so many first-timers experience. The idea with Transaction is that Master Data should be created in a tightly controlled environment which imposes rigor on the master data creation process and ensures that ALL data is collected up front. This approach sounds worthwhile, until you consider what it will take to build such a system: you may as well build a new ERP system. This is the classic trap leading directly to another silo of information!

A Federated approach represents a middle ground between Registry and Transaction. Think of this as the “come as you are” alternative, because we are going to pull the master data from the sources with as little translation as needed, and we are going to leverage the source identity in MDM, combining the name of the source system with the source system’s identifier. The Federated approach recognizes that in order for the data governance team to effectively govern, it needs enough master data attributes to discern critical differences in the MDM hub, but not all of it.

Here is an example of how the Federated model works. Let’s say that we have 5 sources of Customer Master data: 3 ERP (JD Edwards Enterprise One, SAP and Dynamics AX), 1 CRM (SalesForce) and 1 Custom SQL solution used by a Web site. A Federated design would dictate:

  • An Entity named “Source System” whose members would define the sources of data (i.e. JDE, SAP, AX, SFDC, SQL)
  • An Entity named “Customer” whose members would have an identity which would be the combination of the Actual Id value from the source system, prefixed by the Source System code itself (i.e. “JDE-100054”, “SAP-000005478”, “SQL-1”). If the Actual Source System Id is used, then these MDM Identifiers will be unique in the Federated model.
  • Some number of additional attributes which help the stewards of the master data understand if these records represent a common customer. This needs to be defined by the business stakeholders, not the database administrators.
  • A solution for creating a Golden Record in the MDM solution. This solution should match members based upon matching rules and group them together as proposed units. The common grouping for these records is a reference to the Golden Record. For example, the solution should present hierarchy with the Golden Records as parents of the sources.
  • MDM-100 : “BlumShapiro”
  • JDE-100054 : “Blum Shapiro & Co”
  • SAP-0000005478 : “BlumShapiro”
  • SQL-1 : “Bloom Shapiro”

Advantages to this way of working include:

A. Because we are bringing source data in “as is”, data can be loaded quickly into the MDM hub

B. A Federated MDM solution can produce reports which tie out to legacy reports, because they use the same Master Data.

C. Your Data Stewardship team works with the data as it is, without translation or normalization, and has enough data to begin defining data quality rules

D. The solution is positioned nicely for source system synchronization, because a one-to-one relationship exists between the Authoritative record in MDM and the target

A Federated MDM Data Model for your Master Data Management program is the best approach for getting started. It is simple to design, easy for business users to grasp, and avoids creating another data silo. In fact, it is only aggregating from silos and grouping for matching, harmonization and coherence. Most importantly, this approach gets you something fast. It gets your governance team something they can touch and feel. An MDM initiative must deliver value to the business quickly, and to do this it must be relevant as soon as possible.

Next time, I’ll talk about how an MDM differs from CRM, and why you should treat your CRM(s) as “just another source of master data”.

Working with Multiple Virtual Machines in Hyper-V

As a consultant who works on many different projects, it is always a good idea to keep projects separate.  The best way to keep things separate is to create multiple virtual machines.  This can pose a few problems, like the time it takes to build a virtual machine and get it fully configured.  And all that hard drive space just to have multiple virtual machines with the same software installed.  In the long run, the benefits of multiple virtual machines out-weigh these costs.  You not only get to archive out-dated projects, freeing up valuable/precious hard drive space, but you do not cross contaminate your projects.  We’ve all said this: “it works on my machine“.  Well there is a reason for that.  Maybe you changed a setting in the registry, or modified some permissions while working on another project on the same virtual machine.  “I know I did something with the registry, but I can’t remember what.”

Using multiple virtual machines solves that issue, but what about your time and space?  If you haven’t solved the Einstein-Rosen bridge, you should try using differencing drives to alleviate some of the wasted time and space. 

A differencing disk is a virtual disk which points to a “parent” virtual disk. So when you create a virtual machine with a differencing drive, you are starting with a baseline image (parent) and any changes made in the virtual machine are saved to a separate virtual disk (child). Any subsequent virtual machine you create with a differencing drive, you can also point to the “parent,” since it remains unchanged. This saves space, since anything stored on the “parent” virtual disk is shared between the children. Using differencing disks also saves time since you don’t need to install a new operating system, any security updates, or any commonly used software. You just point to the baseline image and go. 

Create Parent Image

  1. Create a new virtual machine using Hyper-V manager. 
    • Install your operating system, the software you will share from VM to VM, and all the security updates they entail.
  2. Run the SysPrep.exe utility.   
    • SysPrep /oobe /generalize /shutdown
  3. Remove the virtual machine from Hyper-V Manager.  (You don’t need to run this directly anymore).
  4. Make the .VHD/.VHDX file read-only.  This is important!  Any change to the base image will render virtual machines created off of it invalid.

Create Virtual Machines

  1. Create a new virtual machine using Hyper-V Manager but choose the “attach a virtual hard disk later” option.
  2. Right click the newly created virtual machine and go into the “Settings…”
  3. Click on the IDE Controller 0 and click to add a new hard drive. 
  4. Click the “New” button to create a new drive and select the “Differencing” disk type
    • Choose the parent disk that you created before.
  5. Start the machine, enter in the product code and start developing.

With that you have just saved your future self some time, since creating new virtual machines will take no time at all. No need to copy an already existing SysPrep-ed image. No need to copy anything! All you need to do is point to a disk that already exists.

In addition, you can create a differencing disk which points to a differencing disk.  So if you have projects which use different technologies, like SharePoint and BizTalk, you can start by creating a base image with just the operating system on it.  Then, you can create two disks, one for SharePoint and one for BizTalk, which both point to the baseline operating system image.  Then, when creating a new virtual machine, you can just point to either the SharePoint disk or BizTalk disk, both of which also share the same disk.

Moving to Team Foundation Server 2010

I love it when a new wave of releases comes out from Microsoft. Sure, there’s more to learn and absorb, but more importantly it’s time to apply planning and installation exercises with new technologies. When it comes to planning solution architecture, you have to keep testing your methodologies – “Does our approach to planning for MOSS 2007 work as well for SharePoint 2010?” for example. I enjoy this exercise.

The technology team here recognized on Monday that now was the time to move our client projects to the new platforms – run our tests, address compatibility issues, etc. My tasks were focused on planning and performing the migration of Dynamics CRM and Team Foundation Server solutions to a new lab environment. The CRM was not really a platform migration; we simply decided that it needed a new home. However, we are all anxious to start leveraging the latest and greatest with TFS.

For those who are still using Visual SourceSafe, I’m truly sorry. The good news is that now is your chance to make the move to TFS (i.e. convince your program manager that it’s time to make the move). I saw this announcement at a Microsoft partner event last fall and still cannot believe the price point on TFS 2010. Essentially, if you have an MSDN license, you’ve got TFS: http://blogs.msdn.com/buckh/archive/2009/10/20/tfs-2010-server-licensing-it-s-included-in-msdn-subscriptions.aspx

Next Question: does TFS 2010 support the new technologies we want to use: Windows Server 2008 R2, SQL Server 2008 R2 and SharePoint Server 2010? Yes, Yes, and Yes. AS you can see from this excellent blog, the setup and administration capabilities have been vastly improved. I remember conversations with TFS 2005/2008 architects wherein they outlined the pain involved in getting all of the components to play nicely together. In 2010, the setup operations have been separated from the install – very similar to the approach taken with SharePoint ( and SSRS before it) – the result is a set of optional features which can be turned on after you have confirmed that more important features are working (like say, project collections and builds) http://blogs.msdn.com/bharry/archive/2009/04/30/tfs-2010-admin-operations-setup-improvements.aspx

Here are the critical downloads for getting your team up and running with TFS 2010:

Team Foundation Installation Guide for Visual Studio 2010

Administration Guide for Microsoft Visual Studio 2010 Team Foundation Server

I found this Upgrade Guide also very, very useful: