Tag Archive for SQL Master Data Services

How Much is Your Data Worth?

Data is the new currency in today’s modern businesses. From the largest international conglomerate down to the smallest neighborhood Mom-and-Pop shop, data is EVERYTHING! Without data, you don’t know who to bill for services, or for how much. You don’t know how much inventory you need on hand, or who to buy it from if you run out. Seriously, if you lost all of your data, or even a small but vitally important piece of it, could your company recover? I’m guessing not.

“But,” you say, “We have a disaster recovery site we can switch to!”

That’s fine if your racks melt down into a pool of heavy metals on the server room floor, then yes, by all means switch over to your disaster recovery site because molten discs certainly qualify as a “disaster!” Databases hosted on private or public cloud virtual machines are less susceptible, but not immune, to hardware failures.  But what about a failure of a lesser nature? What if one of your production databases gets corrupted because of a SQL Injection hack, cleaned out by a disgruntled employee, or is accidentally purged because a developer thought he was working against the DEV environment? Inadvertent changes to data are no respecter of where such data is stored, or how it is stored! And, sorry to say, clustering or other HADR solutions (High Availability/Disaster Recovery, such as SQL Server Always On technology) may not be able to save you in some cases. Suppose some data gets deleted or is modified in error. These ‘changes’, be they accidental or on purpose, may get replicated to the inactive node of the cluster before the issue is discovered. After all, the database system doesn’t know if it should stop such changes from happening when the command to modify data is issued. How can it tell an ‘accidental purge’ from regular record maintenance? So the system replicates those changes to the failover node. You end up with TWO copies of an incorrect database instead of one good one and one bad! And worse yet, depending on your data replication latency from your primary site to the disaster recovery site, and how quickly you stop the DR site from replicating, THAT may get hosed too if you don’t catch it in time!

Enter the DATABASE BACKUP AND RESTORE, the subject of this article. Database backups have been around as long as Relational Database Management Systems (RDBMS). In my humble opinion, a product cannot be considered a full-featured RDBMS unless it has the capability of performing routine backups and allows for granular restore to a point in time. (Sorry, but Microsoft Excel and Access simply do not qualify.) Being a Microsoft guy, I’m going to zero in on their flagship product: SQL Server, but Oracle, SAP, IBM and many others will have similar functionality. (See the Gartner Magic Quadrant for database systems for a quick look at various vendors, including Microsoft a clear leader in this Magic Quadrant.)

So what is a BACKUP? “Is it not simply a copy of the database?” you say, “I can make file copies of my Excel spreadsheet. Isn’t that the same as a backup?” Let me explain how database backups work and then you can decide the answer to that question.

First of all, you’ll need the system to create a FULL database backup. This is a file generated by the database server system, stored on the file system, the format of which is proprietary to the system. Typically, full backups are taken once per night for a moderately sized database, for example under 100 GB, and should be handled via an automated scheduling service such as SQL Agent.

iStock_000006412772XSmallNext, you’ll need TRANSACTION LOG backups. Log backups, as they are known, record every single change in the database that has occurred since the last full or log backup. A good starting point is scheduling log backups at least every hour, with possible tightening down to every few minutes if the database is extremely active.

Now, to restore a database in the event of a failure, you need to do one very important step: backup the transaction log one last time if you want to have any hope of restoring to a recent point. To perform the actual restore, you’ll need what is known as the ‘chain of backups’ which includes the most recent full backup and every subsequent log backup. During the restore, you will be able to specify a point in time anywhere from the time of the full backup to the time of the latest log backup, right down to the second or millisecond.

So we’re all set right? Almost. The mantra of Database Administrators the world over regarding backups is this: “The backups are only as good and sure as the last time we tested the RESTORE capability.” In other words, if you haven’t tested your ability to restore your database to a particular point in time, you can’t be sure you’re doing it right. Case in point: I saw a backup strategy once where the FULL backups were written directly to a tape drive every night, then first thing in the morning, the IT guys would dutifully eject the tapes and immediately ship them out to an off-site storage location. How can you restore a database if your backups are not available? Case two: The IT guys, not understanding SQL backup functionality and benefits, used a third party tool to take database backups, but didn’t bother with the logs. After four years of this, they had a log that was 15 times the size of the database! So big, in fact, that there was no space available to hold its backup. About a year after I got the situation straightened out with regular full AND transaction log backups going, the physical server (virtualization was not common practice then) experienced a debilitating hardware failure and the whole system was down for three days. Once running again, the system (a financials software package with over 20,000 tables!) was restored to a point in time right before the failure. Having the daily FULL backups saved the financials system (and the company). But also having the log backups saved many people a day’s work if we had had to go back to the latest FULL backup.

So, what’s your data worth? If your data is critical to your business, it is critical that you properly back up the data. Talk to us to learn how we can help with this.

About Todd: Todd Chittenden started his programming and reporting career with industrial maintenance applications in the late 1990’s. When SQL Server 2005 was introduced, he quickly became certified in Microsoft’s latest RDBMS technology and has added certifications over the years. He currently holds an MCSE in Business Intelligence. He has applied his knowledge of relational databases, data warehouses, business intelligence and analytics to a variety of projects for BlumShapiro since 2011. 

Technology Talks Newsletter CTA

Three Major Changes to Master Data Services in SQL 2012

Microsoft introduced Master Data Services as part of their SQL Server product starting with version 2008 R2. In SQL 2012, there were several enhancements that made the product a bit more usable. We’ll be highlighting three of the major changes in this article. I will be presenting a more in-depth look at Master Data Services at a SQL Saturday even in Providence, Rhode Island on September 21st, 2013 (#sqlsat; #sqlpass).

Master Data Services web application is now Silverlight enabled

The first version of MDS in SQL 2008 R2 had a basic web front-end that allowed the user to create models, entities, attributes, and hierarchies. But it was slow. It did the job OK, but could be very frustrating when working with more than a handful of records. With any decent amount of data, the Explorer page tended to be unmanageable. MDS web site in SQL 2012 is now Silverlight enabled and responds much faster. Fast enough, in fact, to NOT require any third-party tools to simply browse the data.


The Master Data Services Add-in for Microsoft Excel

I heard mentioned, tongue-in-cheek, at a Microsoft conference once that eventually all any developer would ever need would be Internet Explorer and Visual Studio. I submit that the third essential component in that set would be the right Add-in for Excel. Let’s face it, EVERYBODY uses Excel these days. Realizing this, Microsoft has done us a great service by including the install link right on the MDS web site. Once installed, connecting to the data is very straight-forward. Simply supply the URL of the site to connect, then choose the Model and Version to work with. Adding and editing data is intuitive as well. The Add-in keeps track of changed data and highlights it as you work. In the end, the Publish button on the tool bar pushes the data changes back into the database.


New Staging table structure for Entity imports

In 2008 R2, to load a single member row into an entity with, say, 5 attributes required a total of 6 rows of data to be sent to two staging tables. The staging table in MDS 2008 R2 handled bulk imports of ALL entities, ALL attributes, and ALL relationships with only 4 tables, regardless of the number of Models or Entities. The two main tables were for Members and Attributes, with the latter being in an ‘Attribute Name/Attribute Value pair’ format. So your one row of data with 5 attributes would turn into one row of data in the Members staging table, plus 5 rows of un-pivoted data in the Attributes table. Entities with few attributes did not cause major issues, but complex attributes could tend to bloat the staging tables. In addition, if two different processes were trying to load two different entities at the same time, there was the possibility that both sets of data could be co-mingled to the one set of staging tables. This would not be an issue if all data would be staged within one batch, but trying to then process a sub-set of staging data could be difficult.

In SQL 2012 MDS, each entity now gets its own staging table for imports, with meta-data to match, plus a dedicated stored procedure to process data in that table. Not only does this eliminate the need to un-pivot the attributes to attribute/value pairs, it eliminates any possibility that entities could be mixed in the staging import tables.