Archive for SQL 2012

How Much is Your Data Worth?

Data is the new currency in today’s modern businesses. From the largest international conglomerate down to the smallest neighborhood Mom-and-Pop shop, data is EVERYTHING! Without data, you don’t know who to bill for services, or for how much. You don’t know how much inventory you need on hand, or who to buy it from if you run out. Seriously, if you lost all of your data, or even a small but vitally important piece of it, could your company recover? I’m guessing not.

“But,” you say, “We have a disaster recovery site we can switch to!”

That’s fine if your racks melt down into a pool of heavy metals on the server room floor, then yes, by all means switch over to your disaster recovery site because molten discs certainly qualify as a “disaster!” Databases hosted on private or public cloud virtual machines are less susceptible, but not immune, to hardware failures.  But what about a failure of a lesser nature? What if one of your production databases gets corrupted because of a SQL Injection hack, cleaned out by a disgruntled employee, or is accidentally purged because a developer thought he was working against the DEV environment? Inadvertent changes to data are no respecter of where such data is stored, or how it is stored! And, sorry to say, clustering or other HADR solutions (High Availability/Disaster Recovery, such as SQL Server Always On technology) may not be able to save you in some cases. Suppose some data gets deleted or is modified in error. These ‘changes’, be they accidental or on purpose, may get replicated to the inactive node of the cluster before the issue is discovered. After all, the database system doesn’t know if it should stop such changes from happening when the command to modify data is issued. How can it tell an ‘accidental purge’ from regular record maintenance? So the system replicates those changes to the failover node. You end up with TWO copies of an incorrect database instead of one good one and one bad! And worse yet, depending on your data replication latency from your primary site to the disaster recovery site, and how quickly you stop the DR site from replicating, THAT may get hosed too if you don’t catch it in time!

Enter the DATABASE BACKUP AND RESTORE, the subject of this article. Database backups have been around as long as Relational Database Management Systems (RDBMS). In my humble opinion, a product cannot be considered a full-featured RDBMS unless it has the capability of performing routine backups and allows for granular restore to a point in time. (Sorry, but Microsoft Excel and Access simply do not qualify.) Being a Microsoft guy, I’m going to zero in on their flagship product: SQL Server, but Oracle, SAP, IBM and many others will have similar functionality. (See the Gartner Magic Quadrant for database systems for a quick look at various vendors, including Microsoft a clear leader in this Magic Quadrant.)

So what is a BACKUP? “Is it not simply a copy of the database?” you say, “I can make file copies of my Excel spreadsheet. Isn’t that the same as a backup?” Let me explain how database backups work and then you can decide the answer to that question.

First of all, you’ll need the system to create a FULL database backup. This is a file generated by the database server system, stored on the file system, the format of which is proprietary to the system. Typically, full backups are taken once per night for a moderately sized database, for example under 100 GB, and should be handled via an automated scheduling service such as SQL Agent.

iStock_000006412772XSmallNext, you’ll need TRANSACTION LOG backups. Log backups, as they are known, record every single change in the database that has occurred since the last full or log backup. A good starting point is scheduling log backups at least every hour, with possible tightening down to every few minutes if the database is extremely active.

Now, to restore a database in the event of a failure, you need to do one very important step: backup the transaction log one last time if you want to have any hope of restoring to a recent point. To perform the actual restore, you’ll need what is known as the ‘chain of backups’ which includes the most recent full backup and every subsequent log backup. During the restore, you will be able to specify a point in time anywhere from the time of the full backup to the time of the latest log backup, right down to the second or millisecond.

So we’re all set right? Almost. The mantra of Database Administrators the world over regarding backups is this: “The backups are only as good and sure as the last time we tested the RESTORE capability.” In other words, if you haven’t tested your ability to restore your database to a particular point in time, you can’t be sure you’re doing it right. Case in point: I saw a backup strategy once where the FULL backups were written directly to a tape drive every night, then first thing in the morning, the IT guys would dutifully eject the tapes and immediately ship them out to an off-site storage location. How can you restore a database if your backups are not available? Case two: The IT guys, not understanding SQL backup functionality and benefits, used a third party tool to take database backups, but didn’t bother with the logs. After four years of this, they had a log that was 15 times the size of the database! So big, in fact, that there was no space available to hold its backup. About a year after I got the situation straightened out with regular full AND transaction log backups going, the physical server (virtualization was not common practice then) experienced a debilitating hardware failure and the whole system was down for three days. Once running again, the system (a financials software package with over 20,000 tables!) was restored to a point in time right before the failure. Having the daily FULL backups saved the financials system (and the company). But also having the log backups saved many people a day’s work if we had had to go back to the latest FULL backup.

So, what’s your data worth? If your data is critical to your business, it is critical that you properly back up the data. Talk to us to learn how we can help with this.

About Todd: Todd Chittenden started his programming and reporting career with industrial maintenance applications in the late 1990’s. When SQL Server 2005 was introduced, he quickly became certified in Microsoft’s latest RDBMS technology and has added certifications over the years. He currently holds an MCSE in Business Intelligence. He has applied his knowledge of relational databases, data warehouses, business intelligence and analytics to a variety of projects for BlumShapiro since 2011. 

Technology Talks Newsletter CTA

Adding User Configurations to an Analysis Server Cube

Part 3: One Row for Each Dimension Member

In Part 2, we created a view that pivoted the Configurations table around a single [Dummy Key] value of -1 and joined that to a dimension. And the dimension we picked had to be one that would never be used with the Configuration Measure Group because the sum of the configuration measure was only available at the UNKNOWN dimension member. Now, perhaps we come to realize that there is no good candidate dimension to which we can join our configuration measure group because it needs to be available to ALL dimensions. What now?

To make this work, we will need to edit our view from Part 2. First, select a dimension with only a few members. I chose my Date dimension for this solution. We’ll need to create the dimension table and populate it with some sample data.

Create the table …

Insert some data …

Now alter the view by adding a CROSS JOIN to the dimension table, and substituting the dimension’s [Date Key] for the static [Dummy Key] as follows:

The CROSS JOIN of the sub-select will return the Cartesian product of all rows in the Date dimension table (7) and all rows in the Configuration table(2), or 14 rows. Each Configuration will have the exact same value for all dates. The PIVOT statement will bring those 14 rows back to 7 rows, with three columns: [Date Key] and two configuration columns.

Refreshing the Data Source View will remove the [Dummy Key] field and add the new [Date Key] column in its place. Next we need to edit the Dimension Usage tab such that the Configuration Measure Group is joined to the Date dimension on the [Date Key] field. But browsing the cube now along any other dimension will yield surprising results: The Configuration value is SUMMED for the number of days in the slice! Well, obviously as that is what we told it to do when we created the Measures back in Part 2! Go back to the Cube Structure tab and edit the individual Measure aggregations in the Configurations Measure Group to “First Non Empty” (if you have Enterprise Edition) or Min or Max (if you have only Standard Edition). Now, regardless of the number of rows in the cube slice, the aggregation will always yield the value desired.

As with Part 2, we still have a single point outside the cube where configurations are stored, but the configurations are available to all dimensions.

Adding additional Configuration entries would involve the following steps:

  • Add a row to the Configuration table
  • Edit the view to include the field based on its [Configuration Name]
  • Refresh the cube project’s Data Source view to get the new field included
  • Add a new Measure to the Configuration measure group.

So far, our solutions have all resulted in global configuration values that if changed, will change for all dimension slices. In our case this is for all dates in the Date dimension. In the next part, we’ll explore a few different methods that will allow for entering configuration values that are date dependent. Since I started using the Date dimension for this series I’m going to stick with it for clarity, and also because that is usually how configurations are needed for the business.

Adding User Configurations to an Analysis Server Cube Part 2

Part 2: Dynamic User Configurations

In Part 1 of this series, we hard-coded some MDX values into the cube. That approach works in that it produces the desired end result, but if the values need to change, a developer is needed to make it happen. What is needed is a way to persist the configuration values outside of the cube itself. In Part 2, we will create a configuration table to store the values. The structure is borrowed from that used by earlier versions of Integration Services:

Really, the only two fields absolutely required here are the Name and Value fields; the other two are added for administration and clarification. Next we’ll insert some fictional values into this configuration table:

Next, create a view that pivots the Configuration Name and Configuration Value fields with T-SQL code like this:


 

The dataset returned by this view will be a single row of data with one column for each Configuration named in the PIVOT section, and a static [DummyKey] value of -1.

 

Again, some purists may dislike my use of “SELECT * FROM …” in my view definition, but since I am limiting the columns returned via the ” . . . FOR ConfigurationName IN (…) . . .” statement of the PIVOT clause, there is not much chance of getting unneeded columns.

 

Next, add this view to the cube project Data Source View, then add it as a Measure Group to your cube. Delete the COUNT and the SUM(Dummy Key) measures that were added by the Measure Group wizard. Since there is only one row in the measure group’s base table, a SUM( ) aggregations for the configurations are fine. Lastly, since a Measure Group MUST be joined to at least one Dimension, on the Dimension Usage tab join the Configuration Measure Group to a dimension in your cube that meets the following criteria:

  1. The dimension has a member row with a key value of -1. (Data Warehouse designers typically add a -1 key as the “Unknown” member of the dimension table.)
  2. You will NOT be using the dimension in conjunction with the Configuration Values. This sounds rather counter intuitive based on cube design practices, but it is explained below.

Browsing the cube by any dimension OTHER than the one used to join the Configuration Measure Group will return the configuration measure values at every cube intersection. This is because you are actually selecting the [All] member of that one dimension, which includes the SUM of each Configuration Value. And since there is only one row at the [Unknown] member (Key = -1), the SUM at the [All] level is the one row. Browsing the cube INCLUDING the one dimension will show that the configuration values are ONLY available for the “Unknown” member, and not for any others. If your configuration values, whatever they represent, will NEVER be used with the dimension you have them joined to, then this is just fine. But if there is any possibility that the Configuration Measures would be needed for any and every dimension in the cube, then you need to do a little editing of the view. We’ll cover that in Part 3.

The advantage of this method over what was covered in Part 1 is that if the Configuration Values ever need to be changed, it is now simply a matter of changing a single value in a table and reprocessing the Measure Group instead of editing the cube design and redeploying the entire cube. To add additional configurations would involve the following:

  1. Add the entry in the table
  2. Edit the view to include the appropriate [ConfigurationName] in the PIVOT clause
  3. Refresh the Data Source View for the cube project
  4. Add a new measure to the Configuration Measure Group for the newly added column
  5. Deploy and process the cube

In Part 3, we will overcome the limitation of NOT being able to use the Configuration Measures for EVERY dimension.

Adding User Configurations to an Analysis Server Cube

Many times, clients ask me to hard-code various values into their cube logic, things like conversion rates or other common factors. This article series will cover several different methods for adding user specific configuration values to an Analysis Server (Dimensional Model, as opposed to a Tabular Model) cube. As with any technical solution, there may be other methods available. As the series progresses, the techniques will get more and more involved, but in conjunction, they will be more flexible. As you may already be aware, there are essentially three ways to get ‘stuff’ into a cube: as a Calculated Member, as part of a Dimension Member, or as a Measure in a Measure Group. In Part 1 of this series, we will be hard coding some values as MDX calculations. Most of the methods in this series will work with any version of SQL Server and Visual Studio. I will try to point out limitations in versions when they arrive.

Part 1: Static Conversion Factors as an MDX Calculated Member

If a conversion factor isn’t ever going to change (like converting miles to feet, or vice versa), we can simply add an MDX statement in the Calculation design tab as a static value. Open the cube project in Visual Studio, double-click on the cube object in the Solution Explorer to open the cube in design mode. Click the Calculations tab at the top. Enter the following MDX Statements in Script View:

If you prefer Form View over Script View, your screen for the first MDX Member would look like this:

The purist reader may point out that the latter measure is simply the inverse of the former, which is absolutely true. However, I would reply that a) it doesn’t take up any more cube space or processing resources, and b) I prefer to always be multiplying my conversion factors instead of worrying about multiplication or division depending on the direction of the conversion.

For example, I would prefer the following MDX calculation:

to this one:

The end result of the two statements is exactly the same, but logically to me, the first one makes more sense.

Note here that the VISIBLE property of the two base conversion factors are set to 0 or False (hidden) because in this instance, I don’t need to let the users see them.

This method is about as simple as it gets. But that simplicity comes with a price. The conversion rate between miles and feet is never going to change, but something like a business specific factor can, and most likely will change. Using the above method would require a Business Intelligence developer to edit the MDX code within the cube and re-deploy it. Starting in Part 2 of this series, the methods will involve saving the configuration value outside the cube and independent of the code.