Tag Archive for SQL Server

How Much is Your Data Worth?

Data is the new currency in today’s modern businesses. From the largest international conglomerate down to the smallest neighborhood Mom-and-Pop shop, data is EVERYTHING! Without data, you don’t know who to bill for services, or for how much. You don’t know how much inventory you need on hand, or who to buy it from if you run out. Seriously, if you lost all of your data, or even a small but vitally important piece of it, could your company recover? I’m guessing not.

“But,” you say, “We have a disaster recovery site we can switch to!”

That’s fine if your racks melt down into a pool of heavy metals on the server room floor, then yes, by all means switch over to your disaster recovery site because molten discs certainly qualify as a “disaster!” Databases hosted on private or public cloud virtual machines are less susceptible, but not immune, to hardware failures.  But what about a failure of a lesser nature? What if one of your production databases gets corrupted because of a SQL Injection hack, cleaned out by a disgruntled employee, or is accidentally purged because a developer thought he was working against the DEV environment? Inadvertent changes to data are no respecter of where such data is stored, or how it is stored! And, sorry to say, clustering or other HADR solutions (High Availability/Disaster Recovery, such as SQL Server Always On technology) may not be able to save you in some cases. Suppose some data gets deleted or is modified in error. These ‘changes’, be they accidental or on purpose, may get replicated to the inactive node of the cluster before the issue is discovered. After all, the database system doesn’t know if it should stop such changes from happening when the command to modify data is issued. How can it tell an ‘accidental purge’ from regular record maintenance? So the system replicates those changes to the failover node. You end up with TWO copies of an incorrect database instead of one good one and one bad! And worse yet, depending on your data replication latency from your primary site to the disaster recovery site, and how quickly you stop the DR site from replicating, THAT may get hosed too if you don’t catch it in time!

Enter the DATABASE BACKUP AND RESTORE, the subject of this article. Database backups have been around as long as Relational Database Management Systems (RDBMS). In my humble opinion, a product cannot be considered a full-featured RDBMS unless it has the capability of performing routine backups and allows for granular restore to a point in time. (Sorry, but Microsoft Excel and Access simply do not qualify.) Being a Microsoft guy, I’m going to zero in on their flagship product: SQL Server, but Oracle, SAP, IBM and many others will have similar functionality. (See the Gartner Magic Quadrant for database systems for a quick look at various vendors, including Microsoft a clear leader in this Magic Quadrant.)

So what is a BACKUP? “Is it not simply a copy of the database?” you say, “I can make file copies of my Excel spreadsheet. Isn’t that the same as a backup?” Let me explain how database backups work and then you can decide the answer to that question.

First of all, you’ll need the system to create a FULL database backup. This is a file generated by the database server system, stored on the file system, the format of which is proprietary to the system. Typically, full backups are taken once per night for a moderately sized database, for example under 100 GB, and should be handled via an automated scheduling service such as SQL Agent.

iStock_000006412772XSmallNext, you’ll need TRANSACTION LOG backups. Log backups, as they are known, record every single change in the database that has occurred since the last full or log backup. A good starting point is scheduling log backups at least every hour, with possible tightening down to every few minutes if the database is extremely active.

Now, to restore a database in the event of a failure, you need to do one very important step: backup the transaction log one last time if you want to have any hope of restoring to a recent point. To perform the actual restore, you’ll need what is known as the ‘chain of backups’ which includes the most recent full backup and every subsequent log backup. During the restore, you will be able to specify a point in time anywhere from the time of the full backup to the time of the latest log backup, right down to the second or millisecond.

So we’re all set right? Almost. The mantra of Database Administrators the world over regarding backups is this: “The backups are only as good and sure as the last time we tested the RESTORE capability.” In other words, if you haven’t tested your ability to restore your database to a particular point in time, you can’t be sure you’re doing it right. Case in point: I saw a backup strategy once where the FULL backups were written directly to a tape drive every night, then first thing in the morning, the IT guys would dutifully eject the tapes and immediately ship them out to an off-site storage location. How can you restore a database if your backups are not available? Case two: The IT guys, not understanding SQL backup functionality and benefits, used a third party tool to take database backups, but didn’t bother with the logs. After four years of this, they had a log that was 15 times the size of the database! So big, in fact, that there was no space available to hold its backup. About a year after I got the situation straightened out with regular full AND transaction log backups going, the physical server (virtualization was not common practice then) experienced a debilitating hardware failure and the whole system was down for three days. Once running again, the system (a financials software package with over 20,000 tables!) was restored to a point in time right before the failure. Having the daily FULL backups saved the financials system (and the company). But also having the log backups saved many people a day’s work if we had had to go back to the latest FULL backup.

So, what’s your data worth? If your data is critical to your business, it is critical that you properly back up the data. Talk to us to learn how we can help with this.

About Todd: Todd Chittenden started his programming and reporting career with industrial maintenance applications in the late 1990’s. When SQL Server 2005 was introduced, he quickly became certified in Microsoft’s latest RDBMS technology and has added certifications over the years. He currently holds an MCSE in Business Intelligence. He has applied his knowledge of relational databases, data warehouses, business intelligence and analytics to a variety of projects for BlumShapiro since 2011. 

Technology Talks Newsletter CTA

Electrify Your Business with Data

Data is a lot like electricity – a little bit at a time, not too much direct contact, and you’re fine. For example, a single nine-volt battery doesn’t provide enough power to light a single residential bulb. In fact, it would take about a dozen nine-volt batteries to light that single bulb, and it would only last about an hour. It’s only when you get massive amounts of electricity flowing in a controlled way that you can see real results, like running electric motors, lighting up highway billboards, heating an oven or powering commuter trains.

iStock_000006412772XSmallIt’s the same with data. The sale of a single blue shirt at a single outlet store is not much data. And it’s still not much even when you combine it with all the sales for that store in a single day. But what about a year’s worth of data, from multiple locations? Massive amounts of data can do amazing things for us as well. We have all seen in today’s data centric business environment what controlled usage of data can do.

Some examples include:

  • The National Oceanic and Atmospheric Administration (NOAA) can now more accurately predict a hurricane’s path thanks to data that has been collected over time
  • Marketing firms can save money on culled down distribution lists based on customer demographics, shopping habits and preferences.
  • Medical experts can identify and treat conditions and diseases much better based on a patient’s history, life risks and other factors.
  • Big ‘multi-plex’ movie houses can predict more accurately the number of theatres it will need to provision for the latest summer block buster by analyzing Twitter and other social media feeds as related to the movie.

All of this can be done thanks to controlled data analytics.

The key word here is “controlled.” With a background in marine engineering and shore-side power generation, I have seen my share of what can happen when electricity and other sources of energy are not kept ‘controlled.’ Ever see what happens when a handful of welding rods go through a steam turbine spinning at 36,000 RPM and designed for tolerances of thousandths of an inch? It’s not pretty. After as many years in database technologies, data analysis and visualizations, I have also seen the damage resulting from large quantities of uncontrolled data. In his book Signal: Discerning What Matters Most in a World of Noise, author Steven Few shows a somewhat tongue-in-cheek graph that ‘proves’ that the sale of ice cream is a direct cause of violent crime. Or was it the other way around?  It’s an obvious comic hyperbole that serves to illustrate his point that we need to be careful of how we analyze and correlate data.

With the ‘big data‘ explosion, proponents will tell you that ‘if a little is good, then more is better.’ It’s an obvious extension, but is it accurate? Is there such a thing as ‘too much data’?

Let’s say you are a clothing retail store in the mall. Having data for all of your sales over the past ten years, broken down by item, store, date, salesperson and any number of other dimensions may be essential. What if we were to also include ALL the sales of ALL competitors’ products, seasonal weather history, demographic changes, foot traffic patterns in the mall and just about anything else that could influence a customer’s decision to buy your product even down to what they had for lunch just before they made the purchase? The result would most likely be UN-controlled data analysis. This tends to lead to erroneous correlations and bad decisions. For instance, you just might discover that customers are four times as likely to make a purchase if they had pizza for lunch, never realizing that there are more pizza restaurants near your stores than any other type of food service!

When it comes to data, stick with what you know is good data. Make sure it’s clean, reliable and most of all, relevant. Most of all, make sure you CONTROL your data. Without control, there may be no stopping the damage that results.

About Todd: Todd Chittenden started his programming and reporting career with industrial maintenance applications in the late 1990’s. When SQL Server 2005 was introduced, he quickly became certified in Microsoft’s latest RDBMS technology and has added certifications over the years. He currently holds an MCSE in Business Intelligence. He has applied his knowledge of relational databases, data warehouses, business intelligence and analytics to a variety of projects for BlumShapiro since 2011.

BI Call to Action

Dynamic Cell Formatting in SSRS

Early in my Business Intelligence Designer days using other report designer tools, I was all about the data. Get the data onto the report, black and white, never mind fancy fonts or colors. After all, the numbers won’t be any different if you try to make them prettier. A bit like putting lipstick on a pig! Then I came to realize that judicious deviation from the default fonts was not only acceptable, but desirable as evidenced in SQL Server Reporting Services report templates. Sub-total numbers started showing up in a heavier and more prominent font and weight. Grand total number even more so. Such emphasis on the more important numbers on a report help the eyes to quickly identify them as such, leaving the brain to concentrate on other things, such as the numbers themselves. You end up spending more time thinking about the data, and less time finding what you want.

But report cell formatting can go so much further than simply putting more emphasis on numbers at the bottom of the page. With SQL Server Reporting Services, you have the ability to do ‘conditional formatting’, much like can be done in Excel or Access. (If one number is greater than another, format the cell one way, if not, do something else.) To take advantage of the rich formatting options available in SSRS, you need to understand a few basic concepts, and some nuances of Reporting Services.

Know Your Properties

Click on a tablix cell in a report, then click the Properties button.


Note the myriad of properties that can be set just for the cell formatting. Each property, when you go to access its options via the drop-down menu, will have a choice labeled “Expression…”. This is where we will be putting in the ‘conditions’ of the conditional formatting.


Know your Expression Editor, Program Flow expressions, and Property expressions

The Expression Editor in Reporting Services is a powerful tool. For the purposes of our conditional formatting, we’ll be using it to a) examine some value, b) set a Property, and c) supply a default. For the Font Color property of a cell, click the Expression… option below the color palette (see the screen shot above). At this point, the Expression will most likely contain the single word “Black“, assuming you are going with a standard font color. Also note that the work Black is NOT enclosed in double quotes.

 

In the Category list in the lower left, expand the Common Functions node, then click on Program Flow. Note that there are three functions in the Item list:

Choose
IIf
Switch

Each of these can be used effectively for this task, but we’ll stick with the IIf ( ) function as it is probably the easiest to understand. Basically it works like this: IF a certain condition evaluates to TRUE, THEN do what it says in the second section, OTHERWISE, do what it says in the third section (the default). To see it in action, enter the following formula:

=IIf ( 1 = 1 , “Black” , “Red” )

Click OK, then Preview the report. Because 1 always equals 1, the first section evaluates to TRUE, so the second section is used, in this case “Black”. Note that we need to enclose our color names in double quotes in the Expression Editor for this to work. Now edit the formula and replace the 1 = 1 with 1 = 0. The preview will show the cell contents in a red font. Armed with this knowledge, we can replace the 1 = 0 with a more appropriate business condition. Delete the 1 = 0 text, but leave the cursor in that spot. Click on the Fields (DatasetName) node to see a list of fields available for the Dataset. Double click one to have it inserted in the formula. Add some examination criteria for it that makes sense to your business users. Remember that this part of the expression must evaluate to either True or False. Your final Expression might look like this:

= IIf ( Fields!YTD_Sales.Value >= Fields!PriorYear_YTD_Sales.Value , “Black”, “Red” )

The above conditional expression template can also be used to set any other property available. The only trick is to know what to put in place of the “Black” and “Red” components. The easiest way to determine that is to open the list of choices for the property, and use any set of values you see displayed, as long as you enclose them in double quotes in your expression. Shown below is a list of possible values for the Font Weight property

 

Using the same True/False comparison from the previous example, the Font Weight property could contain the expression:

= IIf ( Fields!YTD_Sales.Value >= Fields!PriorYear_YTD_Sales.Value , “Bold”, “Normal” )

Font Size is a little trickier as it does not list any valid values. Just know that “8pt” is an 8 point font, “10pt” is 10 point, and so on.

Other Conditional functions

We skipped over the Choose () and Switch () functions, but they operate in a similar manner to IIf(). They return a value based on some comparison. Microsoft’s example of the Choose() function leaves out one simple element: the return value. Sure, it is nice to show the function in use:

=Choose (2, “13”, “15”, “21)

But it would be even better to explain that the function would be ‘choosing’ the second option, which is “15”. How could we use this in a report? Suppose your Salespeople are ranked on performance, and that ranking is available as a report field as an integer. You could select a Font Weight based on the ranking as follows:

= Choose ( Fields!EmployeeRank.Value, “Heavy”, “ExtraBold”, “Bold”, “Semibold”, “Medium”, “Normal”)

The Switch() function lets you list ‘comparison/return’ pairs, all separated by commas, and the function will take the return value that matches the first comparison that evaluates to True. The basic format for this function is:

= Switch ( Comparison 2, Return Value 1, Comparison 2, Return Value 2, … Comparison N, Return Value N )

A Font Size expression could be written as:

= Switch ( Fields!EmployeeLevel.Value = “Manager”, “16pt”,
Fields!AnnualSales.Value <= Fields!SalesGoal.Value, “14pt”,
Fields!AnnualSales.Value > Fields!SalesGoal.Value, “10pt” )

Note here that we have used a different field in the second and third comparisons than is used in the first. There is no rule that says that the comparison elements need to look at the same fields.

Summary

It’s easy to see that the Expression editor in SQL Server Reporting Services can be used to build powerful and dynamic expressions that drive various formatting properties. By manipulating multiple Property Expressions for a single cell, the report designer can draw attention to various numbers and report elements if and when the numbers being reported warrant it. It is a much better viewing experience than a simple black on white report where every number is given the same emphasis.

 

 

When and How to Use Xml Data Types in SQL Server

During a design session with a client, the question came up of “When is it a good idea to consider Xml storage in SQL Server?” I have been a big fan of Xml technologies for some time now and was resisting the urge to say “All the time!” However, in this particular case, we were designing a fairly generic Auditing sub-system for use in a specific CRM application. As we discussed the problem, it became clear that we really wanted to track operations on a broad set of business objects. So we began to think about XML in SQL Server 2005 more seriously.

XML documents can go a long way towards reducing the overall complexity of a relational model which is trying to be “all things to all people”. For our audit system, we wanted to start out with a small set of “messages”, but we wanted to support a broader set in future releases. Whenever I hear that, I get a bit scared, because it runs the risk of either creating wild complexity in the data model or over-simplifications. On the other hand, a single column of type XML can support a number of schemas at once.

What is the best approach for using XML in a relational model? Here are my thoughts:

Strongly Type the XML – it is extremely easy to use Visual Studio tools to generate an XSD schema from a candidate XML message. So, take the time to define what your message ought to look like and then refine your schema to enforce it. That way you can have some comfort with the data you are storing.

Qualify your Elements – people seem a bit skittish about Xml Namespaces. Think about this in terms of .NET code – we always want to define a Namespace structure for code which reflects the owner of the code, its source service or application, and its intended purpose. Otherwise we would have type collisions all over the place! Well, same applies to Xml – Xml messages should be as specific as possible. I like to use the following standard when assigning a namespace: http://schemas.yourcompany.com/system/subsystem/component

Promote Important Properties – I’m a SharePoint guy, but I saw this behavior first in BizTalk Server, and I think it is widely applicable. In BizTalk, developers have the ability to “promote” values out of an Xml message and into the “header” of the message. Imagine designing a class which included an Xml document as a public property. Now add a set of public properties which act as XPath queries into the Strongly Typed Xml document within. This approach is possible because you have already designed the schema for the message, so you know what to expect – it’s not just ad-hoc XML. Now, the values within the message are readily available to even casual interrogators of the class – no need for XPath or schema knowledge on the client’s end. In SharePoint, this concept is manifested in Forms Libraries: when an InfoPath form is published to a Forms Library, the publisher can quickly promote values from the internal InfoPath payload into columns on the SPListItem which will be created by each InfoPath form entry. Perfect! Let’s do the same in SQL Server! Your SQL colleagues who can now perform some SQL queries on our tables without any knowledge of XPath will thank you.

Any success/horror stories you’d like to share around using XML data in a relational data store?

Brian Berry is a Director of Technology Consulting with BlumShapiro, focusing on Microsoft Business Intelligence solutions, with a strong focus on Systems Integration, Master Data Management and PerformancePoint Services. He has been helping companies optimize their investments in Microsoft technology for over 12 years.