5 Tips to Make Power BI Easy on the Brain

How you present your data driven insight is important! But unfortunately, analysts can sometimes forget to tell their story effectively, leaping from data exploration to a dashboard without giving much thought to how audiences will receive the information. Your insights can get lost in a messy report. If the information is critical, shouldn’t the communication medium be crisp, clean and understood at a glance?

Advanced Analytics tools, such as Qlik and Power BI, are fantastic for creating interactive dashboards and reports you can use to explore large datasets, understand trends, track key ratios and indicators, and then share insights with your colleagues, or your boss. What makes these tools useful? They take data and refine it into information by placing a visualization over it, thereby helping our visually oriented brains make sense of the numbers. When it comes to understanding the meaning behind the numbers, a data table or Excel report can leave a brain very, very tired.

Here are some quick tips for making your next analytics report “Easy on the brain”!

Our 5 Power BI Tips:

Respect the Rim

Before my career in technology began, I worked as a waiter, and I worked at some pretty classy spots. If you have never had the pleasure, let me share with you that before each plate makes it to its appointed destination, it is briefly inspected. If any sauces, herbs, or actual food has errantly landed on the rim of the plate, it is removed. “Respect the Rim,” my mentor once told me. The same is true for your data and information. Enforce a thin empty “margin” around each of your Power BI reports. By using the “Snap to Grid” feature, make sure that each visual on the report is aligned to your self-imposed margin. The analysis will look sharper and more credible.

What’s the Headline?

Most reports have data points which are essential, such as Key Performance Indicators (KPI) defined by management, or other indicators such as “Bottom Line” financials, Net Income or EBITDA. These essential measurements are the Headline for the report. Don’t bury the Headline – always place key information in the upper left-hand corner of the report, either in a Power BI Card or KPI visual.

Once you have done that, you can further segment visual information into key categories, and keep them segmented into groups. For example, beneath your KPI you may want to provide leading indicators or contributing factors. Another option may be to provide a group of categorical breakdowns together, or key ratios that contribute to the success or challenges of your headline. You may want to provide a group of visuals providing detailed exposition – a table or other visual with detailed categories. The information should flow from the Headline to the Exposition, just as a newspaper story would.

              

Have a Perspective

Reports that provide multiple ways to filter and slice the data are very helpful to data analysts, data scientists and casual explorers. These are tools to help you with data exploration. Once you’ve found insights worth sharing, focus the audience’s attention on that information by removing the extraneous bits. Don’t worry – the underlying Power BI data model remains for exploration of new insights later.

Make It Mobile

I always make a point of creating a phone layout for my reports, because it is very easy in Power BI. Due to the smaller form factor, phone layouts require you to further choose the essential information. However, if you’ve followed my advice, then you know what the headline is already. Simply drag the headlines onto the phone layout for your report before publishing. Among many outstanding features, the Power BI mobile app allows users to get notifications and data-driven alerts. End users can even mark up phone reports, distribute with their annotations and launch a conversation.

Intentional Style

Use good judgement and don’t get carried away with excessive colors, logos, or background images. I find that a company logo can be helpful for some audiences. However, when it comes to colors, be mindful of some universal rules.

  1. Some colors convey information, intended or otherwise (Red, Yellow and Green, for example)
  2. Company branding adds a professional touch – use the color scheme
  3. Use no more than 10 data colors, no more than 3 backgrounds and no more than 4 fonts

I recommend saving a Custom Theme for Power BI that reflects this guideline. Save off your company theme and save for later use. Any changes can be applied globally.

Your Most Important Information, Quickly Understood

Critical information is more valuable when it is quickly understood. Indeed, each visualization available conveys information in its own manner; therefore, it is important for professionals who prepare analysis with advanced analytics tools to be mindful of the strengths and weaknesses of each. Regardless of audience, these five guidelines apply.

Advanced analytics tools are a critical component to digital transformation, because they enable data-driven decision making. Among other things, data-driven decision making requires the creation of information from data. Often, that data is massive, or moving quite rapidly. To extract insights and reduce business uncertainty, talk to BlumShapiro about our analytics service offerings. We’ll provide a road map to data-driven decision making, enabling digital information at your fingertips. And you can focus on your business.

Berry_Brian-240About Brian: Brian Berry leads the Microsoft Business Intelligence and Data Analytics practice at BlumShapiro. He has over 15 years of experience with information technology (IT), software design and consulting. Brian specializes in identifying business intelligence (BI) and data management solutions for upper mid-market manufacturing, distribution and retail firms in New England. He focuses on technologies which drive value in analytics: data integration, self-service BI, cloud computing and predictive analytics

Technology Talks Podcast: Introduction to Blockchain

Listen to our podcast, Technology Talks, hosted by Hector Luciano, Consulting Manager at BlumShapiro Consulting. Each month, Hector will talk with about the latest news and trends in technology with different leaders in the field.

In this episode, Hector provides an introduction to blockchain technology, the connections to bitcoin and how this technology will have an impact on business going forward. Listen to previous episodes of Technology Talks on our SoundCloud page.

Create a Pareto Chart in Power BI

“Baseball is 90% mental, and the other half is physical.” – Yogi Berra

You just have to love Yogi Berra quotes like this. We all pretty much know what he’s talking about, even if his math is not spot on. It’s a restatement of the Pareto Principle, the 80/20 rule! It applies to just about anything in life or business. If I had to write a definition of it for technical documentation, it would look something like this:

“A situation where eighty percent of events attributed to a group are caused by twenty percent of the members of the group.”

Re-stated as examples:

  • Eighty percent of your human resource issues are caused by twenty percent of your employees.
  • Eighty percent of your maintenance issues are caused by twenty percent of your equipment.
  • Eighty percent of your sales are attributed to twenty percent of your products.
  • Eighty percent of the wealth is controlled by twenty percent of the population.

 

(And I’m certainly not in that last twenty percent. If I was, I wouldn’t have to write articles like this one!)

Now that we’ve got an understanding of the principle, let’s look at how it can be visualized. Excel has a very simple wizard for creating a Pareto Chart that can be found on the Insert menu:

But we want one of these in Power BI. And Power BI doesn’t have one (yet, maybe later). We’ll need to ‘roll our own’.  Let’s discuss the various parts of the chart itself, so we know what we’re shooting for.

  • The categories, or series, at the bottom (One, Two, Three, etc.) represent the different members of the ‘group’ we are trying to analyze. They may be employees, machines on the manufacturing line, or products in our catalog.
  • The blue bar above each ‘member’ is its respective measurement (count of HR issues, money spent on maintenance, annual sales, etc.). The scale of this measurement is on the left side, in our case going from 0 to 120.
  • The final element is the curved line and the right-side scale measuring from zero to one hundred percent. This represents, at each category member, the percentage of the cumulative total of all members to the left of the member in question, inclusive. To put it another way, as we add each category’s number to the running total of those on its left, the line represents that running total divided by the entire total for all members of the category.

The arrow points to the spot on the line where it crosses 80%, in our case, after about the first four members, as can be seen by following the green dashed line from right to left, then down. The first four members would be the ‘twenty percent’ of the Pareto Principle, and their cumulative measure would be the eighty.

Note: Math wizards may point out that four members divided by a total member count of fifteen is closer to thirty percent than twenty, but remember that this is a rule of thumb, and we all know that some thumbs are bigger or smaller than others.

To plot some data in a Pareto Chart, we’ll need a couple of pieces of information from it:

  • Each member’s respective total
  • The grand total
  • The running total at each member, sorted from largest to smallest
  • The percentage that running total represents compared to the grand total

Now that we understand what we’re shooting for, let’s get started.

If your data includes a running sum of the measurement for each member, sorted by the respective member’s measurement, then you’re golden and can skip to the section titled Add the Grand Total and Running Percent. Your data may include a Ranking column so you may be able to skip the respective steps in each of the following two sections. For the rest of you, keep reading. We’ll look at two approaches to getting the intermediate bits of data: Power Query (M), and DAX.

Create the Rank and Running Sum in Power Query

Let’s start with some simple data in Excel, in fact the same data used to generate the Excel Pareto chart we used to explain the concepts:

We’ll load this data (it’s in an Excel table called “Table1”) and edit it in the Power BI Query Editor. First, we need to sort the data by the [Measure] column, sorted descending. Click the down-arrow next to the Measure column title and select Sort Descending.

Next, on the Add Column menu, select Index Column. Keep the defaults of Starting Index of 1 and Increment of 1.

I renamed my column to [Power Query Rank] to differentiate it from ranking step we’ll introduce in the model later via DAX.

Next, we’ll add the running total as a Custom Column with a formula as shown below:

Hint: If you can’t read the formula from the screen shot, it is:

= Table.Range ( #”Renamed Columns”, 0, [Power Query Rank] )

Attribution should go to Sam Vanga and SQL Server Central for this bit of M code: http://www.sqlservercentral.com/blogs/samvangassql/

The Power Query function Table.Range can be explained like this: Given a table of data, in our case the last of our query steps, a.k.a. #”Renamed Columns”, start at the 0 row (top), and go down the number of rows represented by the value in column [Power Query Rank]. The result is a table associated with each row in the query. The first row of the query has a table with one row of data in it. The second row has a table with two rows, and so forth. This table is represented by the word “Table” on each row of the column we just added.

From here, click on the ‘expand’ arrow in the column header and select the Aggregate radio button, check off the “Sum of Measure” column, and un-check “Use original column name as prefix”:

I renamed the resulting column [Power Query Running Total] (not shown).

Click Close and Apply on the Home menu.

Create the Rank and Running Sum in DAX

As with all things Microsoft, there is more than one way to accomplish a goal. In our case, the goal is to get the running total, and just like before, we’ll need the ranking first. For this exercise, we’ll be using DAX instead of Power Query, but should get the same results.

Create a column with the formula as follows:

DAX Rank = RANKX (All ( Table1 ), [Measure] )

Next, create the DAX Running Total measure as:

DAX Running Total =

CALCULATE (

SUM ( Table1[Measure] ),

FILTER (

ALLSELECTED ( Table1 ),

Table1[DAX Rank] <= MAX ( Table1[DAX Rank] )

)

)

This DAX formula does pretty much the same thing as the Power Query Range.Table function above, the only difference is that it includes the aggregate within, eliminating the need for an extra column.

Note: Know the difference between Columns and Measures in DAX. Mistaking the two will cause error, frustration, and hair loss.

Plotting all these columns and measures on a simple table visual shows that Power Query and DAX come up with the same answers for Rank and Running Total, a good sanity check. Also, the ranks are easy to verify as to accuracy, and with a little mental math, running totals are as well. I had to re-format some of the numbers to make them show without decimals.

Add the Grand Total and Running Percentage

There’s two more pieces we need: [Grand Total] which is self-explanatory, and [Running Percent], which is the ‘percentage of the [Running Total] compared to the [Grand Total]’. These can only be done in DAX. Add a measure as follows:

Grand Total = CALCULATE ( SUM ( Table1[Measure] ) , ALL ( Table1 ) )

This calculates the Grand Total and makes it available at every slice (row of each Member).

Now add the last item, a column with the expression:

Running Percent = [Power Query Running Total] / [Grand Total]

Or:

Running Percent = DIVIDE ( [Power Query Running Total] , [Grand Total] )

Note: The column [DAX Running Total] would work just as well as its Power Query equivalent since we know it has the same number.

Format this last one as a percent.

Create the Chart

Now the fun part. For this we’ll need either a “Line and Stacked Column Chart” or a “Line and Clustered Column Chart”. This is the easiest part of the whole exercise:

  • The Shared Axis is the [Member] column (“One”, “Two”, “Three”, etc.)
  • The Column values is the [Measure] column
  • The Line values in the [Running Percent] column

Like I said, simple if you have all of the data pieces in front of you.

Need help getting the right data pieces? Not sure what charts you can generate from the data pieces you have? There’s probably a way to get to where you want to be. Reach out to our team of data scientists at BlumShapiro Consulting to learn more about how data can help guide your organization into the future.

T-SQL Inner Join Need Not Always Be an Equal Join

Have you ever had to write a T-SQL query that joined a table to itself? The classic example is an employee table that lists the ID of the person’s supervisor or boss, which refers to another row in the same table. This example is nothing like that (Hint: If you do have to do that, do a search for “recursive hierarchy”). Our example today is something different.

In our example we’re going off of the following scenario: given a list of cities on a map, generate a table that can record the distance between any two. In other words, create a list where every value in the table is paired with every other value. For this exercise, we’re not going to be calculating the actual distance, simply creating the records and leaving the space for someone else to fill in. Sounds simple enough. Let’s start with the simplest example: TWO points: New York and San Francisco:

We could easily create a simple three column table with the following entries:

But the distance in one direction is the same as the return trip so really, we only need one row. After deleting the redundant record and adding Chicago to our list of cities, we get this:

 

As you can see, with the addition of just one value to our list of cities, we need two additional rows in the Distances table. As you might expect, the fourth city in the first table would require three more rows in the second. The fifth city would require four more and so on. The growth is almost exponential. At 100 cities, we would need 4950 pairs of cities in the distance table, and at 1,000 we would need close to half a million. The actual number is

{ N * ( N – 1 ) / 2 }.

Time to automate, which is the focus of this article.

Let’s settle on a total of five cities, with the last two being New Orleans and Las Vegas. We could easily build the grid below and fill in any of the appropriate blocks. The black squares are same-city pairs (New York to New York, or Chicago to Chicago, etc.) so we don’t need to worry about them. And if we can get the distance from New York to San Francisco, for example, we don’t need the distance from San Francisco back to New York, so we can also eliminate all the squares marked in gray:

Let’s replace our city names with numbers. After all, any decent database design will have a unique key assigned to each row of the City table.

And our grid now looks like this:

 

So how do we create this mega list of relationships from ONE table? An INNER JOIN of the City table to itself on the City ID column will generate the squares marked in black. But that is exactly what we DON’T want. What about a CROSS JOIN?

That will do it, but with over twice the results we need. Remember that once we know the distance from, say, Chicago to Las Vegas, we don’t also need to record the distance from Las Vegas to Chicago. And we’re still left with some extra rows where the cities are the same, like the distance from New Orleans to New Orleans.

Comparing the City ID columns in the result set above with the City ID axis of the grid, we can see that we’re only looking for pairs (cells) where the ID of City 2 is greater than the ID of City 1. From here, we have two choices: Stick with the CROSS JOIN and add a WHERE clause, or go back to the INNER JOIN and change the ON clause:

The results are the same. In fact, the estimated execution plan from the query optimizer is the same as well.

The point of this whole exercise is this: An INNER JOIN need not always be an EQUAL JOIN, though that is by far the most common occurrence. When faced with unique query needs, think outside the box. In this case, simple substituting a GREATER THAN for the EQUAL in the ON clause did the trick.

Learn more about how BlumShapiro Consulting can help with a project like this and additional application development.

About Todd: Todd Chittenden started his programming and reporting career with industrial maintenance applications in the late 1990’s. When SQL Server 2005 was introduced, he quickly became certified in Microsoft’s latest RDBMS technology and has added certifications over the years. He currently holds an MCSE in Business Intelligence . He has applied his knowledge of relational databases, data warehouses, business intelligence and analytics to a variety of projects for BlumShapiro since 2011.