Archive for Predictive Analytics

3 Tips to Jump Start your Data Science Plan

Are you looking to form a Data Science capability at your company?

If you answered, yes, then you probably already get the Machine Learning concept (The 4 Machine Learning Problems).  Maybe you are coming from either a Statistics or Computer Science background.  Either way, you see the potential of Data Science and Predictive Analytics and you’re ready to demonstrate some tangible benefit to management.

How are you getting started?  I’m hearing about two core hurdles:

  1. We’re looking for a great business problem to solve, one which could reasonably be solved with data the business already collects
  2. Our internal resources have very little practical experience working on a formal data science team, and don’t understand how it aligns to more traditional project teams

Time to Value is critical, but you need to do it in a way that has a formal process for managing risk, one which can be communicated inside and outside the team.  Here are the things you want to have in place, in order to launch your first project.

Establish Your Data Science Methodology – every project has a project plan and data science projects are no different.  What should the Data Science one look like?  Several teams of very smart people have already asked this question and independently arrived at the same conclusion.  My favorite is the “Cross Industry Standard Process for Data Mining”  (CRISP-DM) because it calls out the need for basic Business Understanding of the problem first.  Basically there are 6 phases of the process

  1. Set the Business Objectives
  2. Find the Data
  3. Prep and Cleanse the Data
  4. Do the Machine Learning Work
  5. Evaluate the Model You Created – does it meet the Business Objectives?
  6. Deploy the Model

Need a picture?  Note the backwards arrows – Data Science is an iterative process.

Assess your Data Capabilities – Data Science needs Data.  Teams that try to predict outcomes without relevant data are setup for failure.  An example: let’s say that you would like to forecast demand for your products, in order to reduce your inventory.  You might start with basic sales data and find that you are not getting the  level of prediction accuracy you expected.  What other factors might be driving demand?  Customer Satisfaction might be one you decide to include.  But what if your company is not measuring customer satisfaction in any quantifiable way?  Data Science leaders need to understand the capabilities of their company (in effect, the Data Science customer) with respect to data assets, in order to effectively determine which business problems are ripe for prediction.

Outsource the Team – Data Science requires a very specialized set of skills.  You probably have some of those skills yourself: Computer Science, Statistics and an understanding of the principles behind Machine Learning.  These three are important, but equally important is Business and Domain Knowledge.  Do you have a team of resources which possess all four?  If you are working with a technology provider who already understands your business and who also has demonstrated capability in  delivering data science value – then outsourcing the work to that team becomes very attractive.  If you don’t have such a resource, consider a business and technology consulting partner such as Blum Shapiro Consulting.  Provided you already understand the CRISP-DM process, you’ll be able to effectively manage a seasoned team of business and data science pros.

Can Data Science increase your bottom line?  Improve Customer Loyalty?  Drive down costs?  Yes it can, provided you have a methodology to manage the work as a project, data to support it and a capable team.  If you’re convinced the opportunity is there, follow these tips and Data Science will have a strategic role within your company after your first big win!



The 4 Machine Learning Problems, Explained

Machine Learning and Predictive Analytics have been receiving a lot of attention lately!  Without question, this is an exciting technology with extremely broad applicability.  After all, who wouldn’t want to be able to predict the future?  Still, with hype comes confusion, and there is a lot of confusion today about what exactly Machine Learning is and how to use it.

I have good news!  There are really only 4 (yes, four) Machine Learning problems.  For anyone who wants to explore the value of Machine Learning, it’s important to understand them, because the first step in any Machine Learning process is to figure out which of these problems you are trying to solve.  Data Science teams address this question before they begin designing a Machine Learning model.  If your problem does not fit into one of these buckets, forget the hype! You’re better off taking a simpler approach.

Classification – in this machine learning problem, we’re trying  to figure out if some bit of data (an observation) represents something simple which we already understand (a Label).  This label can either be a Yes or No decision, (Two Class) or it can be one of a set of possible answers (Multi Class).  In order for this to work well, you need to provide the Machine Learning model with examples first.  Applications include:

  1. Facial Recognition – is this picture an image of my customer?
  2. Voice Recognition – what word is represented by this sound?
  3. Handwriting Recognition – which letter in the alphabet does this image represent?
  4. Fraud Detection – is this transaction fraudulent?
  5. Medical Outcomes – will this person have a stroke in the next year?
  6. Proactive Maintenance – will this piece of machinery fail in the next 72 hours?
  7. Credit Default Risk – will this borrower default on his/her loan?

Regression – in this machine learning problem, a Yes or No answer is not going to be enough.  In order to solve this problem, the machine needs to predict a value (i.e. a price, a temperature, a measurement) by understanding the numeric relationship of that value to other values (or Factors).  If you took Calculus, this might sound like a simple “Rate of Change” function: you’re on the right track.  Just as with Classification, Regression problems need some examples in order to work well.  Applications include:

  1. Cost Analysis – when will be the best time to buy something?
  2. Demand Prediction – how many widget’s will we sell next year?

Clustering – this is where things get complicated (!!)  With the first two problems, we have examples we can use to “train” our machines to predict a label AND we can test them with labeled observations (known to Data Scientists as “Ground Truth”).  But what if we don’t have a ground truth?  The best we can do is identify clusters of observations.   Fair warning: without ground truth, evaluating the results will be a challenge.  Still, some applications include:

  1. Grouping of Content – Grouping Today’s News into Categories, or Documents into Topics
  2. Materials Classification – take a Raw Materials Master File and organize it into a taxonomy
  3. Customer Segmentation – identify similar customers based upon purchase behavior

Recommender – have you ever been on a website which presented a recommendation of something you might “Like”?  Movie recommendations on Netflix, product recommendations on Amazon, or advertisements on your apps – if you are familiar with the internet, you probably understand the premise here.

That’s it.  Now you know how to recognize a problem which Machine Learning can help you with.   If your business problem does not fall into one of these four, you don’t need a machine learning model to solve it.  More importantly, if you know the factors which drive a business outcome, just build a model in Excel – you don’t need a Data Science team for that.

Good luck!

It’s Easy to Assess and Share Your Project Portfolio Health


Successful companies continually improve the way in which they envision, manage and execute their internal projects.  Companies which execute their projects effectively (on-time and on-budget) have a tremendous advantage in  the marketplace – provided they are doing the RIGHT projects.

But. we hear a common pain from many of our clients around reporting and insights – How can we monitor the overall health of our Project Portfolio, without getting mired in Resource constraints and “Eye-Chart” style reporting?  Executives need continuous insight into how their portfolio is performing, and to do this they need to calculate the ROI and Health of each project individually and then aggregate.  They also frequently need to consider Resource Availability, Schedule Constraints and Costs when assessing the health of any given project.

We have been working with our Project and Portfolio Management clients to help them build dashboards and visualizations of project data managed in Microsoft Project Online.  Microsoft Project Online is a Cloud Hosted Software-as-a-Service (SaaS) solution for Project Management Offices (PMO’s), Project Managers and Project Teams to formally manage ALL of the projects in their portfolio in a disciplined manner.  Blum Shapiro’s Project and Portfolio Management practice helps companies improve their Project Management capabilities, often with technology tools such as Microsoft Project Online.  You can learn more about our Project and Portfolio Management practice here.

Microsoft Power BI is uniquely suited to the building of these dashboards and visualizations, because it is also a SaaS type of product – which means you don’t need to purchase BI servers or make room in your data center.  That would make this into another project!  Further, it is designed to connect to, shape and model ALL data you may need, whether this data is On-Premises, in the Microsoft Cloud, or in another Cloud.  Finally, it’s really, really EASY to get started.

Power BI comes in two licensing offers: Free and Pro.  The Free Edition works well for individuals or departments who want to build personal dashboards for themselves, simply to keep track of or analyze important data for which they are responsible.  We’re running workshops throughout the fall called Dashboard in a Day where we help clients connect to cloud datasets (such as Project Online) and build themselves a simple set of reports and dashboards. Truth in advertising: it takes less than a day.

However, collaboration is valuable.  Nobody wants to hold status meetings with Senior Management at their desk or even be tied to a projector.  Therefore, we recommend that companies with a clear Project Management directive upgrade to the Pro Level ($10/user/month), and the biggest reason is in order to take advantage of Content Packs.  A Content Pack is a collection of pre-built Datasets, Reports and Dashboards which can be published and shared across an entire enterprise.  As easy as it is to connect and consume data in Power BI, some knowledge of the source systems is extremely helpful.  Since not everyone needs to understand how data is stored in Microsoft Project Online, let’s arrange and model the data for our colleagues and direct reports, then share our insights.

Blum Shapiro Consulting offers a Power BI Content Pack which contains pre-built reports and dashboards for Microsoft Project Online.  All that is required is for us to change the web address of our data sources from ours to yours, publish the content pack to your Power BI tenant, and your organization will have instant visibility into the project portfolio.

In order to get this instant visibility, users would follow these simple steps:

First, Sign up for Power BI Pro.

Once signed up with an Organizational Account, users will have their own dashboard viewer.

Second, In the lower right hand corner of the application, Click Get Data

Power BI Get Data

Under My Organization, Click Get

Power BI Content Pack Library

Select from one of several Content Packs (in this case, Project Online Customer Immersion Experience) and click Connect

POL Content Pack

After about 10 seconds, users will have a prebuilt set of Reports and Dashboards to view.

POL Dashboard

Before we leave, we’ll help you set up an Hourly Refresh Schedule on the data from Project Online.  That way, the dashboards you share will always be up to date.

Contact us to learn more about Power BI Content Packs, Microsoft Project Online or our Dashboard in a Day workshops.