Have no fear: Microsoft’s transformation from “Windows and Office” company to “Cloud and Services” company continues to accelerate. Nowhere is this trend more evident than in the range of services supporting Internet of Things scenarios.
So – What are the Microsoft technologies that would comprise an Internet of Things solution architecture?
And – How do Cloud Computing and Microsoft Azure enable Internet of Things scenarios?
Here are the key Microsoft technologies which architects and developers need to understand.
Software for Intelligent Devices
First, let’s understand the Things. The community of device makers and entrepreneurs continues to flourish, enabled by the emergence of simple intelligent devices. These devices have a simplified lightweight computing model capable of connecting machine-to-machine or machine-to-cloud. Windows 10 for IoT, released in July 2015, will enable secure connectivity for a broad range of devices on the Windows Platform.
Scalable Event Ingestion
The Velocity of Big Data demands a solution capable of receiving telemetry data at cloud scale with low latency and high availability. This component of the architecture is the “front-end” of an event pipeline which will sit between the Things sending data and the consumers of the data. Microsoft’s Azure platform delivers this capability with Azure Event Hubs – extremely easy to setup and connect to over HTTPS.
Still – Volume + Velocity lead to major complexity when Big Data is consumed; the data may not be ready for human consumption. Microsoft provides options to analyze this massive stream of “Fast data”. Option 1 is to process the events “in-flight” with Azure Stream Analytics. ASA allows developers to combine streaming data with Reference Data (e.g. Master Data) to analyze events, defects, “likes” and summarize the data for human consumption. Option 2 is to stream the data to a massive storage repository for analysis later (see The Data Lake and Hadoop). Regardless of whether you analyze in flight or at rest, a third option can help you learn about what is happening behind the data (see Machine Learning).
We’ve learned a lot about “Artificial Intelligence” over the past 10 years. Indeed, we’ve learned that machines “think” very differently than humans. Machines use principles of statistics to assess which features (“columns”) of a dataset provide the most “information” about a given observation (“row”). For example, which variable(s) are most predictive (or closely correlated) with the final feature of the dataset? Having learned how the data is related to one another, a machine can be “trained” to predict the outcome of the next record in the dataset; given an algorithm and enough data – a machine can learn about the real world.
If the IoT solution you envision includes predictions or “intelligence”, you’ll want to look at Azure Machine Learning. Azure ML provides a development studio for data science professionals to design, test and deploy Machine Learning services to the Microsoft Azure Cloud.
Finally, you’ll also want to understand how to organize a data science project within the structure of your company’s overall project management processes. The term “Data Science” is telling – it indicates an experimental aspect to the process. Data scientists prepare datasets, conduct experiments, and test their algorithms (written in statistical processing languages like “R” and “Python”) until the algorithm accurately predicts correct answers to questions posed by the business, using data. Data Science requires a balance between experimentation and business value.
The Data Lake and Hadoop
A Data Lake is a term used to describe a single place where the huge variety of data produced by your big data initiatives is stored for future analysis. A Data Lake is not a Data Warehouse. A Data Warehouse has One Single Structure; data from a variety of formats must be transformed into that structure. A Data Lake has no predefined structure. Instead, the structure is determined when the data is analyzed. New structures can be created over and over again on the same data.
Businesses have the choice of simply storing Big Data in Azure Storage. If the data velocity and volume exceed certain limits of Azure Storage, Azure Data Lake is a specialized storage service optimized for Hadoop, with no fixed limits on file size. Azure Data Lake is a service announced in May 2015, and you can sign up for the Public Preview.
The ability to define a structure as the data is read is the magic of Hadoop. The premise is simple – Big Data is too massive to move from one structure to another, as you would in a Data Warehouse/ETL solution. Instead, keep all the data in its native format, wait to apply structure until analysis time, and perform as many reads over the same data as needed. There is no need to buy tons of hardware for Hadoop: Azure HDInsight provides Hadoop-as-a-Service, which can be enabled/disabled as needed to keep your costs low.
Real Time Analytics
The human consumption part of this equation is represented by Power BI. Power BI is the “single pane of glass” for all of your Data Analysis needs, including Big Data. Power Bi is a dashboard tool capable of transforming company data into rich visuals. It can connect to data sources on premises, consume data from HDInsight or Storage, and receive real-time updates from data “in-flight”. If you are located in New England, attend one of our Dashboard in a Day workshops happening throughout the Northeast in 2015.
IoT solutions are feasible because of the robust cloud offerings currently available. The cloud is an integral part of your solution, and you need resources capable of managing your cloud assets as though they were on premise. Your operations team should be comfortable turning on and off services in your cloud, just as they are comfortable enabling services and capabilities on a server. Azure PowerShell provides the operations environment for managing Azure cloud services and automating maintenance and management of those services.
Enterprises ready to meet their customers in the digital world will be rewarded. First, they must grasp Big Data technologies. Microsoft customers can take advantage of the Azure cloud to create Microsoft Big Data solutions. They are designed first by connecting Things to the cloud, then creating and connecting Azure services to receive, analyze, learn from, and visualize the data. Finally, be ready to treat those cloud assets as part of your production infrastructure, by training your operations team in cloud management tools from Microsoft.