Discover the Question with Azure Machine Learning

June 30, 2015   //   Business Intelligence, , , ,


I’ll be the first to say that traditional/old school business intelligence does a great job of providing answers to the questions you know you want to ask. What if you don’t know the questions? What if you are more interested in the questions than you are in the answers? Enter Azure Machine Learning!

Azure Machine Learning was launched last summer to much fanfare and marketing hype. Since its inception, features and user adoption continue to increase and we are finding that almost every organization has opportunities lurking that Azure Machine Learning can help to solve.

Radiological Society of North America (RSNA) Evaluates Azure Machine Learning

Recently we had a discussion with Spencer Moore, Director of Information Management at RSNA, regarding the fit for their machine learning needs. After an insightful discussion on requirements and pain points, we elected to take a sample dataset and put it through the paces in a Machine Learning Proof of Concept (POC). In the POC we decided to focus on the following areas:

  • The outputs/outcomes should be easily consumed by visualization tools such as Tableau
  • Demonstrate ease of data migration from disparate sources to the Azure ecosystem and consumption by Machine Learning experiments/models
  • Evaluate text mining capabilities
  • Evaluate membership churn capabilities

Named Entity Recognition

One of the datasets we got from RSNA was a large export of emails, customer service comments, and phone conversation debriefs. The dataset was fairly large and most of the text conversations were more than two sentences. This turned out be a perfect dataset for evaluating NER, or Named Entity Recognition. NER is an actual text mining task in Azure Machine Learning Studio that can interpret a block of text and automatically parse out names, organizations, and places. What’s even more useful is the task does this parsing recursively and can find many entities in single block of text. Imagine, all of these nouns or entities have been parsed from your text data, and you now have the ability to discover some of those unknown questions. For example, “Did you know that Canada is the number one location that CSRs reference in conversation debriefs?”

Setting Up the Experiment

NER experiments are fairly easy to build. The image below has the following steps:

  1. We created a data reader that loads the text data from an Azure SQL database. We could have also used a text file, blob storage or an HDInsights Hadoop table as a source.
  2. We only want to extract named entities from text so in the second step we removed the non-text columns from the incoming dataset.
  3. The NER task only needs to have an incoming dataset and an output dataset. The output dataset is assumed to be one too many. One incoming record will have zero or many output records because the paragraph text had many recognized persons, places, or organizations.
  4. The parsed entities are written to a table in Azure SQL where we can analyze the parsed data with adhoc reporting tools (Excel, Power BI, Tableau, etc.)


Next Steps

Our NER experiment was built to do named entity analysis in a batch process. In the future, we could expose the experiment as a web service and let other applications pipe their datasets into the experiment and get extracted entities as an output. The process of converting an experiment to a web service just requires a few clicks!

Want to Learn More?

Azure Machine Learning can augment most any BI system and if you haven’t given consideration to machine learning in your BI architecture, you should! If you would like more information about Azure Machine Learning or business intelligence in general, contact SWC to speak with one of our Microsoft Azure experts.

Download our guide, The Essential Guide to Data-Driven Decision Making, to explore real-world examples of how modern organizations are overcoming the barriers to becoming a data-driven business and how you can be leveraging data analytics to grow.

Data Analytics Guide