Discover the Question with Azure Machine Learning
I’ll be the first to say that traditional/old school business intelligence does a great job of providing answers to the questions you know you want to ask. What if you don’t know the questions? What if you are more interested in the questions than you are in the answers? Enter Azure Machine Learning!
Azure Machine Learning was launched last summer to much fanfare and marketing hype. Since its inception, features and user adoption continue to increase and we are finding that almost every organization has opportunities lurking that Azure Machine Learning can help to solve.
Radiological Society of North America (RSNA) Evaluates Azure Machine Learning
Recently we had a discussion with Spencer Moore, Director of Information Management at RSNA, regarding the fit for their machine learning needs. After an insightful discussion on requirements and pain points, we elected to take a sample dataset and put it through the paces in a Machine Learning Proof of Concept (POC). In the POC we decided to focus on the following areas:
- The outputs/outcomes should be easily consumed by visualization tools such as Tableau
- Demonstrate ease of data migration from disparate sources to the Azure ecosystem and consumption by Machine Learning experiments/models
- Evaluate text mining capabilities
- Evaluate membership churn capabilities
Named Entity Recognition
One of the datasets we got from RSNA was a large export of emails, customer service comments, and phone conversation debriefs. The dataset was fairly large and most of the text conversations were more than two sentences. This turned out be a perfect dataset for evaluating NER, or Named Entity Recognition. NER is an actual text mining task in Azure Machine Learning Studio that can interpret a block of text and automatically parse out names, organizations, and places. What’s even more useful is the task does this parsing recursively and can find many entities in single block of text. Imagine, all of these nouns or entities have been parsed from your text data, and you now have the ability to discover some of those unknown questions. For example, “Did you know that Canada is the number one location that CSRs reference in conversation debriefs?”
Setting Up the Experiment
NER experiments are fairly easy to build. The image below has the following steps:
- We created a data reader that loads the text data from an Azure SQL database. We could have also used a text file, blob storage or an HDInsights Hadoop table as a source.
- We only want to extract named entities from text so in the second step we removed the non-text columns from the incoming dataset.
- The NER task only needs to have an incoming dataset and an output dataset. The output dataset is assumed to be one too many. One incoming record will have zero or many output records because the paragraph text had many recognized persons, places, or organizations.
- The parsed entities are written to a table in Azure SQL where we can analyze the parsed data with adhoc reporting tools (Excel, Power BI, Tableau, etc.)
Our NER experiment was built to do named entity analysis in a batch process. In the future, we could expose the experiment as a web service and let other applications pipe their datasets into the experiment and get extracted entities as an output. The process of converting an experiment to a web service just requires a few clicks!
Want to Learn More?
Azure Machine Learning can augment most any BI system and if you haven’t given consideration to machine learning in your BI architecture, you should! If you would like more information about Azure Machine Learning or business intelligence in general, contact SWC or call 630.572.0240 to speak with one of our Microsoft Azure experts.
If you enjoyed this post from Chad, please read a few of his past posts on business intelligence:
Predictive Analytics Made Easy with Microsoft Azure Machine Learning
Microsoft Azure Machine Learning – A Data Tool for the Masses
R You Looking to Integrate with Tableau?
Is My BI Fat? Database Partitioning Can Help!
It’s The Data Quality
Have We Forgotten How to Fix Problems?
Do Tableau And MDS Make Strange Bedfellows?