Machine learning is behind more and more of the technology we use every day - and it's not just voice recognition in Kinect and Cortana or Microsoft's futuristic language translation in Skype.
Every time you get directions from your GPS or make a credit card transaction or search for a product online, machine learning is predicting the best route, working out whether you're likely to be using a stolen credit card and suggesting what else you might like to buy.
So far you've had to be a company with the resources of Amazon or Yahoo to take advantage of machine learning. With its new Machine Learning (ML) Studio service running on Azure, Microsoft is hoping to open it up to anyone who understands statistics – and make it easy to use the predictions from the machine learning models you come up with in the apps where they will be most useful.
"I'm building this to be easy enough for a high school student to use," Microsoft corporate vice president Joseph Sirosh told TechRadar – and he knows how hard machine learning can be, having built Amazon's recommendation engine.
With ML Studio, Microsoft is giving businesses access to the tools it uses internally. Microsoft has been working on machine learning for two decades, Sirosh points out. "It's integrated into Bing, into Xbox, into the fabric of most key products we have – including Cortana. We have a tremendous amount of experience with machine learning and how to do it at internet scale and we're bring a lot of that experience into the product."
You will still need to be a data scientist or at least have mathematics and statistics experience to get the most from the service, which you'll be able to try out in preview next month. But that's not what made McKinsey say businesses can't find the hundreds of thousands of data scientists they want to employ.
"It's not that people don't exist with the math knowhow; every graduate in engineering or mathematics or statistics will have some of the background to be productive data scientist and machine learning people," says Sirosh. It's that they haven't had good, fast, cheap, simple tools to work with.
"Today data scientists have to know so many complex tools; they have to be both a data engineer and a mathematician to get things done," he explains, adding that they might need 10 different packages to try enough machine learning models to solve one problem.
"And those tools are extremely expensive and have a big learning curve; you have to be spending a huge amount of time to get productive with them. It's a huge stumbling block.
"What we're doing here is to make this so much simpler; you just have to know your data, know how to set up and frame your problem and then build the machine learning model. And for deployment, previously you had to hand it over to IT or to an engineer with lot of sophisticated programming experience to hook up. Now the data scientist can do it.
"The way we are changing the game is we allow you build these scalable systems in the cloud that can handle any transaction load, that allow you to do sophisticated deployments with very little effort and that is incredibly empowering. What we're changing with this tool is extending the reach to a very broad class of developers. You can hook it up to a web site and it will just work!"
Putting big data to work
That's a far cry from the complexity of big data systems, which is part of the reason Gartner's hype cycle report has just said big data is not delivering the benefits to most companies that have been claimed for it.
"We are really hoping to pull big data out of its trough of disillusionment," says Sirosh, "and the reason for that disillusionment is that today big data allows you to store big data but analysing it and making use of it is incredibly hard, and this it is very hard to hook it up in operational systems. That's key.
"At the end of the day if you want to get real benefit out of it you have to hook it up to systems that actually affect customers or help you anticipate things and create benefits in automated ways. That kind of automation is what our tool really excels at."
What you can do with machine learning is almost limited only by your imagination. There are some obvious areas and the ML Studio service includes a lot of samples for building tools like recommendation engines that you can pull your own sales data into and connect to an ecommerce site.
One of the many teams inside Microsoft using ML Studio applied it to fraud detection around Windows licence keys. "They got an extra 20-30% savings on fraud," claims Sirosh.
He expects a lot of businesses to use it for sales forecasts, which he sees as a natural progression from storing your data in a database and analysing it with business intelligence tools.
"The great thing about machine learning is it allows you to learn from data and adapt to changing circumstances, and it's also predictive. Traditional analysis and analytics lets you look into the rear view mirror, lets you look at the past, analyse data by slicing and dicing it.
"Machine learning lets you interpret the future, looking forward at what is going to happen. You can forecast what is going to happen, you can forecast demand, you can forecast fraud. Once you have those early warnings and forecasts you can act on them."
Out of reach
That kind of prediction is normally out of reach for smaller businesses, he points out. "If you're trying to forecast demand for the product you're selling on your website for the next week, the traditional way smaller companies do this is in Excel; you work with your historical data and maybe you try to include seasonality in your spreadsheet." But Excel spreadsheets are fragile; formulae get out of data as things change, it's far too easy to accidentally mess up your data and it's difficult to manage a lot of data.
"With this tool you can do it at scale, with more historical data in the cloud, very simply. But then you put it into production with an API," he says. "It's an API that's in the cloud that can be called from applications and you get instant results that can feed into your inventory planning systems, into your ordering systems that get you more inventory. It's the automation that makes a huge difference. You want the automation to be able to serve this up in website and in apps and that now becomes possible, in a very simple way."
Sirosh hopes that will enable lots of new applications for machine learning. One interesting area is predictive maintenance for machinery, which he says is not widely used today. "In our labs we're working with data from escalator failure, and we found we could predict failure a week in advance of some major shutdowns."
If you know when a part was likely to fail, you could order a spare and have it ready to fit and save time, inconvenience and probably some significant repair bills. The same principle could save lives in healthcare.
Microsoft's research team in New England looked at hospital readmissions and discovered that a high percentage were people who hadn't understood how to take their drugs and a follow-up phone call was often enough to keep them out of hospital.
The potential is huge, Sirosh believes. "There are a large number of applications where forecasting and predictive anticipation is going to be an incredible empowering thing. Applications we haven't seen to today will rapidly emerge."
He's particularly excited about what we could do with devices and sensors. "Every mobile app can now be intelligent. Every Internet of Things sensor can now send data to the cloud and call into APIs that provide it with intelligence." But in every area you can think of, he's expecting an explosion of machine-learning powered improvements now so many more developers can experiment with it.