How Machine Learning In The Database Can Change Industries And Save Lives
It seems like every year Microsoft CEO Satya Nadella tweaks the company’s technological direction. Microsoft was the productivity company. Then the mobile first, cloud first company.
After the Microsoft Build 2017 developer conference in Seattle last week, Microsoft is now the intelligent cloud, intelligent edge company.
A casual observer might think that Nadella can’t get his story straight. What is Microsoft, really? That’s the wrong frame of mind to take. The “new” Microsoft knows much better what it is than anytime in the last 15 years. Nadella’s yearly declamations are really about the evolution of computing.
So, what does “intelligent cloud, intelligent edge” actually mean?
It’s the denotation that the climax of the period of growth defined by the rapid rise of smartphones and the cloud is coming to an end. The coming era will be defined by machine and deep learning and artificial intelligence, built on top of the mobile/cloud model.
And Microsoft is showing how machine learning can and will invade every aspect of human industry, behavior and business.
Intelligence On The Edge
In 2016, we wrote about how the client-server model of computing has evolved in the mobile era. Intelligence evolves towards the edge. Computing has moved from massive mainframe access by terminals, to databases and personal computers, to the cloud and mobile devices. The intelligence continues to move down the scale as the client (edge) devices adopt new hardware, become more power efficient and have more sophisticated software. From the smartphone, the intelligence began to move to the Internet of Things and so forth.
Machine learning models throw a fork in this evolution and represent a massive jump in intelligence on the edge that cannot necessarily be accounted for by Moore’s Law.
As Microsoft has shown, machine learning models can be moved to the edge by bringing artificial intelligence capabilities that used to only be able to run in the cloud to the device. This is done by building some compute in to edge devices (such as CPUs and GPUs etc., as we have seen with IoT maturation) and by bringing cloud computing capabilities to the edge through virtual machines and Docker-style containerization.
Edge devices will soon be able to run their own machine learning processes and be able to triage and understand data without pumping all of that information back to the cloud. This is exactly what Microsoft’s new Azure IoT Edge product is designed to do.
But we can take this concept a distinct step further.
Machine Learning In The Database
The migration of intelligence from the cloud to consumer edge devices is a pretty basic and simple notion. The potential for an explosion in comprehensive intelligence is immense.
But the consumer edge is not the only edge that matters. Nor is it likely the most important.
Think of a database. What is it, exactly? What does it do?
Database management is a concept that has been around basically as long as there have been computers. How you store, transfer and handle data has been a constant problem to solve for just about forever. Each new era of computing brings with it new data challenges.
The issues surrounding capital letter Big Data that spawned out of the data explosion in the mobile era are beginning to be solved by the pure power of the cloud and artificial intelligence. Database management has become a commoditized industry.
“If you look back historically at what customers’ database platforms were for, a lot of it was just about data management. High availability, data recovery, virtualization rights, things like that,” said Rohan Kumar, general manager of database systems at Microsoft.
If database management is easy and boring in 2017, what is the next step?
One of the biggest problems in the era of Big Data is data portability. You may want to use machine learning in the cloud to run predictive analytics or computational understanding, but moving that data from one point to another is problematic. The Internet may be faster than ever, but the amount of data coming out of hundreds of intelligent devices in a factory or a hospital can easily overwhelm the capacity to transfer that data.
“The challenge there is that the data volumes are very high,” Kumar said. “You cannot move data around, that’s just too expensive. Once the data gets stored, it is sticky. Expecting terabytes of data to move is just not possible. And you want real time intelligence as data is coming in.”
If we consider the fact that an on premises database is essentially an edge device (just a little bit bigger and more sophisticated), then the answer is obvious: add machine learning and artificial intelligence capability to it.
On premise machine learning in databases will be critically important to the next evolution of artificial intelligence. Databases are what take artificial intelligence to the edge and act as the middleman between the edge and the cloud.
For Microsoft, the steps were to make database functions run in a world defined by machine learning. The best way to do that is, instead of building new tools, bring the right tools to the database. That meant using the R programming language to help move data and Python to help run the machine learning algorithms.
“GPUs are a hardware unit that are very optimized to certain kinds of processing. Number crunching, image processing, graphics of media created. Then we basically said that the logical next step for us to take after R is to get Python in there because Python is the language that has the bindings to all of these deep learning libraries,” Kumar said.
“Naturally, based on that we are working on graph capabilities because that is where relationship management comes from. Again, going back to the intelligence value chain: Graph + R + Python + deep learning together, we believe, creates a very powerful system.”
Practical Application Of AI In The Database
Placing machine learning and artificial intelligence in the database is all well and good from a theoretical standpoint. But what can it actually do?
How about save lives?
Kumar explains the case of lung cancer diagnosis:
If you take a look at the use cases of this, one of them is lung cancer. Typically what happens is you do the CT scan image of your lungs and the doctors effectively look at the picture and based on certain things, they determine the high probability or low probability of lung cancer.
Typically what happens is that this is hard for the person or patient to hear. So they go for second opinions, third opinions. Because they can never be sure. They are using their information to make a judgment call.
Now imagine if we can access the collective intelligence of all the lung cancer specialists. Across all of their patients’ data. This is what you looked at and decided that there is a high probability of a chance of lung cancer or not. And use deep learning to use that data and capture the intelligence that has been done by all of these doctors.
So now, when you look at a picture and score against that model, the machine says this is the probability of lung cancer. And the doctor can now augment that decision and say, yeah, that makes sense. Because at that time you are not just using your decision making powers, or how you think of it, but using the powers of all the other specialists and their patients.
Think of the possibilities there. Every doctor, instead of being an individual island of knowledge, becomes the culmination of all of history’s accumulated knowledge, instantly.
“To me, deep learning will revolutionize healthcare,” Kumar said. “The amount of time it took for all the opinions and the guesswork in terms of whether treatment should be started or not, doctors are going to get a lot more concrete data, which is going to augment their decision making.”