In case you haven’t noticed, the age of artificial intelligence is upon us. What was once reserved for science fiction plots is now playing out in real life to save real lives, as a matter of fact. Chances are you’ve already heard of deep learning’s potential; remember IBM’s Watson and his impressive Jeopardy win?
Well, soon computers will be using deep learning to determine whether or not human health is in jeopardy, thanks to highly sophisticated algorithmic methods that better detect diseases and cancers from scanning and comparing medical images. At forefront of this next frontier is San Francisco-based startup Enlitic. At the helm, is its founder and CEO Jeremy Howard, who kindly answered a few of our questions so we could better understand how computers are outpacing human capability and how that’s a positive for healthcare:
Q
My basic understanding of deep learning is that it is a sect of machine learning that allows you to make swift use of a computer’s data sources through algorithmic abstractions. What am I missing from this definition if anything, because this is a pretty complex concept to wrap our heads around: computers “learning.” Why is it we are just starting to scratch beneath the surface on this new level of sophistication? What’s taken so long to get to this point of innovation?
We have to start by understanding what machine learning is, since deep learning is a way of doing machine learning. Machine learning refers to any approach that allows computers to learn to solve a problem on their own, by seeing examples of solutions. It differs from standard computer programming in that normally we have to tell a computer the exact steps to solve a problem. With machine learning, they learn the steps for themselves.
Machine learning is a technique that goes back to 1954, when it was used to create a checkers-playing program that beat its own programmer. But until recently, machine learning could only tackle a small range of tasks, particularly those that use what we call structured data – that is, data in rows and columns, such as what might come from a database table. Most data that humans use to work with every day is not structured; it’s in the form of images and sounds and natural language and so forth.
Deep learning is a technique that brings machine learning to these types of unstructured data. It allows us to compute with images and sounds and natural language and such things. It is inspired by how the brain works with these kinds of data, but in a very simplified form. The idea of doing this is decades old, but it is only in recent years that computers became powerful enough and deep-learning algorithms have become effective enough that this approach became the state of the art in many areas. As computers continue to become faster, and data continues to grow in capacity, and more and more people work on deep learning, it’s making the algorithms better and better. Deep learning is now becoming the state of the art in more and more areas, and therefore is improving more and more rapidly.
Q
I know in your TED talk you talked about computers seeing, hearing, talking, and writing. But let’s be honest, they aren’t actually doing these things autonomously. Someone still has to feed the computers the data and then create and train it to understand directives through algorithms. Therefore, isn’t saying that computers are “learning” or mimicking the capabilities of humans’ a bit of stretch, even if they are surpassing our capabilities?
They are certainly not mimicking the capabilities of humans – but they most certainly are learning! Computers are being presented with pictures, and being told what they are pictures of. And they are learning to recognize them on their own. This is humans do too. We show a baby a cup, and we say, “This is a cup.” And we do it many times. In traditional computer programming, on the other hand, we have to tell it exactly how to recognize a cup, step by step.
I’m not saying that the way humans learn and deep-learning learns is identical. They’re definitely not. But that doesn’t mean we can’t call it “learning”.
Q
With Enlitic, you are using deep learning to scan images for diseases, brain tumors, and cancers in the hopes of detecting these ailments more quickly and accurately. To do this you have to train the computer first a bit, something that takes around 15 minutes? How can you be sure the data will be solid and clean enough to carry out these functions?
We can’t until we try. So we try, and we end up with an algorithm trained on that data which then has to be validated. The validation process consists of a number of parts. One is that we try to get hold of a data set that contains a ground truth. A ground truth, for example, for lung cancer screening would be a data set that contains the actual pathology results of a follow-up biopsy for those patients. We compare the predictions from the algorithm to the ground truth to see how accurate they are. We also compare the accuracy of radiologists to that ground truth, and as a result we can compare the accuracy of the algorithm to the accuracy of radiologists, and actually get the right answer.
Sometimes, in fact quite often, we can’t get a ground truth. For example, for bone fractures there is no biopsy or other gold-standard answer. In these cases, we audit the algorithm by finding the cases where the algorithm differs in interpretation from the original radiologist, and we then have a panel of radiologists look at the places that they differed, and look closely to try to determine which one is correct. Sometimes they won’t know, in which case they say so. But quite often, when they scrutinize it closely as a group, they can actually determine with some high level of certainty which was correct.
Those are the things that we do to validate the algorithm to make sure that it’s effective.