World chess champion Garry Kasparov made headlines all over the world in 1997 for something that he probably wasn’t very excited about. He lost a chess match. To a computer.
It was an IBM supercomputer called Deep Blue and its victory in that New York City match marked the first time a reigning world chess champion had been defeated by a computer under tournament conditions. It was also a symbolically significant event, the first sign that artificial intelligence could become equal to or even superior to human intelligence.
It was the dawn of the Big Data era.
Deep Blue wasn’t just repeating commands that had been programmed into it ahead of time. It was actually learning from its opponents in real-time, applying those lessons to improve its own play. And that was more than 20 years ago. In the years since that match, the technology that was used to build Deep Blue has progressed exponentially, expanding into new applications that researchers never could have seen coming in 1997, including everything from supply chain logistics to agriculture, to manufacturing and more.
As Kasparov himself recently wrote: “Today you can buy a chess engine for your laptop that will beat Deep Blue quite easily.”
At the core of all of this is machine learning, a type of data analysis that automates the process of analytical model construction. It uses artificial intelligence to identify patterns in data so that software can make decisions and improvements on its own.
Machine learning is making Google’s self-driving car project possible, allowing the vehicle’s software to pick up and remember context clues from the real world to guide its driving.
It’s making the entertainment and purchase recommendations we get from sites such as Netflix and Amazon more accurate than ever before, aggregating the insights and preferences of millions of similar users all over the world.
And it is opening up new frontiers in medical science, playing an increasingly important role in understanding, predicting, and treating disease.
Big Data for Diagnostics
In medicine, the diagnostic process involves three primary categories: detection, the treatment decision, and the monitoring of that treatment. Ideally, we would be able to detect the first signals of disease, prior to the presentation of symptoms. However, the more likely case today is that a patient experiences symptoms and goes to their doctor for help. Using the information gathered in that visit, the physician then determines what course of treatment is best suited to that patient. Finally, the treatment is monitored overtime to make sure the patient’s health is restored and they can go on with their life.
Using machine learning, researchers are bringing new tools and efficiencies to each step of this process.
For instance, hospitals across China have been using artificial intelligence in recent weeks to help doctors quickly diagnose patients affected by the COVID-19 virus, training software that’s normally used to read CT scans for lung cancer to instead look for signs of coronavirus in suspected cases.
There are a number of companies working on similar solutions across the diagnostic landscape as well. With early cancer detection in mind, Freenome is using simple blood tests and AI-powered genomics to shorten diagnosis and treatment time, as are GRAIL and Thrive. On the monitoring side, Guardant Health is developing similar tests for recurrence monitoring in cancer survivors, while Lexent Bio is building novel liquid biopsy technology to change the way cancer is managed.
And this sector is growing rapidly. McKinsey expects Big Data and machine learning in pharmaceuticals and medicine to generate up to $100 billion in new value each year, due to the better decision-making it will enable, new innovations it will uncover, and improved efficiencies it will bring to the entire health delivery system.
But there’s a disconnect in disease diagnostics that only machine learning can help solve.
The Goldilocks of Data
Despite major breakthroughs in cancer treatments, less than 20% of patients today respond to even the most innovative immunotherapies. The reason is the industry’s reliance on imperfect, single-analyte biomarkers, using antiquated methods to poorly match patients to therapies.
At the same time, the growth of Big Data has left medical science facing a tidal wave of information of every shape, size and origin. It’s coming directly from physicians and clinics, from patients themselves, their caregivers, corporate R&D departments and many more sources. The problem, then, isn’t access to data, but rather synchronizing all of this information and making sense of it in ways that can improve overall care. There’s simply too much to take into consideration when making a treatment decision.
So, in order to extract the most value from the data we’re collecting, we need to find new ways to better analyze all of it and use those insights to help diagnose, treat and prevent disease.
Fortunately, there is a solution: doing more with less.
This isn’t unheard of in medical research. Consider modern immune profiling techniques. Rather than focusing on just single analytes, which fail to capture the complexity of disease, or the totality of all available data, RNA models called Health Expression Models have been generated to represent cell types such as a CD4-positive T cell or an M2 macrophage. Using this multidimensional approach affords improved sensitivity and specificity in detecting these cells. Taking this one step further, machine learning is now able to combine these immune signals into a multidimensional biomarker that hits the sweet spot of information to improve predictive accuracy for identifying responders to therapy. With this approach, the model can identify more than just one single factor but doesn’t get overwhelmed trying to work with a huge dataset.
It’s not too little data, and it’s not too much. It’s just right.
So, machine learning can be most valuable in diagnostics when it takes all of the data collected from myriad sources and hones it down into the most meaningful signals for further analysis. That way, researchers can tap the power and reach of Big Data to create diagnostic tools that are more accurate and more effective, all while delivering results that are faster than ever before.
We’re already seeing this in action with new developments in personalized medicine, such as Predictive Immune Modeling.
In order to leverage this approach, a patient’s RNA is collected from their tumor, sequenced and analyzed in order to model their unique biological composition. This model is compared to a model generated through the retrospective analysis of other patients who are responders and non-responders to therapy.
The datasets involved are huge, and the target outcome is focused on just a single patient. By using machine learning to trim down everything that’s been collected and focus on just that sweet spot in the data represented by a multidimensional biomarker, physicians and researchers will be able to develop personalized treatment plans quickly, while simultaneously improving outcomes.
This Predictive Immune Modeling approach also helps to address concerns that machine learning and artificial intelligence are somehow going to replace human physicians. The fact is, these technologies are at their most effective when working together with humans, taking on massive analysis tasks to help doctors do their work better. It isn’t about replacing what human researchers or physicians can do, but rather finding the most relevant signals for diagnostic development and treatment.
Whatever the case, machine learning, and artificial intelligence aren’t going anywhere. These technologies have already revolutionized industries of all types, and we’re just now beginning to understand the potential they hold to contribute to and exponentially expand the capabilities of modern medicine. As an industry, it’s on us to ensure that these tools are thoughtfully and carefully applied to maximize their positive impact on diagnostics, research, and the patients we treat.
About Natalie LaFranzo, PhD
Natalie LaFranzo, Ph.D. is Vice President, Market Development of Cofactor Genomics. Using Predictive Immune Modeling and RNA, Cofactor Genomics is building multidimensional models of disease to deliver true precision medicine for improving patient outcomes.