According to IDC, all the data that’s being created and captured in healthcare is projected to grow by 36 percent (CAGR)—more than any other industry.
While managing growing volumes of data is a common challenge for many organizations across all sectors, healthcare is uniquely set apart by the sheer number of new data sources that’s being made available—with new sources being added all the time. This is driven in large part by advancements in telemedicine, personalized or precision medicine, as well as IoT-based medical and personal health devices. While this flood of real-time data and analytics adds to the opportunities for all kinds of data-driven benefits like more advanced and customized care as well as faster drug development, it also means healthcare organizations will have to manage increasingly large and varied data assets. This is creating some big challenges that these organizations will need to resolve, including how to ingest and organize the information, ensure it complies with HIPAA and other regulations, and make it valuable for all stakeholders.
The problem is compounded as healthcare focuses on population health management and becomes more preventative rather than merely reactive. This, of course, requires capturing and analyzing even more data that can be used to detect early indications of health risks. Meanwhile, there has been a shift toward more remote health monitoring and response. You may go to Kaiser and think your medical professionals are all on site, but it’s becoming more common for hospitals to tap the expertise of specialists who could live elsewhere on the globe. They connect via teleconference systems and trade data from different systems located in different countries, all with different regulatory requirements. Using data-driven collaboration to provide the best possible care for individuals or enable the most comprehensive research for global responses to disease outbreaks while meeting various compliance needs is no easy task.
To support these needs, IDC for its part recommends big investments in health IT, blockchain and analytics tools along with effective strategies for digital transformation. A big enabling part of this transformation requires using AI and machine learning technology that’s taught to recognize patterns in unstructured data and automatically converting it into structured data that can be retrieved and analyzed. This is how you automate many of the time-, cost- and resource-intensive manual processes that often sink an organization’s big data ambitions. These steps include:
1. System and Silo Consolidation:
The healthcare industry is constantly consolidating. This creates a challenge in integrating all the disparate systems and data silos that need to come together to provide a big data ecosystem that can draw from all the incoming streams of data and various data sources. This means everything from hospital monitoring machines to personal IoT-enabled medical devices. Together, they can paint a holistic picture of a patient’s health and medical needs, accelerate pharmaceutical drug development and so much more. Using AI and machine learning-driven technology to automate data classification and consolidation across systems, departments and organizations around the world in this way can dramatically cut the time, cost and required expertise of migrating disparate data into centralized data lakes.
Furthermore, to avoid complicating efforts—it’s complex enough already—don’t try to build Rome in one day. Start with a few critical projects that require certain data that can be processed in order to form your project’s bloodstream. Focus on a few key systems and get them cleaned up. Don’t try to boil the ocean all at once. Settle on a few essential use cases to launch with. Apply your automation, curation, assessment of data quality, etc., and then use it in your AI and ML initiatives. Once you’re able to demonstrate success, steadily build on those successes.
2. Data Lake Management:
After suffering some setbacks due to improper management, data lakes are regaining the luster they first captured in 2010 when organizations began using them to cost-effectively store their raw data. The problem? The data lake is great for storing data, but not so great when it comes to generating value. Organizations would often dump their data there with no proper management, leaving the data to rot ungoverned and unused. But the emergence of the cloud has combined with the development of new AI-driven cataloging techniques that help automate and simplify many management functions that keep data lakes healthy. Organizations can now use them to combine data from different systems in one place where the stored data can be rendered governable, searchable and accessible.
3. Packaging Data:
Storing all your data in one data lake doesn’t automatically make it usable. All that data is still streaming in from all kinds of different sources, including medical records, patient surveys, cancer or cardiac registries, claims records, and so on. You need to be able to recognize and find data regardless of its source and then format and provision it for use according to what the use case requires. This is what will enable the self-service retrieval and analytics that today’s medical practitioners want in order to provide better care. Sure, they’re more data-savvy now than ever, but you still need to package data in a way that makes sense to them.
Healthcare generates oceans upon oceans of data, and all of it needs to be governed. This is an area that requires absolute automation to ensure every bit of data adheres to the rules governing that particular bit of data. All have to be maintained. Some data can be seen but not copied. Some data can be shared by one party with another party but only if anonymized. There are a lot of regulations and restrictions. Only granular governance will ensure you’re deriving the most value from both restricted and unrestricted data without breaking any industry or governmental rules—or disobeying the patient’s stated data privacy and security preferences. For governance to work, you need to make sure all your data is properly identified so that the automated enforcement rules they’re bound by can be applied.
As advancements continue to be made in AI and machine learning to further enable data automation, the healthcare industry is poised for a dramatic transformation that will greatly improve the quality of care for humankind. But data automation can’t be applied in one fell swoop. It requires deliberate implementation across many iterative stages. Making those modest moves to automate now will get you on track towards the giant leaps that data will undoubtedly make in the quality and effectiveness of healthcare.
About Alex Gorelik
Alex Gorelik, the author of the newly published book, The Enterprise Data Lake (published by O’Reilly Media), is CTO and founder of Waterline Data as well as three startups. He also served as GM of Informatica’s Data Quality Business Unit. In addition, Alex was an IBM Distinguished Engineer and co-founder, CTO at Exeros and Acta Technology.