
At the center of some of healthcare’s most important conversations about patient access, provider burnout, and interoperability is a single pain point: unstructured data.
When the HITECH Act pushed healthcare into the digital age, it prompted a shift to structured data formats that our industry has yet to master. As a result, interoperability suffers, patient data is incomplete or missed, and providers and staff spend more time than ever wading through data that may not net relevant insights.
Recently, my team met with a focus group of several members of the College of Healthcare Information Executives (CHIME) to discuss this critical issue. Though this challenge has plagued healthcare for many years, these executives uncovered new pain points, nuances, and believe it or not, hope, when it comes to handling unstructured data.
Key Takeaways:
- The Core Issue: Unstructured data is the primary bottleneck for healthcare interoperability, provider burnout, and patient access.
- Hidden Sources: Unstructured data doesn’t just come from scanned documents; it is increasingly generated by poor third-party digital integrations.
- The True Cost: Unstructured data negatively impacts organizational credibility, security, staff time, and ultimately, patient outcomes.
- The Solution: Healthcare leaders are turning to AI and Intelligent Document Processing (IDP) solutions to automatically structure data and integrate it directly into the EMR.
Where does unstructured healthcare data come from?
When asked about their organizations’ biggest sources of unstructured data, most respondents could easily list off the usual suspects. Imaging files don’t fit within the neat boxes of an EMR. Data from pharmacies, other hospitals, consult notes and provider referrals don’t always come in a standardized format.
Scanned documents were also listed as a source of unstructured data. Organizations routinely dedicate staff time to manually reviewing physical or scanned documents and entering data into the EMR to structure it appropriately.
But the most surprising aspect of this discussion was that organizations are increasingly struggling with unstructured data from within modern digital tools themselves.
One executive in the focus group noted that much of their organization’s unstructured data comes from third-party integrations. While the data starts off structured in one system, integration challenges can mean the information becomes translated into unstructured data once it reaches the EHR.
The example was confirmed by multiple other people on the panel. It’s a stark reality, but unfortunately, even when providers implement the latest and greatest “best of breed” digital solutions, they can still result in a backlog of new unstructured data. To combat this, executives noted that when they’re shopping for new vendors, they de-prioritize point solutions. Gone are the days of implementing multiple one-off digital tools or apps. Today’s healthcare leaders are looking for large, integrated solutions that will minimize digital and data-sharing friction, an experience they’re collectively struggling with today.
What are the costs of unstructured data in healthcare?
The cost of unstructured data is hard to quantify, especially because it is so ingrained in today’s healthcare workflows. Many healthcare organizations have found ways to work around unstructured data, mostly by throwing more staff hours at the problem. But when we asked these executives what unstructured data was costing their organization, several answers bubbled to the top:
- Credibility. A senior analytics manager noted that a lack of clean, structured data causes a downstream effect of incomplete or inaccurate reports or deliverables. Multiple instances of errors or having to rescind information has damaged the organization’s credibility both with internal stakeholders and the public.
- Security & compliance risks. One health system executive observed that many recent — and highly notable — data breaches have originated from unstructured data. Outdated or moved files, spreadsheets, and information on shared drives are prime areas for hackers to explore, and unstructured data sources can contain sensitive information they can exploit.
- Time. Regardless of the size of the organizations represented, most executives agreed that unstructured data cost them time and resources. One practice leader told us they had two people dedicated just to referral management and inbound messages. An informatics executive noted they had three or four staffers dedicated to the same tasks. While staff do manage requests and referrals that include unstructured data, several of the organizations reported a backlog of this work, which they said can result in patient leakage if organization staff don’t reach out quickly enough.
- Patient outcomes. With more than 80% of healthcare data estimated to be unstructured, one executive noted there is an untapped value in these disparate records. While they could include insights that can change a treatment or diagnosis, they might not. Without manually reviewing these sources, there is no way for providers to know. Nearly all the executives, most of whom had clinical backgrounds, agreed that this unmined information represented the strongest cost of unstructured data.
What can we do about healthcare’s unstructured data problem?
Unstructured data has been an enormous challenge in healthcare for many years, and one executive joked that parts of our conversation felt like déjà vu. The industry knows this is a problem, so what has materially changed that can help us tackle it in a new, more effective way?
Almost all the executives agreed: AI.
Many of them discussed pilots their organization had implemented or were planning to implement, including solutions using Optical Character Recognition (OCR) and Intelligent Document Processing (IDP), which can scan a document in any format — structured or unstructured — and place relevant information directly within the EMR. It’s an approach that many leading healthcare organizations are already adding to documentation and digital fax workflows to reduce manual staff review and ensure data capture without adding new point solutions. Likewise, many of the organizations were looking at applications for LLMs to augment existing tools, or finding ways to use AI-based solutions to parse and update records into structured documents.
As one attendee said at the end of our discussion, healthcare moves slowly. But, outlooks were optimistic about the rapid adoption of AI to deal with unstructured data. Though it’s not a new challenge, healthcare organizations are increasingly finding ways to apply new technologies across the tech stack — not just within a single tool — to drive real change in advancing interoperability, patient care, and staff balance by tackling the unstructured data problem head-on.
About Stacy Pur
Stacy Pur, MBA, BSN, RN, is Senior VP of Product for eFax® by Consensus Cloud Solutions. A global leader in secure data exchange, eFax® is evolving the future of digital fax to serve healthcare and other highly regulated sectors requiring mission-critical reliability.
