Preparing Healthcare Data for AI: Why Health Systems Must Fix Legacy Systems

**Zach Evans, Chief Technology Officer at Xsolis**

Artificial intelligence (AI) promises to transform healthcare operations and decision-making. Healthcare alone now captures nearly half of all vertical AI spend – approximately $1.5 billion in 2025, more than tripling from $450 million the year prior and exceeding the next four verticals combined. Yet many organizations discover that their AI initiatives stall before they deliver value. The problem often isn’t the AI models themselves, but rather the data behind them.

For healthcare leaders preparing their organizations for AI, ensuring that data is truly AI-ready is one of the most important steps they can take.

Warning signs

One clear warning sign of unprepared data appears during early AI pilots. If teams spend more time explaining why a script produced incorrect results than they would have spent completing the task manually, the underlying data likely isn’t ready. In many cases, subject matter experts find themselves repeatedly correcting AI outputs instead of refining them.

For example, clinical decision support tools might label nearly every patient as “high risk” because the system can’t distinguish between resolved and active medical conditions. In these situations, the AI model isn’t malfunctioning; it’s producing results based on incorrect, incomplete, or poorly labeled data.

Much of this incompatibility stems from the way healthcare data systems were originally designed. Many were built to support operational tasks such as billing, transaction processing, compliance reporting, or sales management — not advanced analytics or predictive modeling. Electronic health records were designed primarily for billing and documentation, while customer relationship management systems tracked sales or outreach. These systems function well for their intended purposes, but the data they produce might lack the consistency and context that AI systems require.

As a result, data fields can contain inconsistent or ambiguous information. A temperature field might include both Fahrenheit and Celsius values without any indication of which unit was used. Diagnosis codes might appear as free-text descriptions instead of standardized medical terminology. Medication names might show up as brand names, generic names, or internal codes — or some combination of the three — depending on which system generated the record.

None of these practices were necessarily mistakes when the systems were created, but they become major obstacles when organizations try to use the data for pattern recognition and predictive modeling.

Cleaning up the mess

The scale of this challenge is significant. In my experience working across health systems, less than 20% of enterprise data is ready for AI without substantial preparation. Research supports this gap: a 2025 Wolters Kluwer survey found that most healthcare organizations are not yet ready to harness the full value of AI, despite widespread enthusiasm for it.Healthcare organizations process enormous volumes of clinical messages and records every day, and the variation in how different systems represent the same information can be staggering. One hospital may record an “admission date,” while another records a “bed assignment time.” Although both refer to similar events, the lack of standardization complicates analysis.

Preparing data for AI therefore requires both technical work and organizational change.

Technically, teams might need to standardize codes, fill in missing attributes, reconcile duplicate records, and normalize inconsistent fields. However, these steps alone are not enough. The larger challenge is changing how data enters systems in the first place. Organizations need strong validation rules at the point where data is captured, clear data stewardship roles, and feedback loops that show employees how their data quality affects AI performance.

Poor data quality also has broader consequences beyond failed AI projects. Analysts spend valuable time cleaning data instead of analyzing it. Decision-makers delay important choices while teams verify conflicting reports. Departments develop workarounds and manual processes that eventually become permanent parts of the workflow. Over time, organizations risk running critical operations on spreadsheets and informal knowledge rather than trusted enterprise systems.

A Practical Path Forward

Ultimately, organizations that succeed with AI tend to be those that invested in strong data practices long before AI became a strategic priority. Improving data quality strengthens reporting, compliance, and operational decision-making even before AI enters the picture.

For many CIOs and technology leaders, the most effective entry point is to examine a single high-stakes workflow (prior authorization, denial management, or clinical documentation, for example), fix the data issues within it, and establish better standards going forward. In an industry where data quality directly affects patient outcomes and reimbursement, the organizations that invest in getting their data right won’t just be better prepared for AI — they’ll be better operators today.

About Zach Evans

Zach Evans is the Chief Technology Officer with Xsolis, an AI-driven health technology company headquartered in Franklin, Tennessee, where he is responsible for using Xsolis’ proprietary real-time predictive analytics and technology to support client objectives and internal business operations.

Preparing Healthcare Data for AI: Why Health Systems Must Fix Legacy Systems

Company

Editorial Coverage

Connect

Subscribe to HIT Consultant Media

Reader Interactions

Footer

Company

Editorial Coverage

Connect

Subscribe to HIT Consultant Media