• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to secondary sidebar
  • Skip to footer

  • Opinion
  • Health IT
    • Behavioral Health
    • Care Coordination
    • EMR/EHR
    • Interoperability
    • Patient Engagement
    • Population Health Management
    • Revenue Cycle Management
    • Social Determinants of Health
  • Digital Health
    • AI
    • Blockchain
    • Precision Medicine
    • Telehealth
    • Wearables
  • Life Sciences
  • Investments
  • M&A
  • Value-based Care
    • Accountable Care (ACOs)
    • Medicare Advantage

How Healthcare CIOs Can Solve the Unstructured Data Crisis and Reduce Storage Costs

by Krishna Subramanian, COO Co-Founder at Komprise 08/27/2025 Leave a Comment

  • LinkedIn
  • Twitter
  • Facebook
  • Email
  • Print
Krishna Subramanian, COO & Co-Founder at Komprise

Healthcare organizations are experiencing a seismic shift in how they handle unstructured data. From digital pathology to high-resolution imaging, genomic sequencing, and machine-generated sensor data, the volume of data that doesn’t fit neatly into rows and columns of a database has exploded across hospitals, research labs, and academic institutions.  

While this unstructured data is intrinsic to patient care and scientific discovery, it’s also creating a mounting crisis for enterprise IT organizations: ballooning storage costs, constrained infrastructure, security and privacy risks, and a mess to untangle for AI.  One-third of the world’s data comes from the healthcare industry – and it’s growing faster than in most other sectors, according to RBC Capital. “Hospitals produce an average of 50 petabytes of data each year, with as much as 97% of that data going unused,” according to the World Economic Forum.   

Behind the scenes lies a hard truth: this “cold data,” data that hasn’t been accessed in over a year or more, continues to occupy premium, on-premises and cloud file storage and is consuming outsize costs to store.  A lack of visibility across disparate data silos makes this difficult because IT managers and storage administrators don’t have enough insight as to whether the data can be moved, archived, or deleted and they need to get buy-in from departments before making changes.  

A more nuanced, collaborative approach to unstructured data management for healthcare organizations is now possible. This can reduce unnecessary costs to store and manage data and make it more useful and accessible by researchers, data scientists and analytics teams as well as departments and teams now driving AI initiatives. 

The Growing Cost of Inaction 

The size of healthcare data is a major factor in escalating costs: single X-ray and CT images can consume as much as 30 megabytes each. If a facility takes just a few dozen images each day, they quickly add up to fill many gigabytes’ worth of space each month.  It can take as much as 200 gigabytes of storage to sequence just one person’s genes. 

Contributing to the problem is data hoarding, which occurs when researchers or clinicians hang on to files indefinitely “just in case.” Without tools to understand or classify their data, teams often keep all of it. IT infrastructure and operations, in turn, ends up supporting an ever-expanding storage environment that must meet security, compliance, backup and performance standards. 

Addressing the Data Hoarding Dilemma 

To break this cycle, healthcare IT organizations should consider collaborative unstructured data management strategies that involve both automation and user participation.  

The goal is threefold: 

  1. Accelerate data tiering (online archiving) of cold data to lower-cost storage without changing how users access the data. 
  2. Give departmental users the tools and visibility they need to make informed decisions about their own data. 
  3. Prepare data for safe AI ingestion. 

This shift begins with data discovery and classification. A global index of unstructured data across on-prem and cloud storage can show which files are used frequently, which are inactive, where data resides, and how fast it’s growing. 

With a more detailed picture of the data profile, IT infrastructure managers can define automated policies to tier cold data after a certain age (e.g., 12 months) to more affordable cloud-based object storage. These rules can be enforced centrally with transparent data movement, so users continue to see their files from the original location. This way, researchers don’t waste their valuable time identifying data for tiering and archiving. 

Key departmental users can also get read-only access to dashboards and reports where they can see their departmental data footprint and do their own analysis.  These reports allow department heads, researchers and clinicians to tag additional files or folders for archiving, such as data from completed studies.   

This dual-pronged strategy of automating age-based transparent tiering and Involving users in additional data tiering decisions can double or even triple the amount of data that is moved off premium file storage—saving a large institution six or seven figures annually on data storage and backup costs. Data is never deleted: just relocated to less expensive, durable storage. Even better, this allows the data owners, who know the data best, to be involved in its management, engendering better relationships with IT. 

An additional use case here is ransomware protection. WIth data tiering to immutable object storage such as AWS S3 Object Lock or Azure Blob Immutable Storage, data cannot be modified or deleted. Since many large healthcare organizations may not be protecting all their data equally from ransomware actors, now they can do so because the economics are right. 

The Benefits of Department Collaboration on Data Management 

For collaborative data management to succeed, users must trust that moving or tiering data won’t interrupt their work while IT still achieves goals for lower costs and lower complexity.  Benefits can include: 

  • Non-disruptive access: Tiered data should remain visible in the same file paths. If a user clicks on a file that has been archived, it should open just like any other file without the need to call IT. 
  • Accessible dashboards: Users should be able to see basic file metadata (owner, size, age, last accessed) on dashboards. This can reveal cold vs. hot data, growth trends, and cost implications. 
  • Easy metadata enrichment: Users can tag files as ready for archiving by IT. Additionally, they should be able to tag directories or folders of files by project name, clinical area, or research keywords. When needed, IT can apply AI tools to help in this effort by scanning files across large data sets, inspecting file content and delivering a subset of data that can be tagged with keywords. Unstructured data management software can apply tags automatically by policy. By enriching file metadata this way, it’s easier and faster for users to curate the data they need for projects. John 
  • Support for Chargeback:  Many organizations are deploying showback or chargeback models for IT services like storage. For example, a department might only be charged for the data it keeps on expensive primary storage; anything archived to the cloud is free. Detailed reporting on data usage and costs helps departments plan ahead. 
  • Storage Cost Savings: For a large healthcare system, savings in the millions of dollars annually is possible, since they can rely less on expensive primary storage and therefore defer hardware purchases. 
  • Better Classification and Search for AI:  Integral to AI is high quality unstructured data. Metadata enrichment through data tagging gives more structure and context to file data so that it can be categorized and leveraged in AI data workflows for clinical research. Authorized users can search for specific files and folders of interest across their entire data estate without IT assistance. 

Petabytes of unstructured data are a blessing and a curse for healthcare CIOs. This data is an asset for future research to improve facility operations, diagnostics, treatments and outcomes for patients. It’s what healthcare CEOs and boards are clamoring for in the race to be profitable and attract patients and clinicians. Yet this data comes with high costs and compliance risks if not managed systematically. Instead of looking for more budget to expand storage capacity in the data center, IT leaders should start by understanding and classifying their data. By doing so, they can take advantage of lower-priced storage while supporting departments with new data services. 


About Krishna Subramanian 

Krishna Subramanian is the COO, president, and co-founder of Komprise. In her career, Subramanian has built three successful venture-backed IT businesses and was named a “2021 Top 100 Women of Influence” by Silicon Valley Business Journal.

  • LinkedIn
  • Twitter
  • Facebook
  • Email
  • Print

Tap Native

Get in-depth healthcare technology analysis and commentary delivered straight to your email weekly

Reader Interactions

Primary Sidebar

Subscribe to HIT Consultant

Latest insightful articles delivered straight to your inbox weekly.

Submit a Tip or Pitch

Featured Interview

Reach7 Diabetes Studios Founder Chun Yong on Reimagining Chronic Care with a Concierge Medical Model

Most-Read

EVERSANA and Waltz Health Merge to Redefine Pharmaceutical Commercialization

EVERSANA and Waltz Health Merge to Redefine Pharmaceutical Commercialization

Tempus AI Acquires Digital Pathology Leader Paige for $81.25M

M&A:Tempus AI Acquires Digital Pathology Leader Paige for $81.25M

Advancing Diabetes Care: Combating Burnout and Harnessing Technology

Advancing Diabetes Care: Combating Burnout and Harnessing Technology

White House Event Unveils CMS Health Tech Ecosystem Initiative

White House Event Unveils CMS Health Tech Ecosystem Initiative

Meaningful Use Penalties_Meaningful Use_Partial Code Free_Senators Urge CMS to Establish Clear Metrics for ICD-10 Testing

CMS Finalizes TEAM Model: A New Era of Value-Based Surgical Care

HHS Finalizes HTI-4 Rule: Prior Authorization & E-Prescribing Interoperability

HHS Finalizes HTI-4 Rule: Prior Authorization & E-Prescribing Interoperability

Digital Health Faces Q2'25 Pullback: Funding Falls to 5-Year Low, But AI Dominates and $1B+ IPOs Emerge

Healthcare Investment Shifts in 1H 2025: AI Remains a Bright Spot Amidst Fundraising Decline

Digital Health Faces Q2'25 Pullback: Funding Falls to 5-Year Low

Digital Health Faces Q2’25 Pullback: Funding Falls to 5-Year Low

Beyond the Hype: Building AI Systems in Healthcare Where Hallucinations Are Not an Option

Beyond the Hype: Building AI Systems in Healthcare Where Hallucinations Are Not an Option

Health IT Sector Navigates Policy Turbulence with Resilient M&A

Health IT’s New Chapter: IPOs Return, Resilient M&A, Valuations Rise in 1H 2025

Secondary Sidebar

Footer

Company

  • About Us
  • Advertise with Us
  • Reprints and Permissions
  • Submit An Op-Ed
  • Contact
  • Subscribe

Editorial Coverage

  • Opinion
  • Health IT
    • Care Coordination
    • EMR/EHR
    • Interoperability
    • Population Health Management
    • Revenue Cycle Management
  • Digital Health
    • Artificial Intelligence
    • Blockchain Tech
    • Precision Medicine
    • Telehealth
    • Wearables
  • Startups
  • Value-Based Care
    • Accountable Care
    • Medicare Advantage

Connect

Subscribe to HIT Consultant Media

Latest insightful articles delivered straight to your inbox weekly

Copyright © 2025. HIT Consultant Media. All Rights Reserved. Privacy Policy |