• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to secondary sidebar
  • Skip to footer

  • Opinion
  • Health IT
    • Behavioral Health
    • Care Coordination
    • EMR/EHR
    • Interoperability
    • Patient Engagement
    • Population Health Management
    • Revenue Cycle Management
    • Social Determinants of Health
  • Digital Health
    • AI
    • Blockchain
    • Precision Medicine
    • Telehealth
    • Wearables
  • Startups
  • M&A
  • Value-based Care
    • Accountable Care (ACOs)
    • Medicare Advantage
  • Life Sciences
  • Research

Avoid COVID-19 Modeling Pitfalls by Eliminating Bias, Using Good Data

by Kim Babberl, Product Consulting Group Director at MedeAnalytics 07/28/2020 Leave a Comment

  • LinkedIn
  • Twitter
  • Facebook
  • Email
  • Print
Avoid COVID-19 Modeling Pitfalls by Eliminating Bias, Using Good Data
Kim Babberl, Product Consulting Group Director at MedeAnalytics

COVID-19 models are being used every day to predict the course and short- and long-term impacts of the pandemic. And we’ll be using these COVID-19 models for months to come. While many of us in healthcare are not epidemiologists or data scientists, we’re all sifting through the data to get a handle on how many people are going to get sick, how many will end up in the hospital or on a ventilator, and ultimately, how many people will die. 

Government agencies are using models to set public policy, such as social distancing or shelter-in-place mandates, but confusion sets in because the various models often disagree. To understand the inherent disagreement in models, you must look at what goes into their development. Having this information will help you determine the best way to use and interpret predictive COVID-19 models.

Building a Model

For most of us, the process behind developing a model seems a little bit like the Wizard of Oz. It’s hard to pull back the curtain on the underlying details to understand how they work together to generate the ultimate output: predicting the future. 

To appreciate the power of a model’s predictions, it’s important to start with the inputs. Model-building is an iterative (and non-linear) process with five basic steps:

A close up of a logo

Description automatically generated

1. Problem identification A critical part of the process is identifying what you are trying to solve, which in turn will help you identify the data needed for the model and narrow the types of models to test. 

2. Data acquisition and cleaning This step consumes a significant portion of the time it takes to build a model. It is incredibly important to ensure the data is clean and accurate. Bad data produces bad models and ultimately leads to poor decision-making: garbage in, garbage out. (You also may need to return to data acquisition and cleaning multiple times throughout the model-building lifecycle.)

3. Model selection When selecting test models, you’ll need to select the features (input variables) to help the model predict the target variable (model output). In the COVID-19 models, examples of features and target variables include:

FeaturesTargets
Age, Gender, Ethnicity, Underlying Chronic ConditionsClassification of High Risk or Low Risk
– Percent of social distancing observed by the population
– Population density of location
– Date of first case of COVID-19 detected
– Percent of patients classified as high risk 
Number of positive COVID-19 cases
Number of patients who end up in ICU
Number of patients who need a ventilator
Number of deaths

4. Model fitting Use of different features, models and parameters for the same model, or even different data sources for the same target variables, will significantly impact the prediction.

5.  Model validation In building the model, you must thoroughly investigate data sources, methodologies used and assumptions made before accepting any predictions as reality.

How Certain is Certain?

It’s difficult to be certain about the outcomes derived through a predictive model because of the model’s inherent uncertainty. 

Outputs are sometimes deemed to be facts, rather than the probability-driven predictions they really are. As statistician George Box famously said, “All models are wrong, but some are useful.” When a COVID-19 model predicts 120,000 US deaths, many people: 

  1. Accept that with a degree of certainty, such as “This is what will happen”; and
  2. Apply a level of exactness to it, such as “Exactly 120,000 people will die,” despite the range of probabilities. 

Both are dangerous approaches to decision-making. 

For example, using data as of May 10, 2020, the IHME (Institute for Health Metrics and Evaluation) estimates COVID-19 related deaths in the US will reach 137,184 by August 4. Other models, however, state the range of probable deaths is 100,000 to 220,000,  a wide span of variability when you’re talking about human lives. Many people don’t understand the importance of considering this range to account for probability when using models for decision-making. This range also only accounts for errors inherent in the model itself; it does not account for errors created by using bad data or mistakes made by the person training the model when selecting parameters.

Ultimately, when using COVID-19 models to drive policy and to inform operational, financial or clinical decisions, proceed with caution and be sure to look beyond the graphs to the underlying assumptions and supporting data, including the potential for bias. You may find for yourself and your organization that the best option is to use your data and train the models yourself to ensure you understand its mechanics. Make yourself the Wizard of Oz.

You can find examples of COVID-19 models and predictions in the links below. As noted at the Centers for Disease Control and Prevention website, “It is important to bring these forecasts together to help understand how they compare with each other and how much uncertainty there is about what may happen in the upcoming four weeks.” 


About Kim Babberl 

Kim Babberl is the Product Consulting Group Director at MedeAnalytics. Before joining MedeAnalytics, she spent 11 years as a business analyst lead with a Blues system, and 10 years in public accounting, various healthcare consulting, auditing and analysis roles, supporting providers and payers.


Additional information about COVID-19 models can be found at:

  • https://covid19.healthdata.org/united-states-of-america
  • https://projects.fivethirtyeight.com/covid-forecasts/
  • https://www.cdc.gov/coronavirus/2019-ncov/covid-data/forecasting-us.html

  • LinkedIn
  • Twitter
  • Facebook
  • Email
  • Print

Tagged With: Box, Coronavirus (COVID-19), MedeAnalytics, model, Payers, risk

Tap Native

Get in-depth healthcare technology analysis and commentary delivered straight to your email weekly

Reader Interactions

Primary Sidebar

Subscribe to HIT Consultant

Latest insightful articles delivered straight to your inbox weekly.

Submit a Tip or Pitch

Featured Insights

2025 EMR Software Pricing Guide

2025 EMR Software Pricing Guide

Featured Interview

Paradigm Shift in Diabetes Care with Studio Clinics: Q&A with Reach7 Founder Chun Yong

Most-Read

Medtronic to Separate Diabetes Business into New Standalone Company

Medtronic to Separate Diabetes Business into New Standalone Company

White House, IBM Partner to Fight COVID-19 Using Supercomputers

HHS Sets Pricing Targets for Trump’s EO on Most-Favored-Nation Drug Pricing

23andMe to Mine Genetic Data for Drug Discovery

Regeneron to Acquire Key 23andMe Assets for $256M, Pledges Continuity of Consumer Genome Services

CureIS Healthcare Sues Epic: Alleges Anti-Competitive Practices & Trade Secret Theft

The Evolving Role of Physician Advisors: Bridging the Gap Between Clinicians and Administrators

The Evolving Physician Advisor: From UM to Value-Based Care & AI

UnitedHealth Group Names Stephen Hemsley CEO as Andrew Witty Steps Down

UnitedHealth CEO Andrew Witty Steps Down, Stephen Hemsley Returns as CEO

Omada Health Files for IPO

Omada Health Files for IPO

Blue Cross Blue Shield of Massachusetts Launches "CloseKnit" Virtual-First Primary Care Option

Blue Cross Blue Shield of Massachusetts Launches “CloseKnit” Virtual-First Primary Care Option

Osteoboost Launches First FDA-Cleared Prescription Wearable Nationwide to Combat Low Bone Density

Osteoboost Launches First FDA-Cleared Prescription Wearable Nationwide to Combat Low Bone Density

2019 MedTech Breakthrough Award Category Winners Announced

MedTech Breakthrough Announces 2025 MedTech Breakthrough Award Winners

Secondary Sidebar

Footer

Company

  • About Us
  • Advertise with Us
  • Reprints and Permissions
  • Submit An Op-Ed
  • Contact
  • Subscribe

Editorial Coverage

  • Opinion
  • Health IT
    • Care Coordination
    • EMR/EHR
    • Interoperability
    • Population Health Management
    • Revenue Cycle Management
  • Digital Health
    • Artificial Intelligence
    • Blockchain Tech
    • Precision Medicine
    • Telehealth
    • Wearables
  • Startups
  • Value-Based Care
    • Accountable Care
    • Medicare Advantage

Connect

Subscribe to HIT Consultant Media

Latest insightful articles delivered straight to your inbox weekly

Copyright © 2025. HIT Consultant Media. All Rights Reserved. Privacy Policy |