• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to secondary sidebar
  • Skip to footer

  • Opinion
  • Health IT
    • Behavioral Health
    • Care Coordination
    • EMR/EHR
    • Interoperability
    • Patient Engagement
    • Population Health Management
    • Revenue Cycle Management
    • Social Determinants of Health
  • Digital Health
    • AI
    • Blockchain
    • Precision Medicine
    • Telehealth
    • Wearables
  • Startups
  • M&A
  • Value-based Care
    • Accountable Care (ACOs)
    • Medicare Advantage
  • Life Sciences
  • Research

Can NLP Text-Mining Transform Drug Discovery & Development for Biopharma?

by Jane Reed, Head of Life Science Strategy at Linguamatics 01/31/2018 Leave a Comment

  • LinkedIn
  • Twitter
  • Facebook
  • Email
  • Print

Can NLP Text-Mining Transform Drug Discovery & Development for Biopharma?

Healthcare is in the midst of a data explosion, thanks to the widespread adoption of electronic health records (EHRs), the increased sophistication of insurance claims databases, and the growth of social media. These real-world data provide critical insights into the health of patients and can be leveraged to advance pharma and biotech initiatives that improve the delivery of care.

This wealth of clinical data is particularly valuable for organizations seeking to understand the real-world impact of therapies on patients. Biopharmaceutical companies that have ready-access to real-world evidence (RWE) based on real-world data (RWD) – patient outcome data collected outside clinical trials – have the potential to speed the development and commercialization of their offerings.

RWE can shed light on real-world clinical effectiveness and on the safety profiles of products across a broad patient community, as well as provide better insights to assess patient-reported outcomes, understand product reputation management, and engage opinion leaders.

Expanded revelations with RWE

Though data from a controlled clinical trial provides good information on safety and efficacy, RWE provides pharma and biotech companies with expanded insights because the feedback is collected from a broader segment of the population. RWE includes factors that may have been controlled in a clinical trial setting, but which potentially impact outcomes, such as co-morbidities, co-medications, patient age or other social factors.

While identifying negative outcomes is a priority, RWE is also valuable for uncovering unexpected positive outcomes, such as being especially effective on certain populations or beneficial for particular off-label conditions.

Life science companies need RWE to understand the impact of their new products in order to update labelling if necessary, to understand long-term effectiveness, or to disclose previously unknown side-effects.

RWD and the free-text challenge

Even though we have more healthcare data than ever before, much of the information is hidden in unstructured text within EHRs, adverse event reports, social media, and customer call transcripts. Unstructured text often provides a level of detail and granularity not available from the structured fields that are more commonly available to life science companies.

However, when data are stored as free text instead of within structured fields and mapped to standards, the extraction of data becomes a challenge, as does the integration of data into advanced analytics programs.

Unfortunately, if pharma and biotech companies can’t readily access the key information from RWD, the value of the data is limited. To effectively leverage the rich information stored as free-text RWD, biopharmaceutical companies must take advantage of advanced technologies to transform unstructured data into actionable intelligence for decision-making.

For example, numerous biopharma firms have deployed artificial intelligence technologies such as natural language processing (NLP)-based text mining, to unlock the value of free-text RWD.

NLP-based text mining

NLP-based mining allows users to extract key details from unstructured documents using relevant ontologies and focused queries. For example, queries can be written to extract information on treatment patterns to identify drug switching or discontinuation. Numeric-based queries can search for lab values and dosage information, as well as patient-specific details such as history of disease, problem lists, demographics, social factors, and lifestyle.

Ideally the technology is flexible enough to apply different business rules based on particular data sets, such as sentiments from tweets or outcomes and treatment pattern choices from EHRs.  Thus NLP-based text mining can transform real-world data into real-world evidence.

Mining EHR data with NLP

Access to EHR data has long been limited due to understandable restrictions to protect patient privacy. Pharma and biotech have had to develop relationships with academic medical centers and health systems to mine EHR data on behalf of drug companies. With the growth in electronic records across North America, RWE companies such as RealHealthData and Pentavere are now providing access to de-identified, unstructured patient information.

Using NLP technology drug companies can mine these medical transcripts and other patient records. This supports the types of high-value discontinuation studies mentioned previously.

For example, one epilepsy study identified that patients were twice as likely to switch medications because of adverse events compared to efficacy reasons. The same study found that the most frequent comorbidity was depression, and that drug switching was often correlated with body weight, and alcohol and substance abuse.

Benefits to biotech and pharma

For biotech and pharma companies, effective mining of RWD provides value from bench to bedside and enhances the discovery, development, and post-market delivery of drugs and therapies. Other benefits include:

– RWE supports product development and commercial decision-making because it provides better understanding of disease states and treatment patterns across broad populations

– RWE aids health economics and outcomes research, comparative effectiveness research, and the post-market product life cycle management, including disease forum engagement, reputation management, KOL engagement, safety profiles, and treatment regime effectiveness

– RWE helps to understand treatment effectiveness, as well as provide insights into patterns of care, long-term safety, healthcare resource utilization, and disease epidemiology.

NLP text-mining in practice

A number of bio-pharma companies are already leveraging NLP text-mining to advance the development and commercialization of their drugs and therapies. Consider the following use cases:

RWD from voice of the customer calls – Patient and customer call transcripts are rich with details on patient-reported outcomes, side effects, drug interactions, and other insights that greatly impact commercial business decisions and affect post-launch product marketing and planning.

In order to gain insights into the real-world use of their drugs, one large biopharma company currently uses NLP-based text mining technology to annotate and categorize “voice of the customer” (VoC) call feeds. Researchers in the company’s predictive analytics group have built an end-to-end workflow for processing call transcripts and making sense of the unstructured feeds.

Using agile text-mining technology, researchers categorize and tag calls for key metadata, such as caller demographics and call reasons. By leveraging its use of NLP text-mining technology, the company has doubled the efficiency of their analysis, and enabled longitudinal exploration of real-world patient concerns and outcomes.

Comparing adverse event profiles from clinical trials to patient forum data – A different pharma company is using NLP text-mining to examine the differences in nausea adverse reaction (AR) frequencies in clinical trials versus AR frequencies in real-world occurrences, as documented in patient-reported outcomes.

The company works with a third-party patient forum to access the on-line self-reported data. Using NLP-based text mining, they also extract nausea AR frequencies reported in clinical trials from FDA Drug Product Labels. The company is then able to demonstrate the differences in reporting rates, including those due to differences in dosage and usage.  

Leveraging NLP text-mining to improve patient health

The explosion of clinical health data has created a trove of information that can be leveraged to accelerate the discovery and delivery of new drugs and therapy that ultimately improve the health of patients. Thanks to new technologies such as NLP-based text mining, biopharma companies can take advantage of RWD to advance the innovation and delivery of their products.

 Jane Reed is the Head of Life Science Strategy at Linguamatics, where she is responsible for developing the strategic vision for Linguamatics’ growing product portfolio and business development in the life science market.

  • LinkedIn
  • Twitter
  • Facebook
  • Email
  • Print

Tagged With: biopharmaceutical, Clinical Trials, Linguamatics, NLP Text-Mining, real-world data (RWD)

Tap Native

Get in-depth healthcare technology analysis and commentary delivered straight to your email weekly

Reader Interactions

Primary Sidebar

Subscribe to HIT Consultant

Latest insightful articles delivered straight to your inbox weekly.

Submit a Tip or Pitch

Featured Insights

2025 EMR Software Pricing Guide

2025 EMR Software Pricing Guide

Featured Interview

Kinetik CEO Sufian Chowdhury on Fighting NEMT Fraud & Waste

Most-Read

2019 MedTech Breakthrough Award Category Winners Announced

MedTech Breakthrough Announces 2025 MedTech Breakthrough Award Winners

WeightWatchers Files for Bankruptcy to Eliminate $1.15B in Debt

WeightWatchers Files for Bankruptcy to Eliminate $1.15B in Debt

KLAS: Epic Dominates 2024 EHR Market Share Amid Focus on Vendor Partnership; Oracle Health Sees Losses Despite Tech Advances

KLAS: Epic Dominates 2024 EHR Market Share Amid Focus on Vendor Partnership; Oracle Health Sees Losses Despite Tech Advances

'Cranky Index' Reveals EHR Alert Frustration Peaks Midweek, Highest Among Admin Staff

‘Cranky Index’ Reveals EHR Alert Frustration Peaks Midweek, Highest Among Admin Staff

Madison Dearborn Partners to Acquire Significant Stake in NextGen Healthcare

Madison Dearborn Partners to Acquire Significant Stake in NextGen Healthcare

Wandercraft Begins Clinical Trials for Physical AI-Powered Personal Exoskeleton

Wandercraft Begins Clinical Trials for Physical AI-Powered Personal Exoskeleton

Chipiron Secures $17M to Transform MRI Access with Portable Scanner

Chipiron Secures $17M to Transform MRI Access with Portable Scanner

Abbott to Integrate FreeStyle Libre Glucose Data with Epic EHR

Abbott to Integrate FreeStyle Libre Glucose Data with Epic EHR

5 Ways New Trump Administration Tariffs Are Impacting U.S. Healthcare Now

5 Ways Trump Administration Tariffs Are Impacting U.S. Healthcare Now

iCAD, GE HealthCare Integrate to Advance Breast Cancer Detection with AI

RadNet to Acquire iCAD for $103M in All-Stock Transaction

Secondary Sidebar

Footer

Company

  • About Us
  • Advertise with Us
  • Reprints and Permissions
  • Submit An Op-Ed
  • Contact
  • Subscribe

Editorial Coverage

  • Opinion
  • Health IT
    • Care Coordination
    • EMR/EHR
    • Interoperability
    • Population Health Management
    • Revenue Cycle Management
  • Digital Health
    • Artificial Intelligence
    • Blockchain Tech
    • Precision Medicine
    • Telehealth
    • Wearables
  • Startups
  • Value-Based Care
    • Accountable Care
    • Medicare Advantage

Connect

Subscribe to HIT Consultant Media

Latest insightful articles delivered straight to your inbox weekly

Copyright © 2025. HIT Consultant Media. All Rights Reserved. Privacy Policy |