Categories of Real-World Evidence Data

Table with four categories of real-world data.

Table with four categories of real-world data.

Table with four categories of real-world data is available online.

Four main sources of big data in the healthcare industry (from McKinsey & Company compiled report for the Center for US Health System Reform)

Hospital activity (claims) and cost data

Goal: To provide the most cost-effective care.

Hospital data collected by organizations consists of figures related to everything from the number of discharges to the number of medical procedures, the amount of care which has been supplied by providers in the system, and the cost of paying for that care.

Analysis of this tells us about the spread of diseases, and the priority that should be given to dealing with specific health threats. The most cost-effective treatments for specific ailments can be identified and the number of duplicate or unnecessary treatments can be significantly reduced.

Electronic Health Record (EHR)

Interoperable electronic health records (EHRs) for patient care hold tremendous potential to reduce the growth in costs. EHRs can help healthcare organizations improve chronic disease management, increase operating efficiencies, transform their finances, and improve patient outcomes.

However, EHR implementations are in various stages of maturity across the country, and their benefits have not been fully realized. One of the primary challenges healthcare decision-makers face is how to make meaningful use of the data collected, available, and accessible in EHRs.

Clinical data

Goal: To help identify at-risk patients.

These include patient medical records and images gathered during examinations or procedures, lab results and doctors’ notes.

These patient records (or clinical data) are analyzed using data science’s natural processing algorithms to discover which patients may be at risk for certain health problems.

Pharmaceutical R&D data

Goal: Comparison of data from multiple clinical trials can lead to new solutions for patients.

Over the last few years a large number of partnerships have sprung up between pharmaceutical companies. In the US major firms such as Pfizer and Novartis pool their data from trials into the clinicaltrials.gov website. And in the UK GlaxoSmithKline recently unveiled its partnership with the SAS Institute which aims to increase collaboration based on data from clinical trials. Suitable candidates can be found for trials more effectively by looking into lifestyle information. And comparison of data from multiple trials can throw up surprising results which can lead to new breakthroughs.

Consider Project Datasphere, an initiative to share, integrate, and analyze historical cancer trial data sets for the purpose of accumulating research findings and accelerating cures. The power of this rich dataset is in the analysis and the global focus on finding solutions for cancer patients.

Patient behaviour and sentiment data

Goal: To advance preventive medicine, find cures, and improve recovery times.

This is data from over-the-counter drug sales combined with the latest wearable health devices which monitor your activity, heart rate, sleep patterns, number of steps taken, and more, patient experience and customer satisfaction surveys as well as the vast amount of unstructured information about our lifestyles broadcast every day over social media.

By collecting the data from these devices, health care professionals can learn about the general population’s-and a specific patient’s-behavioral patterns, biometrics, and geolocation. All of this data can help in the pursuit of disease prevention, more effective cures, and faster recovery from illness.

Data Science Applications in Healthcare

Healthcare and Payers

  • Analyzing patient characteristics and the cost and outcomes of care to identify the most clinically effective and cost-efficient diagnoses and treatments.
  • Identifying, predicting, and minimizing fraud by implementing advanced analytic systems for fraud detection and checking the accuracy and consistency of claims.
  • Analyzing large numbers of claim requests rapidly in the pre-adjudication phase to reduce fraud, waste and abuse.

Evidence-Based Medicine

  • Combining and analyzing a variety of structured and unstructured data - EMRs, financial and operational data, clinical data, and genomic data - to match treatments with outcomes, predict patients at risk for disease or readmission, and provide more efficient care at reduced cost.
  • Applying advanced analytics to patient profiles (e.g., segmentation and predictive modeling) to identify individuals who would benefit from proactive care or lifestyle changes (for example, those patients at risk of developing a specific disease who would benefit from preventive care lifestyle changes).
  • Using historical data to personalize medical care by predicting and/or estimating developments or outcomes, such as which patients will choose elective surgery, will not benefit from surgery, are at risk for medical complications or hospital-acquired illness, or will have possible co-morbid conditions.
  • Executing gene sequencing more efficiently and cost effectively to make genomic analysis a part of the regular medical care decision process and the growing patient medical record.

Real-Time Healthcare and Clinical Analytics

  • Collecting and publishing data on innovative medical treatment procedures, assisting patients in determining the care protocols or regimens that offer the best value.
  • Aggregating and synthesizing patient clinical records and claims datasets in real time to provide data and services to third parties (e.g., licensing data to assist pharmaceutical companies in identifying patients for inclusion in clinical trials).
  • Detecting individual and population trends more rapidly and accurately by developing and deploying mobile applications that help patients manage their care, locate providers, and improve their health.
  • Monitoring medical devices, including wearables, to capture and analyze in real-time large volumes of fast-moving data, for safety monitoring and adverse event prediction, enabling payers to monitor adherence to drug and treatment regimens and detect trends that lead to individual and population wellness benefits.

Research & Development

  • Improving predictive modeling to lower attrition and produce a leaner, faster, more targeted R&D pipeline in drugs and devices.
  • Leveraging statistical tools and algorithms to improve clinical trial design and patient recruitment to better tailor treatments to individual patients, thus reducing trial failures and speeding new treatments to market.
  • Analyzing clinical trials and patient records to identify follow-on indications and discover adverse effects before products reach the market.

Public Health

  • Analyzing disease patterns and tracking disease outbreaks and transmission to improve public health surveillance and speed response.
  • Improving data models to better predict virus evolution, leading to more accurately targeted seasonal vaccines (e.g., choosing the annual influenza strains).
  • Turning large amounts of data into actionable information that can be used to identify needs, provide services, and predict and prevent crises, especially for the benefit of populations.

Challenges for Data Science in Healthcare

Big Data Analytics Strategy

Problem: Processing the data: availability, ease of use, scalability, ability to manipulate data at various levels of granularity, ability to analyze data without IT intervention and with the users’ preferred tools of choice, privacy and security enablement, quality assurance, and transparency.

Organizations need an actionable roadmap comprising people, process, and technology improvements that result from a comprehensive assessment of their existing data management capabilities, prioritized data-related goals, and business value drivers. The data management strategy should identify an organization’s pain points and address them through disciplined yet agile phased execution. The result should be a timely and cost-effective strategic approach that provides incremental business benefits at the conclusion of each phase.

Big Data Governance

Problem: Data stewardship and data quality have to be considered through an organization’s continuous data acquisition and data cleansing. Life sciences and healthcare data is rarely standardized and is often fragmented or generated in legacy IT systems with incompatible formats.

Organizations need good data governance. Governance, the human aspect of managing data, encompasses the people, processes, and technology required to ensure the accuracy, timeliness, and effective use of data across the enterprise. Strong operations, the processes required to effectively manage information environments and platforms, support good data governance. Without a well-developed governance program and robust operations, organizations struggle with inaccurate and poor-quality data, leading to untrustworthy results and decisions. Organizations need to develop the tools necessary to effectively and confidently manage their data assets in specific information environments.

Real-Time Analytics

Problem: The lag between data collection and processing has to be addressed.

Organizations need to implement delivery tools and technologies that not only seamlessly interface with big data platforms, but also drive real-time data analytics. Also, the dynamic availability of numerous analytics algorithms, models, and methods is necessary for large-scale adoption.

Data Sets Available

References