Analysis Report Two - Data, Data Everywhere

Author

Mackenzi Horabik

Executive Summary

This report examines how healthcare organizations can use big data and clinical decision support systems to improve decision-making and patient care. Using the MIMIC-III database, data from admissions, microbiology, ICU, prescription, and patient demographic systems were combined through SQL queries and visualized using ggplot. The analysis identified common bacteria associated with sepsis and demographic differences within the patient population. The findings demonstrate the value of integrating data from multiple hospital systems to support clinical decision-making, operational efficiency, and quality improvement initiatives. Based on the readings and data analysis, healthcare organizations should continue investing in integrated data platforms, effective clinical decision support tools, and predictive analytics capabilities.

Introduction

The readings for this week focused on the challenges healthcare organizations face as they generate and store increasingly large amounts of data. Hospitals collect information from electronic medical records, laboratory systems, medication records, monitoring devices, and administrative systems, creating opportunities to improve patient care while also introducing challenges related to data management and information overload. The concept of clinical decision support systems (CDSS) was highlighted as a method of transforming large amounts of clinical information into actionable recommendations for healthcare providers. Research has shown that CDSS can improve patient safety, support clinical decision-making, and enhance healthcare quality when implemented effectively (Sutton et al. 2020). The readings also emphasized the growing influence of artificial intelligence, wearable technologies, and predictive analytics, which continue to increase both the volume and value of healthcare data. As healthcare organizations become increasingly data-driven, leaders must develop strategies that allow them to integrate information across departments while ensuring that clinicians receive relevant and useful information rather than overwhelming amounts of raw data.

The Healthcare Context

Modern healthcare organizations depend on information generated across multiple departments, including admissions, laboratory services, pharmacy systems, intensive care units, and patient monitoring technologies. The growing availability of healthcare data creates opportunities for organizations to improve decision-making, operational efficiency, and patient outcomes through data analytics (Raghupathi and Raghupathi 2014). Data integration allows information from these separate systems to be combined, providing clinicians and administrators with a more complete view of patient care. For example, a healthcare organization can combine laboratory results, medication records, and patient demographics to identify trends that may improve treatment decisions or operational performance.

While data integration provides significant benefits, it also creates challenges. Large volumes of information can contribute to clinician burnout and information overload if decision support systems are poorly designed. In addition, healthcare organizations must address concerns related to data quality, privacy, cybersecurity, and system interoperability. Effective clinical decision support systems help manage these challenges by organizing data into meaningful insights that support decision-making rather than simply presenting more information. As healthcare organizations continue adopting artificial intelligence, predictive analytics, and wearable technologies, the ability to effectively integrate and analyze data across multiple systems will become increasingly important for maintaining high-quality patient care.

Data Visualizations

Visualization One - Two Table Join

SELECT microbiologyevents.org_name
FROM microbiologyevents
INNER JOIN admissions
ON microbiologyevents.hadm_id = admissions.hadm_id
WHERE admissions.diagnosis LIKE "%SEPSIS%"
AND microbiologyevents.org_name IS NOT NULL
myquery1_top <- myquery1 %>%
  count(org_name, sort = TRUE) %>%
  slice_head(n = 3)

ggplot(data = myquery1_top,
       aes(y = org_name, x = n)) +
  geom_col()

This visualization combines admissions and microbiology data to identify the most common bacteria found in patients diagnosed with sepsis. By integrating information from multiple hospital systems, healthcare organizations can better understand infection trends and support clinical decision-making. This example demonstrates how data from different departments can be combined to generate actionable insights for patient care.Understanding which organisms appear most frequently can help support infection monitoring and treatment planning within healthcare organizations.

Visualization Two - Three Table Join

SELECT microbiologyevents.org_name,
       patients.gender
FROM microbiologyevents
INNER JOIN admissions
ON microbiologyevents.hadm_id = admissions.hadm_id
INNER JOIN patients
ON admissions.subject_id = patients.subject_id
WHERE admissions.diagnosis LIKE "%SEPSIS%"
AND microbiologyevents.org_name IS NOT NULL
ggplot(data = myquery2,
       aes(y = org_name, fill = gender)) +
  geom_bar()

This visualization expands the sepsis analysis by combining microbiology, admissions, and patient demographic data. The three-table join demonstrates how healthcare organizations can integrate information from multiple systems to identify patterns within specific patient populations. By connecting infection data with demographic characteristics, healthcare leaders can gain a more detailed understanding of clinical trends and support data-driven decision-making.

Recommendations for Industry

Healthcare organizations should invest in integrated data systems that allow information from admissions, microbiology, pharmacy, ICU, and patient demographic systems to be combined into a single platform. The SQL analyses performed in this report required data from multiple hospital systems to identify infection patterns and medication utilization trends. Integrated data environments can help clinicians and administrators access information more efficiently and support faster, evidence-based decision-making.

Healthcare organizations should continue expanding the use of clinical decision support systems (CDSS) to help providers manage growing volumes of information. The readings emphasized that hospitals are generating enormous amounts of data, and poorly organized information can contribute to information overload. Effective CDSS tools can filter, prioritize, and present relevant information at the point of care, helping clinicians make informed decisions while reducing the burden of reviewing large quantities of data (Sutton et al. 2020).

Healthcare organizations should leverage predictive analytics and population-level monitoring to identify trends that may improve patient outcomes and operational performance. The sepsis and medication analyses demonstrated how combining data from multiple departments can reveal patterns that may not be visible when examining a single dataset. By using integrated data to identify high-risk patients, monitor treatment variation, and support quality improvement initiatives, healthcare organizations can improve both clinical outcomes and organizational efficiency.

References

Raghupathi, Wullianallur, and Viju Raghupathi. 2014. “Big Data Analytics in Healthcare: Promise and Potential.” Health Information Science and Systems 2 (1): 1–10.
Sutton, Reed T, David Pincock, Daniel C Baumgart, Daniel C Sadowski, Richard N Fedorak, and Karen I Kroeker. 2020. “An Overview of Clinical Decision Support Systems: Benefits, Risks, and Strategies for Success.” NPJ Digital Medicine 3 (1): 17.