Analysis Report Four - Health Privacy and Data Profiling
Author
Tristan Worlock
Executive Summary
This week, our assigned readings taught us about how Healthcare organizations constantly continue to adopt digital technologies such as EHR’s, telehealth platforms, AI, and cloud based health information systems to improve patient care and operational efficiency {(davidson2025consumer?)}. The assigned readings show that while these technologies have improved communication, access to patient information, and clinical decision-making, they also have introduced significant challenges related to cyber security, patient privacy, clinical workload, and system interoperability (davidson2025consumer?).
The most important theme that I found across the readings is that technology alone will not improve healthcare outcomes. Organizations must invest into cyber security, staff training, privacy protections, and change management (kruse2017cybersecurity?). Recent events, including the Change Healthcare ransom ware attack and increasing federal oversight of digital health companies, demonstrate that healthcare organizations must treate data security and governance as strategic priorities rather than technical issues (kruse2017cybersecurity?).
Introduction
The assigned readings this week focused on how exactly healthcare organizations use data as a strategic asset while addressing the challenges that go along side digital transformation. A big major theme of the readings is the widespread adoption of EHR’s, which was accelerated by the Health Information Technology for Economic and Clinical Health (HITECH) Act through financial incentives for hospitals (adler2017hitech?). Although EHR’s improve information sharing and care coordination, healthcare professionals continue to report concerns about usability, interoperability, documentation burden, and workflow disruptions (adler2017hitech?).
The readings also introduce the broader data strategy concepts involving artificial intelligence, telehealth, privacy, and cybersecurity. AI can now improve clinical decision making but also makes a grey area, as it should support, not replace, professional judgment (davidson2025consumer?). Telehealth also expands access to care, but also does raise concerns about how sensitive patient information is collected and shared (davidson2025consumer?). Across all of the articles, effective data governance depends on transparency, strong cybersecurity practices, regulatory compliance, and organizational leadership that encourages responsible technology adoption (davidson2025consumer?).
The Healthcare Context
Today’s healthcare organizations face many challenges, as they continue to adopt digital technologies while protecting sensitive patient information. EHR’s, telehealth, and AI have improved patient care, communication, and operational efficiency, they have also introduced concerns related to cybersecruity, data privacy, interoperability, and clinician workload. Recent events, such as the 2024 Change Healthcare cyberattack (kruse2017cybersecurity?), demonstrated how a single cybersecurity breach can disrupt healthcare services nationwide and expose the personal information of millions of patients (kruse2017cybersecurity?).
In addition to this, increased FTC enforcement against companies like BetterHelp highlights the growing need for stronger privacy protections and transparent data practices (kruse2017cybersecurity?). Overall, as healthcare continues its digital transformation, organizations must balance technological innovation with effective cybersecurity, regulatory compliance, and strong data governance to protect patients while delivering high-quality care.
Pre-Visualization Table:
candidates <-dbGetQuery(mydb, " SELECT p.subject_id, p.gender, p.dob, p.dod, a.hadm_id, a.admittime, a.dischtime, a.deathtime, a.admission_type, a.marital_status, a.ethnicity FROM patients p JOIN admissions a ON p.subject_id = a.subject_id WHERE p.expire_flag = 1 AND a.deathtime IS NOT NULL ORDER BY a.dischtime DESC LIMIT 10 ")candidates
subject_id gender dob dod hadm_id
1 41976 M 2136-07-28 00:00:00 2202-12-05 00:00:00 153826
2 41976 M 2136-07-28 00:00:00 2202-12-05 00:00:00 149469
3 41976 M 2136-07-28 00:00:00 2202-12-05 00:00:00 145024
4 41976 M 2136-07-28 00:00:00 2202-12-05 00:00:00 151798
5 41976 M 2136-07-28 00:00:00 2202-12-05 00:00:00 179418
6 41976 M 2136-07-28 00:00:00 2202-12-05 00:00:00 155297
7 41976 M 2136-07-28 00:00:00 2202-12-05 00:00:00 125013
8 41976 M 2136-07-28 00:00:00 2202-12-05 00:00:00 174863
9 41976 M 2136-07-28 00:00:00 2202-12-05 00:00:00 180546
10 41976 M 2136-07-28 00:00:00 2202-12-05 00:00:00 130681
admittime dischtime deathtime admission_type
1 2202-10-03 01:45:00 2202-10-11 16:30:00 EMERGENCY
2 2202-09-16 21:56:00 2202-09-23 16:20:00 EMERGENCY
3 2202-05-01 22:00:00 2202-05-04 18:42:00 EMERGENCY
4 2202-02-15 19:01:00 2202-02-19 16:42:00 EMERGENCY
5 2201-12-31 19:19:00 2202-01-03 17:55:00 EMERGENCY
6 2201-11-16 23:00:00 2201-11-19 16:30:00 EMERGENCY
7 2201-09-28 16:47:00 2201-10-01 15:53:00 EMERGENCY
8 2201-08-10 23:00:00 2201-08-13 16:55:00 EMERGENCY
9 2201-05-12 10:49:00 2201-05-19 14:04:00 EMERGENCY
10 2200-10-29 20:46:00 2200-11-03 18:45:00 EMERGENCY
marital_status ethnicity
1 MARRIED HISPANIC/LATINO - PUERTO RICAN
2 MARRIED HISPANIC/LATINO - PUERTO RICAN
3 MARRIED HISPANIC/LATINO - PUERTO RICAN
4 MARRIED HISPANIC/LATINO - PUERTO RICAN
5 MARRIED HISPANIC/LATINO - PUERTO RICAN
6 MARRIED HISPANIC/LATINO - PUERTO RICAN
7 MARRIED HISPANIC/LATINO - PUERTO RICAN
8 MARRIED HISPANIC/LATINO - PUERTO RICAN
9 MARRIED HISPANIC/LATINO - PUERTO RICAN
10 MARRIED HISPANIC/LATINO - PUERTO RICAN
ggplot(data = icu_timeline,aes(x = intime, xend = outtime, y = first_careunit, yend = first_careunit)) +geom_segment(size =6, color ="steelblue") +labs(title ="ICU Unit Transfers During Terminal Stay", x ="Date/Time", y ="Care Unit")
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
EXPLANATION for #1: I created Visualization #1 to show the direct uses of the INTIME/OUTTIME timestamps, and how to reconstruct where this patient physcially was during their final hospitalization. For this paitent 153826, it revealed something meaningful, which is the patient never left the MICU. This was a single, roughly 39-hour stay from admission to death. This is a data point in itself. This showed me and shows the readers that this wasn’t a case of escalating care across multiple units, it was a patient who arrived already critical enough to go straight to intensive caare and didn’t survive the stay.
VISUALIZATION #2:
dbGetQuery(mydb, " SELECT DISTINCT d.label FROM labevents le JOIN d_labitems d ON le.itemid = d.itemid WHERE le.hadm_id = 153826 ")
label
1 Alanine Aminotransferase (ALT)
2 Albumin
3 Alkaline Phosphatase
4 Anion Gap
5 Asparate Aminotransferase (AST)
6 Bicarbonate
7 Bilirubin, Total
8 Chloride
9 Creatinine
10 Glucose
11 Potassium
12 Sodium
13 Urea Nitrogen
14 Basophils
15 Eosinophils
16 Hematocrit
17 Hemoglobin
18 INR(PT)
19 Lymphocytes
20 MCH
21 MCHC
22 MCV
23 Monocytes
24 Neutrophils
25 Platelet Count
26 PT
27 PTT
28 RDW
29 Red Blood Cells
30 White Blood Cells
31 Base Excess
32 Calculated Total CO2
33 Lactate
34 pCO2
35 pH
36 pO2
37 Epithelial Cells
38 RBC
39 Specific Gravity
40 WBC
41 Yeast
42 Calcium, Total
43 Magnesium
44 Phosphate
45 25-OH Vitamin D
46 Ferritin
47 Iron
48 Iron Binding Capacity, Total
49 Parathyroid Hormone
50 Transferrin
51 Creatine Kinase (CK)
52 Creatine Kinase, MB Isoenzyme
53 Troponin T
lab_trend <-dbGetQuery(mydb, " SELECT le.charttime, le.valuenum, le.valueuom, d.label FROM labevents le JOIN d_labitems d ON le.itemid = d.itemid WHERE le.hadm_id = 153826 AND d.label = 'Creatinine' ORDER BY le.charttime ")
ggplot(data = lab_trend,aes(x = charttime, y = valuenum)) +geom_line(color ="firebrick") +geom_point() +labs(title ="Creatinine Trend Over Terminal Stay",x ="Date/Time", y ="Creatinine (mg/dL)")
`geom_line()`: Each group consists of only one observation.
ℹ Do you need to adjust the group aesthetic?
EXPLANATION #2 I created this visualization so it tracks the patient’s creatinine levels over the course of their terminal stay, using timestamped lab values that are pulled directly from the LABEVENTS table. Creatinine is a standard clinical market of a kidney function, and plotting it against exact CHARTTIME value shows wheter the renal function was declining, stable, or if it was just fluctuating in the lead-up to death. A rising trend would most likely be consistent with an acute kidney injury or some sort of multi-organ decline, which can be common in critically ill patients nearing end of life. Frequent measurements also reflect how closely this patient was being monitored during their stay. This overall can show how clinical data can be extremely granular, and timestamped, in order to reconstruct a patient’s physiological trajectory. This also reincofrces how much sensistive data detail hospitals collect and retain on patients, individually.
Recommendations for Industry
Based off of the visualizatons that I had built, I think hosptials need to take a harder look at two things. First thing, being how they use timestamp data operationally, and how carefully they protect it. On the operational side, I am surprised by how much you can reconstruct just from admission, ICU, and lab timestamps. With a few queries, I was able to map out exactly where this patient was and how their condition changed, almost hour by hour. That same capability could be credible and useful for hospitals if they are using it proactively. If administrators tracked ICU patient stay patterns and lab trends like creatinine in real time across all patients, they could catch signs of decline earlier and make faster staffing or care decisions, rather than only peicing this story together after the fact like I did.
At the same time though, the same power is exactly why I think data governance needs to be taken more seriously. My recommendation would be for healthcare organizations to invest in stronger role based access controls and regular audits of who’s running these kinds of granular queries, especially on identifiable patient data. I think the same data that makes hospitals better at predicting and responding to patient decline is also data that needs to be locked down a lot tighter than it probably is now.
References
(inproceedings?){neprash2022trends, title={Trends in ransomware attacks on US hospitals, clinics, and other health care delivery organizations, 2016-2021}, author={Neprash, Hannah T and McGlave, Claire C and Cross, Dori A and Virnig, Beth A and Puskarich, Michael A and Huling, Jared D and Rozenshtein, Alan Z and Nikpay, Sayeh S}, booktitle={JAMA Health Forum}, volume={3}, number={12}, pages={e224873}, year={2022} }
(article?){adler2017hitech, title={HITECH Act drove large gains in hospital electronic health record adoption}, author={Adler-Milstein, Julia and Jha, Ashish K}, journal={Health affairs}, volume={36}, number={8}, pages={1416–1422}, year={2017} }
(article?){davidson2025consumer, title={Consumer health data: regulation, governance, and innovation}, author={Davidson, Elizabeth and Winter, Jenifer}, year={2025} }
(article?){kruse2017cybersecurity, title={Cybersecurity in healthcare: A systematic review of modern threats and trends}, author={Kruse, Clemens Scott and Frederick, Benjamin and Jacobson, Taylor and Monticone, D Kyle}, journal={Technology and Health Care}, volume={25}, number={1}, pages={1–10}, year={2017}, publisher={SAGE Publications Sage UK: London, England} } :::