ehr

title: “EHR Reporting Warehouse & Clinical KPI Analytics Engine”

author: “Owar Omot”

date: “2026-06-11”

format:

html:

toc: true

toc-depth: 3

theme: cosmo

highlight-style: github


## Executive Project Overview

This analytics engine simulates an enterprise-level Epic Clarity relational database reporting warehouse. The architecture spans 6 normalized transactional and dimensional clinical entities mapped to evaluate backend key performance indicators (KPIs). Using an integrated pipeline, this Quarto document establishes a database interface, compiles transactional datasets, executes high-ROI analytical procedures, and provisions dynamic visualizations for healthcare operational optimization.

Phase 1: Database Initialization & R-Environment Setup.

```{r setup, message=FALSE, warning=FALSE}

# Load essential data engineering and informatics libraries

library(DBI)

library(RSQLite)

library(ggplot2)

library(dplyr)

# Automatically creates and connects to a self-contained local database file

con <- dbConnect(RSQLite::SQLite(), “ehr_reporting_warehouse.sqlite”)

Phase 2: Environment Purge & Relational Data Definition (DDL)

This block prepares the data warehouse environment by executing target drops to prevent overlapping key constraints, then defines the database schemas, primary identifiers, and multi-level foreign key constraints.– DROP EXISTING TABLES IN THE DEPENDENCY TREE DROP TABLE IF EXISTS Diagnoses; DROP TABLE IF EXISTS Encounters; DROP TABLE IF EXISTS Appointments; DROP TABLE IF EXISTS Patients; DROP TABLE IF EXISTS Providers; DROP TABLE IF EXISTS Departments;

– 1. DEPARTMENTS DIMENSION (Clarity equivalent: Clinic/Unit mapping table) CREATE TABLE Departments ( DepartmentID INT PRIMARY KEY, DepartmentName VARCHAR(100) NOT NULL, Specialty VARCHAR(100) NOT NULL, FacilityLocation VARCHAR(100) NOT NULL );

– 2. PROVIDERS DIMENSION (Clarity equivalent: SER Master Registry) CREATE TABLE Providers ( ProviderID INT PRIMARY KEY, ProviderName VARCHAR(100) NOT NULL, ProviderType VARCHAR(50) NOT NULL, TargetDailyCapacity INT DEFAULT 16 );

– 3. PATIENTS DIMENSION (Clarity equivalent: EPT Master Patient Index) CREATE TABLE Patients ( PatientID INT PRIMARY KEY, PatientName VARCHAR(100) NOT NULL, DateOfBirth DATE NOT NULL, Gender VARCHAR(20) NOT NULL, ZipCode VARCHAR(10) NOT NULL );

– 4. APPOINTMENTS TRANSACTIONAL LOG (Scheduling Access & Status Metric Table) CREATE TABLE Appointments ( AppointmentID INT PRIMARY KEY, PatientID INT REFERENCES Patients(PatientID), ProviderID INT REFERENCES Providers(ProviderID), DepartmentID INT REFERENCES Departments(DepartmentID), AppointmentDate DATE NOT NULL, ScheduledDate DATE NOT NULL, ApptStatus VARCHAR(50) NOT NULL CHECK (ApptStatus IN (‘Completed’, ‘No-Show’, ‘Canceled’)), ReferralSourceProviderID INT REFERENCES Providers(ProviderID) );

– 5. ENCOUNTERS TRANSACTIONAL RECORD (Clarity equivalent: EPT_CS_ENC Patient Visits) CREATE TABLE Encounters ( EncounterID INT PRIMARY KEY, AppointmentID INT UNIQUE REFERENCES Appointments(AppointmentID), PatientID INT REFERENCES Patients(PatientID), ProviderID INT REFERENCES Providers(ProviderID), DepartmentID INT REFERENCES Departments(DepartmentID), CheckInTime TIMESTAMP NOT NULL, CheckOutTime TIMESTAMP NOT NULL, EncounterType VARCHAR(50) NOT NULL, CONSTRAINT chk_times CHECK (CheckOutTime > CheckInTime) );

– 6. DIAGNOSES FACT TABLE (ICD-10 Encounter Assessment Mapping) CREATE TABLE Diagnoses ( DiagnosisID INT PRIMARY KEY, EncounterID INT REFERENCES Encounters(EncounterID), ICD10Code VARCHAR(10) NOT NULL, DiagnosisDescription VARCHAR(255) NOT NULL, DiagnosisType VARCHAR(50) NOT NULL CHECK (DiagnosisType IN (‘Primary’, ‘Secondary’)) );

Phase 3: Transactional Mock Seeding (DML)

We seed the relational schema elements with clinical mock metadata matching regional scheduling metrics.

INSERT INTO Departments VALUES (101, ‘Mankato Specialty Clinic’, ‘Cardiology’, ‘Building A’), (102, ‘River Valley Urgent Care’, ‘Urgent Care’, ‘Building B’), (103, ‘North Mankato Pediatrics’, ‘Pediatrics’, ‘Building C’);

{sql connection=INSERT INTO Providers VALUES 
(501, 'Dr. Sarah Jenkins', 'MD', 12),
(502, 'Mark Olson, NP', 'NP', 20),
(503, 'Dr. Allen Vance', 'DO', 15);
}

```{sql connection=INSERT INTO Patients VALUES (1001, ‘John Doe’, ‘1978-05-12’, ‘Male’, ‘56001’), (1002, ‘Jane Smith’, ‘1992-10-24’, ‘Female’, ‘56003’), (1003, ‘Michael Brown’, ‘2015-03-14’, ‘Male’, ‘56001’), (1004, ‘Alice Green’, ‘1965-08-30’, ‘Female’, ‘56002’);}


```{sql connection=INSERT INTO Appointments VALUES 
(9001, 1001, 501, 101, '2026-06-01', '2026-05-15', 'Completed', NULL),
(9002, 1002, 501, 101, '2026-06-01', '2026-05-28', 'No-Show', 502),
(9003, 1003, 503, 103, '2026-06-02', '2026-05-20', 'Completed', NULL),
(9004, 1004, 502, 102, '2026-06-02', '2026-06-02', 'Completed', 503),
(9005, 1001, 503, 103, '2026-06-03', '2026-05-01', 'Canceled', NULL),
(9006, 1002, 502, 102, '2026-06-04', '2026-06-03', 'Completed', NULL);}

INSERT INTO Encounters VALUES (2001, 9001, 1001, 501, 101, ‘2026-06-01 09:00:00’, ‘2026-06-01 09:45:00’, ‘Office Visit’), (2003, 9003, 1003, 503, 103, ‘2026-06-02 10:15:00’, ‘2026-06-02 11:00:00’, ‘Office Visit’), (2004, 9004, 1004, 502, 102, ‘2026-06-02 14:00:00’, ‘2026-06-02 14:30:00’, ‘Telehealth’), (2006, 9006, 1002, 502, 102, ‘2026-06-04 08:30:00’, ‘2026-06-04 09:15:00’, ‘Office Visit’);

Phase 4: High-ROI Operational Analytics & Reporting

Below are the executed analytical query engines capturing key administrative metrics. Quarto displays the structured output fields directly below each script.

Report 1: Clinic Department Utilization Rate

WITH OperationalHours AS ( SELECT DepartmentID, (strftime(‘%s’, CheckOutTime) - strftime(‘%s’, CheckInTime)) / 3600.0 AS DirectCareHours FROM Encounters ) SELECT d.DepartmentName, d.Specialty, ROUND(SUM(o.DirectCareHours), 2) AS TotalEncounterHours, ROUND( (SUM(o.DirectCareHours) / (COUNT(DISTINCT d.DepartmentID) * 8.0) * 100), 2 ) AS UtilizationRatePercentage FROM Departments d LEFT JOIN OperationalHours o ON d.DepartmentID = o.DepartmentID GROUP BY d.DepartmentName, d.Specialty;

Report 2: No-Show Analysis by Specialty

SELECT d.Specialty, COUNT(CASE WHEN a.ApptStatus = ‘No-Show’ THEN 1 END) AS TotalNoShows, COUNT(a.AppointmentID) AS TotalScheduled, ROUND( (COUNT(CASE WHEN a.ApptStatus = ‘No-Show’ THEN 1 END) * 100.0 / NULLIF(COUNT(a.AppointmentID), 0)), 2 ) AS NoShowRatePercentage FROM Appointments a JOIN Departments d ON a.DepartmentID = d.DepartmentID GROUP BY d.Specialty;

Report 3: Provider Productivity Index

SELECT p.ProviderName, p.ProviderType, COUNT(CASE WHEN a.ApptStatus = ‘Completed’ THEN 1 END) AS CompletedVisits, MAX(p.TargetDailyCapacity) AS BaselineCapacity, ROUND( (COUNT(CASE WHEN a.ApptStatus = ‘Completed’ THEN 1 END) * 100.0 / NULLIF(MAX(p.TargetDailyCapacity), 0)), 2 ) AS CapacityThroughputPercentage FROM Providers p LEFT JOIN Appointments a ON p.ProviderID = a.ProviderID GROUP BY p.ProviderName, p.ProviderType;

Report 4: Cross-Provider Referral Tracking Engine

SELECT ref.ProviderName AS ReferringProvider, rec.ProviderName AS ReceivingProvider, d.Specialty AS DestinationSpecialty, COUNT(a.AppointmentID) AS ReferralsIssued, COUNT(e.EncounterID) AS ReferralsCompleted FROM Appointments a JOIN Providers ref ON a.ReferralSourceProviderID = ref.ProviderID JOIN Providers rec ON a.ProviderID = rec.ProviderID JOIN Departments d ON a.DepartmentID = d.DepartmentID LEFT JOIN Encounters e ON a.AppointmentID = e.AppointmentID GROUP BY ref.ProviderName, rec.ProviderName, d.Specialty;

Report 5: Appointment Lead Time Metrics & R Visualization Layer

Execute target extraction directly from R environment layer using SQLite dialect syntax

lead_time_query <- ” SELECT d.DepartmentName, COUNT(a.AppointmentID) AS TotalBooked, ROUND(AVG(julianday(a.AppointmentDate) - julianday(a.ScheduledDate)), 1) AS AvgLeadTimeDays, MAX(julianday(a.AppointmentDate) - julianday(a.ScheduledDate)) AS MaxLeadTimeDays FROM Appointments a JOIN Departments d ON a.DepartmentID = d.DepartmentID WHERE a.ApptStatus != ‘Canceled’ GROUP BY d.DepartmentName; ”

Fetch results from data warehouse

appointment_lead_time <- dbGetQuery(con, lead_time_query)

Render publication-grade reporting asset

ggplot(appointment_lead_time, aes(x = reorder(DepartmentName, -AvgLeadTimeDays), y = AvgLeadTimeDays, fill = DepartmentName)) + geom_bar(stat = “identity”, width = 0.4, show.legend = FALSE) + geom_text(aes(label = paste0(AvgLeadTimeDays, ” Days”)), vjust = -0.5, fontface = “bold”) + labs( title = “Patient Care-Access Latency Index”, subtitle = “Mean lead-time lag from initial scheduling point to encounter completion date”, x = “Clinical Department Unit”, y = “Average Lead Time Delay (Days)” ) + theme_minimal() + theme(plot.title = element_text(face = “bold”, size = 13))

Cleanly terminate the connection loop behind the scenes upon compilation

dbDisconnect(con)