0.1 Descriptive Statistics

The dataset RadDat_IMSE.csv contains data from a major US hospital related to the time required to fulfill orders for X-Rays. The datafile may be found here: https://raw.githubusercontent.com/tmatis12/datafiles/refs/heads/main/RadDat_IMSE.csv

Some columns of specific interest include:

PatientAge = Age of Patient Radiology.Technician = A unique identifier assigned to each Radiology Technician Priority = Priority of the order, STAT or Routine Loc.At.Exam.Complete = The floors of the hospital (e.g. 3W, 4W, 5E, etc.) Ordered.to.Complete…Mins = The time required to complete the order in minutes.

There are some data quality issues, and for some patients the reported time to complete an order is negative. These observations should be filtered out of the dataset before performing your analysis (see the video on filtering rows/selecting columns in the first module).

0.1.1 Q1

Patients that are age 65 or older quality for Medicare. Generate a histogram on the time required to fulfill X-ray orders for Medicare patients, restricting the allowable range of times to be between the first and third quartile of observations, i.e. the IQR. What do you notice about the shape of this histogram?

The Histogram is a straight line from Q1 = 21 and Q3 = 90.

dat<-read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/refs/heads/main/RadDat_IMSE.csv")
Age_65<-dat[dat$PatientAge >= 65,]
hist(Age_65$Ordered.to.Complete...Mins, xlim = c(21.0,90), 
     main ="Histogram on the time required to fulfill X-ray orders for 
     Medicare patients", 
     xlab ="IQR of Ordered to complete (Mins)")

0.1.2 Q2

We would like to compare the performance of Radiology Technician 62 to 65 based on their median time to complete an order. How would you interpret your findings?

50% of the time, Radiology Technician 62 will take longer than 80 mins. 50% of the time, Radiology Technician 65 will take longer than 27 mins. This means Radiology Technician 65 is FASTER at completing orders.

Radio_62<-dat[dat$Radiology.Technician == 62,]
Radio_65<-dat[dat$Radiology.Technician == 65,]
median(Radio_62$Ordered.to.Complete...Mins)
## [1] 80
median(Radio_65$Ordered.to.Complete...Mins)
## [1] 27

0.1.3 Q3

Generate a side by side boxplot comparing the age of those patients receiving STAT versus Routine orders for an X-Ray. What do these boxplots tell you?

These boxplots show older patients receive more Routine orders than STAT. STAT receives a broader range of ages. STAT orders are more common with younger patients.

Stat<-dat[dat$Priority == "STAT",]
Routine<-dat[dat$Priority == "Routine",]
boxplot(Stat$PatientAge,Routine$PatientAge, main = "STAT vs Routine (Ages)",
        names=c("STAT","Routine"), xlab="X-Ray", ylab = "Count")

0.1.4 Q4

Calculate the mean and standard deviation of the time required to complete an X-Ray order on the floor 3W compared to 4W. What do you conclude about differences between X-Ray completion times on these floors?

The time to complete an X-ray on Floor 4W is higher since its mean is higher than Floor 3W. Floor 4W has a higher standard deviation meaning it will have a larger amount of variability compared to Floor 3W.

Floor3w<-dat[dat$Loc.At.Exam.Complete == "3W",]
Floor_3w<-Floor3w$Ordered.to.Complete...Mins
mean(Floor_3w)
## [1] 1463.051
sd(Floor_3w)
## [1] 3894.639
Floor4w<-dat[dat$Loc.At.Exam.Complete == "4W",]
Floor_4w<-Floor4w$Ordered.to.Complete...Mins
mean(Floor_4w)
## [1] 1675.451
sd(Floor_4w)
## [1] 4387.644

0.1.5 Complete R Code

dat<-read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/refs/heads/main/RadDat_IMSE.csv")
# 1)

# Patients that are age 65 or older quality for Medicare
Age_65<-dat[dat$PatientAge >= 65,]
hist(Age_65$Ordered.to.Complete...Mins, xlim = c(21.0,90), 
     main ="Histogram on the time required to fulfill X-ray orders for 
     Medicare patients", 
     xlab ="IQR of Ordered to complete (Mins)")

# 2)

# We would like to compare the performance of Radiology Technician 62 to 65 based on their
# median time to complete an order.  How would you interpret your findings? 
Radio_62<-dat[dat$Radiology.Technician == 62,]
Radio_65<-dat[dat$Radiology.Technician == 65,]
median(Radio_62$Ordered.to.Complete...Mins)
median(Radio_65$Ordered.to.Complete...Mins)

# 3) 

#Generate a side by side boxplot comparing the age of those patients receiving STAT 
#versus Routine orders for an X-Ray.  What do these boxplots tell you?
Stat<-dat[dat$Priority == "STAT",]
Routine<-dat[dat$Priority == "Routine",]
boxplot(Stat$PatientAge,Routine$PatientAge, main = "STAT vs Routine (Ages)",
        names=c("STAT","Routine"), xlab="X-Ray", ylab = "Count")

# 4)

# Calculate the mean and standard deviation of the time required to complete an X-Ray 
# order on the floor 3W compared to 4W.  What do you conclude about differences between 
# X-Ray completion times on these floors?
Floor3w<-dat[dat$Loc.At.Exam.Complete == "3W",]
Floor_3w<-Floor3w$Ordered.to.Complete...Mins
mean(Floor_3w)
sd(Floor_3w)

Floor4w<-dat[dat$Loc.At.Exam.Complete == "4W",]
Floor_4w<-Floor4w$Ordered.to.Complete...Mins
mean(Floor_4w)
sd(Floor_4w)