{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)

Read in and filter out values below zero

data <- read.csv("https://raw.githubusercontent.com/tmatis12/datafiles/refs/heads/main/RadDat_IMSE.csv")

#filter out negatives
clean_data <- data[data$Ordered.to.Complete...Mins >= 0,]

Question 1:

#question 1

medicare_patients <- clean_data[clean_data$PatientAge >= 65,]
iqr_values <- quantile(medicare_patients$Ordered.to.Complete...Mins, probs = c(0.25, 0.75))
iqr_filtered <-medicare_patients[medicare_patients$Ordered.to.Complete...Mins >= iqr_values[1] & medicare_patients$Ordered.to.Complete...Mins <= iqr_values[2],]

hist(iqr_filtered$Ordered.to.Complete...Mins, breaks=10, col="blue", main="hist of xray times for older patients", xlab="minutes to complete")

The shape of this histogram is skewed towards the lower amounts of time required, which implies that most X-ray orders do not take more than an hour and the longer times are less likely the more the value for time to complete increases.

Question 2:

#question 2

technician_62 <- clean_data[clean_data$Radiology.Technician == 62, ]
technician_65 <- clean_data[clean_data$Radiology.Technician == 65, ]

median_62 <- median(technician_62$Ordered.to.Complete...Mins)
median_65 <- median(technician_65$Ordered.to.Complete...Mins)

cat("Median time for Technician 62:", median_62, "\n")
## Median time for Technician 62: 80
cat("Median time for Technician 65:", median_65, "\n")
## Median time for Technician 65: 27

The median of technician 62 is more than double that of technician 65, which could imply that they simply take longer to complete their X-ray orders, or that they have to take more complicated X-rays, which would make sense considering there are far less X-rays completed by technician 62. The lower number of total completed X-rays and the higher median time to complete could also imply that they take longer due to inexperience. Other factors like STAT vs. routine could also come into play.

Question 3:

#question 3

boxplot(PatientAge ~ Priority,data = clean_data,col = c("blue", "green"),main = "Boxplot of Patient Age STAT vs Routine",xlab = "Priority",ylab = "Age")

The box plot of STAT vs routine X-rays by age shows a tighter concentration of routine X-rays for an older age group than STAT X-rays, which are distributed over a wider age range which is also younger. This could be due to it being more likely that a younger person does not have as many reasons to be getting an X-ray unless they are something requiring immediate attention such as an injury.

Question 4:

#question 4

floor_3W <- clean_data[clean_data$Loc.At.Exam.Complete == "3W", ]
floor_4W <- clean_data[clean_data$Loc.At.Exam.Complete == "4W", ]

mean_3W <- mean(floor_3W$Ordered.to.Complete...Mins)
sd_3W <- sd(floor_3W$Ordered.to.Complete...Mins)

mean_4W <- mean(floor_4W$Ordered.to.Complete...Mins)
sd_4W <- sd(floor_4W$Ordered.to.Complete...Mins)

cat("Mean time on floor 3W:", mean_3W, "Standard Deviation:", sd_3W, "\n")
## Mean time on floor 3W: 1463.051 Standard Deviation: 3894.639
cat("Mean time on floor 4W:", mean_4W, "Standard Deviation:", sd_4W, "\n")
## Mean time on floor 4W: 1675.451 Standard Deviation: 4387.644

Both floors here have a relatively similar mean and standard deviation, with 4W being a bit higher in both statistics. Since the mean is very high for both compared to their medians, this tells us there are some X-rays that take a very long time and drive up the average, with 4W probably having more of these and reaching higher values.