Homework 5

Homework Assignment – Descriptive Statistics

Some columns of specific interest include:

PatientAge = Age of Patient

Radiology.Technician = A unique identifier assigned to each Radiology Technician

Priority = Priority of the order, STAT or Routine

Loc.At.Exam.Complete = The floors of the hospital (e.g. 3W, 4W, 5E, etc.)

Ordered.to.Complete…Mins = The time required to complete the order in minutes.

There are some data quality issues, and for some patients the reported time to complete an order is negative. These observations should be filtered out of the dataset before performing your analysis (see the video on filtering rows/selecting columns in the first module).

Please answer the following questions:

Patients that are age 65 or older quality for Medicare. Generate a histogram on the time required to fulfill X-ray orders for Medicare patients, restricting the allowable range of times to be between the first and third quartile of observations, i.e. the IQR. What do you notice about the shape of this histogram?

We would like to compare the performance of Radiology Technician 62 to 65 based on their median time to complete an order. How would you interpret your findings?

Generate a side by side boxplot comparing the age of those patients receiving STAT versus Routine orders for an X-Ray. What do these boxplots tell you?

Calculate the mean and standard deviation of the time required to complete an X-Ray order on the floor 3W compared to 4W. What do you conclude about differences between X-Ray completion times on these floors?

For all plots/graphics, be sure to add a title and label all axes.

The shape of the histogram is skewed to the left, we can also plot normality using the Q-Q plot with the residuals of the minutes column.

The Q-Q plot shape is a curved S, indicating the tail ends of the distribution are heavier. In this case, the tail ends of the distribution are heavy on th left.

qqnorm(medicare_data_IQR\(Ordered.to.Complete...Mins) qqline(medicare_data_IQR\)Ordered.to.Complete…Mins, col = “blue”)

#2.

technicians_62_65 <- filtered_data %>% filter(Radiology.Technician >= 62 & Radiology.Technician <= 65)

boxplot(Ordered.to.Complete…Mins ~ Radiology.Technician, technicians_62_65,main = “Median Completion Time by Radiology Technician”,xlab = “Radiology Technician”,ylab = “Completion Time (Minutes)”,col = “red”, ylim = c(0, 300)) # Adjust this limit based on your data

The findings can be interpreted as the radiology techs near the end of the threshold range of 62-65 having a better performance in terms of completion time.

#3. Generate a side by side box plot comparing the age of those patients receiving STAT versus Routine orders for an X-Ray. What do these box plots tell you?

patientsSTATRoutine <- filtered_data[filtered_data$Priority %in% c(“STAT”, “Routine”), ]

boxplot(PatientAge ~ Priority, col = c(“red”, “blue”),
data = patientsSTATRoutine, main = “Patient Age by Order Type (STAT vs Routine)”, xlab = “Order Type”, ylab = “Age (years)”, names = c(“Rout”, “STA”)) # I shortened it so the group names show up.

4. Calculate the mean and standard deviation of the time required to complete an X-Ray order on the floor 3W compared to 4W. What do you conclude about differences between X-Ray completion times on these floors?

#filtering the already filtered df floors_data <- filtered_data[filtered_data$Loc.At.Exam.Complete %in% c(“3W”, “4W”), ]

#mean and sd summary_stats <- aggregate(Ordered.to.Complete…Mins ~ Loc.At.Exam.Complete, data = floors_data, FUN = function(x) c(mean = mean(x, na.rm = TRUE), sd = sd(x, na.rm = TRUE))) summary_stats