Introduction

Australia has 693 public hospitals and 657 private hospitals in 2017–18 report of Australian Institute of Health and Welfare(AIHW).
So, we have taken a data called Average length of stay from the AIHW website.
The motive is to judge people’s nature towards the size of the hospital.
To conduct an experiment we used a software called R-Studio and the data have 14 variables and 30021 observations.
Data exploration was done by calculating and visualized using summary statistics, data table, box plots, histogram, and Q-Q Plots.
Based on the discoveries several tests were conducted to test the homogeneity of variance, calculate the difference in mean, and decide on 95% confidence or 5% level of significance.
For this experiment Average length of stay was identified as the variable of interest and all the tests were conducted on it only.

Introduction

For this test the problem statement is that are there any factors that affect the patient’s average length of stay based on the size of the hospital.
This question can help us identify the nature of a patient towards the hospital size and their tendency to stay longer.
If there is some difference in the average length of stay of a patient based on the size then this data can be used to provide better services to the patients.
Based on the results different hospitals can lay down different plans for helping the citizens in need and can increase the overall health of its citizens.
Not only this it can also help in increasing the average life expectancy of an average Australian citizen.

Problem Statement

The problem statement says:
- To investigate if there is any statistically significant difference in the average length of stay (ALOS) between large and medium hospitals.
- It might make patients choose one over the other.
Successfully identifying and verifying the problem statement can help the Australian government to lay down better plans to manage medical centers.
It would be interesting to see how people nature shifts based on the hospital peer group.
We are required to manage the missing data, NP values, and other types of data collection problems before beginning with statistics.

Data

States and territories provide data on hospitals to the AIHW under the National Health Information Agreement.
These data are compiled in the National Hospitals Data Collection (NHDC).
This data set that is taken from the National Hospitals Data Collection (NHDC) database is "average length of stay in hospital" dataset.
The data have a list of a total of 612 unique hospitals(Reporting unit) and a total of 6 peer-group categories from which only two are of interest i.e “Large hospitals” and “Medium hospitals”.
The data in the dataset is calculated as the number of bed days for overnight stays divided by the number of overnight stays and is reported for selected conditions and procedures.
The name of the data file is average-length-of-stay-multilevel-data.xlsx and is in Excel format file.

Data (Cont.)

We are going to read the data using read_excel() function from readxl library.
Reading every column as text, to avoid any unnecessary conversions.
Skipped the top 12 rows of the data as it contains information about the data and explanation.
Displayed the top 3 head row data of the data using.

AvgLenStDF <- read_excel("average-length-of-stay-multilevel-data.xlsx", sheet = "Average length of stay", col_types = c("text", "text", "text", "text", "text", "text", "text", "text", "text", "text", "text", "text", "text", "text", "text", "text", "text", "text", "text"), skip = 12)
head(AvgLenStDF, 3)

## # A tibble: 3 x 19
##   `Reporting unit` `Reporting unit… State `Local Hospital… `Peer group`
##   <chr>            <chr>            <chr> <chr>            <chr>       
## 1 Albury Wodonga … Hospital         NSW   Albury Wodonga … Large hospi…
## 2 Albury Wodonga … Hospital         NSW   Albury Wodonga … Large hospi…
## 3 Albury Wodonga … Hospital         NSW   Albury Wodonga … Large hospi…
## # … with 14 more variables: `Time period` <chr>, Category <chr>, `Total
## #   number of stays` <chr>, ...9 <chr>, `Number of overnight stays` <chr>,
## #   ...11 <chr>, `Percentage of overnight stays` <chr>, ...13 <chr>,
## #   `Average length of stay (days)` <chr>, ...15 <chr>, `Peer group
## #   average (days)` <chr>, ...17 <chr>, `Total overnight patient bed
## #   days` <chr>, ...19 <chr>

Data (Cont.)

Renaming the column 15 to contracted that have (‡) to denote people who have a contract with the hospital.
Substituting (‡) symbol with one(1) and empty data cells with zero(0).
Then dropped column number 9,11,13,17 and, 19 as they are empty.
Then removed the white spaces in the data to ease the column selection in later steps.
Removed brackets and text between the brackets.

names(AvgLenStDF)[15] <- 'contracted' #Renaming column 15 of data
AvgLenStDF$contracted <- gsub('‡', 1, AvgLenStDF$contracted) #Subsituting special symbol with 1
AvgLenStDF$contracted[is.na(AvgLenStDF$contracted)] <- 0 #Subsituting empty space with 0
AvgLenStDF <- AvgLenStDF[-c(9,11,13,17,19)] # Dropping empty volumns

names(AvgLenStDF) <- gsub(" ", "", names(AvgLenStDF)) #Removing white spaces in column name
names(AvgLenStDF) <- gsub("LHN", "", names(AvgLenStDF)) #Removing alphabets inside the bracket
names(AvgLenStDF) <- gsub("days", "InDays", names(AvgLenStDF)) #Subsituting 'days' with'InDays'
names(AvgLenStDF) <- gsub("[^A-z]", "", names(AvgLenStDF)) #Removing everyting except the alphabets

Data (Cont.)

Our variable of interest is AveragelengthofstayInDays(Average length of stay (days)) and Peergroup(Peer group), so created a new data frame(df) with only ALOS and peer groups as Large hospitals and Medium hospitals.
Dropped the rows with (-) symbol as there were no patients reported for this indicator in that period.
Dropped the rows with (NP) string as it contains reported data that did not meet the criteria to calculate the indicator.
Finally converted the data column AveragelengthofstayInDays to numeric.

df <- AvgLenStDF[AvgLenStDF$Peergroup == "Medium hospitals" | AvgLenStDF$Peergroup == "Large hospitals",
                 c("AveragelengthofstayInDays", "Peergroup")]
## Removing NP and -
df <- df[df$AveragelengthofstayInDays != '-', ]
df <- df[df$AveragelengthofstayInDays != 'NP', ]
## Converting Averagel ength of stay(In Days) to numeric
df$AveragelengthofstayInDays <- df$AveragelengthofstayInDays %>% as.numeric()

Descriptive Statistics and Visualisation

Median is smaller than mean that suggests that both the peer groups are Right skewed.
First Quartile value test is that 25% of the data is less than that particular point.
Third Quartile value test is that 75% of the data is less than that particular point.

knitr::kable(df %>% group_by(df$Peergroup) %>% summarise(Min = min(AveragelengthofstayInDays,na.rm = TRUE),
                                        Max = max(AveragelengthofstayInDays, na.rm = TRUE),
                                        n = n(),
                                        Missing = sum(is.na(AveragelengthofstayInDays)),
                                        Q1 = quantile(AveragelengthofstayInDays ,probs = .25,na.rm = TRUE),
                                        Median = median(AveragelengthofstayInDays, na.rm = TRUE),
                                        Q3 = quantile(AveragelengthofstayInDays, probs = .75,na.rm = TRUE),
                                        Mean = mean(AveragelengthofstayInDays, na.rm = TRUE),
                                        SD = sd(AveragelengthofstayInDays, na.rm = TRUE),
                                        IQR = IQR(AveragelengthofstayInDays ,na.rm = TRUE))
                                        , "html", caption = "Table 1: Descriptive Statistics", align = "llllllllll", col.names = c("Peer Groups", "Minimum", "Maximum", "Sample Size", "Missing Count","First Quartile", "Median", "Third Quartile", "Mean", "Standard Deviation", "IQR"), digits = 2) %>% kable_styling(latex_options = "HOLD_position") %>% column_spec(1, bold = TRUE) %>% column_spec(c(2,4,6,8,10), color = 'white', background = 'black')

Table 1: Descriptive Statistics
Peer Groups	Minimum	Maximum	Sample Size	Missing Count	First Quartile	Median	Third Quartile	Mean	Standard Deviation	IQR
Large hospitals	1.2	12.6	4411	0	2.5	3.5	5.0	3.99	1.98	2.5
Medium hospitals	1.0	13.2	2182	0	2.4	3.4	4.5	3.71	1.85	2.1

Descriptive Statistics and Visualisation (Cont.)

Both the large and medium hospitals have outliers that can not be removed.
The means are around the same point.
Large hospitals have larger IQR as compared to medium hospitals.
The first quartile (Q1) for both is comparatively the same.
There is only 0.5 difference between Large and medium hospital’s third quartile.

ggplot(df, aes(x=Peergroup, y=AveragelengthofstayInDays)) + geom_boxplot(outlier.colour="black", outlier.shape=1, outlier.size=1.5 ,fill='#4271AE', color="#1F3552") + theme_economist() + theme(plot.title = element_text(family="Tahoma", hjust = 0.5), text = element_text(family="Tahoma"), axis.title = element_text(size = 12)) + scale_x_discrete(name = "\nPeer Group")+ ggtitle("Boxplot for Medium and Large Hospitals\n") + scale_y_continuous(name = 'Average Length of Stay (In Days)\n')

Descriptive Statistics and Visualisation (Cont.)

It can be seen from the following histogram that both Large Hospitals and Medium Hospitals are right-skewed.
For the Large Hospitals mean is 3.99 and for Medium Hospitals mean is 3.71.
Both the Peer groups have a comparatively similar mean.
The distribution is not normal.

LH <- filter(df, Peergroup=="Large hospitals") ; MH <- filter(df, Peergroup=="Medium hospitals")
ggplot(df, aes(AveragelengthofstayInDays)) + geom_histogram(fill = "#4271AE", color = "#1F3552", binwidth = 0.3, position="identity") + facet_wrap(~ Peergroup) + geom_vline(data=LH, aes(xintercept=mean(LH$AveragelengthofstayInDays) ), colour="red", linetype = "dashed", size = 0.8) + geom_vline(data=LH, aes(xintercept=median(LH$AveragelengthofstayInDays) ), colour="orange", linetype = "dashed", size = 0.4)  + geom_vline(data=MH, aes(xintercept=mean(MH$AveragelengthofstayInDays)), colour="green", linetype = "dashed", size = 0.8) + geom_vline(data=MH, aes(xintercept=median(MH$AveragelengthofstayInDays)), colour="purple", linetype = "dashed", size = 0.4) + ggtitle("Frequency histogram of Medium and Large Hospitals\n") + theme_economist() + theme(plot.title = element_text(family="Tahoma", hjust = 0.5), text = element_text(family="Tahoma"), axis.title = element_text(size = 12)) + scale_x_continuous(name = "\nAverage Length of Stay (In Days)") + geom_text(aes(x=4.8, y=400, label= 'μ = 3.99', group=NULL), data=LH[1,], size = 4) + geom_text(aes(x=2.6, y=450, label= 'Median = 3.5', group=NULL), data=LH[1,], size = 3) + geom_text(aes(x=4.6, y=400, label= 'μ = 3.71', group=NULL), data=MH[1,], size = 4) + geom_text(aes(x=2.4, y=360, label= 'Median = 3.4', group=NULL), data=MH[1,], size = 3) + scale_y_continuous(name = 'Frequency\n')

Descriptive Statistics and Visualisation (Cont.)

Normal Quantile-Quantile Plot helps us to compare the sample distribution of our data with that of a theoretical distribution.
Here we are using to compare the theoretical normal distribution with our sample data.
It can be seen that both the Peer groups do not follow the normal distribution.

p1 <- ggqqplot(LH$AveragelengthofstayInDays, size = 0.5) + ggtitle('QQ Plot for Large Hospitals ALOS') + theme(plot.title = element_text(hjust = 0.5))
p2 <- ggqqplot(MH$AveragelengthofstayInDays, size = 0.5) +  ggtitle('QQ Plot for Medium Hospitals ALOS') + theme(plot.title = element_text(hjust = 0.5))
grid.arrange(p1, p2, nrow = 1)

Descriptive Statistics and Visualisation (Cont.)

Levene Test is performed to test the Homogeneity of variance.
For leven test the hypothesis are:
- Null hypothesis(\(H_0\)): Variance is equal.
- Alternative hypothesis(\(H_1\)): Variance is not equal. \[ H_0 : \sigma_1^2 = \sigma_1^2 \\ H_1 : \sigma_1^2 \neq \sigma_1^2\]
Where \(\sigma_1^2\) and \(\sigma_1^2\) refers to the variance of ALOS in Large and Medium hospital respectively.
The Levene’s test is used to compate p-value(\(p\)) to the \(\alpha = 0.5\) (Significance level) or 95% Confidence.
The resulted \(p\) is 0.00004707 which is less than 0.05, i.e \(p < \alpha\).
In this case we will reject our \(H_0\), hence our variance is not equal i.e \(H_1 : \sigma_1^2 \neq \sigma_1^2\).

leveneTest(AveragelengthofstayInDays ~ Peergroup, data = df)

## Levene's Test for Homogeneity of Variance (center = median)
##         Df F value     Pr(>F)    
## group    1  16.585 0.00004707 ***
##       6591                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Descriptive Statistics and Visualisation (Cont.)

Levene Test can give wrong results if the data is not normal and have outliers.
From Box Plot and Q-Q Plot it can be seen that our data do have some outliers that can not be removed and it dows not follow normal distribution.
In this case another test called Fligner-Killeen Test can be done to verify the Homogeneity of variance.
Fligner-Killeen Test is non-parametric and is very robust against the situation when data is not normal.
Similar to Levene test, for Fligner-Killeen test the hypothesis are:
- Null hypothesis(\(H_0\)): Variance is equal.
- Alternative hypothesis(\(H_1\)): Variance is not equal. \[ H_0 : \sigma_1^2 = \sigma_1^2 \\ H_1 : \sigma_1^2 \neq \sigma_1^2\]
The resulted \(p\) is 0.00005255 which is less than 0.05, i.e \(p < \alpha\).
In this case we will reject our \(H_0\), hence our variance is not equal i.e \(H_1 : \sigma_1^2 \neq \sigma_1^2\).

fligner.test(AveragelengthofstayInDays ~ Peergroup, data = df)

## 
##  Fligner-Killeen test of homogeneity of variances
## 
## data:  AveragelengthofstayInDays by Peergroup
## Fligner-Killeen:med chi-squared = 16.354, df = 1, p-value =
## 0.00005255

Hypothesis Testing

The data set is sufficiently large i.e \(Sample size(n) > 30\) hence it satisfies the condition for the central limit theorem(CLT) to hold.
Our motive is to find the difference in mean of two quantitative sample data that suggest the use of two sample T-Test.
As both the data sets are independent so the assumption here is that peer groups are Unpaired so the test that would be used is Welch Two Sample t-test.
From the previous two tests for Homogeneity of variance i.e (1) Levene Test (2) Fligner-Killeen Test it can be confirmed that the variance of both the data is unequal, hence we will assume unequal variance.
For this T-Test expected confidence is 95% or desired significance level of 0.05(\(\alpha = 0.5\)).
For the T-Test hypothesis are:
- Null hypothesis(\(H_0\)): Difference in the mean is equal to 0(Zero).
- Alternative hypothesis(\(H_1\)): Difference in the mean is not equal to 0(Zero). \[H_0 : \overline{\rm X_1} - \overline{\rm X_2} = 0 \\ H_1: \overline{\rm X_1} - \overline{\rm X_2} \neq 0\]
Here \(\overline{\rm X_1}\) denotes sample mean of ALOS for Large hospitals and \(\overline{\rm X_2}\) denotes the sample mean of ALOS for Medium hospitals.

Hypothesis Testing (Cont.)

Based on our alternate hypothesis we are going to use Two Tail Welch Two Sample t-test.
The \(t = 5.6615\), degree of freedom(\(df = 4611\)) and, \(p-value = 0.00000001592\).
As for this test the resulted \(p\) is 0.00000001592 which is less than 0.05, i.e \(p < \alpha\).
In this case, we will Reject our null hypothesis(\(H_0\)) that difference in the mean is equal to zero.
We will conclude based on the test result that there is a statistically significant difference between mean.

t.test(AveragelengthofstayInDays ~ Peergroup, 
       alt = "two.sided", 
       conf = 0.95, 
       var.eq = F, 
       paired = F, 
       data=df)

## 
##  Welch Two Sample t-test
## 
## data:  AveragelengthofstayInDays by Peergroup
## t = 5.6615, df = 4611, p-value = 0.00000001592
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.1835797 0.3780687
## sample estimates:
##  mean in group Large hospitals mean in group Medium hospitals 
##                       3.986874                       3.706049

Hypothesis Testing (Cont.)

It can be seen from the QQ Plots that both the samples are not normally distributed.
So due to non-normality we are going to perform two more tests to confirm that our results are not biased due to our assumption of normality.
We are going to perform
1. T-test with log transformation.
2. Wilcox Test - It is a non-parametric test and can test without the need for multivariate normality.

t.test(log(AveragelengthofstayInDays) ~ Peergroup, 
       alt = "two.sided", conf = 0.95, var.eq = F, paired = F,
       data=df)

## 
##  Welch Two Sample t-test
## 
## data:  log(AveragelengthofstayInDays) by Peergroup
## t = 6.0326, df = 4289, p-value = 0.000000001749
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.05015421 0.09844745
## sample estimates:
##  mean in group Large hospitals mean in group Medium hospitals 
##                       1.271971                       1.197670

wilcox.test(AveragelengthofstayInDays ~ Peergroup, 
            alt = "two.sided", var.eq = F, paired = F,
            data=df)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  AveragelengthofstayInDays by Peergroup
## W = 5172686, p-value = 0.0000007227
## alternative hypothesis: true location shift is not equal to 0

Discussion

Initially we had 5692 observations for the Large hospital peer group and 3877 observations for Medium hospitals peer group.
After preprocessing ‘NP’ and ‘-’ data fields, we had 4411 observations for Large hospital i.e 77.49% of the total data is still available for analysis and we were left with 2181 observations i.e 56.28% of the total data from Medium hospitals.
It can be said that people with certain cases tend to go to Large hospitals as compared to Medium hospitals because ‘NP’ and ‘-’ signifies that there was no data related to that case.
The data set was sufficiently large to meet the criteria for CLT.
There are outliers in the data that can not be removed in both the samples.
The sample is not normally distributed.

Discussion

According to Homogeneity of variance test results both samples do not have similar variance.
For log transformed T test, Wilcox Test and Welch Two Sample t-test assumptions were same i.e
1. variance are unequal
2. Data were unpaired
3. Test performed for two tail
For the log transformed T test the \(p-value = 0.000000001749\) is less than significance level i.e \(p < \alpha\).
Similary for Wilcox Test the \(p-value = 0.0000007227\) is less than significance level i.e \(p < \alpha\).
Hence both the test support are initial finding of Welch Two Sample t-test, so we will conclude that there is a statistically significant difference between mean.
So, It is safe to say that there is a statistically significant difference in the average length of stay (ALOS) between large and medium hospitals which might make patients choose one over the other.

References

[1] “Admitted patients”, Australian Institute of Health and Welfare 2020. [Online]. Available: https://www.aihw.gov.au/reports-data/myhospitals/sectors/admitted-patients. [Accessed: 10-May-2020].

[2] “Homogeneity of Variance Test in R”, Data Novia, [Online]. Available: https://www.datanovia.com/en/lessons/homogeneity-of-variance-test-in-r/ [Accessed: 10-May-2020].

[3] “t-test: Comparing Group Means”, UC Business Analytics R Programming Guide , [Online]. Available: https://uc-r.github.io/t_test [Accessed: 10-May-2020].

MATH1324 Assignment 2

Test for Average Length of Stay (ALOS) Between Large and Medium Hospitals

Introduction

Introduction

Problem Statement

Data

Data (Cont.)

Data (Cont.)

Data (Cont.)

Descriptive Statistics and Visualisation

Descriptive Statistics and Visualisation (Cont.)

Descriptive Statistics and Visualisation (Cont.)

Descriptive Statistics and Visualisation (Cont.)

Descriptive Statistics and Visualisation (Cont.)

Descriptive Statistics and Visualisation (Cont.)

Hypothesis Testing

Hypothesis Testing (Cont.)

Hypothesis Testing (Cont.)

Discussion

Discussion

References