Onjira Benjarangseepornchai
Last updated: 03 June, 2017
Domestic_Airlines <- read_csv("~/Downloads/domestic_time_performance.csv")Melbourne_Airport <- Domestic_Airlines %>% filter(Departing_Port == "Melbourne")Departures Delayed
Melbourne_Airport %>% summarise(Min = min(Departures_Delayed,na.rm = TRUE),
Q1 = quantile(Departures_Delayed,probs = .25,na.rm = TRUE),
Median = median(Departures_Delayed, na.rm = TRUE),
Q3 = quantile(Departures_Delayed,probs = .75,na.rm = TRUE),
Max = max(Departures_Delayed,na.rm = TRUE),
Mean = mean(Departures_Delayed, na.rm = TRUE),
SD = sd(Departures_Delayed, na.rm = TRUE),
n = n(),
Missing = sum(is.na(Departures_Delayed))) -> table1
knitr::kable(table1)| Min | Q1 | Median | Q3 | Max | Mean | SD | n | Missing |
|---|---|---|---|---|---|---|---|---|
| 0 | 6 | 15 | 31 | 337 | 24.81968 | 31.51441 | 6496 | 13 |
Arrivals Delayed
Melbourne_Airport %>% summarise(Min = min(Arrivals_Delayed,na.rm = TRUE),
Q1 = quantile(Arrivals_Delayed,probs = .25,na.rm = TRUE),
Median = median(Arrivals_Delayed, na.rm = TRUE),
Q3 = quantile(Arrivals_Delayed,probs = .75,na.rm = TRUE),
Max = max(Arrivals_Delayed,na.rm = TRUE),
Mean = mean(Arrivals_Delayed, na.rm = TRUE),
SD = sd(Arrivals_Delayed, na.rm = TRUE),
n = n(),
Missing = sum(is.na(Arrivals_Delayed))) -> table2
knitr::kable(table2)| Min | Q1 | Median | Q3 | Max | Mean | SD | n | Missing |
|---|---|---|---|---|---|---|---|---|
| 0 | 7 | 16 | 32 | 372 | 27.28226 | 36.50968 | 6496 | 9 |
plot(log(Departures_Delayed) ~ log(Arrivals_Delayed), data = Melbourne_Airport) - The plot above shows a strong positive of correlations.
\[r = \frac{L_{xy}}{\sqrt[]{L_{xx}-L_{yy}}}\]
cor(Melbourne_Airport$Departures_Delayed,Melbourne_Airport$Arrivals_Delayed, use="complete.obs")## [1] 0.9575857
library(Hmisc)
bivariate<-as.matrix(dplyr::select(Melbourne_Airport, Departures_Delayed,Arrivals_Delayed))
rcorr(bivariate, type = "pearson")## Departures_Delayed Arrivals_Delayed
## Departures_Delayed 1.00 0.96
## Arrivals_Delayed 0.96 1.00
##
## n
## Departures_Delayed Arrivals_Delayed
## Departures_Delayed 6483 6483
## Arrivals_Delayed 6483 6487
##
## P
## Departures_Delayed Arrivals_Delayed
## Departures_Delayed 0
## Arrivals_Delayed 0
\[H_{0}:r = 0 \] \[H_{A}:r \neq 0 \]
The \(p\)-value for \(r\) can be calculated by converting \(r\) to a \(t\) statistics: \[t=r\sqrt[]{\frac{n-2}{1-r^2}}=0.96\sqrt[]{\frac{6496-2}{1-0.96^2}}=276.293\]
Therefore, a two-tailed \(p\)-value for \(r\) can be calculated using:
2*pt(q = 276.293, df = 6496 - 2, lower.tail=FALSE)## [1] 0
\[r=z=\frac{1}{2}ln(\frac{1+r}{1-r})\]
0.5*(log((1+.96)/(1-.96)))## [1] 1.94591
library(psychometric)
r=cor(Melbourne_Airport$Departures_Delayed,Melbourne_Airport$Arrivals_Delayed,use="complete.obs")
CIr(r = r, n = 6496, level = .95)## [1] 0.9555183 0.9595589
Overall, a Pearson’s correlation was calculated to measure the strength of association between departures delayed and arrivals delayed (r=0.96), and the correlation was statistically significant (\(p\) < 0.001).