Problem:

A hotel manager looks to enhance the initial imperession that hotel guests have when they check in. Contributing to initial impressions is the time it takes to deliver a guest’s luggage to the room after check-in. A random sample of 20 deliveries on a particular day were selected in Wing A of the hotel, and a random sample of 20 deliveries were selected in Wing B. The results are stored in the file named Luggage.csv. Analyze the data and determine whether there is a deference between the mean delivery time in the two wings of the hotel. (Use alpha = 0.05).

Load the data

library(readr)
luggage_data<- read_csv("D:/Analytics/BACP-Dec2017/05_Hypothesis_Testing/Luggage (1).csv")
## Parsed with column specification:
## cols(
##   WingA = col_double(),
##   WingB = col_double()
## )
attach(luggage_data)

Exploratory Data Analysis

# Find out Total Number of Rows and Columns
dim(luggage_data)
## [1] 20  2
# Find out Names of the Columns (Features)
names(luggage_data)
## [1] "WingA" "WingB"
# Find out Class of each Feature, along with internal structure
str(luggage_data)
## Classes 'tbl_df', 'tbl' and 'data.frame':    20 obs. of  2 variables:
##  $ WingA: num  10.7 9.89 11.83 9.04 9.37 ...
##  $ WingB: num  7.2 6.68 9.29 8.95 6.61 8.53 8.92 7.95 7.57 6.38 ...
##  - attr(*, "spec")=List of 2
##   ..$ cols   :List of 2
##   .. ..$ WingA: list()
##   .. .. ..- attr(*, "class")= chr  "collector_double" "collector"
##   .. ..$ WingB: list()
##   .. .. ..- attr(*, "class")= chr  "collector_double" "collector"
##   ..$ default: list()
##   .. ..- attr(*, "class")= chr  "collector_guess" "collector"
##   ..- attr(*, "class")= chr "col_spec"
# Check top 6 and bottom 6 Rows of the Dataset
head(luggage_data)
## # A tibble: 6 x 2
##   WingA WingB
##   <dbl> <dbl>
## 1 10.70  7.20
## 2  9.89  6.68
## 3 11.83  9.29
## 4  9.04  8.95
## 5  9.37  6.61
## 6 11.68  8.53
tail(luggage_data)
## # A tibble: 6 x 2
##   WingA WingB
##   <dbl> <dbl>
## 1  8.91  9.23
## 2 11.79  9.25
## 3 10.59  8.44
## 4  9.13  6.57
## 5 12.37 10.61
## 6  9.91  6.77
#Check for Missing Values
colSums(is.na(luggage_data))
## WingA WingB 
##     0     0
# Provide Summary of a Dataset.
summary(luggage_data)
##      WingA           WingB       
##  Min.   : 8.36   Min.   : 5.280  
##  1st Qu.: 9.31   1st Qu.: 6.747  
##  Median :10.24   Median : 8.485  
##  Mean   :10.40   Mean   : 8.123  
##  3rd Qu.:11.21   3rd Qu.: 9.235  
##  Max.   :13.67   Max.   :10.610

Data Visualization

install.packages("ggplot2",repos = "http://cran.us.r-project.org")
## Installing package into 'C:/Users/Ranvir Kumar/Documents/R/win-library/3.4'
## (as 'lib' is unspecified)
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.4.3
par(mfrow=c(2,2))
hist(WingA, main = 'Wing A', xlab = "Time Taken", ylab = "Frequency", col="turquoise")
hist(WingB, main = 'Wing B', xlab = "Time Taken", ylab = "Frequency", col="turquoise")
boxplot(WingA, main = 'Wing A', xlab = "Time Taken", ylab = "Frequency", col="turquoise", horizontal = TRUE)
boxplot(WingB, main = 'Wing B', xlab = "Time Taken", ylab = "Frequency", col="turquoise", horizontal = TRUE)

# Reset the earlier partition command
dev.off()

Mean, Standard Deviation and Variance

Col_Head <- c("Mean","Standard Deviation","Variance","Remark")
WingA_Stats <- c(round(mean(WingA),digits = 2),
                  round(sd(WingA),digits = 2),
                  round(var(WingA),digits = 2),
                  "Wing A ")
WingB_Stats <- c(round(mean(WingB),digits = 2),
                  round(sd(WingB),digits = 2),
                  round(var(WingB),digits = 2),
                  "Wing B ")
#
Combined_Stats <- rbind(Col_Head,WingA_Stats,WingB_Stats)
Combined_Stats
##             [,1]   [,2]                 [,3]       [,4]     
## Col_Head    "Mean" "Standard Deviation" "Variance" "Remark" 
## WingA_Stats "10.4" "1.37"               "1.88"     "Wing A "
## WingB_Stats "8.12" "1.42"               "2.01"     "Wing B "

Hypothesis Testing

Null Hypothesis (H0): µ1 - µ2 = 0 (i.e. they are the same)
Alternative Hypothesis (Ha): µ1 - µ2 =! 0 (i.e. they are not the same)
where,
µ1: Mean delivery time at Wing A
µ2: Mean delivery time at Wing B
By formulation of above hypotheses, we assume that both the wings show no significant difference to each other.

Calculate the p-value using the R function ‘t.test’
#

t.test(WingA,WingB)
## 
##  Welch Two Sample t-test
## 
## data:  WingA and WingB
## t = 5.1615, df = 37.957, p-value = 8.031e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1.38269 3.16731
## sample estimates:
## mean of x mean of y 
##   10.3975    8.1225

Since it is a two-tailed test, the p-value=0.000008031/2=0.0000040155 The p-value for the two-tailed test is 0.0000040155, which is less than level of significance a (0.05). Therefore, the Null Hypothesis (H0) will be rejected.

Confidence Interval

*One sample t-test to get 95% confidence interval
Analysis of WingA #

t.test(WingA)
## 
##  One Sample t-test
## 
## data:  WingA
## t = 33.94, df = 19, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##   9.756305 11.038695
## sample estimates:
## mean of x 
##   10.3975

The 95% confidence interval of population mean for WingA model is between 9.75 & 11.03. This implies that, with 95% confidence, we can say that the sample mean time to deliver at Wing A will be within this range.

Analysis of WingB #

t.test(WingB)
## 
##  One Sample t-test
## 
## data:  WingB
## t = 25.632, df = 19, p-value = 3.359e-16
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  7.459232 8.785768
## sample estimates:
## mean of x 
##    8.1225

The 95% confidence interval of population mean for WingB model is between 7.45 & 8.78. This implies that, with 95% confidence, we can say that the sample mean time to deliver at Wing will be within this range.

Welch Two Sample t-test

95% confidence interval for the difference between the means of the two population:

t.test(WingA,WingB)
## 
##  Welch Two Sample t-test
## 
## data:  WingA and WingB
## t = 5.1615, df = 37.957, p-value = 8.031e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1.38269 3.16731
## sample estimates:
## mean of x mean of y 
##   10.3975    8.1225

The 95% confidence interval of difference in population mean between both the models is between 1.38 on the lower end and 3.17 on the upper end. This implies that, with 95% confidence, we can say that the difference in sample mean time to delivery of both the models will be within the above range. For the 20 observations we have taken in this sample, the difference in mean delivery time is -2.275 (8.1225 - 10.3975) which falls outside the range.

To calculate large sample size

Get the difference between two sample means
Calculate pooled Standard Deviation using following formula:

pooledSD=SQRT[{(n1-1)(sd1^2)+(n2-1)(sd2^2)}/(n1+n2-2)]

where,
m1 = mean of first groupp
m2 = mean of second group
sd1 = standard deviation of first group
sd2 = standard deviation of second group
n1 = number of samples of first group
n2 = number of samples of second group

Execute the power T Test, with current parameters, and decide if larger size is needed.
Calculate the samples number (in case Power of Test is insignificant)

Pooled Standard Deviation

n1=20
n2=20
m1=round(mean(WingA),digits = 2)
m2=round(mean(WingB),digits = 2)
sd1=round(sd(WingA),digits = 2)
sd2=round(sd(WingB),digits = 2)

#delta = Old mean - New mean
delta=m1-m2
#pooledSD=SQRT[{(n1-1)*(sd1^2)+(n2-1)*(sd2^2)}/(n1+n2-2)]
pooledSD=(((20-1)*(1.37^2)+(20-1)*(1.42^2))/(20+20-2))^0.5
delta
## [1] 2.28
pooledSD
## [1] 1.395224

Power T-Test:

power.t.test (n=20,
              delta = 2.28,
              sd=1.395224,
              sig.level=0.05, 
              type = "two.sample",
              alternative = "two.sided")
## 
##      Two-sample t test power calculation 
## 
##               n = 20
##           delta = 2.28
##              sd = 1.395224
##       sig.level = 0.05
##           power = 0.9989416
##     alternative = two.sided
## 
## NOTE: n is number in *each* group

The Power of test is 0.9989 or 99.89%, which means there are 99.89% chances that the null hypothesis will not be rejected when it is false. Hence, we should revisit the number of samples to increase the power of test.

Recalculate sample size using Power T-test
Consider Power of Test 95% and significance level 0.000008031 (The p value calculated above in Welch Two Sample t-test) and execute the Power T test once again.

# To calculate the required sample size
power.t.test (power=0.95,
              delta = 2.28,
              sd=1.395224,
              sig.level=0.000008031, 
              type = "two.sample",
              alternative = "two.sided")
## 
##      Two-sample t test power calculation 
## 
##               n = 32.92665
##           delta = 2.28
##              sd = 1.395224
##       sig.level = 8.031e-06
##           power = 0.95
##     alternative = two.sided
## 
## NOTE: n is number in *each* group

To get the 95% power of test, we need to have a sample size of 33 (rounded up)