T-Test Analysis of Cars Drivability Rating in R

##by Michael Ige

Introduction

Assume that a researcher is interested to compare the drivability of two similar vehicles from two different manufacturers. 15 test drivers were hired and drivability performance data for two cars are provided in the Drivability data file.

Data Source: https://www.theopeneducator.com/doe/hypothesis-Testing-Inferential-Statistics-Analysis-of-Variance-ANOVA/Paired-T-Test-Matched-Pair-Repeated-Measure

Objective

Conduct a t-test in R to determine if there is any statistically significant difference in the drivability of the two cars.

library(readr)
Drivability_data <- read_csv("C:/Users/babao/Desktop/R_wd/Practice Data/Drivability_data.csv")

## Rows: 15 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (2): Vehicle_1, Vehicle_2
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

View(Drivability_data)

print(Drivability_data)

## # A tibble: 15 × 2
##    Vehicle_1 Vehicle_2
##        <dbl>     <dbl>
##  1         8         8
##  2         8         7
##  3         8         9
##  4         9         9
##  5         7        10
##  6         9         9
##  7         9         8
##  8         8        10
##  9         9         6
## 10         8         7
## 11        10         7
## 12         7         7
## 13         7         6
## 14         8         8
## 15        10         9

Rating Scale:1 =Very poor to drive, 10 Excellent to drive

Step 1:State the Hypotheses

Null Hypothesis : There is no difference in the drivability between the two cars.

Alternative Hypothesis: There is difference in the drivability between the two cars.

Step 2: Determine if the groups are or unpaired

There was only one group of 15 drivers with ID numbers 1 to 15.
Each driver drove the two cars at different times consequtively.

Conclusion: the driver’s group is PAIRED

Step 3: Determine if the data test is one-tailed or two-tailed.

The t-test is to determine if there is difference in drivability.
The difference is assumed to be non-directional (i.e., can be higher or lower).
The t-test will be a TWO-TAILED TEST

Step 4: Determine the alpha

The testing alpha is assumed to .05 or at 95% confidence level.

Step 5: Determine if the there is equal or unequal variance

var(Drivability_data$Vehicle_1)

## [1] 0.952381

var(Drivability_data$Vehicle_2)

## [1] 1.714286

From the above, we have EQUAL VARIANCE data.

Step 6:Compute the t-Test

summary(Drivability_data)

##    Vehicle_1        Vehicle_2 
##  Min.   : 7.000   Min.   : 6  
##  1st Qu.: 8.000   1st Qu.: 7  
##  Median : 8.000   Median : 8  
##  Mean   : 8.333   Mean   : 8  
##  3rd Qu.: 9.000   3rd Qu.: 9  
##  Max.   :10.000   Max.   :10

dim(Drivability_data)

## [1] 15  2

Two Sample t-Test

library(psych)

## Warning: package 'psych' was built under R version 4.2.1

describe(Drivability_data)

##           vars  n mean   sd median trimmed  mad min max range skew kurtosis
## Vehicle_1    1 15 8.33 0.98      8    8.31 1.48   7  10     3 0.22    -1.11
## Vehicle_2    2 15 8.00 1.31      8    8.00 1.48   6  10     4 0.00    -1.37
##             se
## Vehicle_1 0.25
## Vehicle_2 0.34

  Vehicle_1 <- rnorm(15,mean = 8.33, sd = .98)
  Vehicle_2 <- rnorm(15, mean = 8.00, sd = 1.31)
  t.test(Vehicle_1, Vehicle_2, var.test = TRUE)

## 
##  Welch Two Sample t-test
## 
## data:  Vehicle_1 and Vehicle_2
## t = 1.4615, df = 27.11, p-value = 0.1554
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.2264063  1.3480817
## sample estimates:
## mean of x mean of y 
##  8.575295  8.014457

Interpretation

p-value (.11) is greater than the alpha (0.05), therefore:

We do not reject the null hypothesis that there is no statistically significant difference in drivability rating between the two cars.

The drivability level of the two cars are thesame according to the ratings the drivers (Vehicle 1 mean = 8.33, vehicle 2 mean =8.00)

The difference in rating mean is .33 representing just 4% overall which is insignificant statistically.

Finally, we can conclude that the vehicles drive well at the same level.

Visualization

hist(Vehicle_1)

hist(Vehicle_2)

boxplot(Vehicle_1, Vehicle_2)

plot(Drivability_data$Vehicle_1,Drivability_data$Vehicle_2,
     pch = 15,
     xlab = "drivers",
     ylab = "ratings" )

abline(0,1, col="blue", lwd=2)

boxplot(Vehicle_1,Vehicle_2,
  data=Drivability_data,
  main="Drivability Ratings by Drivers",
  xlab="Vehicle",
  ylab="Ratings",
  col="steelblue",
  border="black"
)