First research Question: Is there a significant difference in the average fuel consumption (miles per gallon) between cars with automatic and manual transmissions?
1.1 DATA: I used data from the R program (already built in data set)
Unit of Observation: A car Sample Size: 32 Independent samples
# Load the mtcars dataset
data(mtcars)
# Check the structure of the dataset (first 6 rows)
head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
The “mtcars” dataset in R contains information about various car models. The dataset contains multiple variables, however I will primarily focus on the following in order to answer the research question:
mpg: miles per gallon, which is the fuel efficiency of the car (continuous).
am: transmission type 0 = automatic, 1 = manual (Categorical /binary).
1.2 ANALYSIS
mtcars$am <- factor(mtcars$am, levels = c(0, 1), labels = c("Automatic", "Manual"))
With this function I factored the variable “am” since it is categorical.
library(psych)
describeBy(mtcars$mpg, mtcars$am)
##
## Descriptive statistics by group
## group: Automatic
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 19 17.15 3.83 17.3 17.12 3.11 10.4 24.4 14 0.01 -0.8 0.88
## ------------------------------------------------------------
## group: Manual
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 13 24.39 6.17 22.8 24.38 6.67 15 33.9 18.9 0.05 -1.46 1.71
Quite small sample size. There seems to be a difference in means. The mean of miles per gallon for automatic is lower. Will analyze firstly using parametric test. If they cannot be used then non parametric tests will be analyzed.
Assumptions: 1. numeric variable 2. variance in both populations notmally distributed 3. no outliers
If NOT met we use Welch correlation
library(rstatix)
##
## Attaching package: 'rstatix'
## The following object is masked from 'package:stats':
##
## filter
mtcars %>%
group_by(am) %>%
shapiro_test(mpg)
## # A tibble: 2 × 4
## am variable statistic p
## <fct> <chr> <dbl> <dbl>
## 1 Automatic mpg 0.977 0.899
## 2 Manual mpg 0.946 0.536
Automatic: H0: miles per gallon for automatic cars are normally distributed H1: miles per gallon for automatic cars are NOT normally distributed
Cannot reject H0 (p-value=0.899), which means that distribution is normal. Assumption met.
Manual: H0: miles per gallon for manual cars are normally distributed H1: miles per gallon for manual cars are NOT normally distributed
Cannot reject H0 (p-value=0.537), which means that distribution is normal. Assumption met.
wilcox.test(mtcars$mpg ~ mtcars$am,
paired = FALSE,
correct = FALSE,
exact = FALSE,
alternative = "two.sided")
##
## Wilcoxon rank sum test
##
## data: mtcars$mpg by mtcars$am
## W = 42, p-value = 0.001753
## alternative hypothesis: true location shift is not equal to 0
H0: location distributions of miles per gallon are equal for automatic and manual cars. H1: location distributions of miles per gallon are equal NOT for automatic and manual cars.
Can reject the null hypothesis (p-value=0.002), meaning that automatic cars have lower consumption (less miles per gallon) compared to manual cars.
library(effectsize)
##
## Attaching package: 'effectsize'
## The following objects are masked from 'package:rstatix':
##
## cohens_d, eta_squared
## The following object is masked from 'package:psych':
##
## phi
effectsize(wilcox.test(mtcars$mpg ~ mtcars$am,
paired = FALSE,
correct = FALSE,
exact = FALSE,
alternative = "two.sided"))
## r (rank biserial) | 95% CI
## ----------------------------------
## -0.66 | [-0.84, -0.36]
interpret_rank_biserial(0.66)
## [1] "very large"
## (Rules: funder2019)
1.3 CONCLUSION Based on the sample data, I find that manual and automatic cars differ in fuel consumption (miles per gallon) (p-value=0.002). Automatic cars consume on average less fuel compared to manual cars. The difference in distribution is very large (r=0.66).