First research Question: Is there a significant difference in the average fuel consumption (miles per gallon) between cars with automatic and manual transmissions?

1.1 DATA: I used data from the R program (already built in data set)

Unit of Observation: A car Sample Size: 32 Independent samples

# Load the mtcars dataset
data(mtcars)

# Check the structure of the dataset (first 6 rows)
head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

The “mtcars” dataset in R contains information about various car models. The dataset contains multiple variables, however I will primarily focus on the following in order to answer the research question:

mpg: miles per gallon, which is the fuel efficiency of the car (continuous).

am: transmission type 0 = automatic, 1 = manual (Categorical /binary).

1.2 ANALYSIS

mtcars$am <- factor(mtcars$am, levels = c(0, 1), labels = c("Automatic", "Manual"))

With this function I factored the variable “am” since it is categorical.

library(psych)
describeBy(mtcars$mpg, mtcars$am)
## 
##  Descriptive statistics by group 
## group: Automatic
##    vars  n  mean   sd median trimmed  mad  min  max range skew kurtosis   se
## X1    1 19 17.15 3.83   17.3   17.12 3.11 10.4 24.4    14 0.01     -0.8 0.88
## ------------------------------------------------------------ 
## group: Manual
##    vars  n  mean   sd median trimmed  mad min  max range skew kurtosis   se
## X1    1 13 24.39 6.17   22.8   24.38 6.67  15 33.9  18.9 0.05    -1.46 1.71

Quite small sample size. There seems to be a difference in means. The mean of miles per gallon for automatic is lower. Will analyze firstly using parametric test. If they cannot be used then non parametric tests will be analyzed.

Assumptions: 1. numeric variable 2. variance in both populations notmally distributed 3. no outliers

If NOT met we use Welch correlation

library(rstatix)
## 
## Attaching package: 'rstatix'
## The following object is masked from 'package:stats':
## 
##     filter
mtcars %>%
  group_by(am) %>%
  shapiro_test(mpg)
## # A tibble: 2 × 4
##   am        variable statistic     p
##   <fct>     <chr>        <dbl> <dbl>
## 1 Automatic mpg          0.977 0.899
## 2 Manual    mpg          0.946 0.536

Automatic: H0: miles per gallon for automatic cars are normally distributed H1: miles per gallon for automatic cars are NOT normally distributed

Cannot reject H0 (p-value=0.899), which means that distribution is normal. Assumption met.

Manual: H0: miles per gallon for manual cars are normally distributed H1: miles per gallon for manual cars are NOT normally distributed

Cannot reject H0 (p-value=0.537), which means that distribution is normal. Assumption met.

wilcox.test(mtcars$mpg ~ mtcars$am,
            paired = FALSE,
            correct = FALSE,
            exact = FALSE,
            alternative = "two.sided")
## 
##  Wilcoxon rank sum test
## 
## data:  mtcars$mpg by mtcars$am
## W = 42, p-value = 0.001753
## alternative hypothesis: true location shift is not equal to 0

H0: location distributions of miles per gallon are equal for automatic and manual cars. H1: location distributions of miles per gallon are equal NOT for automatic and manual cars.

Can reject the null hypothesis (p-value=0.002), meaning that automatic cars have lower consumption (less miles per gallon) compared to manual cars.

library(effectsize)
## 
## Attaching package: 'effectsize'
## The following objects are masked from 'package:rstatix':
## 
##     cohens_d, eta_squared
## The following object is masked from 'package:psych':
## 
##     phi
effectsize(wilcox.test(mtcars$mpg ~ mtcars$am,
                       paired = FALSE,
                       correct = FALSE,
                       exact = FALSE,
                       alternative = "two.sided"))
## r (rank biserial) |         95% CI
## ----------------------------------
## -0.66             | [-0.84, -0.36]
interpret_rank_biserial(0.66)
## [1] "very large"
## (Rules: funder2019)

1.3 CONCLUSION Based on the sample data, I find that manual and automatic cars differ in fuel consumption (miles per gallon) (p-value=0.002). Automatic cars consume on average less fuel compared to manual cars. The difference in distribution is very large (r=0.66).