mtcars -- T-tests

Sameer Mathur

mtcars dataset

# reading the data
mtcars$am <- as.factor(mtcars$am)   # convert to factor
mtcars$cyl <- as.factor(mtcars$cyl) # convert to factor
attach(mtcars)
head(mtcars)  # first few rows of the data frame
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

For data description column please visit Data Description.

Average milage and standard deviation of automatic vs manual cars

# average milage and sd of automatic vs manual cars
tapply(mpg, am, function(x)(c(mean=mean(x),sd=sd(x))))
$`0`
     mean        sd 
17.147368  3.833966 

$`1`
     mean        sd 
24.392308  6.166504 

Normality Check -- QQ Plot

qqnorm(mpg)
qqline(mpg)

plot of chunk unnamed-chunk-3

The deviations from the straight line are minimal. We can accept that the data is normally distributed.

Normality Check -- Histogram and Density Curve

hist(mpg,freq=FALSE)
lines(density(mpg), lwd=2)

The histogram confirms the non-normality. The distribution is not bell-shaped but negatively skewed (i.e., most data points are in the lower half). Histograms of normal distributions show the highest frequency in the center of the distribution.

Normality Check -- Histogram and Density Curve

plot of chunk unnamed-chunk-5

Normality Check -- Shapiro-Wilk's Test

shapiro.test(mpg)

    Shapiro-Wilk normality test

data:  mpg
W = 0.94756, p-value = 0.1229

Shapiro-Wilk test indicates that dthe ata are not normally distributed and the mild skewness indicated by the plots.

Boxplot of mpg by transmission

boxplot(mpg ~ am, data=mtcars, horizontal=TRUE, 
        ylab="am", xlab="Milage",
        main="Comparison of milage of manual vs automatic cars")

Boxplot of mpg by transmission

plot of chunk unnamed-chunk-8

Plot the average milage of manual and automatic cars

library(gplots)
# plot the average milage of automatic and manual cars
plotmeans(mpg ~ am, data = mtcars, frame = TRUE)

Plot the average milage of manual and automatic cars

plot of chunk unnamed-chunk-10

Compare Variances using F-test (mpg of automatic vs manual cars)

# mpg of automatic vs manual cars
transftest <- var.test(mpg ~ am, data=mtcars, alternative = "two.sided")
transftest

    F test to compare two variances

data:  mpg by am
F = 0.38656, num df = 18, denom df = 12, p-value = 0.06691
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.1243721 1.0703429
sample estimates:
ratio of variances 
         0.3865615 

The p-value of F-test is p = 0.06691 which is greater than the significance level 0.05. In conclusion, there is no significant difference between the two variances.

Independent two-group t-test (mpg vs transmission)

To test whether there is a significance diffrence between mpg of automatic and manual transmission.

# independent 2-group t-test
t.test(mpg ~ am, data=mtcars)

    Welch Two Sample t-test

data:  mpg by am
t = -3.7671, df = 18.332, p-value = 0.001374
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -11.280194  -3.209684
sample estimates:
mean in group 0 mean in group 1 
       17.14737        24.39231 

We obtained p-value less than 0.05, then we can conclude that the averages of two groups are not significantly similar.