Sample Analysis

Illya Mowerman

February 20, 2018

Research Question

What makes cars more efficient. Efficiency measured in MPG

Data

Univariate Statistics

summary(mtcars)
##       mpg             cyl             disp             hp       
##  Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
##  1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
##  Median :19.20   Median :6.000   Median :196.3   Median :123.0  
##  Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
##  3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
##  Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
##       drat             wt             qsec             vs        
##  Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
##  1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
##  Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
##  Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
##  3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
##  Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
##        am              gear            carb      
##  Min.   :0.0000   Min.   :3.000   Min.   :1.000  
##  1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
##  Median :0.0000   Median :4.000   Median :2.000  
##  Mean   :0.4062   Mean   :3.688   Mean   :2.812  
##  3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
##  Max.   :1.0000   Max.   :5.000   Max.   :8.000

Univariate statistics Continued: Histograms

ggplot(mtcars) +
  geom_histogram(aes(mpg) , binwidth = 5)

Dataprep

We created a new categorical varible for MPG. This new variable is called efficient, where 1 is efficient and 0 is not. The cutoff for efficient is 22.8, which is the this quartile.

mtcars2 <- mtcars %>% 
  mutate(efficient = ifelse(mpg < 22.8 , 0 , 1))

head(mtcars2)
##    mpg cyl disp  hp drat    wt  qsec vs am gear carb efficient
## 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4         0
## 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4         0
## 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1         1
## 4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1         0
## 5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2         0
## 6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1         0
mean(mtcars2$efficient)
## [1] 0.28125

Bivariate Statistics: MPG vs. Cylinders

First a correlation

cor.test(mtcars$mpg , mtcars$cyl)
## 
##  Pearson's product-moment correlation
## 
## data:  mtcars$mpg and mtcars$cyl
## t = -8.9197, df = 30, p-value = 6.113e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.9257694 -0.7163171
## sample estimates:
##       cor 
## -0.852162

At alpha = 0.05, the correlation is significant. The correlation is strong and negative.
As vehicles have more cylinders, their mpg decrease

Bivariate stats: efficiency vs cylinders

Below is a t-test between efficiency and cylinders

t.test(mtcars2$cyl ~ mtcars2$efficient , alternative = 'greater')
## 
##  Welch Two Sample t-test
## 
## data:  mtcars2$cyl by mtcars2$efficient
## t = 10.969, df = 22, p-value = 1.094e-10
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  2.567024      Inf
## sample estimates:
## mean in group 0 mean in group 1 
##        7.043478        4.000000

The mean number of cylinders for efficient cars is significantly less at the alpha = .05 level.

Vehicles with more than 4 cylinders are innegicient in MPG

Examination of Efficiency and number of cylinders

We created a table of counts between efficiency and cylinders

CrossTable(mtcars2$cyl , mtcars2$efficient , chisq = T)
## Warning in chisq.test(t, correct = FALSE, ...): Chi-squared approximation
## may be incorrect
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  32 
## 
##  
##              | mtcars2$efficient 
##  mtcars2$cyl |         0 |         1 | Row Total | 
## -------------|-----------|-----------|-----------|
##            4 |         2 |         9 |        11 | 
##              |     4.412 |    11.276 |           | 
##              |     0.182 |     0.818 |     0.344 | 
##              |     0.087 |     1.000 |           | 
##              |     0.062 |     0.281 |           | 
## -------------|-----------|-----------|-----------|
##            6 |         7 |         0 |         7 | 
##              |     0.770 |     1.969 |           | 
##              |     1.000 |     0.000 |     0.219 | 
##              |     0.304 |     0.000 |           | 
##              |     0.219 |     0.000 |           | 
## -------------|-----------|-----------|-----------|
##            8 |        14 |         0 |        14 | 
##              |     1.541 |     3.938 |           | 
##              |     1.000 |     0.000 |     0.438 | 
##              |     0.609 |     0.000 |           | 
##              |     0.438 |     0.000 |           | 
## -------------|-----------|-----------|-----------|
## Column Total |        23 |         9 |        32 | 
##              |     0.719 |     0.281 |           | 
## -------------|-----------|-----------|-----------|
## 
##  
## Statistics for All Table Factors
## 
## 
## Pearson's Chi-squared test 
## ------------------------------------------------------------
## Chi^2 =  23.90514     d.f. =  2     p =  6.442659e-06 
## 
## 
## 
CrossTable(mtcars2$am , mtcars2$efficient , chisq = T)
## Warning in chisq.test(t, correct = TRUE, ...): Chi-squared approximation
## may be incorrect

## Warning in chisq.test(t, correct = TRUE, ...): Chi-squared approximation
## may be incorrect
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  32 
## 
##  
##              | mtcars2$efficient 
##   mtcars2$am |         0 |         1 | Row Total | 
## -------------|-----------|-----------|-----------|
##            0 |        17 |         2 |        19 | 
##              |     0.819 |     2.092 |           | 
##              |     0.895 |     0.105 |     0.594 | 
##              |     0.739 |     0.222 |           | 
##              |     0.531 |     0.062 |           | 
## -------------|-----------|-----------|-----------|
##            1 |         6 |         7 |        13 | 
##              |     1.197 |     3.058 |           | 
##              |     0.462 |     0.538 |     0.406 | 
##              |     0.261 |     0.778 |           | 
##              |     0.188 |     0.219 |           | 
## -------------|-----------|-----------|-----------|
## Column Total |        23 |         9 |        32 | 
##              |     0.719 |     0.281 |           | 
## -------------|-----------|-----------|-----------|
## 
##  
## Statistics for All Table Factors
## 
## 
## Pearson's Chi-squared test 
## ------------------------------------------------------------
## Chi^2 =  7.165562     d.f. =  1     p =  0.007431642 
## 
## Pearson's Chi-squared test with Yates' continuity correction 
## ------------------------------------------------------------
## Chi^2 =  5.182812     d.f. =  1     p =  0.02281138 
## 
## 

Efficiency has a dependerncy on the transmition type, at the alpha = 0.05 level.
The relative risk of being efficient when a vehicle has an automatic transmition is: 3.5 (0.778/0.222)
The likelihood of a veicle being automatics when it is efficient is 77.8%
If a vehicle is automatic, the likelihood of it being efficient is 53.8%. This opposed to manual transmition cars that only have a 10.5% likelihood of being efficient

Conclusion

Through the exploratory data analysis presented in this study it is observed that automatic vehicles with less cylinders are more efficient in terms of MPG.
In addition, radios in vehicles have no impact on MPG.