In order to provide an analysis of the car performance, we used psych, tidyversee, sm packages
library(psych)
## Warning: package 'psych' was built under R version 4.1.2
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.1.2
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.8 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.4 ✓ stringr 1.4.0
## ✓ readr 2.1.2 ✓ forcats 0.5.1
## Warning: package 'tibble' was built under R version 4.1.2
## Warning: package 'readr' was built under R version 4.1.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x ggplot2::%+%() masks psych::%+%()
## x ggplot2::alpha() masks psych::alpha()
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(sm)
## Warning: package 'sm' was built under R version 4.1.2
## Package 'sm', version 2.2-5.7: type help(sm) for summary information
We now read the data from the file since, as was previously indicated, Data that I used was ‘MTCARS’, which includes information about car’s specification and performance. The “mtcars” dataset includes information on 32 cars in 11 categories related to performance and design. Therefore, the “mtcars” dataset includes 32 data samples with 11 performance-related factors for cars. And the main focus of this research will be on the quarter-mile time of autos and the elements that affect it.
data <-read.csv('../data/data.csv')
For the starting point, I used descriptive statistics for summary and desribe the overall situation. It can help us have better understanding of the general performance of cars.
summary(data)
## X mpg cyl disp
## Length:32 Min. :10.40 Min. :4.000 Min. : 71.1
## Class :character 1st Qu.:15.43 1st Qu.:4.000 1st Qu.:120.8
## Mode :character Median :19.20 Median :6.000 Median :196.3
## Mean :20.09 Mean :6.188 Mean :230.7
## 3rd Qu.:22.80 3rd Qu.:8.000 3rd Qu.:326.0
## Max. :33.90 Max. :8.000 Max. :472.0
## hp drat wt qsec
## Min. : 52.0 Min. :2.760 Min. :1.513 Min. :14.50
## 1st Qu.: 96.5 1st Qu.:3.080 1st Qu.:2.581 1st Qu.:16.89
## Median :123.0 Median :3.695 Median :3.325 Median :17.71
## Mean :146.7 Mean :3.597 Mean :3.217 Mean :17.85
## 3rd Qu.:180.0 3rd Qu.:3.920 3rd Qu.:3.610 3rd Qu.:18.90
## Max. :335.0 Max. :4.930 Max. :5.424 Max. :22.90
## vs am gear carb
## Min. :0.0000 Min. :0.0000 Min. :3.000 Min. :1.000
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:3.000 1st Qu.:2.000
## Median :0.0000 Median :0.0000 Median :4.000 Median :2.000
## Mean :0.4375 Mean :0.4062 Mean :3.688 Mean :2.812
## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:4.000 3rd Qu.:4.000
## Max. :1.0000 Max. :1.0000 Max. :5.000 Max. :8.000
describe(data)
## vars n mean sd median trimmed mad min max range skew
## X* 1 32 16.50 9.38 16.50 16.50 11.86 1.00 32.00 31.00 0.00
## mpg 2 32 20.09 6.03 19.20 19.70 5.41 10.40 33.90 23.50 0.61
## cyl 3 32 6.19 1.79 6.00 6.23 2.97 4.00 8.00 4.00 -0.17
## disp 4 32 230.72 123.94 196.30 222.52 140.48 71.10 472.00 400.90 0.38
## hp 5 32 146.69 68.56 123.00 141.19 77.10 52.00 335.00 283.00 0.73
## drat 6 32 3.60 0.53 3.70 3.58 0.70 2.76 4.93 2.17 0.27
## wt 7 32 3.22 0.98 3.33 3.15 0.77 1.51 5.42 3.91 0.42
## qsec 8 32 17.85 1.79 17.71 17.83 1.42 14.50 22.90 8.40 0.37
## vs 9 32 0.44 0.50 0.00 0.42 0.00 0.00 1.00 1.00 0.24
## am 10 32 0.41 0.50 0.00 0.38 0.00 0.00 1.00 1.00 0.36
## gear 11 32 3.69 0.74 4.00 3.62 1.48 3.00 5.00 2.00 0.53
## carb 12 32 2.81 1.62 2.00 2.65 1.48 1.00 8.00 7.00 1.05
## kurtosis se
## X* -1.31 1.66
## mpg -0.37 1.07
## cyl -1.76 0.32
## disp -1.21 21.91
## hp -0.14 12.12
## drat -0.71 0.09
## wt -0.02 0.17
## qsec 0.34 0.32
## vs -2.00 0.09
## am -1.92 0.09
## gear -1.07 0.13
## carb 1.26 0.29
Finally, let me conclude what we did in this part: In order to present a comprehensive view of a car’s performance, we first summarize the data information of the 11 automotive indicators. The output shows the data for 32 cars as “mean, minimum, 1st quartile, median, 3rd quartile, maximum.” Miles per gallon (mpg), the number of cylinders (cyl), displacement (disp), gross horsepower (hp), rear axle ratio (drat), weight (wt), quarter-mile time (qsec), engine (vs), transmission (am), number of forward gears (gear), and number of carburetors (carb) are some of the other metrics. Next, we “describe” 11 cars indicators including standard deviation, range and ect.
During the time that I review the data, I wondered the relationship between the speed and transmission type of the car NOTE: data_a is automatic transmission cars; data_b is manual transmission cars
table(data$am)
##
## 0 1
## 19 13
data_a <- filter(data,am == 0)
data_b <- filter(data, am != 0)
mean(data_a$qsec)
## [1] 18.18316
mean(data_b$qsec)
## [1] 17.36
The average of automatic transmission car’s quarter-mile time is 17.1473684. The average of manumal transmission car’s quarter-mile time is 24.3923077. WHICH means, the latter are better. ## Visualization Here are the visualization (historgram, kernel Density, boxplot) of the data, help us have better understanding
hist(data$qsec)
plot(density(data$qsec))
sm.density.compare(data$qsec, data$am, model='equal')
## Test of equal densities: p-value = 0.55
boxplot(qsec ~ am, data=data)
T-test is used to check the correctness of difference in car speed (1/4 mile time) between these two types of cars in sample.
m1 <- t.test(qsec ~ am, data = data)
According to results, p-value=0.23093, which > tham 0.05, meaning we can not reject the null hypothesis. Thus, the difference is not siginificant
m2 <- lm(qsec ~ am, data = data)
summary(m2)
##
## Call:
## lm(formula = qsec ~ am, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.8600 -0.9583 -0.3516 1.2517 4.7168
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 18.1832 0.4056 44.833 <2e-16 ***
## am -0.8232 0.6363 -1.294 0.206
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.768 on 30 degrees of freedom
## Multiple R-squared: 0.05284, Adjusted R-squared: 0.02126
## F-statistic: 1.674 on 1 and 30 DF, p-value: 0.2057
As we can seen, the coefficient estimate of am is -0.8232, which means when on additional increase, there will have 0.8232 increase in 1/4 mile time. On the other hand, p-value is 0.2057, so there is no significant different in this regression model.
In conclusion, the automatic transmission cars have longer time than munual transmission cars in 1/4 mile time. Although manual one have fasr speed, but not significant in difference.