This report covers an analysis of cars in the year 2020, using the “Cars2020” dataset obtained from https://www.lock5stat.com/datasets3e/Cars2020.csv This dataset comprises of 21 variables and 110 observed car models, providing car information about maker, cost, specifications, and performance of each car. The primary objective of this report is to answer a series of research questions by analyzing this dataset.
Is there a more significant correlation between LowPriced Cars and weight, or HighPriced Cars and weight?
Does Chevrolet cars have an higher average highcost price than BMW cars?
Does Acura, or Audi have a more significant impact on 0 to 60 mph time?
Do car’s with 7 seats cost more overall?
Does fuel capacity have a significant impact on a car’s city miles per gallon?
Which size of vehicle has the most significant impact on braking.
Does car wheelbase have a significant impact on 0 to 30mph time?
Does car length have a significant impact on 0 to 60mph time?
Does City MPG have a significant correlation to Highway MPG?
Do sporty cars have a more significant impact on U-Turn than Sedans?
Cars = read.csv("https://www.lock5stat.com/datasets3e/Cars2020.csv")
head(Cars)
## Make Model Type LowPrice HighPrice CityMPG HwyMPG Seating Drive Acc030
## 1 Acura MDX SUV 44.4 60.15 14 31 7 AWD 2.8
## 2 Acura RLX Sedan 54.9 61.00 15 36 5 AWD 2.7
## 3 Audi A3 Sedan 33.3 43.00 18 40 5 AWD 3.2
## 4 Audi A4 Sporty 37.4 45.70 18 40 5 AWD 2.7
## 5 Audi A6 Sedan 54.9 73.90 17 39 5 AWD 2.8
## 6 Audi A8 Wagon 83.8 83.80 20 27 5 AWD 2.4
## Acc060 QtrMile Braking FuelCap Length Width Height Wheelbase UTurn Weight
## 1 6.8 15.3 135 19.5 196 77 67 111 40 4200
## 2 6.5 15.0 128 18.5 198 74 58 112 40 3930
## 3 8.3 16.4 124 13.2 175 70 56 104 37 3135
## 4 6.3 14.9 135 15.3 186 73 56 111 40 3630
## 5 6.8 15.3 129 19.3 195 74 57 115 38 4015
## 6 6.1 14.5 133 21.7 209 77 59 123 43 4810
## Size
## 1 Midsized
## 2 Midsized
## 3 Small
## 4 Small
## 5 Midsized
## 6 Large
cor(Cars$LowPrice, Cars$Weight)
## [1] 0.5771807
t.test(Cars$LowPrice, Cars$Weight)
##
## Welch Two Sample t-test
##
## data: Cars$LowPrice and Cars$Weight
## t = -46.886, df = 109.08, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -4002.345 -3677.697
## sample estimates:
## mean of x mean of y
## 34.56955 3874.59091
cor(Cars$HighPrice, Cars$Weight)
## [1] 0.5628846
t.test(Cars$HighPrice, Cars$Weight)
##
## Welch Two Sample t-test
##
## data: Cars$HighPrice and Cars$Weight
## t = -46.665, df = 109.22, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3985.435 -3660.691
## sample estimates:
## mean of x mean of y
## 51.52764 3874.59091
Based on this data we can see that their is slightly more lowerpriced cars meaning a higher correlation too, but both are not considered significant based on their p-values.
# Subset the data for the two manufactures to compare
highcost_chevrolet <- Cars$HighPrice[Cars$Make == "Chevrolet"]
highcost_bmw <- Cars$HighPrice[Cars$Make == "BMW"]
# Perform two-sample t-test
t_test_result <- t.test(highcost_chevrolet, highcost_bmw, alternative = "greater")
# Print the result
print(t_test_result)
##
## Welch Two Sample t-test
##
## data: highcost_chevrolet and highcost_bmw
## t = -1.0579, df = 6.5139, p-value = 0.8361
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## -70.75601 Inf
## sample estimates:
## mean of x mean of y
## 58.57 83.74
BMW has a significantly higher average highercost price than Chevrolet brand cars based on the dataset.
t.test(Cars$Acc060 ~ Cars$Make == "Acura")
##
## Welch Two Sample t-test
##
## data: Cars$Acc060 by Cars$Make == "Acura"
## t = 5.6606, df = 4.0371, p-value = 0.004669
## alternative hypothesis: true difference in means between group FALSE and group TRUE is not equal to 0
## 95 percent confidence interval:
## 0.6168594 1.7961036
## sample estimates:
## mean in group FALSE mean in group TRUE
## 7.856481 6.650000
t.test(Cars$Acc060 ~ Cars$Make == "Audi")
##
## Welch Two Sample t-test
##
## data: Cars$Acc060 by Cars$Make == "Audi"
## t = 2.2823, df = 7.1082, p-value = 0.05588
## alternative hypothesis: true difference in means between group FALSE and group TRUE is not equal to 0
## 95 percent confidence interval:
## -0.02900649 1.79439111
## sample estimates:
## mean in group FALSE mean in group TRUE
## 7.882692 7.000000
Audi has a more significant 0 to 60 time impact than Acura based on this data.
#cars with 7 seats and cars without lowcost
seven_seats <- Cars$LowPrice[Cars$Seating == 7]
not_seven_seats <- Cars$LowPrice[Cars$Seating != 7]
# Perform two-sample t-test
t_test_result_1 <- t.test(seven_seats, not_seven_seats)
# Print the result
print(t_test_result_1)
##
## Welch Two Sample t-test
##
## data: seven_seats and not_seven_seats
## t = 1.001, df = 28.15, p-value = 0.3253
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.294154 9.594005
## sample estimates:
## mean of x mean of y
## 37.31857 34.16865
#cars with 7 seats and cars without highcost
seven_seats <- Cars$HighPrice[Cars$Seating == 7]
not_seven_seats <- Cars$HighPrice[Cars$Seating != 7]
# Perform two-sample t-test
t_test_result_1 <- t.test(seven_seats, not_seven_seats)
# Print the result
print(t_test_result_1)
##
## Welch Two Sample t-test
##
## data: seven_seats and not_seven_seats
## t = 0.22717, df = 28.151, p-value = 0.8219
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -9.22499 11.52698
## sample estimates:
## mean of x mean of y
## 52.53214 51.38115
Based on this data seven seat cars are significant to cost and cost more on overall for both low priced and high priced vehicles.
cor.test(Cars$FuelCap, Cars$CityMPG)
##
## Pearson's product-moment correlation
##
## data: Cars$FuelCap and Cars$CityMPG
## t = -15.772, df = 108, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.8840484 -0.7678422
## sample estimates:
## cor
## -0.8350299
Based on this this data there is not a significant impact on a car’s city MPG from fuel capacity. As fuel capacity goes up MPG goes down.
braking_small <- Cars$Braking[Cars$Size == "Small"]
braking_midsized <- Cars$Braking[Cars$Size == "Midsized"]
braking_large <- Cars$Braking[Cars$Size == "Large"]
# Perform two-sample t-test
t_test_result_1 <- t.test(braking_large, braking_midsized, alternative = "less")
# Perform two-sample t-test
t_test_result_2 <- t.test(braking_small, braking_midsized, alternative = "less")
# Print the result
print(t_test_result_1)
##
## Welch Two Sample t-test
##
## data: braking_large and braking_midsized
## t = 2.5607, df = 35.396, p-value = 0.9926
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
## -Inf 6.782956
## sample estimates:
## mean of x mean of y
## 135.4286 131.3415
print(t_test_result_2)
##
## Welch Two Sample t-test
##
## data: braking_small and braking_midsized
## t = -2.2101, df = 84.909, p-value = 0.01489
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
## -Inf -0.7395089
## sample estimates:
## mean of x mean of y
## 128.3542 131.3415
It seems that small sized cars have the least significant impact on breaking, while large sized cars have the most significant. Meaning that heavier cars take longer to come to a stop.
cor.test(Cars$Wheelbase, Cars$Acc030)
##
## Pearson's product-moment correlation
##
## data: Cars$Wheelbase and Cars$Acc030
## t = -4.3873, df = 108, p-value = 2.68e-05
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.5370596 -0.2175284
## sample estimates:
## cor
## -0.3889287
Based on this data wheelbase does not have an significant impact on a car’s 0 to 30mph time.
cor.test(Cars$Length, Cars$Acc060)
##
## Pearson's product-moment correlation
##
## data: Cars$Length and Cars$Acc060
## t = -5.7331, df = 108, p-value = 9.06e-08
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.6146840 -0.3252093
## sample estimates:
## cor
## -0.4830374
Based on this data wheelbase does not have an significant impact on a car’s 0 to 30mph time.
cor.test(Cars$CityMPG, Cars$HwyMPG)
##
## Pearson's product-moment correlation
##
## data: Cars$CityMPG and Cars$HwyMPG
## t = 18.968, df = 108, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8252730 0.9141287
## sample estimates:
## cor
## 0.8769963
Based on this data city mpg does not have an significant impact on highway mph.
uturn_sporty <- Cars$UTurn[Cars$Type == "Sporty"]
uturn_sudan <- Cars$UTurn[Cars$Type == "Sedan"]
# Perform two-sample t-test
t_test_result_2 <- t.test(uturn_sporty, uturn_sudan, alternative = "greater")
# Print the result
print(t_test_result_1)
##
## Welch Two Sample t-test
##
## data: braking_large and braking_midsized
## t = 2.5607, df = 35.396, p-value = 0.9926
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
## -Inf 6.782956
## sample estimates:
## mean of x mean of y
## 135.4286 131.3415
Based on this data sporty cars have a significantly larger U Turn than Sedans.
This will cover some of the key takeaways from each question answered in order they where asked.
This question showed us that based on this data we can see that their is slightly more lowerpriced cars meaning a higher correlation too, but both are not considered significant based on their p-values.
This question showed us that BMW has a significantly higher average highercost price than Chevrolet brand cars based on the dataset.
This question showed us that Audi has a more significant 0 to 60 time impact than Acura based on this data.
This question showed us that based on this data seven seat cars are significant to cost and cost more on overall for both low priced and high priced vehicles.
This question showed us that based on this this data there is not a significant impact on a car’s city MPG from fuel capacity. As fuel capacity goes up MPG goes down.
This question showed us that it seems that small sized cars have the least significant impact on breaking, while large sized cars have the most significant. Meaning that heavier cars take longer to come to a stop.
This question showed us that based on this data wheelbase does not have an significant impact on a car’s 0 to 30mph time.
This question showed us that based on this data car length does not have an significant impact on a car’s 0 to 60mph time.
This question showed us that based on this data city mpg does not have an significant impact on highway mph.
This question shows us that based on this data sporty cars have a significantly larger U Turn than Sedans.
Based on all of the data gathered from questions answered some general conclusions can be made. One is that certain car brands, car weight, and the amount of seats all have a significant impact on car price. We also see that some physical attributes like wheelbase and car length do not have a significant impact on starting speeds.
By doing this project we have displayed that using statistic P-Values we can evaluate significance of certain variables against other variables. It is important to note here that smaller data set variables may have a worse P-value because of this. We also found that we can use statistics to better understand questions pertaining to collected data.