ANIKETH REDDY NIMMA(S3670774), Krishna Sai Patha (s3670773), Vikrant Yadav (s3676697)
Last updated: 22 October, 2017
Is there a difference in fuel consumption by cars made in year 2000 and 2011?
avgp <- read.csv("fuel_consumption.csv")avgp$engineV <- factor(avgp$engineV, levels = avgp$engineV[1:22], ordered = TRUE)X2000.m <- mean(avgp$X2000)
X2000.sd <- sd(avgp$X2000)
ggplot(avgp, aes(X2000, fill = Type)) +
geom_histogram(color = "black", binwidth = 1, position = "stack") +
geom_vline(xintercept = c(X2000.m, X2000.m+X2000.sd, X2000.m-X2000.sd),
colour = c("gold", rep("gray21", 2)), lwd = c(2, 1, 1)) +
ggtitle("Distribution of fuel consumption of cars in the year 2000") +
xlab("Fuel consumption in year 2000")avgdsummarize<- avgp %>% summarise(Min = min(X2000,na.rm = TRUE),
Q1 = quantile(X2000,probs = .25,na.rm = TRUE),
Median = median(X2000, na.rm = TRUE),
Q3 = quantile(X2000,probs = .75,na.rm = TRUE),
Max = max(X2000,na.rm = TRUE),
Mean = mean(X2000, na.rm = TRUE),
SD = sd(X2000, na.rm = TRUE),
Count = n())
knitr::kable(avgdsummarize)| Min | Q1 | Median | Q3 | Max | Mean | SD | Count |
|---|---|---|---|---|---|---|---|
| 4.5 | 6.0425 | 7.745 | 9.4275 | 12.12 | 7.834091 | 2.201559 | 44 |
X2011.m <- mean(avgp$X2011)
X2011.sd <- sd(avgp$X2011)
ggplot(avgp, aes(X2011, fill = Type)) +
geom_histogram(color = "black", binwidth = 1, position = "stack") +
geom_vline(xintercept = c(X2011.m, X2011.m+X2011.sd, X2011.m-X2011.sd),
colour = c("gold", rep("gray21", 2)), lwd = c(2, 1, 1)) +
ggtitle("Distribution of fuel consumption of cars in the year 2011") +
xlab("Fuel consumption in year 2011")avgpsummarize<-avgp %>% summarise(Min = min(X2011,na.rm = TRUE),
Q1 = quantile(X2011,probs = .25,na.rm = TRUE),
Median = median(X2011, na.rm = TRUE),
Q3 = quantile(X2011,probs = .75,na.rm = TRUE),
Max = max(X2011,na.rm = TRUE),
Mean = mean(X2011, na.rm = TRUE),
SD = sd(X2011, na.rm = TRUE),
Count = n())
knitr::kable(avgpsummarize)| Min | Q1 | Median | Q3 | Max | Mean | SD | Count |
|---|---|---|---|---|---|---|---|
| 3.6 | 5.2675 | 6.71 | 7.91 | 11.89 | 6.665682 | 1.827876 | 44 |
The null hypothesis is that there is no diference in average fuel consumption by cars between years 2000 and 2011.\[H_0: \mu_1 = \mu_2 \]
Alternate hypothesis being that there is a difference in average fuel comsumption by cars between years 2000 and 2011.\[H_A: \mu_1 \ne \mu_2\]
avgp$X2000 %>% qqPlot(dist="norm")avgp$X2011 %>% qqPlot(dist="norm")t.test(avgp$X2000 ,avgp$X2011,alternative = "two.sided")##
## Welch Two Sample t-test
##
## data: avgp$X2000 and avgp$X2011
## t = 2.7085, df = 83.187, p-value = 0.008203
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.3104354 2.0263828
## sample estimates:
## mean of x mean of y
## 7.834091 6.665682