MATH1322 Assignment 4

ANALYSIS ON FUEL CONSUMPTION OF PETROL TYPE IN THE YEARS 2000 AND 2011

ANIKETH REDDY NIMMA(S3670774), Krishna Sai Patha (s3670773), Vikrant Yadav (s3676697)

Last updated: 22 October, 2017

Introduction

Introduction Cont.

Problem Statement

Is there a difference in fuel consumption by cars made in year 2000 and 2011?

Data

avgp <- read.csv("fuel_consumption.csv")

Data Cont.

avgp$engineV <- factor(avgp$engineV, levels = avgp$engineV[1:22], ordered = TRUE)

Descriptive Statistics and Visualisation

X2000.m <- mean(avgp$X2000)
X2000.sd <- sd(avgp$X2000)
ggplot(avgp, aes(X2000, fill = Type)) +
  geom_histogram(color = "black", binwidth = 1, position = "stack") +
  geom_vline(xintercept = c(X2000.m, X2000.m+X2000.sd, X2000.m-X2000.sd),
             colour = c("gold", rep("gray21", 2)), lwd = c(2, 1, 1)) +
  ggtitle("Distribution of fuel consumption of cars in the year 2000") +
  xlab("Fuel consumption in year 2000")

Descriptive statistics cont.

avgdsummarize<- avgp %>% summarise(Min = min(X2000,na.rm = TRUE),
                                Q1 = quantile(X2000,probs = .25,na.rm = TRUE),
                                Median = median(X2000, na.rm = TRUE),
                                Q3 = quantile(X2000,probs = .75,na.rm = TRUE),
                                Max = max(X2000,na.rm = TRUE),
                                Mean = mean(X2000, na.rm = TRUE),
                                SD = sd(X2000, na.rm = TRUE),
                                Count = n())
knitr::kable(avgdsummarize)
Min Q1 Median Q3 Max Mean SD Count
4.5 6.0425 7.745 9.4275 12.12 7.834091 2.201559 44

Decsriptive Statistics Cont.

X2011.m <- mean(avgp$X2011)
X2011.sd <- sd(avgp$X2011)
ggplot(avgp, aes(X2011, fill = Type)) +
  geom_histogram(color = "black", binwidth = 1, position = "stack") +
  geom_vline(xintercept = c(X2011.m, X2011.m+X2011.sd, X2011.m-X2011.sd),
             colour = c("gold", rep("gray21", 2)), lwd = c(2, 1, 1)) +
  ggtitle("Distribution of fuel consumption of cars in the year 2011") +
  xlab("Fuel consumption in year 2011")

Decsriptive Statistics Cont.

avgpsummarize<-avgp %>% summarise(Min = min(X2011,na.rm = TRUE),
                       Q1 = quantile(X2011,probs = .25,na.rm = TRUE),
                       Median = median(X2011, na.rm = TRUE),
                       Q3 = quantile(X2011,probs = .75,na.rm = TRUE),
                       Max = max(X2011,na.rm = TRUE),
                       Mean = mean(X2011, na.rm = TRUE),
                       SD = sd(X2011, na.rm = TRUE),
                       Count = n()) 
knitr::kable(avgpsummarize)
Min Q1 Median Q3 Max Mean SD Count
3.6 5.2675 6.71 7.91 11.89 6.665682 1.827876 44

Hypothesis Testing

Hypothesis Testing Cont.

Hypothesis Testing Cont.

avgp$X2000 %>% qqPlot(dist="norm")

Hypothesis Testing Cont.

avgp$X2011 %>% qqPlot(dist="norm")

Hypothesis Testing Cont.

t.test(avgp$X2000 ,avgp$X2011,alternative = "two.sided")
## 
##  Welch Two Sample t-test
## 
## data:  avgp$X2000 and avgp$X2011
## t = 2.7085, df = 83.187, p-value = 0.008203
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.3104354 2.0263828
## sample estimates:
## mean of x mean of y 
##  7.834091  6.665682

Discussion

References