MATH1324 Assignment 4

Impact of weather season on weekend and weekday cycling

Timothy Davidson

27 May 2017

Introduction

Problem Statement

Problem statement: Is there statistical evidence of an association between weather season and cycling flows on weekdays and weekends?

How can statistics help?

  1. The relationship between the two variables is statistically significant; or

  2. Whether the variation reflects natural sampling variability, assuming weather season and weekday/weekend riding are independent.

Data

Data source and attribution

Data (cont.)

About the data

Data (cont.)

Factors

Importing and cleansing the data in R

Cycling <- read_csv("~/Desktop/Cycling.csv")
Cycling$Season <- Cycling$Season %>% as.factor
levels(Cycling$Season)
## [1] "Autumn" "Spring" "Summer" "Winter"
Cycling$weekend <- Cycling$weekend %>% as.factor
Cycling$weekend <- Cycling$weekend %>% factor(levels = c("FALSE","TRUE"), 
                                              labels=c("Weekday","Weekend"))
levels(Cycling$weekend)
## [1] "Weekday" "Weekend"

Descriptive Statistics and Visualisation

Summary tables

table1<-table(Cycling$Season, Cycling$weekend)
knitr::kable(table1)
Weekday Weekend
Autumn 9297 3728
Spring 10807 4271
Summer 9656 3998
Winter 10341 4275
table2<-table(Cycling$Season, Cycling$weekend) %>% prop.table(margin = 2) *100 
knitr::kable(table2)
Weekday Weekend
Autumn 23.18396 22.91052
Spring 26.94945 26.24754
Summer 24.07920 24.56981
Winter 25.78739 26.27212

Clustered bar chart

Season_weekend <- table(Cycling$Season, Cycling$weekend) %>% prop.table(margin = 2)*100
barplot(Season_weekend, main = "Weekday and weekend cycling flows by weather season", ylab="Percentage", xlab="Weekend or weekday", beside=TRUE,legend=rownames(Season_weekend),args.legend=c(x="top",horiz=TRUE,title="Weather season"),ylim = c(0,35),col=brewer.pal(4, name = "RdBu"))

Hypothesis Testing

Type of hypothesis test:

Hypothesis:

Hypothesis Testing: assumptions and decision rules

Assumptions:

Decision Rules:

Results of Hypothesis Testing

  1. weekend are overrepresented in Summer & Winter and underrepresented in Autumn & Spring.

  2. weekday are overrepresented in Autumn & Spring and underrepresented in Summer & Winter.

Hypothesis test

chi1 <- chisq.test(table(Cycling$weekend,Cycling$Season))
chi1
## 
##  Pearson's Chi-squared test
## 
## data:  table(Cycling$weekend, Cycling$Season)
## X-squared = 4.706, df = 3, p-value = 0.1946
chi1$observed #observed values
##          
##           Autumn Spring Summer Winter
##   Weekday   9297  10807   9656  10341
##   Weekend   3728   4271   3998   4275

Hypothesis test (cont.)

chi1$expected %>% round(2) #expected values
##          
##            Autumn   Spring  Summer   Winter
##   Weekday 9265.35 10725.75 9712.79 10397.11
##   Weekend 3759.65  4352.25 3941.21  4218.89
chi1$stdres %>% round(2) #standardised residuals
##          
##           Autumn Spring Summer Winter
##   Weekday   0.70   1.71  -1.23  -1.19
##   Weekend  -0.70  -1.71   1.23   1.19

Hypothesis test (cont.)

qchisq(p = .95,df = 3)
## [1] 7.814728
pchisq(q = 4.706,df = 3,lower.tail = FALSE)
## [1] 0.1946353
chi<-chisq.test(table(Cycling$weekend,Cycling$Season))
chi$p.value
## [1] 0.194632

Discussion

Major findings

Discussion

Strengths and limitations

References

Data.vic.gov.au, ‘Bicycle Volumes - VicRoads’, accessed 23 May 2017.