## Loading required package: dplyr
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## Loading required package: ggplot2
For the purpose of this assignment, I will be assigning numbers to the attendance numbers, representing the week number that the corresponding data point was collected. The first week that the attendance was counted is week 1, and the last data point is week 209.
head(church.tseries)
## year month worship
## 1 1993 1 336
## 2 1993 1 264
## 3 1993 1 382
## 4 1993 1 325
## 5 1993 1 346
## 6 1993 2 464
church.data<-data.frame(
Week = 1:209,
Attendance = church.tseries$worship
)
head(church.data)
## Week Attendance
## 1 1 336
## 2 2 264
## 3 3 382
## 4 4 325
## 5 5 346
## 6 6 464
ggplot(church.data, aes(Week, Attendance)) +
geom_point() +
labs(title = "Weekly Worship Attendance", subtitle="from January 1993 to December 1996")
church.data <- mutate(church.data, smooth.3RS3R.twice = as.vector(smooth(Attendance, kind="3RS3R", twiceit=TRUE)))
church.data <- mutate(church.data,
FinalRough = Attendance - smooth.3RS3R.twice)
head(church.data)
## Week Attendance smooth.3RS3R.twice FinalRough
## 1 1 336 336 0
## 2 2 264 336 -72
## 3 3 382 336 46
## 4 4 325 346 -21
## 5 5 346 346 0
## 6 6 464 346 118
ggplot(church.data, aes(Week, Attendance)) +
geom_point() +
geom_line(aes(Week, smooth.3RS3R.twice), color="red")
Based on the smooth curve, the worship attendance numbers seem to
increase and decrease in cycles. This makes me believe there may be
certain times of the year where attendance increases or decreases.
Despite the peaks and valleys, the attendance overall seems to be
trending slightly upward.
ggplot(church.data, aes(Week, FinalRough)) +
geom_point() +
geom_hline(yintercept=0, color="blue")
Similar to the graph of the smooth curve, the rough values seem to be
somewhat oscillating between negative and positive values. There is a
group of three extreme low rough values just after week 100, which may
indicate some major event that would make attendance decrease for a few
weeks.
church.data <- mutate(church.data, RoughSize = abs(FinalRough))
stem.leaf(church.data$RoughSize, m=2)
## 1 | 2: represents 12
## leaf unit: 1
## n: 209
## 89 0* | 00000000000000000000000000000000000000000000000000000000000000000000000000001111222223334
## (18) 0. | 566667777888888999
## 102 1* | 001112223344444
## 87 1. | 55566677789
## 76 2* | 000133
## 70 2. | 555566679
## 61 3* | 01133333444
## 50 3. | 5799
## 46 4* | 012234444
## 37 4. | 5667789
## 30 5* | 14
## 28 5. | 56678
## 23 6* | 11113
## 18 6. | 66
## 16 7* | 24
## 7. |
## 14 8* | 0
## HI: 87 94 97 104 115 117 118 126 131 137 160 164 211
fivenum(church.data$RoughSize)
## [1] 0 0 9 34 211
lval(church.data$RoughSize)
## depth lo hi mids spreads
## M 105.0 9 9.0 9.00 0.0
## H 53.0 0 34.0 17.00 34.0
## E 27.0 0 56.0 28.00 56.0
## D 14.0 0 80.0 40.00 80.0
## C 7.5 0 117.5 58.75 117.5
## B 4.0 0 137.0 68.50 137.0
## A 2.5 0 162.0 81.00 162.0
## Z 1.0 0 211.0 105.50 211.0
Using the five-number summary, we see the rough sizes have \(F_L=0\) and \(F_U=34\). Using these values, we can the fences for outliers: \[dF=34; STEP=1.5(34)=51\] Because \(F_L=0\) and all rough sizes are non-negative, we only need to calculate upper fence values: \[fence_{upper}=34+51=85\\FENCE_{upper}=34+2(51)=136\] We have mild outliers of 87 (week 111), 94 (week 74), 97 (week 66), 104 (week 171), 115 (week 15), 117 (week 169), 118 (week 6), 126 (week 103), and 131 (week 52). We have extreme outliers of 137 (week 120), 160 (week 105), 164 (week 104), 211 (week 108).