In class we learned how to make increasing seasonal variation constant using transformations. We also learned how to model this data using dummy variables.
For this example I will use the data set airpass from the R package faraway. Airpass contains the number of passengers (in thousands) traveling by plane per month from 1949 to 1951.
library(faraway)
## Warning: package 'faraway' was built under R version 3.4.4
data(airpass)
plot(pass~year, data = airpass, type = "l")
This data has increasing seasonal variation. Now we transform this data to have constant variation.
plot(sqrt(pass)~year, data = airpass, type = "l")
plot(log(pass)~year, data = airpass, type = "l")
As you can see the log transformation makes the data look the closest to constant seasonal variation.
Now we can use dummy variables to group our year factor by months.
head(airpass)
## pass year
## 1 112 49.08333
## 2 118 49.16667
## 3 132 49.25000
## 4 129 49.33333
## 5 121 49.41667
## 6 135 49.50000
justyear=floor(airpass$year)
modecimal=airpass$year - justyear
mofactor=factor(round(modecimal*12))
head(cbind(airpass$year, mofactor))
## mofactor
## [1,] 49.08333 2
## [2,] 49.16667 3
## [3,] 49.25000 4
## [4,] 49.33333 5
## [5,] 49.41667 6
## [6,] 49.50000 7
levels(mofactor)=c("Jan", "Feb", "Mar", "Apr", "May",
"Jun", "Jul", "Aug", "Sep", "Oct",
"Nov", "Dec")
airpass$justyear=justyear
airpass$mofacto=mofactor
Now that the data is grouped by months we can create a model.
mmod=lm(log(pass)~justyear + mofactor, data = airpass)
Finally we can plot our model against our original.
plot(log(pass)~year, data = airpass, type = "l")
lines(airpass$year, mmod$fitted.values, type = "l", col = "blue")
As you can see the dummy variable data set fits our data fairly well.
I also received some feeback in class for our group project. The biggest concern my partner had was that our paper read more as an R guide then as a paper. We will have to add our r code into a appendix and not in the paper itself. We also need to elabirate on why our data maters and connect it back to our research topic more. I think the most important thing that I learned from this peer review is how to present our data in a paper form instead of a how to R guide.