There are 2 types of seasonal variation: constant and increasing. Constant is ok because the magnitude of the swing is constant over time. Increasing is less good and you should transform it to look like constant seasonal variation. You can do this by either the log function or find a lambda that satisfies the transformation y_t ^ lambda. We focused on the seasonal component of the model equation y_t = TR_t + SN_t + E_t today.
Our worktime consisted of us using Faraway library dataset airpass, and looking at the seasonality of the set. Then we had to decide how to transform it. Then we had to repeat the process for the alr3 library dataset Mitchell for monthy seasonal trends.
Before plotting Y_t over time we needed to attach the data; then we considered if we have constant seasonal variation. If not constant, then we looked for a trasnformation that makes it look like constant.
library(faraway)
data(airpass)
attach(airpass)
plot(pass~year, type="l")
So, we should try to transform it. i chose to log both first to see how it looked and got lucky with my guess.
transformpass <- log(pass)
transformyear <- log(year)
plot(transformpass~transformyear, type="l")
transformations are the bomb.com because this graph looks much better than the original one.
Looking at the next data set, I attached the data and plotted temperature against the months.
library(alr3)
## Loading required package: car
##
## Attaching package: 'car'
## The following objects are masked from 'package:faraway':
##
## logit, vif
##
## Attaching package: 'alr3'
## The following objects are masked from 'package:faraway':
##
## cathedral, pipeline, twins
data(Mitchell)
attach(Mitchell)
plot(Temp~Month, type="l")
Honestly, that doesn’t look too bad. I don’t think I should transform it.
We were then assigned to create a dummy for each season, create model with 4 dummies, and plot new podel in red, model with 12 dummies in blue, data in black.
#attach(airpass)
#airmod <- lm(log(pass) ~ 0+ justyear + mofactor, data=airpass)
#coef(airmod)
#season <- airpass$mofactor
#levels(x) <- list(Winter=c("Dec", "Jan", "Feb"), Spring=c("Mar", "Apr", "May"), Summer=c("Jun","Jul","Aug"), Fall=c("Sep", "Oct", "Nov"))
I was unable to follow how to make the plots in different colors and I was confused on how to create the dummy variables. I understand how the indicator function works in application, but I’m not sure how to model it.
In our group paper, the results were a bit rushed because we may have procrastinated too much to not spend a long time carefully going over the results. I plan to go over the results more fully and spend time with the models in order to draw concise and carefully worded conclusions. Specifically, I found it helpful to get peer feedback on the writing that I had done for the methods and results/discussion portion of the paper. Grammatically, I will be revising the draft multiple times, paying special attention to what tense we are writing in. I am going to add more detail to describe why we chose the predictors that we did. I know we need to work on adding more analysis of the linear model by including topics like multicollinearity and prediction/confidence intervals. I will be going over these results and translating them to real English sentences, then tying that back into our research question and whether or not our original hypothesis were correct. Another specific revision I plan to make is going into more depth about why we think the results played out the way they did, as well as expanding on ideas of how we might alter a future project from this one(including immigration application wait times or researching other possible causes/explanations of higher levels of migration stock in certain countries). I Lastly I want to add a section identifying the very best model and explaining why it is the best model, to make it absolutely clear to readers who are less familiar with statistical methods and need to be directly told what each portion of analysis leads to each conclusion.