Student Name: Houcheng Liu (Troye)
Student ID: 206802253

Part I. Introduction

The project aims to study wage changes in the United States, specifically examining how wages changed before and after the pandemic. Hourly average wage data is collected from the Federal Reserve Economic Data (FRED) updated at 2026-01-09, which contains monthly data from 2006-03-01 to 2025-12-01. Visually, the raw data does not change significantly, remaining an upward-sloping, linear movement. To study the impact of COVID-19, the following parts will fit wages with multiple models and decompose the data via an additive and multiplicative approach.

Part II. Results

Question 1

Q1 a
Time series plot
Q1 b

Average Hourly Earnings
Graphically, average hourly earnings has a roughly linear, upward sloping trend, suggesting it is not covariance stationary since \(\mu_{Wage}\) is different across time.
Growth rate of Average Hourly Earnings
The growth rate of average hourly earnings is mean-reverting. Despite the fact that COVID-19 stroke the economy heavily, causing severe fluctuations, in general, variance of wage growth appears to be stable overtime. Therefore, it suggests that the growth rate of average hourly earnings is approximately covariance stationary.

Q1 c

>Answer
Average Hourly Earnings
Autocorrelation is strong and slowly decays throughout the time, indicating the current wage level strongly depends on previous periods’ wage level; Partial autocorrelation is significant at lag 1 and approximately 0 for the other lags, indicating only the average hourly wage one period before will directly affect the wage level of the current period. The other periods will affect the period after, then the effects are indirectly transmitted to the current period through lag 1.
Growth rate of Average Hourly Earnings
Both ACF and PACF are significant for multiple lags, however, it is hard to visually identify patterns from ACF and PACF. ACF and PACF both suggest that current period wage growt rate is associated with previous periods’ wage growth rate. Since they are significant, it indicates that wage growth exihibts patterns (trend, seasonality, cycle) that we can study.

Q1 d
Q1 e
## Wage Linear Regression R^2= 0.9416281
## Wage Quadratic Regression R^2= 0.9922468
## Wage Growth Linear Regression R^2= 0.005333449
## Wage Growth Periodic Regression R^2= 0.06338962
Answer:
Wage
Quadratic regression better fits the data. Residuals for the linear model spread across -1.76 to 2.56. In a linear regression model, residuals decrease and then increase as the fitted value increases. This is because wages are growing at an increasing rate; however, linear regression has a constant slope. We can expect residuals from a linear model to become more significant as time increases. This indicates that the linear model will underestimate the wage growth rate, and therefore, underestimate wages in subsequent periods. The quadratic regression model has smaller residuals across time, ranging from -1.08 to 0,81. From \(R^2\), the quadratic model can explain 99.22% of the average hourly wage rate variations. According to the two graphs, both residuals demonstrate strong patterns. It suggests that both linear and quadratic regression are unable to fully capture wage variations; there are patterns yet to be explained by seasonality and trend.
Wage Growth
For both periodic regression and linear regression model, their residuals vs. fitted wage growth do not display obvious patterns, suggesting that both linear and periodic regression are adequate to explain the wage growth trend. However, the scope of residuals is similar to the scope of wage growth, suggesting both the periodic regression model and the linear regression model are insufficient to explain variations. The data is mainly described by remainders, which are likely to be external factors such as shocks and price level expectations.
Q1 f

Answer:
Wage
In the linear regression model, residuals spread more evenly and widely from -2 to 2.5, they visually spread around 0. In the quadratic regression model, residuals concentrate at 0.3 and spread more narrowly across -1 to 0.5. Visually, residuals have a larger left tail.
Wage Growth
Histograms for linear and periodic regression both centered around 0 and spread around -0.05 to 0.05. Residuals are more concentrated to their mean in periodic regression compare with linear regression. Residuals from the two model both have similar range compared with wage growth, the observation aligned with part1.e, the two models are both unable to explain wage growth well, leaving most variations explained by external transitory shocks.
Q1 g
## 
## Regression Results
## ==========================================================================
##                                      Dependent variable:                  
##                     ------------------------------------------------------
##                                              Time                         
##                                (1)                         (2)            
## --------------------------------------------------------------------------
## T                            0.067***                   0.007***          
##                              (0.001)                     (0.002)          
##                                                                           
## I(T2)                                                   0.0003***         
##                                                         (0.00001)         
##                                                                           
## Constant                    18.681***                   21.086***         
##                              (0.149)                     (0.082)          
##                                                                           
## --------------------------------------------------------------------------
## Observations                   238                         238            
## R2                            0.942                       0.992           
## Adjusted R2                   0.941                       0.992           
## Residual Std. Error      1.145 (df = 236)           0.418 (df = 235)      
## F Statistic         3,807.040*** (df = 1; 236) 15,037.490*** (df = 2; 235)
## ==========================================================================
## Note:                                          *p<0.1; **p<0.05; ***p<0.01
## 
## Regression Results
## ==============================================================
##                                Dependent variable:            
##                     ------------------------------------------
##                            Time                 g_wage        
##                             (1)                  (2)          
## --------------------------------------------------------------
## T                         0.00001                             
##                          (0.00001)                            
##                                                               
## T                                              0.00001        
##                                               (0.00001)       
##                                                               
## sin.t                                         -0.002***       
##                                                (0.001)        
##                                                               
## cos.t                                           0.0003        
##                                                (0.001)        
##                                                               
## Constant                  0.002*                0.002*        
##                           (0.001)              (0.001)        
##                                                               
## --------------------------------------------------------------
## Observations                237                  237          
## R2                         0.005                0.063         
## Adjusted R2                0.001                0.051         
## Residual Std. Error  0.007 (df = 235)      0.007 (df = 233)   
## F Statistic         1.260 (df = 1; 235) 5.256*** (df = 3; 233)
## ==============================================================
## Note:                              *p<0.1; **p<0.05; ***p<0.01
## Wage Linear Regression R^2= 0.9416281
## Wage Quadratic Regression R^2= 0.9922468
## Wage Growth Linear Regression R^2= 0.005333449
## Wage Growth Periodic Regression R^2= 0.06338962
Answer:
Wage The regression summary table indicates that both the linear model and the quadratic model explain the wage sample well. According to \(R^2\), the linear model is able to explain 94.2% of wage variations; however, as previously discussed, as the wage growth rate displays an upward sloping trend, the linear model may be insufficient to explain future wages in later periods. The quadratic model takes the changing wage growth rate into account; it is able to explain 99.2% of the variations. t-statistics associated with coefficients are all significant under 1% level of significance, which confirms the upward sloping trend in wages, and it grows faster overtime. F statistic for the quadratic model is significant, indicating that the parameters included in the model jointly explains the data compared with the restricted model (\(Wage=Constant\)).
Wage Growth The regression summary table indicates that both linear model and the periodic model have limited ability to explain the growth in wages. According to \(R^2\), the linear model is able to explain 0.5% of the growth variations. Despite the fact that the periodic model considers seasonal components, it explains only 6.3% of the growth in the sample. Only the constant and the sine.t coefficient are statistically significant. \(R^2\) suggests that wage growth is affected by transitory shocks (residuals), coefficients for sine.t is significant indicates that wage growth exhibits seasonality. The F-statistic is more significant in the periodic model, further confirming that introducing sine helps the model explain wage growth.
Q1 h
##                 df      AIC
## linear_reg_w     3 743.8948
## quadratic_reg_w  4 265.4379
##                 df      BIC
## linear_reg_w     3 754.3116
## quadratic_reg_w  4 279.3270
##                 df       AIC
## linear_reg_gw    3 -1651.646
## periodic_gw_reg  5 -1661.900
##                 df       BIC
## linear_reg_gw    3 -1641.242
## periodic_gw_reg  5 -1644.559
Answer:
Wage
Both AIC and BIC suggest quadratic regression is better at explaining wage samples.
Wage Growth
Both AIC and BIC suggest periodic regression in better at explaining growth in wages, however, the AIC and BIC differences between the two model are fairly small.
Q1 i


Question 2

Q2 a
## # A dable: 237 x 6 [1]
## # Key:     .model [1]
## # :        g_wage = trend + remainder
##    .model     T    g_wage   trend remainder season_adjust
##    <chr>  <int>     <dbl>   <dbl>     <dbl>         <dbl>
##  1 stl        2  0.0143   0.00238   0.0120       0.0143  
##  2 stl        3 -0.0148   0.00237  -0.0172      -0.0148  
##  3 stl        4 -0.000998 0.00237  -0.00336     -0.000998
##  4 stl        5  0.0119   0.00236   0.00954      0.0119  
##  5 stl        6 -0.00792  0.00235  -0.0103      -0.00792 
##  6 stl        7  0.0114   0.00235   0.00902      0.0114  
##  7 stl        8  0.00783  0.00234   0.00549      0.00783 
##  8 stl        9 -0.00342  0.00234  -0.00575     -0.00342 
##  9 stl       10  0.00780  0.00233   0.00547      0.00780 
## 10 stl       11  0.00436  0.00232   0.00204      0.00436 
## # ℹ 227 more rows
Answer:
Wage
After detrending and seasonally adjusting the series through additive decomposition, the remainder displays significant cyclical fluctuations. The ACF and PACF suggest the existence of cycles since autocorrelations have passed the significance band for multiple lags. By comparing the scale of remainder components and trend components from previous sections, we can conclude that the wage rate is largely driven by a quadratic trend.
Wage Growth After detrending and seasonally adjusting the series through additive decomposition, the remainder displays irregular fluctuations. Multiple ACF and PACF pass through the significance band, showing strong serial correlations. Therefore, cycles exists in wage growth. Notice that, from the table above, the seasonally adjust value is the same as wage growth rate, suggesting that seasonality is weak. However, from question question 1.g, we came to the conclusion that seasonality is statistically significant because parameters of \(sint\) is significant. Therefore, although from a statistical perspective, seasonality exists, it is not the most important component in this time series explaining variations across time. The remainder visually fluctuates through time, it is fairly noisy.
Q2 b
Answer:
Wage
After detrending and seasonally adjusting the series through multiplicative decomposition, the remainder displays cyclical patterns. Multiple ACF and PACF passed the significance band, suggesting signs of serial correlation.
Wage Growth After detrending and seasonally adjusting the series through multiplicative decomposition, the remainder shows irregular fluctuations around 0. ACF and PACF are significant for multiple lags, as discussed in part 2a, cycles exists in wage growth.
Q2 c
Answer: Graphically, the results of additive and multiplicative decomposition are similar. The two approaches results in similar observations and conclusions recorded in part2a and 2b. For Wage data, the two approaches are similar because Wage did not experience strong variations (shock) across time (except COVID); the trend is the main driving factor, as suggested in Q1.g the quadratic regression explains 99.2% of the wage sample. Therefore, after detrend and seasonally adjust wage, only a small portion of wage variation is left in the remainder, and they are largely similar. For Wage growth data, the two approaches are similar because Wage growth is largely explained by external, transitory shocks. Those shocks are not explained by trend, seasonal, and cyclical factors; therefore, changing the approach does not affect the decomposition results significantly.
Q2 d
Answer:
As part Q2c mentioned, the two methods produce similar results; therefore, cycle components are similar. For Wage data, the cycles from the two approaches are similar because Wage did not experience rapid, strong shocks across time (except for COVID); the trend is the main driving factor. From the graphs displayed above, the two methods produces similar trend and seasonal components; therefore, the remainder is fairly similar. Since amplitudes of the sample wage data varies across time, multiplicative decomposition is better if we have to make a decision between the two approaches. For Wage growth data, the two approaches have similar results because Wage growth is largely explained by cycles. Those shocks are not explained by trend and seasonality; therefore, changing the approach does not affect the results significantly. Multiplicative approach is better if we have to conclude one approach is better than the other because wage growth amplitudes changes across time.
Q2 e
Answer:
Wage
The scale of seasonality is relatively small compare with the scale of trend, it alignes with the conclusion from previous parts that trends explains the majority variations in the data set. Seasonality reaches its peak at the beginning of the year, then gradually decreases until it reaches the bottom at the middle of the year. The wage average is similar across months; in the same month, the variations across years is significant. It suggests that trend variations are larger compared to seasonal variations. It further supports the conclusion that the trend is more important to explain wage changes than seasonality.
Wage Growth The scale of seasonality and the scale of trend is smaller than the scale of the remainder, suggesting that remainder is more vital to explain wage growth variations. Seasonality displays a stable pattern through time, it reaches the peak at the middle of the year touches the bottom at the end of the year. The averages for wage growth rate differ across month, suggesting presence of seasonality. In the same month, wage growth rates varies substantially across years. Therefore, fluctuations in wage growth are better explained by transitory shocks and cycles rather than seasonal factors.
Q2 f

Answer:
Based on the previous parts, additive decomposition is preferred, therefore prediction is established based on aditive approach. Quadratic trend is applied to wage and Linear trend is applied to wage growth, as concluded in the former parts.


Part III Conclusions and Future Work

Conclusion: According to Part II, we can conclude that the quadratic model better explains wages in the US starting from 2006-03-01 to 2025-12-01. Additive and Multiplicative decomposition approaches produce similar results, however, since both wage and its growth experienced amplititude variations, multiplicative decomposition might be better for formality reasons. Wage restores to its original trend after the shock in 2020. It is supported by analysis regarding wage growth as well, dispersion, and average age growth rate, which is highly consistent with what it was before shocks. In summary, form the data, there is limited evidence of COVID causes wage changes in the long run.
Issues and Future Work: Although we have come to the conclusion that COVID does result in long run shift in wage patterns, this is mostly likely a result of regulations and policies posted after the shock. To study the impact of COVID on wages, it’s essential to separate the effects of post-COVID policies from the data. Data collected from FRED represents nominal data, which requires further processing to get rid of inflation. Since price level is multiplicative (\(P_(t+1)=P_(t)*(1+pi))\)), after excluding Price level, the real wage may display a different trend and seasonality. Notice that in Question 2a, we came to the conclusion that wage growth is “approximately” white noice. Further tests is required to claim modelling success. Therefore in future work, incorporating exogeneous variables such as polictial effects, CPI, inflation expectation could be beneficial.


Part IV References

Reference: FRED. (2026). Average Hourly Earnings of All Employees, Total Private (CEU0500000003). FRED. https://fred.stlouisfed.org/series/CEU0500000003


Part V. Code

setwd("~/Desktop/learning/UCLA/Winter/Economics Forecasting/Project 1")
wage<-read.csv("US wage1.csv")
library(stargazer)   
library(dplyr)        
library(tsibble)      
library(tsibbledata)   
library(lubridate)    
library(ggplot2)  
library(feasts)  
library(seasonal)
library(fpp3)
library(forecast)
library(patchwork)

#create tsibble
str(wage)
Wage<-wage$CEU0500000003
wage_ts<- wage |> mutate(T=row_number(),date = as_date(observation_date))|> as_tsibble (index = T) #set the row number as T, date as the "observation date" in csv.
g_wage_ts<-wage_ts |>mutate(T=row_number(),g_wage=difference(log(Wage))) #calculate wage growth rate by using log difference
wage_ts
g_wage_ts
#1a
#time series plot
autoplot(wage_ts,Wage) + labs(title="Average Hourly Earnings:2006 - 2025",x="Time",y="Dollars per hour") #plot Wage
autoplot(g_wage_ts,g_wage) + labs(title="Growth rate of Average Hourly Earnings:2006 - 2025",x="Time",y="%") #plot wage growth rate

#1c
wage_ts |> feasts::gg_tsdisplay(y=Wage, plot_type="partial") #plot Wage and its ACF, PACF
g_wage_ts |> feasts::gg_tsdisplay(y=g_wage,plot_type="partial") #plot Wage growth rate and its ACF, PACF

#1d
linear_reg_w<-lm(Wage~T,data=wage_ts) #linear regression of wage with respect to T
linear_reg_gw<-lm(g_wage_ts$g_wage~g_wage_ts$T) #linear regression of wage growth rate with respect to T
#export all coefficients in the regression into a new data set
b_l_w<-coef(linear_reg_w) 
b_l_gw<-coef(linear_reg_gw)

#plot wage and wage growth rate, then plot the regression line on the same graph
#plot the linear regression line
autoplot(wage_ts,Wage) + labs(title="Average Hourly Earnings with linear approximation:2006 - 2025",x="Time (Month)",y="Dollars per hour")+
  geom_line(aes(y=b_l_w[1]+b_l_w[2]*T),color="red")

autoplot(g_wage_ts,g_wage)+ labs(title="Average Hourly Earnings Growth Rate with linear approximation:2006 - 2025",x="Time (Month)",y="%")+
  geom_line(aes(y=b_l_gw[1]+b_l_gw[2]*T),color="red")

#plot periodic regression line
#create variables
sin.t<-sin(2*pi*g_wage_ts$T/12) 
cos.t<-cos(2*pi*g_wage_ts$T/12)
periodic_gw_reg<-lm(g_wage~T+sin.t+cos.t,data=g_wage_ts)
b_pl_gw<-coef(periodic_gw_reg)
autoplot(g_wage_ts,g_wage)+geom_line(aes(y=b_pl_gw[1]+b_pl_gw[2]*T+b_pl_gw[3]*sin.t+b_pl_gw[4]*cos.t),color="red")+labs(title="Average Hourly Earnings growth with periodic approximation")

#plot quadratic regression line
quadratic_reg_w<-lm(Wage~T+I(T^2),data=wage_ts)
b_ql_w<-coef(quadratic_reg_w)
autoplot(wage_ts, Wage) +geom_line(aes(y=b_ql_w[1]+b_ql_w[2]*T+b_ql_w[3]*T^2),color="red")+labs(title="Average Hourly Earnings with Quadratic approximation: 2006–2025",
                                                                                                x="Time (Month)",y="Dollars per hour")

#1e
#plot linear fit
par(mfrow=c(2,1)) #let the following two graphs display in the same window, di
plot(Wage,ylab="Average Hourly Rate",xlab="Time (Month)",lwd=2,col="black",type="l")
#plot the linear fitted line
lines(wage_ts$T,linear_reg_w$fit,col="red",lwd=2)
#plot residuals
plot(wage_ts$T,linear_reg_w$res, ylab="Residuals",type='l',xlab="Time")
summary(linear_reg_w)
R_linear<-summary(linear_reg_w)$r.squared

par(mfrow=c(2,1))
plot(g_wage_ts$g_wage,ylab="Growth of Average Hourly Rate (linear)",xlab="Time (Month)",lwd=2,col="black",type="l")
lines(linear_reg_gw$fit,col="red",lwd=2)
plot(linear_reg_gw$res, ylab="Residuals",type='l',xlab="Time")
summary(linear_reg_gw)
R_linear_gw<-summary(linear_reg_gw)$r.squared

#plot periodic wage growth
par(mfrow=c(2,1))
plot(g_wage_ts$g_wage,ylab="Growth of Average Hourly Rate (Periodic)",xlab="Time (Month)",lwd=2,col="black",type="l")
lines(periodic_gw_reg$fit,col="red",lwd=2)
#plot residual
plot(periodic_gw_reg$res, ylab="Residuals",type='l',xlab="Time")
summary(periodic_gw_reg)
R_periodic_gw<-summary(periodic_gw_reg)$r.squared

#quadratic
par(mfrow=c(2,1))
plot(Wage,ylab="Average Hourly Rate",xlab="Time (Month)",lwd=2,col="black",type="l")
lines(wage_ts$T,quadratic_reg_w$fit,col="red",lwd=2)
plot(wage_ts$T,quadratic_reg_w$res, ylab="Residuals",type='l',xlab="Time")
summary(quadratic_reg_w)
R_quadratic<-summary(quadratic_reg_w)$r.squared
range(linear_reg_w$residuals)
range(quadratic_reg_w$residuals)
range(linear_reg_gw$residuals)
range(periodic_gw_reg$residuals)

#create residual and fitted value table
#wage
#linear reg
wage_residual_l<-resid(linear_reg_w)
wage_fit_l<-fitted(linear_reg_w)
#quadratic reg
wage_residual_q<-resid(quadratic_reg_w)
wage_fit_q<-fitted(quadratic_reg_w)
#g wage
#linear reg
g_wage_residual_l<-resid(linear_reg_gw)
g_wage_fit_l<-fitted(linear_reg_gw)
#periodic reg
g_wage_residual_p<-resid(periodic_gw_reg)
g_wage_fit_p<-fitted(periodic_gw_reg)
#plot
plot(wage_fit_l,wage_residual_l,xlab="Fitted values",ylab="Residuals",main="Residuals vs Fitted Wage (linear reg)")
plot(wage_fit_q,wage_residual_q,xlab="Fitted values",ylab="Residuals",main="Residuals vs Fitted Wage (quadratic reg)")
plot(g_wage_fit_l,g_wage_residual_l,xlab="Fitted values",ylab="Residuals",main="Residuals vs Fitted Wage Growth (linear reg)")
plot(g_wage_fit_p,g_wage_residual_p,xlab="Fitted values",ylab="Residuals",main="Residuals vs Fitted Wage Growth (Periodic reg)")
#1f
par(mfrow=c(4,1))
#create 4 histograms regarding residuals of the four fitte graph listed above
hist(linear_reg_w$residuals,main="Histogram for wage linear regression residuals")
hist(quadratic_reg_w$residuals,main="Histogram for wage quadratic regression residuals")
hist(linear_reg_gw$residuals,main="Histogram for wage growth linear regression residuals")
hist(periodic_gw_reg$residuals,main="Histogram for wage growth periodic regression residuals")

#1g
#create regression summary statistics via stargazer
stargazer(
  linear_reg_w,quadratic_reg_w,
  type = "text",
  title = "Regression Results",
  dep.var.labels = "Time")

stargazer(
  linear_reg_gw,periodic_gw_reg,
  type = "text",
  title = "Regression Results",
  dep.var.labels = "Time")

#1h
#Generate AIC, BIC for linear regression, quadratic regression and periodic regression
AIC(linear_reg_w,quadratic_reg_w)
BIC(linear_reg_w,quadratic_reg_w)
AIC(linear_reg_gw,periodic_gw_reg)
BIC(linear_reg_gw,periodic_gw_reg)

#1i
par(mfrow=c(1,1))
#create variables t to translate date into numbers 1,2,3...
tn<-data.frame(t=seq(as.Date("2006-03-01"),as.Date("2025-12-01"), by = "month"))
tn$T <- seq_len(nrow(tn))
#Predict the future wage value by using quadratic regression model
pred_w=predict(quadratic_reg_w, tn, se.fit = TRUE)
#generate 95% prediction interval (where the actual wage might fall)
pred.plim_w = predict(quadratic_reg_w,tn, level =0.95, interval="prediction")
#generate 95% confidence interval (where the average wage might fall)
pred.clim_w = predict(quadratic_reg_w, tn,level=0.95, interval="confidence") 
#plot the prediction with confidence, prediction interval
matplot(tn$t,cbind(pred.clim_w, pred.plim_w[,-1]),
        lty=c(1,1,1,3,3), type="l", lwd=2, ylab="predicted Wage",xlab="Time[Month]")

pred_gw=predict(periodic_gw_reg, tn, se.fit = TRUE)
pred.plim_gw = predict(periodic_gw_reg,tn, level =0.95, interval="prediction")
pred.clim_gw = predict(periodic_gw_reg, tn,level=0.95, interval="confidence") 
matplot(tn$t,cbind(pred.clim_gw, pred.plim_gw[,-1]),
        lty=c(1,1,1,3,3), type="l", lwd=2, ylab="predicted Wage Growth",xlab="Time[Month]")

#2a
#recreate tsibble wage_ts, rename CEU0500000003 as wage and translate date in the csv as "observation"
wage_ts <- wage %>%
  rename(wage = CEU0500000003) %>%
  mutate(
    observation = as.Date(observation_date),        
    observation = yearmonth(observation)     
  ) %>%
  as_tsibble(index = observation)
#additive decomposition
wage_ts |>
  model(
    classical_decomposition(wage, type = "additive") #additive decomposition Y=T+S+R
  ) |>
  components() |>
  autoplot() #plot the additive decomposition+
labs(title = "Classical additive decomposition of average Hourly wage rate")

wage_ts |>
  model(
    classical_decomposition(wage, type = "multiplicative") #multiplicative decomposition Y=T*S*R
  ) |>
  components() |>
  autoplot() #plot the multiplicative decomposition+
labs(title = "Classical multiplicative decomposition of average Hourly wage rate")

#create tsibble fit_stl_w and fit_stl_gw, use cmp_w and cmp_gw to display all decomposition related value
fit_stl_w<- wage_ts %>%model(stl=STL(wage ~ trend(window = 13)+season(window="periodic"))) #the larger the window, the less frequent trend will change, decompose season periodically
cmp_w<- fit_stl_w%>%components()
#detrend using additive approach: Y_detrended=Y-T
wage_detrended<-cmp_w%>%transmute(observation=observation,wage_detrended=wage-trend)
wage_dt<-wage_detrended$wage_detrended
#seasonally adjust the value via additive approach: Y_seasonally_adjusted=Y-S
wage_seasonally_adj<-cmp_w%>%transmute(observation=observation,wage_seasonally_adj=wage-season_year)
wage_sadj<-wage_seasonally_adj$wage_seasonally_adj
#seasonally adjust and detrend wage via additive approach: Y_seasonally_adjusted_detrended=Y-T-S
wage_detrend_seasonally_adj<-cmp_w%>%transmute(observation=observation,wage_detrend_seasonally_adj=wage-trend-season_year)
wage_dt_sadj<-wage_detrend_seasonally_adj$wage_detrend_seasonally_adj
#plot detrend, seasonally adjusted and both detrended and seasonally adjusted wage and their corresponding ACF, PACF
wage_detrended |> feasts::gg_tsdisplay(y=wage_dt, plot_type="partial")+ggplot2::ylab("Wage Detrended")
wage_seasonally_adj |> feasts::gg_tsdisplay(y=wage_sadj, plot_type="partial")+ggplot2::ylab("Wage Seasonally adjusted")
wage_detrend_seasonally_adj |> feasts::gg_tsdisplay(y=wage_dt_sadj,plot_type="partial")+ggplot2::ylab("Wage Detrended & Seasonally adjusted")+ggplot2::xlab("Time[Month]")


#repeat same process from line 171-186
fit_stl_gw<- g_wage_ts %>% filter(!is.na(g_wage)) %>%model(stl=STL(g_wage ~ trend(window = 13)+season(window="periodic")))
cmp_gw<- fit_stl_gw%>%components()
g_wage_detrended<-cmp_gw%>%transmute(observation=T,g_wage_detrended=g_wage-trend)
g_wage_dt<-g_wage_detrended$g_wage_detrended
g_wage_seasonally_adj<-cmp_gw%>%transmute(observation=T,season_adjust)
g_wage_sadj<-g_wage_seasonally_adj$season_adjust
#since season_adjust = g_wage, we can conclude that g_wage does not exhibits seasonality
g_wage_detrend_seasonally_adj<-cmp_gw%>%transmute(T,g_wage_detrend_seasonally_adj=remainder)
g_wage_dt_sadj<-g_wage_detrend_seasonally_adj$g_wage_detrend_seasonally_adj
g_wage_detrended |> feasts::gg_tsdisplay(y=g_wage_dt, plot_type="partial")+ggplot2::ylab("Wage Growth rate Detrended")
g_wage_seasonally_adj |> feasts::gg_tsdisplay(y=g_wage_sadj, plot_type="partial")+ggplot2::ylab("Wage Growth rate Seasonally adjusted")
g_wage_detrend_seasonally_adj |> feasts::gg_tsdisplay(y=g_wage_dt_sadj,plot_type="partial")+ggplot2::ylab("Wage Growth rate Detrended & Seasonally adjusted")

#2b
#detrend wage via multiplicative approach: Y_detrended=Y/T
wage_detrended_m<-cmp_w%>%transmute(observation,wage_detrended=wage/trend)
wage_dt_m<-wage_detrended_m$wage_detrended
#detrend wage via multiplicative approach: Y_seasonally_adjusted=Y/S
wage_seasonally_adj_m<-cmp_w%>%transmute(observation,wage_seasonally_adj=season_adjust)
wage_sadj_m<-wage_seasonally_adj_m$wage_seasonally_adj
#detrend and seasonally adjust wage via multiplicative approach: Y_detrend_seasonally_adjusted=Y/T/S, since in cmp_w already helped us to calculate Wage/T/S and set it as "remainder", we will rename remainder as Wage_detrend_seasonally_adjusted
wage_detrend_seasonally_adj_m<-cmp_w%>%transmute(observation,wage_detrend_seasonally_adj=remainder)
wage_dt_sadj_m<-wage_detrend_seasonally_adj_m$wage_detrend_seasonally_adj
#plot detrend, seasonally adjusted and both detrended and seasonally adjusted wage and their corresponding ACF, PACF
wage_detrended_m |> feasts::gg_tsdisplay(y=wage_dt_m, plot_type="partial")+ggplot2::ylab("Wage Detrended")
wage_seasonally_adj_m |> feasts::gg_tsdisplay(y=wage_sadj_m, plot_type="partial")+ggplot2::ylab("Wage Seasonally adjusted")
wage_detrend_seasonally_adj_m |> feasts::gg_tsdisplay(y=wage_dt_sadj_m,plot_type="partial")+ggplot2::ylab("Wage Detrended & Seasonally adjusted")+ggplot2::xlab("Time[Month]")

#repeat the same process from line 204 to line 216 on wage growth
g_wage_detrended_m<-cmp_gw%>%transmute(T,g_wage_detrended=g_wage/trend)
g_wage_dt_m<-wage_detrended_m$wage_detrended
g_wage_seasonally_adj_m<-cmp_gw%>%transmute(T,g_wage_seasonally_adj=season_adjust)
g_wage_sadj_m<-g_wage_seasonally_adj_m$g_wage_seasonally_adj
g_wage_detrend_seasonally_adj_m<-cmp_gw%>%transmute(T,g_wage_detrend_seasonally_adj=remainder)
g_wage_dt_sadj_m<-g_wage_detrend_seasonally_adj_m$g_wage_detrend_seasonally_adj
g_wage_detrended_m|>filter(!is.na(g_wage_detrended))|>feasts::gg_tsdisplay(y=g_wage_detrended, plot_type="partial")+ggplot2::ylab("Wage Growth rate Detrended")
g_wage_seasonally_adj_m |> feasts::gg_tsdisplay(y=g_wage_sadj_m, plot_type="partial")+ggplot2::ylab("Wage Growth rate Seasonally adjusted")
g_wage_detrend_seasonally_adj_m |> feasts::gg_tsdisplay(y=g_wage_dt_sadj_m,plot_type="partial")+ggplot2::ylab("Wage Growth rate Detrended & Seasonally adjusted")+ggplot2::xlab("Time[Month]")

#2c
wage_detrend_seasonally_adj |> feasts::gg_tsdisplay(y=wage_dt_sadj,plot_type="partial")+ggplot2::ylab("Wage Detrended & Seasonally adjusted Additive approach")+ggplot2::xlab("Time[Month]")
wage_detrend_seasonally_adj_m |> feasts::gg_tsdisplay(y=wage_dt_sadj_m,plot_type="partial")+ggplot2::ylab("Wage Detrended & Seasonally adjusted Multiplicative approach")+ggplot2::xlab("Time[Month]")
g_wage_detrend_seasonally_adj |> feasts::gg_tsdisplay(y=g_wage_dt_sadj,plot_type="partial")+ggplot2::ylab("Wage Growth rate Detrended & Seasonally adjusted Additive approach")
g_wage_detrend_seasonally_adj_m |> feasts::gg_tsdisplay(y=g_wage_dt_sadj_m,plot_type="partial")+ggplot2::ylab("Wage Growth rate Detrended & Seasonally adjusted Multiplicative approach")+ggplot2::xlab("Time[Month]")
#2e
#select only seasonal components from wage_ts. wage_ts contains both trend and seasonal data. Then plot
wage_ts |>
  model(
    classical_decomposition(wage, type = "multiplicative")
  ) |>
  components() |> select(seasonal)|>
  autoplot() +
  labs(title = "Seasonal Factors of average Hourly wage rate")

#after calculating growth, g_wage in the first period is NA, thus we need to let R "ignore" the NA value
g_wage_ts <- g_wage_ts |>
  mutate(observation = yearmonth(date)) |>
  as_tsibble(index = observation) |>
  fill_gaps()
#select only seasonal components from g_wage_ts. Then plot
g_wage_ts |>
  model(
    classical_decomposition(g_wage, type = "multiplicative")
  ) |>
  components() |> select(seasonal)|>
  autoplot() +
  labs(title = "Seasonal Factors of average Hourly wage growth rate")

#plot the seasonality in another approach, displaying the same month across all years in the same row.
wage_ts|>gg_subseries(wage)
g_wage_ts|>gg_subseries(g_wage)

#2f
library(forecast)
#create wage ts then apply time series linear regression model
wage_ts_base<-ts(wage_ts$wage,start=c(2006, 3),frequency=12)
wage_fit1<-tslm(wage_ts_base ~ trend+I(trend^2))
wage_fit2=tslm(wage_ts_base ~ season)
wage_fit3=tslm(wage_ts_base ~ trend + I(trend^2) + season)
#plot seasonal, trend and seasonal+trend forecast in one graph
par(mfrow=c(3,1))
plot(forecast(wage_fit1,h=12),main="Wage Forecast Trend")
lines(wage_fit1$fitted.values, col="red")
legend("topleft",legend = c("Wage", "Fitted values","Forecast"),col = c("black", "red","blue"),
       lty = c(1, 1),lwd = c(2, 2),bty = "n")
plot(forecast(wage_fit2,h=12),main="Wage Forecast Seasonality")
lines(wage_fit2$fitted.values, col="red")
plot(forecast(wage_fit3,h=12),main="Wage Forecast Trend + Seasonality")
lines(wage_fit3$fitted.values, col="red")



#create wage ts then apply time series linear regression model
g_wage_ts_base<-ts(g_wage_ts$g_wage,start=c(2006, 4),frequency=12)
g_wage_fit1<-tslm(g_wage_ts_base ~ trend+I(trend^2))
g_wage_fit2=tslm(g_wage_ts_base ~ season)
g_wage_fit3=tslm(g_wage_ts_base ~ trend + I(trend^2) + season)
#plot seasonal, trend and seasonal+trend forecast in one graph
par(mfrow=c(3,1))
plot(forecast(g_wage_fit1,h=12),main="Wage Growth Forecast Trend")
lines(g_wage_fit1$fitted.values, col="red")
legend("topleft",legend = c("Wage", "Fitted values","Forecast"),col = c("black", "red","blue"),
       lty = c(1, 1),lwd = c(2, 2),bty = "n")
plot(forecast(g_wage_fit2,h=12),main="Wage Growth Forecast Seasonality")
lines(g_wage_fit2$fitted.values, col="red")
plot(forecast(g_wage_fit3,h=12),main="Wage Growth Forecast Trend + Seasonality")
lines(g_wage_fit3$fitted.values, col="red")