Problem-2.1 a) The regression model $x_t$ = $\beta_t$ + $\alpha_1$$Q_1(t)$ + $\alpha_2$$Q_2(t)$ + $\alpha_3$$Q_3(t)$ + $\alpha_4$$Q_4(t)$ + $\omega_t$

where $Q_i(t)$ = 1 if time t corresponds to quater i=1,2,3,4, and 0 otherwise

library(astsa)

#Making this series center
trend = time(jj)-1970

Q = factor(rep(1:4, 21))

reg = lm(log(jj)~0 + trend + Q, na.action = NULL)
summary(reg)

## 
## Call:
## lm(formula = log(jj) ~ 0 + trend + Q, na.action = NULL)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.29318 -0.09062 -0.01180  0.08460  0.27644 
## 
## Coefficients:
##       Estimate Std. Error t value Pr(>|t|)    
## trend 0.167172   0.002259   74.00   <2e-16 ***
## Q1    1.052793   0.027359   38.48   <2e-16 ***
## Q2    1.080916   0.027365   39.50   <2e-16 ***
## Q3    1.151024   0.027383   42.03   <2e-16 ***
## Q4    0.882266   0.027412   32.19   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1254 on 79 degrees of freedom
## Multiple R-squared:  0.9935, Adjusted R-squared:  0.9931 
## F-statistic:  2407 on 5 and 79 DF,  p-value: < 2.2e-16

The interpreta-tion of the parameters $\beta$, $\alpha_1$, $\alpha_2$, $\alpha_3$, and $\alpha_4$

trend = 0.167172 Quater1 coefficent= 1.052793 Quater2 coefficient = 1.080916 Quater3 coefficient = 1.151024 Quater4 coefficient = 0.882266

Include intercept in the model.

#Included intercept in the model
reg1 = lm(log(jj)~trend + Q, na.action = NULL)
summary(reg1)

## 
## Call:
## lm(formula = log(jj) ~ trend + Q, na.action = NULL)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.29318 -0.09062 -0.01180  0.08460  0.27644 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.052793   0.027359  38.480  < 2e-16 ***
## trend        0.167172   0.002259  73.999  < 2e-16 ***
## Q2           0.028123   0.038696   0.727   0.4695    
## Q3           0.098231   0.038708   2.538   0.0131 *  
## Q4          -0.170527   0.038729  -4.403 3.31e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1254 on 79 degrees of freedom
## Multiple R-squared:  0.9859, Adjusted R-squared:  0.9852 
## F-statistic:  1379 on 4 and 79 DF,  p-value: < 2.2e-16

b)If we include intercept in the model, Quater1 will be removed and Quater2 will become in significant.

Graph the data $x_t$ and $\hat{x_t}$

#Graph x
plot(log(jj),main='plot of log(x)',xlab='Time',ylab = 'log(x)',col='green')
lines(fitted(reg),col='red')

Estimated value for year1960 was under the actual values and for 1969 its above the actual value. Between year 1970 to 1975, estimated values are less than the actual values.Ourall its seems ok.

plot $x_t$ - $\hat{x_t}$

#Plot of xt-xt_hat
plot(log(jj) - fitted(reg), main='residuals',ylab='Actual - Estimated')

Residuls plot does not have any papern, seems white.

Problem - 2.3

#plot for part(a)
set.seed(6555)
par(mfcol = c(3 , 2))
for(i in 1:6)
  { x = ts(cumsum(rnorm(100,0.01,1))) #Data
    reg = lm(x~0+time(x),na.action = NULL) # Regression line without intercept
    plot(x) #plot of data
    lines(0.01*time(x),col='red', lty = 'dashed') #mean plot
    abline(reg, col = 'blue')
  
}

The mean and fit plot of 1 and 5 has kind of similar behavious but not exactly.and alos can say 3, 4 and 6 have same responce. figure 2 has differnt response. fit line is close to mean line in figure 3, 4 and 6.

Problem 2.6 a. $x_t$ = $\beta_0$ + $\beta_1$*t + $\omega_t$ where $\beta_0$ and $\beta_1$ are constant

prove $x_t$ is nonstationary $x_t$ = $\beta_0$ + $\beta_1$t + $\omega_t$ E($x_t$) = E($\beta_0$) + E($\beta_1$t) + E($\omega_t$)

where E($\omega_t$) = 0

E($x_t$) = E($\beta_0$) + E($\beta_1$t) = $\beta_0$ + $\beta_1$t

mean function is function of t. so series is not stationary.

The first differnce series $\bigtriangledown$$x_t$ = $x_t$ - $x_t_-_1$ =$\beta_0$ + $\beta_1$t + $\omega_t$ - ($\beta_0$ + $\beta_1$*(t-1) + $\omega_(t-1)$) = - $\beta_1$ + $\omega_t$ + $\omega_(t-1)$

mean function of difference series = - E($\beta_1$) + E($\omega_t$) + E($\omega_(t-1)$) = - $\beta_1$

mean function is independent of time t.

Autocovariance function for s = t, h=0 $\rho_y(h)$ = E(($x_t$ - $\mu_t$)($y_s$ - $\mu_s$)) = E((- $\beta_1$ + $\omega_t$ + $\omega_(t-1)$ - (- $\beta_1$)) (- $\beta_1$ + $\omega_s$ + $\omega_(s-1)$ - (- $\beta_1$)) = E(($\omega_t$ + $\omega_(t-1)$)($\omega_s$ + $\omega_(s-1)$)) = 2 $\sigma^2_{\omega}$

for s-t = 1 and t-s =1, |h| = 1 $\rho_y(h)$ = $\sigma^2_{\omega}$

for s-t = 2 and t-s = 2, |h| = 2 $\rho_y(h)$ = $\sigma^2_{\omega}$

$\rho_y(h)$ = { 2* $\sigma^2_{\omega}$, h=0 $\sigma^2_{\omega}$, |h| = 1 $\sigma^2_{\omega}$, |h|= 2 0 otherwise

As mean and autocovaraince is not function of time. series is stationary.

Repeating the process with replaceing white noise with general stationary process The first differnce series $\bigtriangledown$$x_t$ = $x_t$ - $x_t_-_1$ =$\beta_0$ + $\beta_1$t + $y_t$ - ($\beta_0$ + $\beta_1$*(t-1) + $y_{t-_1}$) = - $\beta_1$ + $y_t$ + $y_{t-_1}$

mean function of difference series = - E($\beta_1$) + E($y_t$) + E($y_{t-_1}$) = - $\beta_1$ + 2*$\mu_t$

mean function is independent of time t.

Autocovariance function

$\rho_y(h)$ = E(($x_t$ - $\mu_t$)($y_s$ - $\mu_s$)) = E((- $\beta_1$ + $y_t$ + $y_{t-_1}$ - (- $\beta_1$ +2$\mu_t$)) (- $\beta_1$ + $y_s$ + $y_{s-_1}$ - (- $\beta_1$+ 2$\mu_t$)) = E(($y_t$ + $y_{t-_1}$ - 2$\mu_t$ )($y_s$ + $y_{s-_1}$ - 2*$\mu_s$)) independent of t.

As mean and autocovaraince is not function of time. series is stationary.

problem -2.11

library(astsa)
plot.ts(oil,col = 'Blue', main="Plot of Oil and Gas")
lines(gas, col= 'Red')

This seems most similary to random walk with drift. It look dependent of time variable.so it is seems stationary series.

$y_t$ = $\bigtriangledown$log$x_t$

#Difference of log series 
par(mfrow = c(2,1))
x = diff(log(oil)-1)
y = diff(log(gas)-1)
plot(x, col ='blue')
plot(y, col = 'red')

It seems that plot does not have any trend or papern so series seems stationary.

par(mfrow = c(2,1))
acf(x, lag.max = 20)
acf(y,log.max = 20)

## Warning in plot.window(...): "log.max" is not a graphical parameter

## Warning in plot.xy(xy, type, ...): "log.max" is not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "log.max" is
## not a graphical parameter

## Warning in axis(side = side, at = at, labels = labels, ...): "log.max" is
## not a graphical parameter

## Warning in box(...): "log.max" is not a graphical parameter

## Warning in title(...): "log.max" is not a graphical parameter

most of points are in between two blue lines. so process seems stationary.

poil = diff(log(oil))
pgas = diff(log(gas))
ccf(poil, pgas)

At lag -0.3 and -0.05, oil is more than gas At lag 0.05 and 0.5, gas is more than oil.

#Scatterplot with nonparametric smoother 
par(mfrow = c(1,2))
oil_smooth = lowess(oil)
plot(oil, type="o", ylab="Smooth",main = "Oil") 
lines(oil_smooth, col="red")

gas_smooth = lowess(gas)
plot(gas, type="o", ylab="Smooth", main = "Gas") 
lines(gas_smooth, col="red")

Fit the Regression model $G_t$ = $\alpha_1$ + $\alpha_2$$I_t$ + $\beta_1$$O_t$ + $\beta_2$*$O_{t-_1}$ +$\omega_t$

#Model
indi = ifelse(poil < 0, 0, 1) 
mess = ts.intersect(pgas, poil, poilL = lag(poil,-1), indi) 
summary(fit <- lm(pgas~ poil + poilL + indi, data=mess))

## 
## Call:
## lm(formula = pgas ~ poil + poilL + indi, data = mess)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.18451 -0.02161 -0.00038  0.02176  0.34342 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.006445   0.003464  -1.860  0.06338 .  
## poil         0.683127   0.058369  11.704  < 2e-16 ***
## poilL        0.111927   0.038554   2.903  0.00385 ** 
## indi         0.012368   0.005516   2.242  0.02534 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04169 on 539 degrees of freedom
## Multiple R-squared:  0.4563, Adjusted R-squared:  0.4532 
## F-statistic: 150.8 on 3 and 539 DF,  p-value: < 2.2e-16

Poil, poill and indi is significant. intercept is not significant.

In the model $I_t$ is indicator of growth in oil price. Estimated coffieceint of indicator is 0.012368, which is positive. For Negative growth in the model $G_t$ = $\alpha_1$ + $\beta_1$$O_t$ + $\beta_2$$O_{t-_1}$ +$\omega_t$ $G_t$ = -0.006445 + 0.683127 * $O_t$ + 0.111927 * $O_{t-_1}

For positive or no growth in the model $G_t$ = $\alpha_1$ + $\alpha_2$$I_t$ + $\beta_1$$O_t$ + $\beta_2$$O_{t-_1}$ +$\omega_t$ $G_t$ = -0.006445 + 0.012368$I_t$ +0.683127 * $O_t$ + 0.111927 * $O_{t-_1} yes, when oil price is incersing positive coefficent will add contribution on models.

acf(resid(fit))

Not all the black lines are in the two blue lines. At lag 9 line is outside the blue line. Which means model assumption is not valid.