rm(list=ls())
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.3 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.3 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(car)
## Loading required package: carData
##
## Attaching package: 'car'
##
## The following object is masked from 'package:dplyr':
##
## recode
##
## The following object is masked from 'package:purrr':
##
## some
dat<-readxl::read_excel("/Users/mwilandhlovu/Desktop/ECONS2000/ECOM2000_final_sample_gva_mine.xlsx")
##Question 1:
create (i) a time variable, t, that starts at 1 in the first period and
increases by 1 every period,
dat$t<-seq(1,166)
and (ii) three dummy variables, D1,t , D2,t and D3,t , which take a value 1 if the observation t is in the corresponding quarter and 0 otherwise (e.g., 1, 1 D t if the period t is in the 1st quarter and 0 otherwise).
dat$q2<-ifelse(dat$Quarter==2,1,0)
dat$q3<-ifelse(dat$Quarter==3,1,0)
dat$q4<-ifelse(dat$Quarter==4,1,0)
Then, estimate a linear trend model with seasonality. Provide a summary output from R.
#yt = b00 + b1t + b2qt2 + b3qt3 + b4qt2
lms<-lm(GVA_mine~t+q2+q3+q4,data = dat)
summary(lms)
##
## Call:
## lm(formula = GVA_mine ~ t + q2 + q3 + q4, data = dat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4067.3 -1465.5 -808.5 381.2 10404.8
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1630.416 561.431 2.904 0.00420 **
## t 188.605 4.413 42.738 < 2e-16 ***
## q2 930.346 601.748 1.546 0.12405
## q3 1617.913 598.140 2.705 0.00757 **
## q4 1286.450 598.156 2.151 0.03299 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2724 on 161 degrees of freedom
## Multiple R-squared: 0.9194, Adjusted R-squared: 0.9174
## F-statistic: 458.9 on 4 and 161 DF, p-value: < 2.2e-16
##Question 2: State the sample regression equation estimated in Question 1.
\[ GVA = 1630.416 + 188.605t + 930.346D1_t + 1617.913D2_t + 1286.450D4_t + e_t \]
##Question 3:provide interpretations of the coefficients for t, D1, (i) Interpret the estimated coefficient for t. 188.60 is how much GVA changes, every quarter when t increases by 1, holding all other variables constant (D1, D2, D3).
how much GVA is higher in quarter 2 as compared to quater 1, holding all other Variables constant. The average difference in GVA between Q1 and 2. same goes from Q3 how much higher GVA is in Q3 from Q1.
##Question 4: Interpret the reported R‐square value and briefly comment on the adequacy of the model.
In this model the adjusted R squared is 0.91, which suggests 91% of the variation in GVA per captia is explained by model. The standard error is 2724 which itially looks like a high number, but when you compare it to the other high co-effieents it isnt such as big change. So the model is showing to be adequcate because of the R sqaured