week 14 discussion

data <- read.csv("/Users/timyang/Downloads/GoldUP.csv")

data2 <- data[ , c("Gold_Price", "Interest_Rate", "CPI", "USD_Index")]

pairs(data2, pch = 18, col = "steelblue")

library(GGally)

## Loading required package: ggplot2

## Registered S3 method overwritten by 'GGally':
##   method from   
##   +.gg   ggplot2

#generate the pairs plot
ggpairs(data2)

model <- lm(Gold_Price~ Interest_Rate + CPI + USD_Index, data = data2)

hist(residuals(model), col = "steelblue")

#create fitted value vs residual plot
plot(fitted(model), residuals(model))

#add horizontal line at 0
abline(h = 0, lty = 2)

summary(model)

## 
## Call:
## lm(formula = Gold_Price ~ Interest_Rate + CPI + USD_Index, data = data2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4911.6 -1939.2  -633.6  1417.1 13820.4 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -284.018   1903.514  -0.149  0.88152    
## Interest_Rate  491.182    162.744   3.018  0.00282 ** 
## CPI            380.144      6.642  57.231  < 2e-16 ***
## USD_Index     -128.712     16.753  -7.683  4.2e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2923 on 235 degrees of freedom
## Multiple R-squared:  0.938,  Adjusted R-squared:  0.9373 
## F-statistic:  1186 on 3 and 235 DF,  p-value: < 2.2e-16

# my equation is
# Gold_Price= -284.018+491.182*Interest_Rate+380.144*CPI-128.712*USD_Index

library(stargazer)

## 
## Please cite as:

##  Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.

##  R package version 5.2.3. https://CRAN.R-project.org/package=stargazer

stargazer(model, type = "text", title="Descriptive statistics", digits=1, out="table1.txt")

## 
## Descriptive statistics
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                             Gold_Price         
## -----------------------------------------------
## Interest_Rate                491.2***          
##                               (162.7)          
##                                                
## CPI                          380.1***          
##                                (6.6)           
##                                                
## USD_Index                    -128.7***         
##                               (16.8)           
##                                                
## Constant                      -284.0           
##                              (1,903.5)         
##                                                
## -----------------------------------------------
## Observations                    239            
## R2                              0.9            
## Adjusted R2                     0.9            
## Residual Std. Error     2,922.9 (df = 235)     
## F Statistic          1,186.1*** (df = 3; 235)  
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01

From the pairs plot we can see that the Gold_Price and Interest_Rate appear to have a strong negative linear correlation, Gold_Price and CPI appear to have a strong positive linear correlation, Gold_ Price and USD_Index appear to have a modest negative linear correlation, they are magnitude meaningful to statistical significance

The overall F-statistic of the model is 1186, and the corresponding p-value is 2.2e-16. This indicates that the overall model is statistically significant. In other words, the regression model as a whole is useful. through the residual graph, we are predicting the negative gold price, the relationship between the Gold Price and interest rate is nonlinear, the residual is more equally above or below the zero line.

plot(model)

When met, the Gauss-Markov assumptions enable the ordinary least squares (OLS) estimator to belong to the class of linear estimators known as BLUE (Best Linear Unbiased Estimator). The term “linearity” refers to a linear relationship between the independent and dependent variables. Rigid Exogeneity is the situation in which, given any value of the independent variables, the expected value of the error term is zero. Although it makes hypothesis testing and interval estimation easier, this assumption is not required for OLS estimators to be impartial and effective. Whether these presumptions hold true in a particular analysis relies on the research question and the data’s context. It is crucial to verify these hypotheses with statistical tests and diagnostic instruments.

The unbiased estimates generated by ordinary least squares (OLS) regression have the lowest variance among all potential linear estimators. The linear regression model’s parameters can be estimated using the OLS approach.

A linear relationship between the independent variable(s) and the dependent variable is the underlying assumption of linear regression. However, the relationship might not always be linear in real-world scenarios. The relationship can be changed to become more linear by taking the logarithm of one or more variables. Other modeling techniques or transformations can be better suitable in some situations. Furthermore, a grasp of the logarithmic scale is necessary for interpreting coefficients in log-transformed models, and logarithmic transformations can lessen the effect of extreme values or outliers. Taking the logarithm can assist lessen the disproportionate impact that outliers have on the model in linear regression.

week 14 discussion

Mingdong Yang

2023-12-12