2024-10-20

Introduction

  • Simple linear regression is a statistical method that allows for the summarization and study of relationships between two variables
  • Simple linear regression can be used to predict the value of a dependent variable based on the independent variable’s value
  • A straight line is used to model this relationship

Equation for Linear Regression

  • The line for simple linear regression models can be expressed as \[Y = \beta_0 + \beta_1X + \epsilon\]

  • Where \(Y\) is the dependent variable

  • \(X\) is the independent variable

  • \(\beta_0\) is the y-intercept

  • \(beta_1\) is the slope

  • \(\epsilon\) is the error term

Assumptions of Simple Linear Regression

  • Linearity: relationship between X and Y is linaer
  • Homoscedasticity: variance of residual is the same for any value of X
  • Normality: for any fixed value of X, Y is normally distributed
  • Independence: observations are independent of each other

Example of Linear Regression: Stock Prices (generated data)

set.seed(123) days <- 0:365 date <- Sys.Date() - days price <- 100 + 0.1 * days + rnorm(366,0,5) stock_data <- data.frame(Date=date, Price = price)

Plot of Stock Price Data

Regression Analysis

## 
## Call:
## lm(formula = Price ~ Days, data = stock_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -12.6974  -3.1416  -0.2888   3.1408  16.0727 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 99.935789   0.504609   198.0   <2e-16 ***
## Days         0.101207   0.002393    42.3   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.837 on 364 degrees of freedom
## Multiple R-squared:  0.8309, Adjusted R-squared:  0.8305 
## F-statistic:  1789 on 1 and 364 DF,  p-value: < 2.2e-16

Residual Plot

Plotly interactive stock price plot

Results Explained

The regression equation for stock prices is \[\text{Price} = \hat{\beta_0} + \hat{\beta_1} \times \text{Days}\]

  • \(\hat{\beta_0}\) is the estimated initial stock price
  • \(\hat{\beta_1}\) is the estimated daily price change

interpretation:

\(\hat{\beta_1}\) represents the average daily change in stock price, while \(\hat{\beta_0}\) represents the estimated stock price at day 0

For financial data there are other models such as the time series model which are more suitable, but this was a short and sweet example.

Thank You!