- Linear Regression models the relationship between two variables:
- Independent variable (X)
- Dependent variable (Y)
- Purpose:
- Predict the value of Y based on X
- Understand the strength and direction of the relationship
The mathematical representation of the model is:
\[ Y = \beta_0 + \beta_1 X + \epsilon \]
Where:
## Warning: package 'ggplot2' was built under R version 4.3.3
## `geom_smooth()` using formula = 'y ~ x'
A residual plot to assure outr assumptions of linear regression, especially the contant variance of error terms.
Here’s the linear regression model R code representation:
## ## Call: ## lm(formula = Scores ~ Hours, data = data) ## ## Residuals: ## Min 1Q Median 3Q Max ## -2.2695 -1.1248 -0.2785 0.7707 3.3628 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 3.0509 1.3346 2.286 0.0516 . ## Hours 2.8361 0.2151 13.186 1.04e-06 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 1.954 on 8 degrees of freedom ## Multiple R-squared: 0.956, Adjusted R-squared: 0.9505 ## F-statistic: 173.9 on 1 and 8 DF, p-value: 1.042e-06
Interpretration:
\[ R^2 = 1 - \frac{\sum (Y_i - \hat{Y_i})^2}{\sum (Y_i - \bar{Y})^2} \]
Further resources: - (https://www.ncl.ac.uk/webtemplate/ask-assets/external/maths-resources/statistics/regression-and-correlation/simple-linear-regression.html)