The topic I have chosen for this presentation is simple linear regression.
I have chosen this topic because it is at the root of statistical analysis and is one of the most commonly used techniquies that everyone should learn a bit about.
2023-10-15
The topic I have chosen for this presentation is simple linear regression.
I have chosen this topic because it is at the root of statistical analysis and is one of the most commonly used techniquies that everyone should learn a bit about.
Linear regression is a statistical technique which involves using an independent variable to predict a dependent variable. It can be a simple linear regression with two variables, but there is also multivariate regression which uses multiple variables to predict another.
It is often used to predict a number such as a stock price or the amount of something given another variable.This can help us understand a relationship between two variables.
Linear Regression models are represented by the equation: \[Y = \beta_0 + \beta_1X + \epsilon\]
This equation contains a few parts:
library(MASS)
data("Boston")
lmModel <- lm(medv ~ rm, data = Boston)
Here is the code used to create the linear regression model. I used the “Boston” data set in order to get housing data for this particular regression. This model will then be plotted as a linear regression plot.
## `geom_smooth()` using formula = 'y ~ x'
Linear Regression on Boston Housing
This is an interactive plotly plot showing predicted vs observed values with the regression model made.
linearResiduals <- resid(lmModel)
residual_data <- data.frame(Predicted = predict(lmModel),
Residuals = linearResiduals)
ggplot(data = residual_data, aes(x=Predicted, y = Residuals))
+ geom_point() + labs(title = "Residual Plot for Linear Regression Model",
x = "Predicted Values", y = "Residuals")
The equation used for residuals is this:
\[ \epsilon_i = y_i - \hat{y}_i \]
It is the observed - predicted for a data point!
Thanks for going through the presentation. I hope you learned something interesting.