02/04/2024

Page1: Introduction to Linear Regression

Simple Linear Regression is a pivotal statistical technique, crafted to elucidate the linear relationship between two continuous variables.

In this presentation, we will delve into: - The essential principles and mathematical framework of Simple Linear Regression. - Executing Simple Linear Regression analysis with the R programming language. - The practical relevance and methods of interpreting the findings.

Page2: Theoretical Background

An independent variable \(X\) and a dependent variable \(Y\). It predicts the value of \(Y\) based on the value of \(X\) using the linear equation:

\[ Y = \beta_0 + \beta_1X + \epsilon \]

where: - \(Y\) Variable to predict, - \(X\) Variable used for prediction, - \(\beta_0\) is the y-intercept, indicating the value of \(Y\) when \(X\) is 0, - \(\epsilon\) is the error term, accounting for the deviation of the observed values from the line. - \(\beta_1\) is the slope of the regression line, representing the expected change in \(Y\),

Example: In real estate, we could use simple linear regression to predict the price of a house based on its square footage.

Page3: Assumptions of the Regression

The key assumptions of simple linear regression include:

  1. Linearity: The relationship between \(x\) and \(y\) is linear.
  2. Independence: Observations are independent of each other.
  3. Homoscedasticity: Constant variance of error terms.
  4. Normality: Errors are normally distributed.

Page4: Estimating Coefficients and getting the data

The coefficients of the regression line, \(\beta_0\) (intercept) and \(\beta_1\) (slope), are estimated during the model fitting process. They are calculated using the least squares criterion.

##   x          y
## 1 1 -26.023782
## 2 2  -7.508874
## 3 3  83.935416
## 4 4  11.525420
## 5 5  16.464387
## 6 6  97.753249

Page5: Slide with Plot

Page6: Fitting the Regression Model

## `geom_smooth()` using formula = 'y ~ x'

Page7: Analysis of the Regression Model

# Loading the package
library(plotly)

if (!exists("model")) {
  model <- lm(y ~ x, data = data)
}

if (!exists("predictions")) {
  predictions <- predict(model, data)
}

# Calculating the regression
if (!exists("residuals")) {
  residuals <- data$y - predictions
}

# Creating and displaying the 3D plotly plot
plot_ly(x = data$x, y = predictions, z = residuals, type = 'scatter3d', mode = 'markers') %>%
  layout(title = "3D Residual Plot",
         scene = list(xaxis = list(title = 'X'),
                      yaxis = list(title = 'Predicted Y'),
                      zaxis = list(title = 'Residuals')))

Slide 9: Conclusion

Conclusion

The presentation effectively highlighted the use of Simple Linear Regression, to reveal the predictive interplay between variables, highlighting its essential value in analytics.