2024-10-20
In this presentation, we will explore the concept of Simple Linear Regression. We’ll use an example from real-world data to illustrate the process.
The regression line is defined by the equation: \[ Y = \beta_0 + \beta_1 X + \epsilon \] where: - \(\beta_0\) is the intercept - \(\beta_1\) is the slope - \(\epsilon\) is the error term
# Sample Data set.seed(123) study_hours <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) exam_scores <- c(55, 57, 61, 64, 66, 70, 74, 78, 81, 85) data <- data.frame(study_hours, exam_scores)
## `geom_smooth()` using formula = 'y ~ x'
## ## Attaching package: 'plotly' ## ## The following object is masked from 'package:ggplot2': ## ## last_plot ## ## The following object is masked from 'package:stats': ## ## filter ## ## The following object is masked from 'package:graphics': ## ## layout
model <- lm(exam_scores ~ study_hours, data=data) summary(model)
## ## Call: ## lm(formula = exam_scores ~ study_hours, data = data) ## ## Residuals: ## Min 1Q Median 3Q Max ## -1.41212 -0.25455 0.02424 0.43030 1.09091 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 50.53333 0.52647 95.98 1.55e-13 *** ## study_hours 3.37576 0.08485 39.79 1.75e-10 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.7707 on 8 degrees of freedom ## Multiple R-squared: 0.995, Adjusted R-squared: 0.9943 ## F-statistic: 1583 on 1 and 8 DF, p-value: 1.752e-10