- Simple Linear Regression is used to model the relationship between two variables.
- One variable is the predictor (independent variable), and the other is the response (dependent variable).
2025-02-09
The model is represented as:
\[ Y = \beta_0 + \beta_1X + \epsilon \]
where: - \(Y\) = dependent variable - \(X\) = independent variable - \(\beta_0\) = intercept - \(\beta_1\) = slope - \(\epsilon\) = error term
library(ggplot2) library(plotly)
## ## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2': ## ## last_plot
## The following object is masked from 'package:stats': ## ## filter
## The following object is masked from 'package:graphics': ## ## layout
library(htmlwidgets) set.seed(42) x <- rnorm(100, mean = 50, sd = 10) y <- 5 + 2*x + rnorm(100, sd = 5) data <- data.frame(x, y)
ggplot(data, aes(x = x, y = y)) +
geom_point(color = "blue", alpha = 0.6) +
ggtitle("Scatter Plot of X vs Y") +
theme_minimal()
model <- lm(y ~ x, data = data) summary(model)
## ## Call: ## lm(formula = y ~ x, data = data) ## ## Residuals: ## Min 1Q Median 3Q Max ## -9.4421 -2.5332 0.0612 2.7053 14.3120 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 3.87919 2.25215 1.722 0.0881 . ## x 2.01358 0.04383 45.938 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 4.542 on 98 degrees of freedom ## Multiple R-squared: 0.9556, Adjusted R-squared: 0.9552 ## F-statistic: 2110 on 1 and 98 DF, p-value: < 2.2e-16
ggplot(data, aes(x = x, y = y)) +
geom_point(color = "blue", alpha = 0.6) +
geom_smooth(method = "lm", color = "red") +
ggtitle("Regression Line on Scatter Plot") +
theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'
z <- 5 + 1.5*x + rnorm(100, sd = 5) plot_ly(x = ~x, y = ~y, z = ~z, type = "scatter3d", mode = "markers")