Simple Linear Regression is a statistical method to model the relationship between two variables using a linear equation.
2025-03-16
Simple Linear Regression is a statistical method to model the relationship between two variables using a linear equation.
The equation of a simple linear regression model is:
\[ Y = \beta_0 + \beta_1 X + \epsilon \]
where: - \(Y\) is the dependent variable
- \(X\) is the independent variable
- \(\beta_0\) is the intercept
- \(\beta_1\) is the slope
- \(\epsilon\) is the error term
library(ggplot2) library(plotly)
## ## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2': ## ## last_plot
## The following object is masked from 'package:stats': ## ## filter
## The following object is masked from 'package:graphics': ## ## layout
# Generate Example Data set.seed(42) data <- data.frame( X = 1:100, Y = 5 + 0.5 * (1:100) + rnorm(100, mean = 0, sd = 5) ) head(data) # Show first few rows
## X Y ## 1 1 12.354792 ## 2 2 3.176509 ## 3 3 8.315642 ## 4 4 10.164313 ## 5 5 9.521342 ## 6 6 7.469377
ggplot(data, aes(x=X, y=Y)) + geom_point(color="blue") + geom_smooth(method="lm", col="red") + labs(title="Scatter Plot with Regression Line", x="X", y="Y")
## `geom_smooth()` using formula = 'y ~ x'
model <- lm(Y ~ X, data=data) data$residuals <- resid(model) ggplot(data, aes(x=X, y=residuals)) + geom_point(color="purple") + geom_hline(yintercept=0, linetype="dashed", color="red") + labs(title="Residuals vs X", x="X", y="Residuals")
summary(model)
## ## Call: ## lm(formula = Y ~ X, data = data) ## ## Residuals: ## Min 1Q Median 3Q Max ## -15.0974 -3.3091 0.4045 3.2635 11.1318 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 5.34467 1.05434 5.069 1.89e-06 *** ## X 0.49639 0.01813 27.386 < 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 5.232 on 98 degrees of freedom ## Multiple R-squared: 0.8844, Adjusted R-squared: 0.8833 ## F-statistic: 750 on 1 and 98 DF, p-value: < 2.2e-16
Simple Linear Regression is a foundational statistical tool for modeling relationships between two variables. It helps in prediction and understanding data trends.
Any Questions?