library(ggplot2) library(plotly)
2025-03-18
library(ggplot2) library(plotly)
This presentation covers key statistical concepts using R.
Linear regression is used to model the relationship between a dependent variable and an independent variable.
\[ Y = \beta_0 + \beta_1 X + \epsilon \]
The least squares estimation method minimizes the sum of squared residuals:
\[ \hat{\beta_1} = \frac{\sum (X_i - \bar{X})(Y_i - \bar{Y})}{\sum (X_i - \bar{X})^2} \]
\[ \hat{\beta_0} = \bar{Y} - \hat{\beta_1} \bar{X} \]
library(ggplot2) data(mtcars) head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb ## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 ## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 ## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 ## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 ## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 ## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
ggplot(mtcars, aes(x = hp, y = mpg, color = factor(cyl))) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(title = "Linear Regression: MPG Vs Horsepower",
x = "Horsepower",
y = "Miles Per Gallon (MPG)",
color = "Cylinders") +
theme_minimal() +
theme(plot.title = element_text(hjust = 0.5))
ggplot(mtcars, aes(x = disp, y = mpg, color = factor(cyl))) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "blue") +
labs(title = "Linear Regression: MPG vs Displacement",
x = "Displacement (cu. in.)",
y = "Miles Per Gallon (MPG)",
color = "Cylinders") +
theme_minimal() +
theme(plot.title = element_text(hjust = 0.5))
## `geom_smooth()` using formula = 'y ~ x'
model <- lm(mpg ~ wt, data = mtcars) summary(model)
## ## Call: ## lm(formula = mpg ~ wt, data = mtcars) ## ## Residuals: ## Min 1Q Median 3Q Max ## -4.5432 -2.3647 -0.1252 1.4096 6.8727 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 37.2851 1.8776 19.858 < 2e-16 *** ## wt -5.3445 0.5591 -9.559 1.29e-10 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 3.046 on 30 degrees of freedom ## Multiple R-squared: 0.7528, Adjusted R-squared: 0.7446 ## F-statistic: 91.38 on 1 and 30 DF, p-value: 1.294e-10
This presentation demonstrated statistical concepts using R Markdown with ioslides.