2024-04-06
Here is the R Code for the last slide, which produced a plot of petal showing the relationship between petal lengths and sepal lengths of iris.
library(ggplot2)
ggplot(data = iris, aes(x = Sepal.Length, y = Petal.Length)) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ x, se = FALSE) +
labs(
x = "Sepal Length",
y = "Petal Length",
title = "Relationship between Sepal Length and Petal Length"
) +
theme_minimal()
Each species has a distinct linear regression line which may be used for prediction.
Here is the R Code for the last slide, which produced a plot of petal length and sepal lengths of each iris species.
ggplot(data = iris,
aes(x = Sepal.Length,
y = Petal.Length,
color = Species)) +
geom_point() +
geom_smooth(method = "lm",
formula = y ~ x, se = FALSE) +
labs(
x = "Sepal Length",
y = "Petal Length",
title = "Relationship between Sepal Length and Petal Length"
) +
theme_minimal()
The regression equation used for the plots can be represented as:
\[ \hat{PetalLength}_{\text{Setosa}} = \beta_0 + \beta_1 \cdot SepalLength + \epsilon \]
Where \(\hat{PetalLength}_{\text{Setosa}}\) represents the predicted Petal Length for the species.
\[s^2 = \frac{\sum_{i=1}^{n}(x_i - \bar{x})^2}{n-1}\] The Sepal Length Variance is:
## [1] 0.6856935
The variance is relatively small, and most points are close to the mean value. The regression line for this data set is fairly reliable for future predictions.