- For this assignment, I have utilized the built-in dataset “Iris”.
- It contains measurements of sepal length, sepal width, petal length, and petal width of various iris species.
2024-10-28
data(iris) head(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species ## 1 5.1 3.5 1.4 0.2 setosa ## 2 4.9 3.0 1.4 0.2 setosa ## 3 4.7 3.2 1.3 0.2 setosa ## 4 4.6 3.1 1.5 0.2 setosa ## 5 5.0 3.6 1.4 0.2 setosa ## 6 5.4 3.9 1.7 0.4 setosa
The simple linear regression model predicts Petal Length (\(Y\)) based on Petal Width (\(X\)):
\[ Y = \beta_0 + \beta_1 X + \epsilon \] The fitted line equation: \[ \hat{Y} = \hat{\beta_0} + \hat{\beta_1} X \]
Slope and intercept: \[ \hat{\beta_1} = \frac{\sum{(X_i - \bar{X})(Y_i - \bar{Y})}}{\sum{(X_i - \bar{X})^2}} \] \[ \hat{\beta_0} = \bar{Y} - \hat{\beta_1} \bar{X} \] —
y = iris$Petal.Length; x = iris$Petal.Width mod = lm(y~x) xax <- list( title = "Petal Width", titlefont = list(family="Modern Computer Roman") ) yax <- list( title = "Petal Length", titlefont = list(family="Modern Computer Roman") ) plot_ly(x=x, y=y, type="scatter", mode="markers") %>% add_lines(x = x, y = fitted(mod), line = list(color = 'pink')) %>% layout(xaxis = xax, yaxis = yax)