Multi Regression in R using Iris dataset.

df <- iris

Checking the Assumptions

lm_model <- lm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, data = df)

summary(lm_model)
## 
## Call:
## lm(formula = Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, 
##     data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.82816 -0.21989  0.01875  0.19709  0.84570 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   1.85600    0.25078   7.401 9.85e-12 ***
## Sepal.Width   0.65084    0.06665   9.765  < 2e-16 ***
## Petal.Length  0.70913    0.05672  12.502  < 2e-16 ***
## Petal.Width  -0.55648    0.12755  -4.363 2.41e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3145 on 146 degrees of freedom
## Multiple R-squared:  0.8586, Adjusted R-squared:  0.8557 
## F-statistic: 295.5 on 3 and 146 DF,  p-value: < 2.2e-16
vif(lm_model)
##  Sepal.Width Petal.Length  Petal.Width 
##     1.270815    15.097572    14.234335
par(mfrow=c(2,2))
plot(lm_model)

The residuals, representing the differences between actual and predicted sepal lengths, range from approximately -0.83 to 0.85 units, with most falling within the range of -0.22 to 0.20 units. The coefficients reveal that sepal width, petal length, and petal width all have significant effects on sepal length, with higher standard errors associated with petal width. The model, explaining around 85.86% of the variability in sepal length, is statistically significant (F-statistic = 295.5, p-value < 2.2e-16), indicating that at least one of the predictor variables contributes significantly to predicting sepal length.Looking at the Adjusted R Squared with see a value of 0.8557 which is a decent. The assumptions of homoscedasticity and independence of residuals are met in the model. However, the VIF values suggest the presence of multicollinearity, especially between Petal.Length and Petal.Width.