2023-04-16

Introduction on Simple Linear Regression

In this presentation using the built in database “Orange” in R, a simple linear regression would be a good model to predict the average growth rate of an orange tree’s trunk based on the age of the tree. Simple linear regression models the relationship between a dependent variable and one independent variable. The model assumes a linear relationship.

Linear Regression Formula

Simple Linear Regression Model

\[ y = \beta_0 + \beta_1 x + \epsilon \] where y = dependent variable

x = independent variable

beta_0 = intercept

beta_1 = slope of the line

epsilon = error line

You can use R to calculate the values according to the formula however R has a built in linear model function which produces a full set of diagnostic plots as well as the residuals.

Simple Linear Regression using Plotly

## A marker object has been specified, but markers is not in the mode
## Adding markers to the mode...

Code for Linear Regression

Residuals

Residuals are important because they provide a way to assess the goodness of fit of a model.

Residual plot using ggplot

The plot shows the difference between predicted and actual circumference for each observation, with negative residuals indicating overestimation and positive residuals indicating underestimation. This plot helps check if the linear regression model assumptions are met, such as normally distributed residuals with constant variance.