What is Simple Linear Regression?

Simple Linear Regression is a model that, by using a line, can predict the behavior of the relationship of two variables. In a Cartesian coordinate system, this is generally the x-axis (independent variable) and y-axis (dependent variable).

Simple Linear Regression can be used to predict a dependent value for a given independent value; i.e. a y-axis value can be predicted for a given x-axis value.

Formula for Simple Linear Regression

The most basic formula for Simple Linear Regression can be described as: \[ y= \alpha + \beta x \] Where \(\alpha\) is the y-intercept and \(\beta\) is the slope of the line.

Example of Simple Linear Regression

To exemplify Simple Linear Regression, we will create a series of figures using R.

The data set we will use as an example will be the the height and weight of children using the “UsingR” data set “kid.weights”.

First we will create a plot of children’s height vs their weight.

Then we will apply a Simple Linear Regression to the data with which we can make predictions about the relationship between children’s height and weight.

Figure without Simple Linear Regression

Figure with Simple Linear Regression

Analysis of Figures and Model

We can now see a clear relationship in between children’s height and weight with the created Simple Linear Refression model.

While a rough prediction can be made using the figure alone, a more accurate prediction can be made if we know the formula of our model’s line.

Finding the Formula of our Model

To find the formula of our model we can use:

line = lm(weight ~ height, data=kid.weights)
coef(line)
## (Intercept)      height 
##  -31.341912    1.909044

Therefore, our line of Simple Linear Regression has a formula of: \[ y=-31.341912 + 1.90944x \]

Using our Formula to make a Prediction

With our model’s formula, we can predict the weight of children with given heights and create a plotable data frame:

PredHeights=seq(36,54)
PredWeights=coef(line)["(Intercept)"] + coef(line)["height"] * 
  PredHeights
Predictions=data.frame(PredHeights,PredWeights)

This data frame can then be plotted in plotly to make an intractable plot that will allow us to see the specific weight prediction for a given height.

Figure of Predictions

Veryify Predictions are Accurate

We can verify our Predictions are accurate to our original model line by ploting both our predicted points and our Simple Linear Regression Line

Figure of Predictions and Simple Linear Regression Line

Conclusions

As we can see from the figures, Simple Linear Regression can be used to find the relationship between an independent variable and a dependent variable. The line created from the model can then be used to predict dependent variable values for given independent variable values.