2023-02-11

Simple Linear Regression

It is a statistical technique used to predict or estimate a quantitative variable based on another quantitative variable.

There are two main variables, the “y” variable, which is the dependent variable (the variable to be predicted or estimated), and the “x” variable, which is the independent variable (the variable that explains y).

Simple Linear Regression

The linear regression technique consists of modeling an equation of a line, either up or down.

If both variables do the same thing, either increasing or decreasing, then x and y are said to have a direct relationship.

R-Squared

R-squared measures how well a regression model fits the actual data. In other words, it is a measure of the overall accuracy of the model. R squared is also known as the coefficient of determination.

A value close to 1 means that our model fits the real data, and vice versa.

Simple Linear Regression

Suppose we are given some values of y with respect to x, and we are asked to apply simple linear regression. The data looks as follows.

Simple Linear Regression

Our goal is to come up an equation of a line, that is closest/have the least errors from the points to the line.

Such lines are approximations

Simple Linear Regression

As the approximation is supposed to be a line. It can be defined by the formula

model: \(\text{y} = \alpha + \beta \text{x}\)

To get those values, R allows us to use a command to find them

variable = lm(responseVariableY~predictorX, data= dataset) 

Also, R allows us to add another layer with a function, which allows us to add a recursion line to the set of points graph.

graph + geom_smooth(method = "lm", se=F) 

Simple Linear Regression

The formula for the recursion line of our set of points is

model: \(\text{y} = 0.4667 + 0.9152 \text{x}\)