2026-03-07

What is Simple Linear Regression

Simple Linear Regression is when we have a dataset of somesort where to actual information isn’t important as it matters that it can be plotted and with the plotable data we would need to search for the best applicable line according to the data where the line is supposed to represent a sort of average of the data with a linear line representing it.

Needed equation and explanation

Line in Linear Regression equation: \[y = mx + b\] This line as you can see is a basic linear line equation where \(m = slope\), \(x = xvalue\), and \(b =\) y-intercept, these values would be gotten from the dataset where you would get each of these and then find the average of them.

Now in the following slides I will give examples with datasets to show how it is to look

Example plot with Air Quality without Line Regression

Example plot with Air Quality with Line Regression

R code used to make the graph of the previous 2 slides

Graph without Line Regression with ggplot:

ggplot(data = airquality, aes(x = Day, y = Ozone))+geom_point()

Graph with Line Regression with ggplot:

ggplot(data = airquality, aes(x = Day, y = Ozone))+geom_point()+geom_smooth(method = 'lm', level = .99, se = FALSE)

Explanation of previous graphs

For the first graph that is shown it has the data plotted of the Ozone per day where as is shown is that it has a lot of data where it all scattered around and doesn’t seem to have a real pattern to it.

In the second graph it is the same plotted data but witht he regression line where as said before it has the linear line with the formula \(y = mx + b\) where the \(m\) is negative as the linear line is decreasing, additional the intercept which is b seems to be around \(b =45\).

The reason for a linear line like this compared to the plot points is because all the data for each point is so scatter so the average with the linear regression needs to be between them.

Example plot with Pressure

References