2022-11-13

Simple Linear Regression

Simple Linear Regression is a statistical method that allows us to summarize and study relationships between two continuous(quantitative) variables.

A relationship exists between two variables, one denoted as a predictor that predicts a response.

Deterministic Relationships

Examples of deterministic relations are:

Hooke’s Law: \(Y = \alpha + \beta X\),

where Y =amount of stretch in a spring, and X= applied weight.

Currency Conversions: \(U = \frac {C} {1.33}\),

where C=Canadian Dollar, and U= US Dollar.

Area of A Circle: \(A = \pi r^2\),

where A=Area, and r=radius.


The equation describes the relationship exactly between the predictor and response variables.

Statistical Relationships

Statistical Relationships are determined by a trend between two continuous variables. Where a trend in data may exist but there exists some leniency or scattering effect within the data set. This trend is denoting using the line of best fit .

\(\hat{y}_i = b_0 + b_1x_i\)

\(\hat{y}_i\) denotes the predicted response for experimental unit i
\(x_i\) denotes the predictor value for experimental unit i
\(b_0\) denotes the y-intercept
\(b_1\) denotes the slope coefficients for each explanatory variable

Representing Line of Best Fit Over Data Set

Computing \(b_0\) And \(b_1\)

Using the formulas below we can compute \(b_1\) And \(b_0\),

\(b_1 = \frac {n \sum {XY} - \sum{x}\sum{y}} {n \sum{X^2} - (\sum{x})^2}\)

\(b_0 = \bar {y} - b_1 \bar {x}\)


Over The Cars Dataset Where Y= Price, X= Engine-Size.

##   enginesize price
## 1        130 13495
## 2        130 16500
## 3        152 16500
## 4        109 13950
## 5        136 17450
## 6        136 15250

Calculating \(b_0\) And \(b_1\) in R

n=nrow(carData)
x=carData$enginesize
y=carData$price
  b1 = (n*sum(x*y)-sum(x)*sum(y))/(n*sum(x^2)-sum(x)^2)
  b0=mean(y)- b1*mean(x)
b1;b0
## [1] 167.6984
## [1] -8005.446

Calculated Linear Regression in GGPlot2

Representing Multiple Linear Regressions

Sources