2024-02-05

1

Introduction to Simple Linear Regression

  • Definition and Purpose: A statistical method that allows us to summarize and study relationships between two continuous (quantitative) variables

  • Assumptions:

    • Linearity
    • Independence
    • Normality of Residuals

Plotly 3D scatterplot

library(plotly)
## Loading required package: ggplot2
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
data(trees)
y = trees$Girth;
x = trees$Height;
z = trees$Volume;

xax <- list(title = 'Girth')
yax <- list(title = 'Height')
zax <- list(title = 'Volume')

plot_ly(x=x, y=y, z=z, type='scatter3d', mode='markers') %>%
   layout(
      title = 'Volume vs Height and Girth',
      scene=list(xaxis=xax, yaxis=yax, zaxis=zax))

ggplot Scatterplot 1

ggplot Scatterplot 2

R^2 Formula

  • R^2 Value: 0.8345

Equation of the Regression Line

y = α + βx

  • Equation of the Line: y = 17.40 + 0.107x

Conclusion

  • R^2 value shows that the correlation between age and circumfirence is high due to the value being close to 1.

  • The Equation of the line allows us to find the either the age or circumference given the other.