2026-04-10

Introduction

  • Data science allows us to find useful patterns in day to day information
  • In this scenario we will be examining the following: Can we predict protein in a meal based only on the total amount of calories?
  • In order to do this, we will use something called Simple Linear Regression and see if we can see the relationship

Statistical Model

  • Simple Linear Regression allows us to create a model between two continuous variables
  • The model we work with is the following: \[y = \beta_0 + \beta_1x + \epsilon\]
  • Y is our predicted protein value
  • x is the calories
  • \(\beta_0\) is the y-intercept
  • \(\beta_1x\) is the slope

Estimated Equation

  • Since we only have a sample, we will use our data to calculate an estimated regression line \[\hat{y} = b_0 + b_1x\]
  • \(\hat{y}\) is our predicted protein value
  • \(b_0\) is our estimated intercept
  • \(b_1x\) is our estimated slope

Fitting the Model

macro_model <- lm(Protein ~ Calories, data = nutrition_data)
coef(macro_model)
## (Intercept)    Calories 
##  7.86258729  0.07629223

Data Visualization

Best Fit

Interactive