Harold Nelson
2026-06-30
This is just a small set of exercises to develop your intuition on the topic of influence.
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.2.1 ✔ readr 2.2.0
## ✔ forcats 1.0.1 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.3 ✔ tibble 3.3.1
## ✔ lubridate 1.9.5 ✔ tidyr 1.3.2
## ✔ purrr 1.2.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Here’s some basic data to get started.
Create the model base using lm(). Store its coefficients in the vector base_coeff.
## (Intercept) x
## 0.6201850 0.9679107
Create a scatterplot of x and y with an lm smoother. Don’t start with the aes(). Put the aes() in the geom_point() and also in the geom_smooth().
## `geom_smooth()` using formula = 'y ~ x'
Create a new variable y_ne (Northeast) in df. Modify the last value in y by adding 30 to the original value. Use ifelse(x == 20,…,…).
Examine the tail of df to verify your work.
## x y y_ne
## 15 15 13.88832 13.88832
## 16 16 19.57383 19.57383
## 17 17 17.99570 17.99570
## 18 18 14.06677 14.06677
## 19 19 20.40271 20.40271
## 20 20 19.05442 49.05442
Create the model ne using x and y_ne. Save its coefficients in ne_coeff.
## (Intercept) x
## -2.379815 1.396482
Create the vector ne_diff by subtracting base_coeff from ne_coeff.
## (Intercept) x
## -3.0000000 0.4285714
Complete the following sentence.
A positive outlier on the right …
A positive outlier on the right tilts the regression line up to the right, increasing the slope and decreasing the y-intercept.
Modify the existing scatterplot as follows.
Add color = “Black” to the existing geoms outside the aes().
Add a geom_point for the x,y_ne points. Use an appropriate aes(). Add color = “Red”.
Add a geom_smooth() for the x,y_ne points. Make this red also.