library(ggplot2)
library(dplyr)
library(tidyr)
library(skimr)
# library(gapminder) for data set
# library(moderndive) for data set
First we will check our data set
head(moderndive::evals)
To keep things simple, let’s select only the relevant columns from the main data set and make it subset
eval_ch5 <- moderndive::evals |>
select(ID, score, age, bty_avg)
Let’s check how our data is distributed
summary(eval_ch5[c("score", "age", "bty_avg")])
score age bty_avg
Min. :2.300 Min. :29.00 Min. :1.667
1st Qu.:3.800 1st Qu.:42.00 1st Qu.:3.167
Median :4.300 Median :48.00 Median :4.333
Mean :4.175 Mean :48.37 Mean :4.418
3rd Qu.:4.600 3rd Qu.:57.00 3rd Qu.:5.500
Max. :5.000 Max. :73.00 Max. :8.167
Now, we will check little more deep only outcome and explanatory variables teaching score and beauty average score and pipe them into skim() function
eval_ch5 |>
select(score, bty_avg) |>
skim()
── Data Summary ────────────────────────
Values
Name select(eval_ch5, score, b...
Number of rows 463
Number of columns 2
_______________________
Column type frequency:
numeric 2
________________________
Group variables None