Harold Nelson
2026-06-21
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.2.1 ✔ readr 2.2.0
## ✔ forcats 1.0.1 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.3 ✔ tibble 3.3.1
## ✔ lubridate 1.9.5 ✔ tidyr 1.3.2
## ✔ purrr 1.2.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
You can use pivot_longer() even if it doesn’t make analytical sense. As an exercise, do the following:
Start with the cdc dataframe.
Add a variable ID =1:nrow(cdc).
Use pivot longer to place the values of height and weight in a new column called metric. Call the new dataframe nonsense.
Keep only gender, age, metric and value.
Use glimpse() to verify your work.
load("cdc.Rdata")
cdc = cdc %>%
mutate(ID = 1:nrow(cdc))
nonsense = cdc %>%
pivot_longer(cols = c(height,weight),
names_to = "metric",
values_to = "value") %>%
select(ID,gender,age,metric,value)
head(nonsense)## # A tibble: 6 × 5
## ID gender age metric value
## <int> <fct> <int> <chr> <dbl>
## 1 1 m 77 height 70
## 2 1 m 77 weight 175
## 3 2 f 33 height 64
## 4 2 f 33 weight 125
## 5 3 f 49 height 60
## 6 3 f 49 weight 105
Note that every row in the original dataframe has turned into two rows in the longer dataframe.
We now have a problem if we want to compute the BMI. It requires an algebraic expression in height and weight. To do this, height and weight have to be in the same row of the dataframe.
Use pivot_wider() to undo nonsense and turn it back into sense. We just need to reverse what we did. Use head() to verify your work.
## # A tibble: 6 × 5
## ID gender age height weight
## <int> <fct> <int> <dbl> <dbl>
## 1 1 m 77 70 175
## 2 2 f 33 64 125
## 3 3 f 49 60 105
## 4 4 f 42 66 132
## 5 5 f 55 61 150
## 6 6 f 55 64 114
If we hadn’t created the ID variable, this would have failed.
Comments
The focus of this chapter in Modern Dive is making a dataframe “tidy” by pivotting to a longer version.
The pivot_longer() command has three arguments.
The columns to pivot, frequently specified by excluding one or more columns.
A column in the new, longer dataframe, which will contain the names of the existing column names. This is the parameter names_to.
A column in the new, longer dataframe, which will contain the values in the existing columns, which are to be pivoted. This is the parameter values_to.