---
title: "Week 3"
subtitle: "Reshaping and Summarizing Data"
author: "Penelope Pooler Eisenbies"
date: last-modified
lightbox: true
toc: true
toc-depth: 3
toc-location: left
toc-title: "Table of Contents"
toc-expand: 1
format:
html:
code-line-numbers: true
code-fold: true
code-tools: true
execute:
echo: fenced
---
## RStudio Global General Options
```{r include=F}
#|label: setup
knitr::opts_chunk$set(echo=T, highlight=T) # specifies default options for all chunks
options(scipen=100) # suppress scientific notation
# install pacman if needed
if (!require("pacman")) install.packages("pacman", repos = "http://lib.stat.cmu.edu/R/CRAN/")
pacman::p_load(pacman, tidyverse, gridExtra, magrittr,
kableExtra) # install and load required packages
p_loaded() # verify loaded packages
```
#### Reminders from Week 2 and HW 2
#### HW 2 is Due Wednesday, 9/10/2025
::: fragment
**`dplyer` commands:**
:::
- `select` - used to select variables (columns) of a
dataset
- `slice` - used to select rows by row number
- `filter` - used to filter data rows by values of a
variable
- `mutate` - to create or transform a variable
::: fragment
**`ggplot` introduction:**
:::
- basic syntax and aesthetics statements (`aes`)
- creating a basic boxplot (`geom_boxplot`) or scatterplot
(`geom_point`)
- removing default background by modifying the theme
- adding a third categorical variable to color the data by
category
## Reordering variables
- In class and HW 2 we used `select` to reorder variables.
- Another option in the `dplyr` package is
[`relocate`](https://dplyr.tidyverse.org/reference/relocate.html){target="_blank"}
::::::: fragment
:::::: columns
::: {.column width="48%"}
```{r}
#|label: starwars numeric vars first
my_starwars <- starwars |>
select(1:11) |>
relocate(where(is.numeric)) |>
glimpse(width=40)
```
:::
::: {.column width="4%"}
:::
::: {.column width="48%"}
```{r}
#|label: starwars character vars first
my_starwars <- starwars |>
select(1:11) |>
relocate(where(is.character)) |>
glimpse(width=40)
```
:::
::::::
:::::::
## New Skills in Week 3 (and HW 3)
- Importing a 'clean' dataset
- After Quiz 1 we'll cover how to clean 'messy' data
- Creating a character or factor variable
- Coercing data to be a new data type
- e.g. character to numeric
- Grouping, summarizing, and filtering data
- Reshaping data for a summary table **OR** reshaping data
for a plot
## Preview of 'cleaning' messy data
- This week, we will introduce data from [Box Office
Mojo](https://www.boxofficemojo.com/){target="_blank"}
- We will work with the cleaned (usable data)
- First, a quick preview of one way to acquire and clean
data with no `download` option.
- These are proprietary data, but they can be used for
educational purposes according to the fair use
doctrine of the U.S. copyright statute.
- Steps:
- Select data from website and save as .csv file.
- Examine raw 'messy' data in .csv file.
- Remove non-data rows at the top with skip.
- Select variables and filter data rows.
- Remove nuisance characters like `$` and `,`.
- Clean and convert date information variables, if
present.
- Export and save a clean dataset.
##
::::::::: panel-tabset
### [Website]{style="color:blue;"}
Online Data are often formatted for viewing, not using.
Details that make online data viewing easier, have to be
removed for data management.
{fig-align="center"} [Data
Source: Box Office
Mojo](https://www.boxofficemojo.com/daily/2023/?view=year){target="_blank"}
### [Raw Data (.csv)]{style="color:blue;"}
Copying data from a website and saving them as a .csv file
(CSV UTF-8) removes most of the formatting, but data
cleaning is still required.
{fig-align="center"}
### [Import,Select,Filter]{style="color:blue;"}
- `read_csv` imports the raw data and skips the first 11
rows (above the var names).
- `filter` is used to filter out rows that don't contain
data.
- `select` is used to select only the variables we need.
- `rename` (new command) is used to make the variable
names easier to work with.
- `head` is one of many options for examining the data.
:::: fragment
::: r-fit-text
```{r}
#|label: import, select, filter, rename
bom23 <- read_csv("data/box_office_mojo_2023.csv", skip=11, show_col_types = FALSE) |>
filter(!is.na(Day)) |>
select(Date, `Top 10 Gross`, Gross, Releases, `#1 Release`) |>
rename(top10gross = `Top 10 Gross`,
num_releases=Releases, num1gross=Gross, num1 = `#1 Release`)
head(bom23)
```
:::
::::
### [Clean Numeric Data]{style="color:blue;"}
- The two **Gross** variables both contained `$` and `,`
symbols that were removed with `gsub` and `across`.
- Each variable was then converted to numeric with
`as.numeric`.
:::: fragment
::: r-fit-text
```{r}
#|label: clean numeric variables
bom23 <- bom23 |>
mutate(across(.cols=top10gross:num1gross,
~gsub(pattern="$", replacement="", fixed=T, .)), # removes $ from 2 vars
across(.cols=top10gross:num1gross,
~gsub(pattern=",", replacement="", fixed=T, .)) |> # removes , from 2 vars
mutate_at(vars(top10gross,num1gross), as.numeric)) # converts to numeric
head(bom23)
```
:::
::::
### [Dates]{style="color:blue;"}
- Dealing with dates used to be much more difficult prior
to development of the
[lubridate](https://lubridate.tidyverse.org/){target="_blank"}
package.
- Dates are still troublesome in other software
environments.
- Below we create a date variable from the provided
character variable, create other variables, examine
data, and export the dataset with `write_csv`.
:::: fragment
::: r-fit-text
```{r}
#|label: date example with lubridate
bom23 <- bom23 |>
mutate(date = dmy(paste(Date,"2023")), # year is required
# we paste it (add it as text) to each date
month = month(date, label=T, abbr=T), # month shown as 3 letter abbr.
day = wday(date, label=T, abbr=T), # weekday shown as 3 letter abbr.
quart = quarter(date)) |> # quarter shown as number
select(date, month, day, quart, top10gross:num1) |> # select and reorder variables
glimpse() |> # examine data
write_csv("data/Box_Office_Mojo_Week3_HW3.csv") # export using write_csv
```
:::
::::
:::::::::
## Importing Clean Data
- `read_csv` is used in this class
- External datasets should be saved as `.csv` files to
your project folder
- There are many CSV file options.
- Select **CSV UTF-8** when saving Excel datasets as
`.csv` files.
- `show_col_types=F` suppresses the output message from
importing data
- This option will be required when you create a
dashboard.
:::: fragment
::: r-fit-text
```{r}
#|label: import clean data
mojo_23 <- read_csv("data/Box_Office_Mojo_Week3_HW3.csv", show_col_types=F) |>
glimpse(width=60)
```
:::
::::
## Week 3 In-class Exercises - Q1
[***Poll
Everywhere***](https://pollev.com/penelopepoolereisenbies685){target="_blank"} -
My User Name: **penelopepoolereisenbies685**
Notice that in the prior chunk, we use the command
`read_csv`
**True or False:**
`read_csv` and `read.csv` are the same and can be used
interchangeably to import data.
::: fragment
**Hint:** Here are three ways to determine this:
1. R help: In console type ?read_csv and/or type ?read.csv
and look through documentation
2. Google **R read_csv and read.csv**
3. Ask 'Chat GPT', 'Copilot', or another AI search engine.
:::
::: fragment
**Note:** R help files are sometimes hard to decipher and
**Googling** often requires time and effort but both are
excellent resources. AI search engines are getting better,
but are not always 100% accurate.
:::
##
::::::: panel-tabset
### [Categorical Data]{style="color:blue;"}
This data set is **ALMOST** ready to work with BUT there are
few additional tasks to cover:
- Select all variables in dataset EXCEPT **`num1`** (name
of number 1 movie)
- We will work with text (character) variables after
Quiz 1
- Convert `month` to an ordinal factor, `monthF`
- Convert `day` (of the week) to an ordinal factor,
`wkdayF`, with Monday as 1st Day
- Change `wkdayF` labels to be
`M, T, W, Th, F, Sa, Su`
- Convert quart (Quarter) to an ordinal factor with text
labels (HW 3):
- In HW 3 you will:
- create a factor variable **`quartF`** with
- levels: 1,2,3,4.
- labels: "1st Qtr", "2nd Qtr", "3rd Qtr",
"4th Qtr" .
- create a publication quality table showing data
by week day and quarter.
### [Exclude `num1`]{style="color:blue;"}
Recall: We use `!` to exclude a variable or filter out
observations
::: r-fit-text
```{r}
#|label: exclude a variable
mojo_23_mod <- mojo_23 |> # save as new dataset
select(!num1) |> # excludes text variable num1
glimpse()
```
:::
### [Create Factors]{style="color:blue;"}
The `factor` command is used with `mutate` to create **TWO**
factor variables - `levels` option specifies **order**. -
`labels` option specifies **appearance of values**.
::: r-fit-text
```{r}
#|label: creating factor variables
mojo_23_mod <- mojo_23_mod |>
mutate(monthF = factor(month,
levels=c("Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec")),
wkdayF = factor(day,
levels=c("Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"),
labels= c("M", "T", "W", "Th", "F", "Sa", "Su"))) |>
glimpse()
```
:::
### [Examine Factors]{style="color:blue;"}
We can use `unique` or `summary` to examine the new
variables `monthF` and `wkdayF`.
- `unique` lists the levels (categories) in the specified
order
- `summary` of a factor variable shows the number of
observations in each level (category).
:::: fragment
::: r-fit-text
```{r }
#|label: Examine factor variables
mojo_23_mod |> pull(monthF) |> unique()
mojo_23_mod |> pull(wkdayF) |> unique()
mojo_23_mod |> select(month, monthF, day, wkdayF) |> summary()
```
:::
::::
:::::::
##
::::: panel-tabset
### [Numerical Data]{style="color:blue;"}
- The `mutate` command can contain many separate
statements.
- **Good practice:** Subdivide data management tasks into
multiple chunks so that each chunk is easily understood.
::: fragment
In the next chunk we will:
:::
- modify `top10gross` and `num1gross`:
- divide by `1000000` and `round` for presentation
purposes.
- create percent of top 10 gross earned by number 1 film
(HW 3), rounded to 2 decimal places.
- `pctnum1 = (num1gross/top10gross * 100) |> round(2)`
- convert `num_releases` to an integer (HW 3).
- `num_releases = as.integer(num_releases)`
### [R code for Numerical Data]{style="color:blue;"}
**Note:** Variables are rounded to two decimal values by
using piping and `round(2)`
::: r-fit-text
```{r}
#|label: numerical data management
mojo_23_mod <- mojo_23_mod |>
mutate(top10grossM = (top10gross/1000000) |> round(2), # change scale and round
num1grossM = (num1gross/1000000) |> round(2), # change scale and round
num1pct = (num1gross/top10gross * 100) |> round(2), # create rounded pct var
num_releases = as.integer(num_releases)) |> # converts num_releases to integer
select(date, monthF, wkdayF, quart, num_releases, num1gross, num1grossM,
top10gross, top10grossM, num1pct)
head(mojo_23_mod)
```
:::
:::::
##
### Week 3 In-class Exercises - Q2-Q3
[***Poll
Everywhere***](https://pollev.com/penelopepoolereisenbies685){target="_blank"} -
My User Name: **penelopepoolereisenbies685**
**This is BB Question 2 in HW 3**
The correct command used to convert a numeric variable to an
integer variable is
`____()`.
When you **`glimpse`** the data after Part 2 (Chunk 3) in HW
3, the type for the **`num_releases`** variable is shown as
`<____>` instead of `<dbl>`.
## Grouping and Filtering Data
- We can filter data by value within each group.
- R command `group_by` allows us to group data before
we filter.
- Data are filtered by value **WITHIN** each specified
group
- Ungrouping data afterwards using `ungroup` is not
required, but often helpful.
- The example below is not used in the subsequent summary
but can be very useful.
:::: fragment
::: r-fit-text
```{r}
#|label: filter to last day of month
mojo_23_mnth_end <- mojo_23_mod |>
select(date, monthF, top10grossM) |>
group_by(monthF) |> # doesn't change data appearance
filter(date == max(date)) |>
ungroup() |> # ungroup not required but helpful
glimpse()
```
:::
::::
## Grouping and Summarizing Data
- We will summarize data and then reshape it for a summary
table.
- R commands `group_by` and `summarize` allow us to
summarize the data by category
- When summarizing data, it is easier to select the
variables you want first.
- Plan what you want to do
:::: fragment
::: r-fit-text
```{r group and summarize}
mojo_23_smry <- mojo_23_mod |>
select(monthF, wkdayF, top10grossM) |>
group_by(monthF, wkdayF) |> # doesn't change data appearance
summarize(avg_top10gross = mean(top10grossM, na.rm=T),
mdn_top10gross = median(top10grossM, na.rm=T),
max_top10gross = max(top10grossM, na.rm=T)) |>
ungroup() |> glimpse() # ungroup not required but helpful
```
:::
::::
## Reshape Data using `pivot_wider`
- A common task in data management is reshaping data
- Display data tables must be compact for presentation
:::: fragment
::: r-fit-text
```{r}
#|label: reshape data with pivot_wider
mojo_23_wide <- mojo_23_smry |>
pivot_wider(id_cols=monthF, names_from=wkdayF, values_from=max_top10gross) |>
rename(Month = monthF)
head(mojo_23_wide)
```
:::
::::
## Creating Tables for Presentation
Below are two options for for displaying a small dataset in
tabular formats.
- **Note:** Appearance of kable tables varies for slides,
documents, and html files
:::::::::: columns
::::: {.column width="48%"}
:::: fragment
#### Basic Table with `kable`
::: r-fit-text
```{r}
#|label: filter select present data
mojo_23_fall_wknd <- mojo_23_wide |>
select(Month, F, Sa, Su) |>
filter(Month %in% c("Sep", "Oct",
"Nov", "Dec"))
mojo_23_fall_wknd |>
kable()
```
:::
::::
:::::
::: {.column width="4%"}
:::
::::: {.column width="48%"}
:::: fragment
#### `kable` Table with styling
::: r-fit-text
```{r}
#|label: modifying alignment and styling
mojo_23_fall_wknd |>
kable(align="lccc",
caption="Max. Fall `23 Top 10 Gross") |>
kable_styling(full_width = F)
```
:::
::::
:::::
::::::::::
## Reshaping Data using `pivot_longer`
The longer data format is often needed for efficient data
visualization
:::::: columns
::: {.column width="48%"}
#### `pivot_longer` R code
```{r}
#|label: pivot_longer code
mojo_23_long <- mojo_23_wide |>
pivot_longer(cols=M:Su, names_to="Day",
values_to="max_top10gross")
head(mojo_23_long, 10)
```
:::
::: {.column width="4%"}
:::
::: {.column width="48%"}
#### basic `geom_bar` barplot R code
```{r fig.dim=c(5,4)}
#|label: stacked barplot
(mojo_barplot <- mojo_23_long |> ggplot() +
geom_bar(aes(x=Month, y=max_top10gross, fill=Day),
stat="identity"))
```
:::
::::::
##
::: panel-tabset
### [Stacked Barplot]{style="color:blue;"}
```{r fig.dim=c(8,5), fig.align='center'}
#|label: stacked no background
mojo_23_long <- mojo_23_long |> # Day converted to factor to specify order
mutate(Day = factor(Day, levels=c("M", "T", "W", "Th", "F", "Sa", "Su")))
(mojo_barplot <- mojo_23_long |> ggplot() +
geom_bar(aes(x=Month, y=max_top10gross, fill=Day), stat="identity") +
theme_classic())
```
### [Side-by-side]{style="color:blue;"}
```{r fig.dim=c(12,5), fig.align='center'}
#|label: side by side
(mojo_barplot <- mojo_23_long |> ggplot() +
geom_bar(aes(x=Month, y=max_top10gross, fill=Day),
stat="identity", position="dodge") +
theme_classic())
```
### [Labels Formatted]{style="color:blue;"}
We can add on to the plot which is a saved object in the
Global Environment.
```{r fig.dim=c(12,5), fig.align='center'}
#|label: label formatting
(mojo_barplot <- mojo_barplot +
theme(legend.position ="bottom") +
guides(fill = guide_legend(nrow = 1)) +
labs(x="", y="Maximum Daily Gross ($M)",
title = "Maximum Daily Gross of Top 10 Films by Month and Day of Week",
caption = "Data Source: www.boxofficemojo.com"))
```
### [Format Palette and Text]{style="color:blue;"}
```{r fig.dim=c(12,5), fig.align='center'}
#|label: spectral palette and text resized
(mojo_barplot <- mojo_barplot +
scale_fill_brewer(palette = "Spectral") +
theme(plot.title = element_text(size = 20),
axis.title = element_text(size=18),
axis.text = element_text(size=15),
plot.caption = element_text(size = 10),
legend.text = element_text(size = 12),
plot.background = element_rect(colour = "darkgrey", fill=NA, size=2)))
```
:::
## Week 3 In-class Exercises - Q4
[***Poll
Everywhere***](https://pollev.com/penelopepoolereisenbies685){target="_blank"} -
My User Name: **penelopepoolereisenbies685**
**This is part of BB Question 5 in HW 3**
If you want a grouped barplot with **side-by-side bars**,
what is the correct option to include in the **`geom_bar`**
statement?
[**Here is some additional information about geom_bar
barplots.**](https://ggplot2.tidyverse.org/reference/geom_bar.html){target="_blank"}
## `pivot_longer` for Line and Area Plots
- An alternative to summarizing the data is to show the
data as a time series.
- Two ways to do this are a **line plot** or an **area
plot**
- These plots are an effective data management and
presentation tool.
- To make a line plot with multiple variables, we use
pivot_longer to reshape the data.
:::: fragment
::: r-fit-text
```{r}
#|label: reshape for line plot
mojo_23_line_area <- mojo_23_mod |>
select(date, top10grossM, num1grossM) |> # select variables
rename(`Top 10` = top10grossM, `No. 1` = num1grossM) |> # rename for plot
pivot_longer(cols=`Top 10`:`No. 1`, # reshape data
names_to = "type", values_to = "grossM") |>
mutate(type=factor(type, levels=c("Top 10", "No. 1"))) # convert gross type to factor
head(mojo_23_line_area, 4)
```
:::
::::
##
::: panel-tabset
### [Line Plot]{style="color:blue;"}
```{r fig.dim=c(14,4), fig.align='center'}
#|label: basic line plot
(line_plt <- mojo_23_line_area |> ggplot() +
geom_line(aes(x=date, y=grossM, color=type), size=1) +
theme_classic())
```
### [Labels & Colors]{style="color:blue;"}
```{r fig.dim=c(14,5), fig.align='center'}
#|label: labels and colors formatted
(line_plt <- line_plt +
theme(legend.position="bottom") + # legend at bottom
scale_color_manual(values=c("blue", "lightblue")) + # specify colors
labs(x="Date", y = "Gross ($Mill)", color="",
title="Top 10 and No. 1 Movie Gross by Date",
subtitle="Jan. 1, 2023 - Dec. 31, 2023",
caption="Data Source:www.boxoffice.mojo.com"))
```
### [Resize Text]{style="color:blue;"}
```{r fig.dim=c(14,5), fig.align='center'}
#|label: adjust text size
(line_plt <- line_plt +
theme(plot.title = element_text(size = 20), plot.caption = element_text(size = 10),
axis.text = element_text(size=15), axis.title = element_text(size=18),
legend.text = element_text(size = 12),
plot.background = element_rect(colour = "darkgrey", fill=NA, linewidth = 2)))
```
### [Area Plot Code]{style="color:blue;"}
Change `geom_line` to `geom_area` and `color` to `fill`
```{r}
#|label: area plot code
area_plt <- mojo_23_line_area |>
ggplot() + # changed to geom_area
geom_area(aes(x=date, y=grossM, fill=type), size=1) + # changed color to fill
theme_classic() + theme(legend.position="bottom") +
scale_fill_manual(values=c("blue", "lightblue")) + # changed color to fill
labs(x="Date", y = "Gross ($Mill)", fill="", # changed color to fill
title="Top 10 and No. 1 Movie Gross by Date",
subtitle="Jan. 1, 2023 - Dec. 31, 2023",
caption="Data Source:www.boxoffice.mojo.com") +
theme(plot.title = element_text(size = 20),
axis.title = element_text(size=18),
axis.text = element_text(size=15),
plot.caption = element_text(size = 10),
legend.text = element_text(size = 12),
plot.background = element_rect(colour = "darkgrey", fill=NA, linewidth=2))
```
### [Area Plot]{style="color:blue;"}
```{r fig.dim=c(14,7), echo=F, fig.align='center'}
#|label: area plot displayed
area_plt
```
:::
## Week 3 In-class Exercises
***Lecture 6 Exercise - NOT on Poll Everywhere***
::: fragment
**In class we will practice:**
:::
- Running chunks and exporting a table.
- **Preview for 1 Question in Quiz 1 where you will:**
- Select variables from a provided dataset
- Group and summarize data
- Export a summary table as a .csv file and submit it.
## Instructions for In-class Exercise
1. Save Week 3 R project to your computer.
2. Open this project by clicking on .Rproj file.
3. Open .Rmd file within open R project.
4. Run all chunks above this exercise.
5. Modify the following chunk below to:
i. Round all values in columns 2-4 of
`mojo_23_fall_wknd` to 1 decimal place using
`round`.
ii. Export `mojo_23_fall_wknd` as a `.csv` file with
your name.
6. Submit this .csv file with your name in the **Week 3
In-class Exercise** in the **In-class Exercises** folder
on Blackboard.
::: fragment
**NOTE:** This counts as part of your in-class participation
for the Week 3 lectures (due Fri. at midnight).
:::
## R Code Chunk for In-class Exercise
0. Remove `, eval=F` from chunk header. This will allow
code in chunk to run when it is rendered.
1. Remove the `#` and complete `round` command to round
numeric columns (columns 2 - 4) to 1 decimal place.
2. Choose EITHER of the `write_csv` commands and edit it so
dataset will be exported to the `data` folder with your
name.
3. Delete `write_csv` command you don't edit or put `#`
symbols in front of it.
4. Submit `.csv file` with your name in the filename
::: fragment
```{r eval = F}
#|label: round and export summary dataset
mojo_23_fall_wknd |> glimpse() # examine data with glimpse
# round columns 2, 3 and 4 only
# export summary dataset using write_csv without piping
write_csv(mojo_23_fall_wknd, "data/Movie_Gross_Fall_2023_Weekends_FirstName_Last_Name.csv")
# export summary dataset using write_csv with piping
mojo_23_fall_wknd |>
write_csv("data/Movie_Gross_Fall_2023_Weekends_FirstName_Last_Name.csv")
```
:::
## Week 3 In-class Exercises
**Practice:**
If all the columns in a dataset are numeric, you can round
the whole dataset at once with the command
`round(<name of dataset>)`.
Why wouldn't that work for the dataset in the previous
exercise, `mojo_23_fall_wknd`?
Hint: To answer this question, you are encourage to
- try running the command `round(mojo_23_fall_wknd)`.
- examine the data using `glimpse`.
## Week 3 In-class Exercises - Q5
[***Poll
Everywhere***](https://pollev.com/penelopepoolereisenbies685){target="_blank"} -
My User Name: **penelopepoolereisenbies685**
Which of the following commands should **NOT** be used
within a `mutate` command or a `summarize` command?
- `as.integer`
- `factor`
- `mean`
- `filter`
## HW 3 Introduction
### Purpose
::: fragment
This assignment will give you experience with:
:::
- Creating an R Project Directory folder with `data` and
`img` folders. (Review)
- Creating, saving, using a Quarto file (Review)
- Importing data
- Rendering a Quarto file to create an HTML file (Review)
- Creating a README file (Review)
- Using the dplyr commands along with commands to reshape
and summarize data
- Creating plots with some formatting
## Week 3 In-class Exercises - Q6
[***Poll
Everywhere***](https://pollev.com/penelopepoolereisenbies685){target="_blank"} -
My User Name: **penelopepoolereisenbies685**
In HW 3, you will group the data by quarter and week day.
This is Part 4 of HW 3 and is very similar to the group_by
and summarize code covered in Lecture 5.
**This is BB Question 3 in HW 3**
Your grouped and summarized dataset, **`mojo_qtr_smry`**,
has
`____` rows and
`____` columns
`____` summary numeric variables
##
### Key Points from This Week
::: fragment
**Summarizing Data by Group**
:::
- Use `group_by` to specify grouping variables followed by
`summarize`
- Within summarize specify type, .e.g. `mean`,
`median`, `max`, etc.
::: fragment
**Reshaping Data for Different Purposes**
:::
- `pivot_wider` is useful for display tables
- `pivot_longer` is useful for plots
::: fragment
**Plotting Data**
:::
- grouped barplots (stacked and side-by-side)
- line plots and area plots
::: fragment
You may submit an 'Engagement Question' about each lecture
until midnight on the day of the lecture. **A minimum of
four submissions are required during the semester.**
:::