EXERCISES
For Questions 1 to 2, you will be presented data visualizations from
the New York Times newspaper and asked to identify or describe aspects
of the plot.
- Consider this graph from September 15, 2021.
- Using the directory
of visualizations from the book Fundamentals of Data
Visualization, identify the type of data visualization used for
this plot.
- Multiple time series using a line graph
- What two geometric objects (or geom function from the
{ggplot2}
) were used to create the plot?
- What were the aesthetics used in the plot, and what
variable/constant was mapped/set to that aesthetic? [Note: Just focus on
one of the plots to come up with your answer, as three plots are a
product of faceting on a particular variable.]
- The X-axis aesthetic is mapped to the variable Year
- The Y-axis aesthetic is mapped to the variable Minutes per day
- The Color aesthetic is mapped to the variable Age group
- What question do you think the authors were interested in when they
created this data visualization? Also, provide an answer to their
(assumed) question.
- Among different age groups, how did the time spent exercising,
grooming, texting, making phone calls, and engaging in video chats
change between 2019 and 2020 when not working or attending school?
- The authors found that across all age groups, people spent more time
texting, making phone calls, and engaging in video chats, with the 15-24
age group experiencing the most significant increase. Additionally, they
found that among those aged 15-24, the time spent exercising decreased,
while for all other age groups, it increased. Furthermore, the authors
found a slight increase in the amount of time spent grooming among the
15-24 age group, but a decrease in time among all other age groups, with
those in the 45-64 age group experiencing the most significant
decline.
- Consider this graph from October 1, 2020.
- Using the directory
of visualizations from the book Fundamentals of Data
Visualization, identify the types of data visualizations used for
this plot.
- Small multiple line graphs with area under the curve filled
- What two (or three) geometric objects (or geom function from the
{ggplot2}
) were used to create the plot? Hint: we are
assuming this is not a density plot.
- geom_line()
- geom_hline()
- geom_area()
- Give three aesthetics used in the plot and what variable/constant
was mapped/set to that aesthetic. [Note: Just focus on one or two of the
plots to come up with your answer.]
- The X-axis aesthetic is mapped to the variable Date
- The Y-axis aesthetic is mapped to the variable Percentage of
uncounted addresses
- The fill aesthetic is mapped to the variable Proportion of uncounted
addresses
- The line type aesthetic is mapped to the variable Reference
line
- What question do you think the authors were interested in when they
created this data visualization? Also, provide an answer to their
(assumed) question.
- Between September 10 and September 30, how many states are not on
track to collect responses for uncounted addresses?
- The authors found that 37 states were not on track to collect
responses for uncounted addresses.
Use for Questions 3 to 4
You will be presented with graphs using the NHANES
dataset in the {NHANES}
package and asked to recreate them
using grammar of graphics framework. [Note: This is the same dataset we
have used in class, but instead of importing the dataset using .csv
file, you are going to access it via the {NHANES}
package.]
- Consider this graph exploring the variables of Education
from the
NHANES
dataset.
- Reconstruct the graph using
{ggplot2}
and its core
components of the grammar of graphics (data, aesthetic properties, and
geometric objects). [Hint: The aesthetic for coloring the bars is
fill =
].
ggplot(NHANES, aes(x = Education, fill = Education)) +
geom_bar()

- Identify the major components (data, aesthetics, geometric
object(s)) of the grammar of graphics that were used to create the plot
as well as variables or values that went into the arguments for the
component.
- Data: the NHANES dataset was used
- Aesthetics: the x-axis and fill, both mapped to the variable
Education
- Geometric object: geom_bar()
- Consider this graph exploring the variables of Age,
Pulse, and Gender from the
NHANES
dataset.
- Reconstruct the graph using
{ggplot2}
and its core
components of the grammar of graphics (data, aesthetic properties, and
geometric objects). [Hint: Notice that one of the geoms has
transparency.]
ggplot(NHANES, aes(x = Age, y = Pulse, color= Gender)) +
geom_point(alpha = 0.1, size = 0.5) +
geom_smooth(se = FALSE, size = 0.5)

- Identify the major components (data, aesthetics, geometric
object(s)) of the grammar of graphics that were used to create the plot
as well as variables or values that went into the arguments for the
component.
- Data: the NHANES dataset was used
- Aesthetics:
- the X-axis is mapped to Age
- the Y-axis is mapped to Pulse
- the Color is mapped to Gender
- Geometric object:
- geom_point() : alpha = 0.1, size = 0.5
- geom_smooth(): se = FALSE, size = 0.5
- Rather than looking at Age as a continuous variable, let’s
swap that out with a categorical one, AgeDecade. Create a data
visualization that allows us to compare the Pulse across
AgeDecade categories, and also by Gender.
ggplot(NHANES, aes(x = AgeDecade, y = Pulse, color= Gender)) +
geom_boxplot()
