Section 8.15

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(dslabs)
data(heights)
data(murders)

1. With ggplot2 plots can be saved as objects. For example we can associate a dataset with a plot object like this:

What is class of the object p?

p<-ggplot(data=murders)
p<- murders |> ggplot()
class(p)
## [1] "gg"     "ggplot"

2. Remember that to print an object you can use the command print or simply type the object. Print the object p defined in exercise one and describe what you see.

A blank slate plot.

print(p)

3. Using the pipe |>, create an object p but this time associated with the heights dataset instead of the murders dataset.

p<-heights|>ggplot()

4. What is the class of the object p you have just created?

class(p)
## [1] "gg"     "ggplot"

5. Now we are going to add a layer and the corresponding aesthetic mappings. For the murders data we plotted total murders versus population sizes. Explore the murders data frame to remind yourself what are the names for these two variables and select the correct answer. Hint: Look at ?murders.

C. total and population.

?murders
## starting httpd help server ... done

6. To create the scatterplot we add a layer with geom_point. The aesthetic mappings require us to define the x-axis and y-axis variables, respectively. So the code looks like this: (except we have to define the variables x and y. Fill this out with the correct variable names.

murders |> ggplot(aes(x=total, y=population)) +
  geom_point()

7. Note that if we don’t use argument names, we can obtain the same plot by making sure we enter the variable names in the right order like this: Remake the plot but now with total in the x-axis and population in the y-axis.

murders |> ggplot(aes(total, population)) + geom_point()

8. If instead of points we want to add text, we can use the geom_text() or geom_label() geometries. The following code will give us the error message: Error: geom_label requires the following missing aesthetics: label

We need to map a character to each point through the label argument in aes.

murders |> ggplot(aes(population, total)) + geom_label()

9. Rewrite the code above to use abbreviation as the label through aes.

murders |> ggplot(aes(population, total)) + geom_label(aes(label=abb))

10. Change the color of the labels to blue. How will we do this?

Because we want all colors to be blue, we do not need to map colors, just use the color argument in geom_label.

11. Rewrite the code above to make the labels blue.

murders |> ggplot(aes(population, total)) + geom_label(aes(label=abb), color="blue")

12. Now suppose we want to use color to represent the different regions. In this case which of the following is most appropriate:

Because each label needs a different color we map the colors through the color argument of aes .

13. Rewrite the code above to make the labels’ color be determined by the state’s region.

murders |> ggplot(aes(population, total)) + geom_label(aes(label=abb, color=region)) 

14. Now we are going to change the x-axis to a log scale to account for the fact the distribution of population is skewed. Let’s start by defining an object p holding the plot we have made up to now. To change the y-axis to a log scale we learned about the scale_x_log10() function. Add this layer to the object p to change the scale and render the plot.

p <- murders |> ggplot(aes(population, total, label=abb, color=region)) + geom_label() + scale_y_log10()
p

15. Repeat the previous exercise but now change both axes to be in the log scale.

p <- murders |> ggplot(aes(population, total, label=abb, color=region)) + geom_label() + scale_y_log10() + scale_x_log10()
p

16. Now edit the code above to add the title “Gun murder data” to the plot. Hint: use the ggtitle function.

p <- murders |> ggplot(aes(population, total, label=abb, color=region)) + geom_label() + scale_y_log10() + scale_x_log10() + ggtitle("Gun murder data")
p