ggplot2The objective of this assignment is to complete and explain basic plots before moving on to more complicated ways to graph data.
Each question is worth 5 points.
To submit this homework you will create the document in Rstudio,
using the knitr package (button included in Rstudio) and then submit the
document to your Rpubs account. Once
uploaded you will submit the link to that document on Canvas. Please
make sure that this link is hyper linked and that I can see the
visualization and the code required to create it
(echo=TRUE).
mosaicData package.Create an informative and meaningful data graphic.
Identify each of the visual cues that you are using, and describe how they are related to each variable.
Create a data graphic with at least five variables (either quantitative or categorical). For the purposes of this exercise, do not worry about making your visualization meaningful—just try to encode five variables into one plot.
We are mapping the x-axis to the ‘hs’variable , the y-acis to the ’race’varible, and the fill aesthetic to the ’race’ varible to colorful.We are using the visual curs of position x and y coordinates to encode the varabile of age, and the visual cur of color(fill aesthetic) to encode the varible of race.
data("Marriage")
ggplot(data = Marriage, aes(x = hs, fill = race))+
geom_histogram(binwidth = 1 , position = "dodge")
summary(Marriage$race)
## American Indian Black Hispanic White
## 1 22 1 74
summary(Marriage$sign)
## Aquarius Aries Cancer Capricorn Gemini Leo
## 7 10 8 2 9 7
## Libra Pisces Saggitarius Scorpio Taurus Virgo
## 7 16 9 7 6 10
ggplot(Marriage, aes(x=delay, y=age, color=race, shape=sign, size=hs)) +
geom_point() +
facet_wrap(~sign)
## Warning: The shape palette can deal with a maximum of 6 discrete values because
## more than 6 becomes difficult to discriminate; you have 12. Consider
## specifying shapes manually if you must have them.
## Warning: Removed 55 rows containing missing values (geom_point).
Your objective for the next four questions will be write the code necessary to exactly recreate the provided graphics.
This boxplot was built using the mpg dataset. Notice the
changes in axis labels.
data("mpg")
mpg
## # A tibble: 234 × 11
## manufacturer model displ year cyl trans drv cty hwy fl class
## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
## 1 audi a4 1.8 1999 4 auto… f 18 29 p comp…
## 2 audi a4 1.8 1999 4 manu… f 21 29 p comp…
## 3 audi a4 2 2008 4 manu… f 20 31 p comp…
## 4 audi a4 2 2008 4 auto… f 21 30 p comp…
## 5 audi a4 2.8 1999 6 auto… f 16 26 p comp…
## 6 audi a4 2.8 1999 6 manu… f 18 26 p comp…
## 7 audi a4 3.1 2008 6 auto… f 18 27 p comp…
## 8 audi a4 quattro 1.8 1999 4 manu… 4 18 26 p comp…
## 9 audi a4 quattro 1.8 1999 4 auto… 4 16 25 p comp…
## 10 audi a4 quattro 2 2008 4 manu… 4 20 28 p comp…
## # … with 224 more rows
q2<- ggplot(mpg, aes(manufacturer, hwy))
q2 + geom_boxplot() +
labs( y = "Vehicle Manufacturer", x = "HFE(miles/gallon)") +
theme_classic()
This graphic is built with the diamonds dataset in the
ggplot2 package.
data("diamonds")
diamonds
## # A tibble: 53,940 × 10
## carat cut color clarity depth table price x y z
## <dbl> <ord> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl>
## 1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43
## 2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31
## 3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31
## 4 0.29 Premium I VS2 62.4 58 334 4.2 4.23 2.63
## 5 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75
## 6 0.24 Very Good J VVS2 62.8 57 336 3.94 3.96 2.48
## 7 0.24 Very Good I VVS1 62.3 57 336 3.95 3.98 2.47
## 8 0.26 Very Good H SI1 61.9 55 337 4.07 4.11 2.53
## 9 0.22 Fair E VS2 65.1 61 337 3.87 3.78 2.49
## 10 0.23 Very Good H VS1 59.4 61 338 4 4.05 2.39
## # … with 53,930 more rows
q3 <- ggplot(diamonds, aes(price, colour = cut, fill = cut)) +
geom_density(alpha = 0.2) +
scale_fill_discrete() +
labs(x = "Diamond Price ", y = "Density", title = "Diamond Price Density")+
theme_bw()
q3
This graphic uses the penguins dataset and shows the
counts between males and females by species.
data("penguins")
penguins
## # A tibble: 344 × 8
## species island bill_length_mm bill_depth_mm flipper_…¹ body_…² sex year
## <fct> <fct> <dbl> <dbl> <int> <int> <fct> <int>
## 1 Adelie Torgersen 39.1 18.7 181 3750 male 2007
## 2 Adelie Torgersen 39.5 17.4 186 3800 fema… 2007
## 3 Adelie Torgersen 40.3 18 195 3250 fema… 2007
## 4 Adelie Torgersen NA NA NA NA <NA> 2007
## 5 Adelie Torgersen 36.7 19.3 193 3450 fema… 2007
## 6 Adelie Torgersen 39.3 20.6 190 3650 male 2007
## 7 Adelie Torgersen 38.9 17.8 181 3625 fema… 2007
## 8 Adelie Torgersen 39.2 19.6 195 4675 male 2007
## 9 Adelie Torgersen 34.1 18.1 193 3475 <NA> 2007
## 10 Adelie Torgersen 42 20.2 190 4250 <NA> 2007
## # … with 334 more rows, and abbreviated variable names ¹​flipper_length_mm,
## # ²​body_mass_g
ggplot(penguins, aes(x = species, fill = sex)) +
geom_bar(position = "dodge") +
labs(title = "Counts between males and females by species",
x = "Species",
y = "Count")
This figure examines the relationship between bill length and depth
in the penguins dataset.
data("penguins")
penguins
## # A tibble: 344 × 8
## species island bill_length_mm bill_depth_mm flipper_…¹ body_…² sex year
## <fct> <fct> <dbl> <dbl> <int> <int> <fct> <int>
## 1 Adelie Torgersen 39.1 18.7 181 3750 male 2007
## 2 Adelie Torgersen 39.5 17.4 186 3800 fema… 2007
## 3 Adelie Torgersen 40.3 18 195 3250 fema… 2007
## 4 Adelie Torgersen NA NA NA NA <NA> 2007
## 5 Adelie Torgersen 36.7 19.3 193 3450 fema… 2007
## 6 Adelie Torgersen 39.3 20.6 190 3650 male 2007
## 7 Adelie Torgersen 38.9 17.8 181 3625 fema… 2007
## 8 Adelie Torgersen 39.2 19.6 195 4675 male 2007
## 9 Adelie Torgersen 34.1 18.1 193 3475 <NA> 2007
## 10 Adelie Torgersen 42 20.2 190 4250 <NA> 2007
## # … with 334 more rows, and abbreviated variable names ¹​flipper_length_mm,
## # ²​body_mass_g
q5 <- ggplot(penguins, aes(x = bill_length_mm, y = bill_depth_mm), color = species)+
geom_point(aes(shape = species,color = species), size = 1) +
labs(title = "Bill Length vs. Depth",
x = "Bill Length (mm)",
y = "Bill Depth (mm)",
shape = "Species")+
geom_smooth(method = "lm", aes(color = species))
q5