Do 7.3.4 # 1-4 all

Do 7.4.1 # 1, 2

Do 7.5.1 # 2-6 all

Excercise 7.3.4 # 1-4

1. They are all right skewed with mostly small diamonds and very few large. All three distributions also have bimodality. According to the documentation x is length, y is width, and z is depth.
2.
require("tidyverse")
## Loading required package: tidyverse
## ── Attaching packages ─────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.0.0     ✔ purrr   0.2.5
## ✔ tibble  1.4.2     ✔ dplyr   0.7.6
## ✔ tidyr   0.8.1     ✔ stringr 1.3.1
## ✔ readr   1.1.1     ✔ forcats 0.3.0
## ── Conflicts ────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
ggplot(data=diamonds, mapping=aes(x=price)) +
  geom_freqpoly(binwidth=0.1)

ggplot(filter(diamonds), aes(x=price)) +
  geom_histogram(binwidth = 100, center=0)

ggplot(filter(diamonds, price < 2500), aes(x=price)) +
  geom_histogram(binwidth = 20, center= 0)

ggplot(filter(diamonds), aes(x=price)) +
  geom_histogram(binwidth = 100, center= 0)

3.
diamonds %>%
  filter(carat >=0.99, carat<= 1) %>% count(carat)

7.4.1 # 1, 2

  1. When you remove observations from geom_histogram() it gives you a warning that certain rows were taken out. With geom_bar() NA observations are not numerical and immediately are dropped because they have no bin to be placed in.

  2. NA removes values from the vector before calculating them.

mean(c(0,1,2,NA), na.rm=TRUE)
## [1] 1
sum(c(0,1,2,NA), na.rm =TRUE)
## [1] 3

7.5.1 # 2-6 all

  1. I would say carat is the best predictor of price. According to the boxplot, I would say that there is a negative relationship between cut and carat. This leads me to believe that cheaper diamonds have an overall better cut.
ggplot(diamonds, aes(x = cut, y = carat)) +
  geom_boxplot()

3. The aesthetics are flipped, but the output is exactly the same.
4.
library("lvplot")
ggplot(diamonds, aes(x=cut, y=price)) +
  geom_lv()

5.
ggplot(data = diamonds, mapping = aes(x = price, y = ..density..)) +
  geom_freqpoly(mapping = aes(color = cut), binwidth = 500)

ggplot(data = diamonds, mapping = aes(x = cut, y = price)) +
  geom_violin() +
  coord_flip()

6.
require("ggbeeswarm")
## Loading required package: ggbeeswarm
ggplot(data = mpg) +
  geom_quasirandom(mapping = aes(x = reorder(class, hwy, FUN = median),
                                 y = hwy),
                   method = "tukeyDense")