Stat 450: Homework 5

Do 7.3.4 # 1-4 all

Do 7.4.1 # 1, 2

Do 7.5.1 # 2-6 all

Excercise 7.3.4 # 1-4

1. They are all right skewed with mostly small diamonds and very few large. All three distributions also have bimodality. According to the documentation x is length, y is width, and z is depth.

2.

require("tidyverse")

## Loading required package: tidyverse

## ── Attaching packages ─────────────────────────────────────────────────── tidyverse 1.2.1 ──

## ✔ ggplot2 3.0.0     ✔ purrr   0.2.5
## ✔ tibble  1.4.2     ✔ dplyr   0.7.6
## ✔ tidyr   0.8.1     ✔ stringr 1.3.1
## ✔ readr   1.1.1     ✔ forcats 0.3.0

## ── Conflicts ────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

ggplot(data=diamonds, mapping=aes(x=price)) +
  geom_freqpoly(binwidth=0.1)

ggplot(filter(diamonds), aes(x=price)) +
  geom_histogram(binwidth = 100, center=0)

ggplot(filter(diamonds, price < 2500), aes(x=price)) +
  geom_histogram(binwidth = 20, center= 0)

ggplot(filter(diamonds), aes(x=price)) +
  geom_histogram(binwidth = 100, center= 0)

3.

diamonds %>%
  filter(carat >=0.99, carat<= 1) %>% count(carat)

7.4.1 # 1, 2

When you remove observations from geom_histogram() it gives you a warning that certain rows were taken out. With geom_bar() NA observations are not numerical and immediately are dropped because they have no bin to be placed in.
NA removes values from the vector before calculating them.

mean(c(0,1,2,NA), na.rm=TRUE)

## [1] 1

sum(c(0,1,2,NA), na.rm =TRUE)

## [1] 3

7.5.1 # 2-6 all

I would say carat is the best predictor of price. According to the boxplot, I would say that there is a negative relationship between cut and carat. This leads me to believe that cheaper diamonds have an overall better cut.

ggplot(diamonds, aes(x = cut, y = carat)) +
  geom_boxplot()

3. The aesthetics are flipped, but the output is exactly the same.

4.

library("lvplot")

ggplot(diamonds, aes(x=cut, y=price)) +
  geom_lv()

5.

ggplot(data = diamonds, mapping = aes(x = price, y = ..density..)) +
  geom_freqpoly(mapping = aes(color = cut), binwidth = 500)

ggplot(data = diamonds, mapping = aes(x = cut, y = price)) +
  geom_violin() +
  coord_flip()

6.

require("ggbeeswarm")

## Loading required package: ggbeeswarm

ggplot(data = mpg) +
  geom_quasirandom(mapping = aes(x = reorder(class, hwy, FUN = median),
                                 y = hwy),
                   method = "tukeyDense")

Stat 450: Homework 5

Sarah Salazar

09/29/2018

Do 7.3.4 # 1-4 all

Do 7.4.1 # 1, 2

Do 7.5.1 # 2-6 all

Excercise 7.3.4 # 1-4

1. They are all right skewed with mostly small diamonds and very few large. All three distributions also have bimodality. According to the documentation x is length, y is width, and z is depth.

2.

3.

7.4.1 # 1, 2

7.5.1 # 2-6 all

3. The aesthetics are flipped, but the output is exactly the same.

4.

5.

6.