library(ggplot2)

ggplot(data = diamonds, mapping = aes(x = price)) + geom_freqpoly(mapping = aes(color = cut), bindwith = 500)
## Warning in geom_freqpoly(mapping = aes(color = cut), bindwith = 500): Ignoring
## unknown parameters: `bindwith`
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Its hard to see the differnce in distrubution because the counts differ so much

ggplot(diamonds) + geom_bar(mapping = aes(x = cut))

to make the comparison easier, we need to swap the display on y-axis. Instead od displaying count, we’ll display density, which is count that area under the curve

ggplot(data = diamonds, mapping = aes(x = price, y = ..density..)) + geom_freqpoly(mapping = aes(color = cut),bindwidth = 500)
## Warning in geom_freqpoly(mapping = aes(color = cut), bindwidth = 500): Ignoring
## unknown parameters: `bindwidth`
## Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(density)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

the fair diamonds have the highest average price. Thats because frequnecy polygons are la little hard to interpret.

Another alternative is the boxplot. A boxplot is a type of visual shorthand

ggplot(data = diamonds, mapping = aes(x = cut, y = price)) + geom_boxplot()

We see much less information about the distrubution, but the boxplots are much more compact, so we can more easily compare them.Supports the counterintuitive finding the better quaility diamonds

ggplot(data = mpg, mapping = aes(x = class, y = hwy)) + geom_boxplot()

ggplot(data = mpg) +
  geom_boxplot(mapping = aes(x = reorder(class, hwy, FUN = median), y = hwy))

ggplot(data = mpg) + geom_boxplot(mapping = aes(x = reorder(class, hwy, FUN = median), y = hwy)) + coord_flip()

visualize the correlation between to continuos variable, use a scatter plot

ggplot(data = diamonds) + geom_point(mapping = aes(x = carat, y = price))

Scatterplots becomes less useful as the size of your dataset grows, because we get overplot. We can fix this using the alpha aesthetic

ggplot(data = diamonds) + geom_point(mapping = aes(x = carat, y = price), alpha = 1/100)