title: “Mini Class 3 extension”
Data received: Built in R studi0 - Diamond Dataset
Density Plot in Various forms:
A Density Plot visualises the distribution of data over a continuous interval or time period. In our case, diamond dataset is used.
library(ggplot2)
str(diamonds)
## Classes 'tbl_df', 'tbl' and 'data.frame': 53940 obs. of 10 variables:
## $ carat : num 0.23 0.21 0.23 0.29 0.31 0.24 0.24 0.26 0.22 0.23 ...
## $ cut : Ord.factor w/ 5 levels "Fair"<"Good"<..: 5 4 2 4 2 3 3 3 1 3 ...
## $ color : Ord.factor w/ 7 levels "D"<"E"<"F"<"G"<..: 2 2 2 6 7 7 6 5 2 5 ...
## $ clarity: Ord.factor w/ 8 levels "I1"<"SI2"<"SI1"<..: 2 3 5 4 2 6 7 3 4 5 ...
## $ depth : num 61.5 59.8 56.9 62.4 63.3 62.8 62.3 61.9 65.1 59.4 ...
## $ table : num 55 61 65 58 58 57 57 55 61 61 ...
## $ price : int 326 326 327 334 335 336 336 337 337 338 ...
## $ x : num 3.95 3.89 4.05 4.2 4.34 3.94 3.95 4.07 3.87 4 ...
## $ y : num 3.98 3.84 4.07 4.23 4.35 3.96 3.98 4.11 3.78 4.05 ...
## $ z : num 2.43 2.31 2.31 2.63 2.75 2.48 2.47 2.53 2.49 2.39 ...
-Density plot using one continuous variable.
-The peaks of a Density Plot help display where values are concentrated over the interval.An advantage Density Plots have over Histograms is that they’re better at determining the distribution shape because they’re not affected by the number of bins used
g <- ggplot(diamonds, aes(x=price))
g + geom_density()+
labs(title="Density plot",
subtitle="Density Plot for the Price of Diamonds",
caption="Source: In R studio",
x="Price")
g + geom_density(adjust = .5 )+ggtitle("adjust = .5")
g + geom_density(adjust = .1 )+ggtitle("adjust = .1")
How about if we have to do it in categorical variable?
g <- ggplot(diamonds, aes(price))
g + geom_density(aes(fill=diamonds$color), color = NA, alpha=.35) +
labs(title="Density plot",
subtitle="Density Plot Grouped by Number of Color",
caption="Source: In R studio",
x="Price",
fill="# Color")
Individual Densities:
ggplot(diamonds, aes(x=price, fill= color))+geom_density(col = "red", alpha = .3) +
scale_x_continuous(limits = c(0,20000))+coord_cartesian(ylim = c(0, .0004)) +
facet_wrap(~color, nrow = 3)
library(ggjoy)
## Loading required package: ggridges
## The ggjoy package has been deprecated. Please switch over to the
## ggridges package, which provides the same functionality. Porting
## guidelines can be found here:
## https://github.com/clauswilke/ggjoy/blob/master/README.md
joy<-ggplot(diamonds, aes(x=price, y= color)) +geom_joy(scale=4)
joy
## Picking joint bandwidth of 535
ggplot(diamonds, aes(x = price, y = color, fill = color)) +
geom_joy(scale = 4) +
scale_fill_cyclical(values = c("blue", "green"))
## Picking joint bandwidth of 535
ggplot(diamonds, aes(x = price, y = color, fill = color)) +
geom_joy(scale = 4) +
scale_fill_cyclical(values = c("blue", "green", "red", "pink", "purple", "yellow","orange" ), guide = "legend")
## Picking joint bandwidth of 535
ggplot(diamonds, aes(x = price, y = color, height = ..density..)) +
geom_joy(stat = "binline", bins = 20, scale = 0.95, draw_baseline = FALSE)