title: “Mini Class 3”

Student Name: Senet Manandhar

Data received: Built in R studi0 - Diamond Dataset

Density Plot in Various forms:

A Density Plot visualises the distribution of data over a continuous interval or time period. In our case, diamond dataset is used.

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 3.3.3
str(diamonds)
## Classes 'tbl_df', 'tbl' and 'data.frame':    53940 obs. of  10 variables:
##  $ carat  : num  0.23 0.21 0.23 0.29 0.31 0.24 0.24 0.26 0.22 0.23 ...
##  $ cut    : Ord.factor w/ 5 levels "Fair"<"Good"<..: 5 4 2 4 2 3 3 3 1 3 ...
##  $ color  : Ord.factor w/ 7 levels "D"<"E"<"F"<"G"<..: 2 2 2 6 7 7 6 5 2 5 ...
##  $ clarity: Ord.factor w/ 8 levels "I1"<"SI2"<"SI1"<..: 2 3 5 4 2 6 7 3 4 5 ...
##  $ depth  : num  61.5 59.8 56.9 62.4 63.3 62.8 62.3 61.9 65.1 59.4 ...
##  $ table  : num  55 61 65 58 58 57 57 55 61 61 ...
##  $ price  : int  326 326 327 334 335 336 336 337 337 338 ...
##  $ x      : num  3.95 3.89 4.05 4.2 4.34 3.94 3.95 4.07 3.87 4 ...
##  $ y      : num  3.98 3.84 4.07 4.23 4.35 3.96 3.98 4.11 3.78 4.05 ...
##  $ z      : num  2.43 2.31 2.31 2.63 2.75 2.48 2.47 2.53 2.49 2.39 ...

-Density plot using one continuous variable.

-The peaks of a Density Plot help display where values are concentrated over the interval.An advantage Density Plots have over Histograms is that they’re better at determining the distribution shape because they’re not affected by the number of bins used

g <- ggplot(diamonds, aes(x=price))
g + geom_density()+
    labs(title="Density plot", 
         subtitle="Density Plot for the Price of Diamonds",
         caption="Source: In R studio",
         x="Price")

g + geom_density(adjust = .5 )+ggtitle("adjust = .5")

g + geom_density(adjust = .1 )+ggtitle("adjust = .1")

How about if we have to do it in categorical variable?

g <- ggplot(diamonds, aes(price))
g + geom_density(aes(fill=diamonds$color), color = NA, alpha=.35) + 
    labs(title="Density plot", 
         subtitle="Density Plot Grouped by Number of Color",
         caption="Source: In R studio",
         x="Price",
         fill="# Color")

# Individual densities

ggplot(diamonds, aes(x=price, fill= color))+geom_density(col = "red", alpha = .3) +
scale_x_continuous(limits = c(0,20000))+coord_cartesian(ylim = c(0, .0004)) +
                     facet_wrap(~color, nrow = 3)

Similary if we have continous bivariate distribution can we vizualize the density as follows:

m <- ggplot(diamonds, aes(x = price, y = carat)) +
 geom_point() +
 xlim(0, 19000) +
 ylim(0, 6)
m + geom_density_2d()

m <- ggplot(diamonds, aes(x = price, y = carat)) +
 geom_point() +
 xlim(0, 10000) +
 ylim(0, 3)
m + geom_density_2d()
## Warning: Removed 5225 rows containing non-finite values (stat_density2d).
## Warning: Removed 5225 rows containing missing values (geom_point).

m + stat_density_2d(geom = "tile", aes(fill= ..density..), contour = FALSE)
## Warning: Removed 5225 rows containing non-finite values (stat_density2d).

## Warning: Removed 5225 rows containing missing values (geom_point).