2023-04-22

Objective

This document has been created as a part of Week 4 assignment in Develop Data Products course. There are two parts of the assignment. Firstly, a Shiny Application has to be done and the then a presentation (you’re looking at it now!) pitching the application.

Diamond Data

For creating this Shiny application, we use the dismond dataset from ggplot2.

A dataset containing the prices and other attributes of almost 54,000 diamonds. We are however interested in only a subset of variables - price, weight, cut, color, clarity

Variable Description
carat Weight of the diamond (0.2–5.01)
cut Cut Quality (Fair, Good, Very Good, Premium, Ideal)
color Diamond colour, from D (best) to J (worst)
clarity A measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best))
price in US dollars (326–18,823)

Data Sample

library(ggplot2)
# Get a subset of the database with only the variables that interest us
diamondData <- diamonds[, c("carat", "cut", "color", "clarity", "price")]
head(diamondData, n=3)
## # A tibble: 3 × 5
##   carat cut     color clarity price
##   <dbl> <ord>   <ord> <ord>   <int>
## 1  0.23 Ideal   E     SI2       326
## 2  0.21 Premium E     SI1       326
## 3  0.23 Good    E     VS1       327

Shiny Application

The application lets one examine the relationship between various characteristics of the diamond with its price. The application allows users to select characteristics to examine their univariate linear regression results in relation to price. The box plot is shown in a different tab, where users can choose to include or omit to show outliers.

Screenshot - Box Plot

Screenshot - Regression Model

Code