Developing Data Products Course Project

Elena Fedorova
15 October 2018

Introduction

This presentation is a part of the Course Project of Developing Data Products Course delivered by the John Hopkins University on Coursera. This course project consists of the two parts: to develop a Shiny application and publish it on Rstudio's servers and to create a reproducible pitch presentation about the application. The documentation on this project is as follows:

Application "The Diamonds catalogue"

The application developed in this course project simulates the site of a diamond-seller: it allows the propespective buyer to find the diamond with characteristics of interest and to estimate the total cost of their purchase.

The user enters the amount of pieces they would like to purchase per order line. In the second filter, specifies the maximum budget they would to spend on the purchase: the table will show only those diamonds, where total amount of purchase is below the entered value.

Additional selection filters allow users to restrict the selection according to various diamond characteristics such as clarity, color etc.

The Diamonds data set

This data set containing the prices and other attributes of almost 54,000 diamonds.

library(ggplot2)
str(diamonds)
Classes 'tbl_df', 'tbl' and 'data.frame':   53940 obs. of  10 variables:
 $ carat  : num  0.23 0.21 0.23 0.29 0.31 0.24 0.24 0.26 0.22 0.23 ...
 $ cut    : Ord.factor w/ 5 levels "Fair"<"Good"<..: 5 4 2 4 2 3 3 3 1 3 ...
 $ color  : Ord.factor w/ 7 levels "D"<"E"<"F"<"G"<..: 2 2 2 6 7 7 6 5 2 5 ...
 $ clarity: Ord.factor w/ 8 levels "I1"<"SI2"<"SI1"<..: 2 3 5 4 2 6 7 3 4 5 ...
 $ depth  : num  61.5 59.8 56.9 62.4 63.3 62.8 62.3 61.9 65.1 59.4 ...
 $ table  : num  55 61 65 58 58 57 57 55 61 61 ...
 $ price  : int  326 326 327 334 335 336 336 337 337 338 ...
 $ x      : num  3.95 3.89 4.05 4.2 4.34 3.94 3.95 4.07 3.87 4 ...
 $ y      : num  3.98 3.84 4.07 4.23 4.35 3.96 3.98 4.11 3.78 4.05 ...
 $ z      : num  2.43 2.31 2.31 2.63 2.75 2.48 2.47 2.53 2.49 2.39 ...

Plot

Here is the graphical overview of the data set (count of observations by some selected characteristics)

plot of chunk unnamed-chunk-2