library(ggplot2)
library(dplyr)
Diamonds <- read.csv("Diamonds.csv")
Diamonds$cut<- factor(Diamonds$cut, levels=c('Fair','Good','Very Good','Premium','Ideal'),
ordered=TRUE)
Diamonds$color<- factor(Diamonds$color, levels=c('J','I','H','G','F','E','D'),
ordered=TRUE)
Diamonds$clarity<- factor(Diamonds$clarity, levels=c('I1','SI2','SI1','VS2','VS1','VVS2','VVS1','IF'),
ordered=TRUE)
str(Diamonds)
'data.frame': 53940 obs. of 10 variables:
$ carat : num 0.23 0.21 0.23 0.29 0.31 0.24 0.24 0.26 0.22 0.23 ...
$ cut : Ord.factor w/ 5 levels "Fair"<"Good"<..: 5 4 2 4 2 3 3 3 1 3 ...
$ color : Ord.factor w/ 7 levels "J"<"I"<"H"<"G"<..: 6 6 6 2 1 1 2 3 6 3 ...
$ clarity: Ord.factor w/ 8 levels "I1"<"SI2"<"SI1"<..: 2 3 5 4 2 6 7 3 4 5 ...
$ depth : num 61.5 59.8 56.9 62.4 63.3 62.8 62.3 61.9 65.1 59.4 ...
$ table : num 55 61 65 58 58 57 57 55 61 61 ...
$ price : int 326 326 327 334 335 336 336 337 337 338 ...
$ x : num 3.95 3.89 4.05 4.2 4.34 3.94 3.95 4.07 3.87 4 ...
$ y : num 3.98 3.84 4.07 4.23 4.35 3.96 3.98 4.11 3.78 4.05 ...
$ z : num 2.43 2.31 2.31 2.63 2.75 2.48 2.47 2.53 2.49 2.39 ...
summary(Diamonds)
carat cut color clarity depth table price x
Min. :0.2000 Fair : 1610 J: 2808 SI1 :13065 Min. :43.00 Min. :43.00 Min. : 326 Min. : 0.000
1st Qu.:0.4000 Good : 4906 I: 5422 VS2 :12258 1st Qu.:61.00 1st Qu.:56.00 1st Qu.: 950 1st Qu.: 4.710
Median :0.7000 Very Good:12082 H: 8304 SI2 : 9194 Median :61.80 Median :57.00 Median : 2401 Median : 5.700
Mean :0.7979 Premium :13791 G:11292 VS1 : 8171 Mean :61.75 Mean :57.46 Mean : 3933 Mean : 5.731
3rd Qu.:1.0400 Ideal :21551 F: 9542 VVS2 : 5066 3rd Qu.:62.50 3rd Qu.:59.00 3rd Qu.: 5324 3rd Qu.: 6.540
Max. :5.0100 E: 9797 VVS1 : 3655 Max. :79.00 Max. :95.00 Max. :18823 Max. :10.740
D: 6775 (Other): 2531
y z
Min. : 0.000 Min. : 0.000
1st Qu.: 4.720 1st Qu.: 2.910
Median : 5.710 Median : 3.530
Mean : 5.735 Mean : 3.539
3rd Qu.: 6.540 3rd Qu.: 4.040
Max. :58.900 Max. :31.800
In the case of the Plot 1, Clarity ‘IF’ is associated with the greatest rise in price, as compared to the other clarity values.
In the case of the Plot 2, Cut ‘Ideal’ is associated with the greatest rise in price, as compared to the other cut values.
In the case of the Plot 3, Color ‘D’ is associated with the greatest rise in price, as compared to the other color values.
Overall, a diamond with clarity equal to IF, cut equal to Ideal and color equal to D would have the greatest value per unit of carat.