Warning: Removed 4305 rows containing non-finite values (`stat_boxplot()`).
Durability Analysis
Distribution of ratings
Below are the distribution of shoe review ratings. These are box and whisker plots, where the box represents the interquartile range (IQR), the line in the middle of the box represents the median, and the whiskers represent the range of the data. The points represent outliers.
These are coarse because rating scores are integers. For this reason, you’ll see odd things like no top-half IQR for the overall rating, because a full 25% of shoes score a 9.
Below are the actual summary statistics for each rating on the raw reviews.
summary(reviewsAnalysis$rating_durability) Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000 7.000 8.000 7.801 9.000 10.000
summary(reviewsAnalysis$rating_overall) Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 8.000 9.000 8.501 9.000 10.000
summary(reviewsAnalysis$rating_comfort) Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00 8.00 9.00 8.61 10.00 10.00
summary(reviewsAnalysis$rating_cushioning) Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
1.000 8.000 9.000 8.427 10.000 10.000 4847
summary(reviewsAnalysis$rating_appearance) Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000 7.000 8.000 8.143 10.000 10.000
This is the distribution of average shoe ratings for all shoes with 5 or more reviews.
Warning: Removed 93 rows containing non-finite values (`stat_boxplot()`).
Below are the summary statistics for the rating dimensions per shoe.
summary(analysis$rating_durability) Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
1.000 7.000 8.000 7.876 9.000 10.000 2
summary(analysis$rating_overall) Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 8.000 8.500 8.447 9.000 10.000
summary(analysis$rating_comfort) Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 8.000 8.700 8.571 9.200 10.000
summary(analysis$rating_cushioning) Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
1.000 7.700 8.500 8.215 9.100 10.000 331
summary(analysis$rating_durability) Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
1.000 7.000 8.000 7.876 9.000 10.000 2
Durability Ratings By Year
I do not see any evidence of durability ratings changing over time.
Trainers vs Supershoes
Trainers
summary(
reviewsAnalysis |>
filter(trainer == TRUE) |>
select(starts_with("rating_"))
) rating_overall rating_comfort rating_cushioning rating_durability
Min. : 1.000 Min. : 0.000 Min. : 1.000 Min. : 0.000
1st Qu.: 8.000 1st Qu.: 8.000 1st Qu.: 8.000 1st Qu.: 7.000
Median : 9.000 Median : 9.000 Median : 9.000 Median : 8.000
Mean : 8.478 Mean : 8.612 Mean : 8.491 Mean : 7.839
3rd Qu.: 9.000 3rd Qu.:10.000 3rd Qu.:10.000 3rd Qu.: 9.000
Max. :10.000 Max. :10.000 Max. :10.000 Max. :10.000
NA's :4294
rating_value rating_appearance
Min. : 0.000 Min. : 0.000
1st Qu.: 7.000 1st Qu.: 7.000
Median : 8.000 Median : 8.000
Mean : 8.029 Mean : 8.102
3rd Qu.: 9.000 3rd Qu.:10.000
Max. :10.000 Max. :10.000
Supershoes
summary(
reviewsAnalysis |>
filter(supershoe == TRUE) |>
select(starts_with("rating_"))
) rating_overall rating_comfort rating_cushioning rating_durability
Min. : 1.000 Min. : 1.000 Min. : 2.000 Min. : 1.000
1st Qu.: 8.000 1st Qu.: 8.000 1st Qu.: 8.500 1st Qu.: 6.000
Median : 9.000 Median : 9.000 Median : 9.000 Median : 8.000
Mean : 9.013 Mean : 8.768 Mean : 9.073 Mean : 7.194
3rd Qu.:10.000 3rd Qu.:10.000 3rd Qu.:10.000 3rd Qu.: 9.000
Max. :10.000 Max. :10.000 Max. :10.000 Max. :10.000
NA's :11
rating_value rating_appearance
Min. : 1.000 Min. : 3.000
1st Qu.: 7.000 1st Qu.: 8.000
Median : 8.000 Median : 9.000
Mean : 7.713 Mean : 8.723
3rd Qu.: 9.000 3rd Qu.:10.000
Max. :10.000 Max. :10.000
Side by side:
# A tibble: 2 × 6
shoe_type Overall Comfort Cushioning Durability Apperance
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Supershoe 9.01 8.77 9.07 7.19 8.72
2 Trainer 8.48 8.61 8.49 7.84 8.10
Below one can see that durability ratings by supershoes are lower overall, with far more very low scores.
And here you can see that for supershoes, it is only durability that supershoes rate lower on; they do better on every other metric.
Warning: Removed 4305 rows containing non-finite values (`stat_boxplot()`).
Warning: Removed 4305 rows containing non-finite values (`stat_summary()`).
Warning: Removed 54760 rows containing missing values (`geom_text()`).
Association of durability with various ratings
Here is correlations of ratings. This isn’t super informative, but a few points:
Heavier shoes score higher on durability, comfort, cushioning. BUT NOT OVERALL (or appearance).
The correlation between durability and comfort is not very high. 0.32 (compared to overall with comfort: 0.70)
reviewsAnalysis |>
select(
rating_overall,
rating_comfort,
rating_cushioning,
rating_durability,
rating_appearance,
weight_in_ounces
) |>
cor(use = "pairwise.complete.obs") rating_overall rating_comfort rating_cushioning
rating_overall 1.00000000 0.70399429 0.52412706
rating_comfort 0.70399429 1.00000000 0.56569732
rating_cushioning 0.52412706 0.56569732 1.00000000
rating_durability 0.47194670 0.32523927 0.32598963
rating_appearance 0.38340333 0.30971773 0.23934653
weight_in_ounces -0.03894765 0.00239362 0.06433817
rating_durability rating_appearance weight_in_ounces
rating_overall 0.4719467 0.38340333 -0.03894765
rating_comfort 0.3252393 0.30971773 0.00239362
rating_cushioning 0.3259896 0.23934653 0.06433817
rating_durability 1.0000000 0.25120632 0.12630440
rating_appearance 0.2512063 1.00000000 -0.09692123
weight_in_ounces 0.1263044 -0.09692123 1.00000000