Reading in the data - r/AnythingGoesNews

Following code reads in the data:

## Loading required package: MASS
## Loading required package: survival
## Warning: Missing column names filled in: 'X1' [1]
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   X1 = col_double(),
##   V1 = col_character(),
##   V2 = col_double(),
##   V3 = col_double(),
##   V4 = col_double(),
##   V5 = col_double(),
##   V6 = col_double(),
##   V7 = col_character(),
##   `2015 Slant` = col_double(),
##   `2016 Slant` = col_double(),
##   `2017 Slant` = col_double(),
##   `2018 Slant` = col_double(),
##   `2019 Slant` = col_double()
## )

The data read is of the r/AnythingGoesNews subreddit for years 2015-2019, Jan - March. Each year’s computed slant score is fitted with a distribution (given the Cullen & Gray Graph). For this subreddit, following is the simple plot of the 2015 slant score data along with the Cullen & Gray graph:

## summary statistics
## ------
## min:  5.784828e-06   max:  0.9999942 
## median:  0.6978874 
## mean:  0.6871052 
## estimated sd:  0.08349227 
## estimated skewness:  -4.856242 
## estimated kurtosis:  37.53401

2015 Slant Scores:

Clearly, the candidates for distribution fitting are: beta, gamma & lognormal. We use: Weibull, beta & gamma. Note that since MLE is used to fit the data, we scale it such that all values are positive. First, the 2015 slant scores is fitted as follows:

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -120.640    0.000    0.000   -1.864    0.000   52.224
##      25%      50%      75%      90%      99% 
##  0.00000  0.00000  0.00000  2.03150 18.60485

2016 Slant scores:

Here are the 2016 slant scores plotted:

## summary statistics
## ------
## min:  5.983903e-06   max:  0.999994 
## median:  0.6330551 
## mean:  0.6245723 
## estimated sd:  0.08264903 
## estimated skewness:  -3.432226 
## estimated kurtosis:  30.04946
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -105.792    0.000    0.000   -1.418    0.000   61.321
##     25%     50%     75%     90%     99% 
##  0.0000  0.0000  0.0000  2.2865 30.4959

2017 Slant scores:

Here are the 2017 slant scores plotted:

## summary statistics
## ------
## min:  5.048414e-06   max:  0.999995 
## median:  0.4533022 
## mean:  0.4512509 
## estimated sd:  0.07483273 
## estimated skewness:  0.8263925 
## estimated kurtosis:  28.50812
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -89.7900   0.0000   0.0000  -0.4063   0.0000 108.2900
##     25%     50%     75%     90%     99% 
##  0.0000  0.0000  0.0000  2.1175 45.8463

2018 Slant scores:

Here are the 2018 slant scores plotted:

## summary statistics
## ------
## min:  4.546074e-06   max:  0.9999955 
## median:  0.1584261 
## mean:  0.1738089 
## estimated sd:  0.1048226 
## estimated skewness:  6.378991 
## estimated kurtosis:  47.00718
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -34.848   0.000   0.000   3.384   0.000 185.120
##      25%      50%      75%      90%      99% 
##   0.0000   0.0000   0.0000   0.0000 122.3242

2019 Slant scores:

Here are the 2019 slant scores plotted:

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -57.0240   0.0000   0.0000  -0.3569   0.0000  39.1440
##    25%    50%    75%    90%    99% 
## 0.0000 0.0000 0.0000 0.0000 0.1668