Following code reads in the data:
## Loading required package: MASS
## Loading required package: survival
## Warning: Missing column names filled in: 'X1' [1]
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## X1 = col_double(),
## `News Outlet` = col_character(),
## `Jan-Mar 2015` = col_double(),
## `Jan-Mar 2016` = col_double(),
## `Jan-Mar 2017` = col_double(),
## `Jan-Mar 2018` = col_double(),
## `Jan-Mar 2019` = col_double(),
## `Estimated Slant of News Outlet` = col_character(),
## `2015 Slant` = col_double(),
## `2016 Slant` = col_double(),
## `2017 Slant` = col_double(),
## `2018 Slant` = col_double(),
## `2019 Slant` = col_double()
## )
The data read is of the r/American_Politics subreddit for years 2015-2019, Jan - March. Each year’s computed slant score is fitted with a distribution (given the Cullen & Gray Graph). For this subreddit, following is the simple plot of the 2015 slant score data along with the Cullen & Gray graph:
## summary statistics
## ------
## min: 8.351289e-06 max: 0.9999916
## median: 0.5632193
## mean: 0.5717666
## estimated sd: 0.0818536
## estimated skewness: 0.9953259
## estimated kurtosis: 26.49989
Clearly, the candidates for distribution fitting are: beta, gamma & lognormal. We use: Weibull, beta & gamma. Note that since MLE is used to fit the data, we scale it such that all values are positive. First, the 2015 slant scores is fitted as follows:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -67.440 0.000 0.000 1.023 0.000 52.300
## 25% 50% 75% 90% 99%
## 0.0000 0.0000 0.0000 0.8540 46.0143
Here are the 2016 slant scores plotted:
## summary statistics
## ------
## min: 1.694571e-05 max: 0.9999831
## median: 0.5690538
## mean: 0.5696607
## estimated sd: 0.08418964
## estimated skewness: -2.14618
## estimated kurtosis: 27.55875
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -33.58000 0.00000 0.00000 0.03582 0.00000 25.43000
## 25% 50% 75% 90% 99%
## 0.0000 0.0000 0.0000 1.2625 11.9570
Here are the 2017 slant scores plotted:
## summary statistics
## ------
## min: 2.065177e-05 max: 0.9999793
## median: 0.3697699
## mean: 0.3748525
## estimated sd: 0.09457507
## estimated skewness: 3.263917
## estimated kurtosis: 27.58009
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -17.9040 0.0000 0.0000 0.2461 0.0000 30.5160
## 25% 50% 75% 90% 99%
## 0.0000 0.0000 0.0000 0.3710 24.5172
Here are the 2018 slant scores plotted:
## summary statistics
## ------
## min: 1.502855e-05 max: 0.999985
## median: 0.1305831
## mean: 0.1488947
## estimated sd: 0.1160694
## estimated skewness: 6.405704
## estimated kurtosis: 44.27104
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -8.688 0.000 0.000 1.218 0.000 57.850
## 25% 50% 75% 90% 99%
## 0.00000 0.00000 0.00000 0.00000 53.45655
Here are the 2019 slant scores plotted:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -10.86000 0.00000 0.00000 -0.08432 0.00000 6.52400
## 25% 50% 75% 90% 99%
## 0 0 0 0 0