Following code reads in the data:
## Loading required package: MASS
## Loading required package: survival
## Warning: Missing column names filled in: 'X1' [1]
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## X1 = col_double(),
## `News Outlet` = col_character(),
## `Jan-Mar 2015` = col_double(),
## `Jan-Mar 2016` = col_double(),
## `Jan-Mar 2017` = col_double(),
## `Jan-Mar 2018` = col_double(),
## `Jan-Mar 2019` = col_double(),
## `Estimated Slant of News Outlet` = col_character(),
## `2015 Slant` = col_double(),
## `2016 Slant` = col_double(),
## `2017 Slant` = col_double(),
## `2018 Slant` = col_double(),
## `2019 Slant` = col_double()
## )
The data read is of the r/Anarchism subreddit for years 2015-2019, Jan - March. Each year’s computed slant score is fitted with a distribution (given the Cullen & Gray Graph). For this subreddit, following is the simple plot of the 2015 slant score data along with the Cullen & Gray graph:
## summary statistics
## ------
## min: 1.882601e-05 max: 0.9999812
## median: 0.9084868
## mean: 0.8919591
## estimated sd: 0.09380777
## estimated skewness: -7.010035
## estimated kurtosis: 58.79051
Clearly, the candidates for distribution fitting are: beta, gamma & lognormal. We use: Weibull, beta & gamma. Note that since MLE is used to fit the data, we scale it such that all values are positive. First, the 2015 slant scores is fitted as follows:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -48.2560 0.0000 0.0000 -0.8779 0.0000 4.8600
## 25% 50% 75% 90% 99%
## 0.0000 0.0000 0.0000 0.0000 1.8631
Here are the 2016 slant scores plotted:
## summary statistics
## ------
## min: 3.280194e-05 max: 0.9999672
## median: 0.9741193
## mean: 0.9490859
## estimated sd: 0.1048014
## estimated skewness: -6.123812
## estimated kurtosis: 47.73204
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -29.6960 0.0000 0.0000 -0.7632 0.0000 0.7880
## 25% 50% 75% 90% 99%
## 0.0000 0.0000 0.0000 0.0000 0.3133
Here are the 2017 slant scores plotted:
## summary statistics
## ------
## min: 2.165674e-05 max: 0.9999783
## median: 0.8347591
## mean: 0.8239026
## estimated sd: 0.0711425
## estimated skewness: -8.881308
## estimated kurtosis: 102.1249
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -38.5440 0.0000 0.0000 -0.5013 0.0000 7.6290
## 25% 50% 75% 90% 99%
## 0.00000 0.00000 0.00000 0.00000 3.56115
Here are the 2018 slant scores plotted:
## summary statistics
## ------
## min: 2.348741e-05 max: 0.9999765
## median: 0.8805195
## mean: 0.873465
## estimated sd: 0.07138593
## estimated skewness: -10.52302
## estimated kurtosis: 127.6935
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -37.4880 0.0000 0.0000 -0.3004 0.0000 5.0860
## 25% 50% 75% 90% 99%
## 0.00000 0.00000 0.00000 0.00000 2.91695
Here are the 2019 slant scores plotted:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -43.8240 0.0000 0.0000 -0.3697 0.0000 0.0000
## 25% 50% 75% 90% 99%
## 0 0 0 0 0