Reading in the data - r/Anarchism

Following code reads in the data:

## Loading required package: MASS
## Loading required package: survival
## Warning: Missing column names filled in: 'X1' [1]
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   X1 = col_double(),
##   `News Outlet` = col_character(),
##   `Jan-Mar 2015` = col_double(),
##   `Jan-Mar 2016` = col_double(),
##   `Jan-Mar 2017` = col_double(),
##   `Jan-Mar 2018` = col_double(),
##   `Jan-Mar 2019` = col_double(),
##   `Estimated Slant of News Outlet` = col_character(),
##   `2015 Slant` = col_double(),
##   `2016 Slant` = col_double(),
##   `2017 Slant` = col_double(),
##   `2018 Slant` = col_double(),
##   `2019 Slant` = col_double()
## )

The data read is of the r/Anarchism subreddit for years 2015-2019, Jan - March. Each year’s computed slant score is fitted with a distribution (given the Cullen & Gray Graph). For this subreddit, following is the simple plot of the 2015 slant score data along with the Cullen & Gray graph:

## summary statistics
## ------
## min:  1.882601e-05   max:  0.9999812 
## median:  0.9084868 
## mean:  0.8919591 
## estimated sd:  0.09380777 
## estimated skewness:  -7.010035 
## estimated kurtosis:  58.79051

2015 Slant Scores:

Clearly, the candidates for distribution fitting are: beta, gamma & lognormal. We use: Weibull, beta & gamma. Note that since MLE is used to fit the data, we scale it such that all values are positive. First, the 2015 slant scores is fitted as follows:

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -48.2560   0.0000   0.0000  -0.8779   0.0000   4.8600
##    25%    50%    75%    90%    99% 
## 0.0000 0.0000 0.0000 0.0000 1.8631

2016 Slant scores:

Here are the 2016 slant scores plotted:

## summary statistics
## ------
## min:  3.280194e-05   max:  0.9999672 
## median:  0.9741193 
## mean:  0.9490859 
## estimated sd:  0.1048014 
## estimated skewness:  -6.123812 
## estimated kurtosis:  47.73204
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -29.6960   0.0000   0.0000  -0.7632   0.0000   0.7880
##    25%    50%    75%    90%    99% 
## 0.0000 0.0000 0.0000 0.0000 0.3133

2017 Slant scores:

Here are the 2017 slant scores plotted:

## summary statistics
## ------
## min:  2.165674e-05   max:  0.9999783 
## median:  0.8347591 
## mean:  0.8239026 
## estimated sd:  0.0711425 
## estimated skewness:  -8.881308 
## estimated kurtosis:  102.1249
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -38.5440   0.0000   0.0000  -0.5013   0.0000   7.6290
##     25%     50%     75%     90%     99% 
## 0.00000 0.00000 0.00000 0.00000 3.56115

2018 Slant scores:

Here are the 2018 slant scores plotted:

## summary statistics
## ------
## min:  2.348741e-05   max:  0.9999765 
## median:  0.8805195 
## mean:  0.873465 
## estimated sd:  0.07138593 
## estimated skewness:  -10.52302 
## estimated kurtosis:  127.6935
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -37.4880   0.0000   0.0000  -0.3004   0.0000   5.0860
##     25%     50%     75%     90%     99% 
## 0.00000 0.00000 0.00000 0.00000 2.91695

2019 Slant scores:

Here are the 2019 slant scores plotted:

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -43.8240   0.0000   0.0000  -0.3697   0.0000   0.0000
## 25% 50% 75% 90% 99% 
##   0   0   0   0   0