Reading in the data - r/American_Politics

Following code reads in the data:

## Loading required package: MASS
## Loading required package: survival
## Warning: Missing column names filled in: 'X1' [1]
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   X1 = col_double(),
##   `News Outlet` = col_character(),
##   `Jan-Mar 2015` = col_double(),
##   `Jan-Mar 2016` = col_double(),
##   `Jan-Mar 2017` = col_double(),
##   `Jan-Mar 2018` = col_double(),
##   `Jan-Mar 2019` = col_double(),
##   `Estimated Slant of News Outlet` = col_character(),
##   `2015 Slant` = col_double(),
##   `2016 Slant` = col_double(),
##   `2017 Slant` = col_double(),
##   `2018 Slant` = col_double(),
##   `2019 Slant` = col_double()
## )

The data read is of the r/American_Politics subreddit for years 2015-2019, Jan - March. Each year’s computed slant score is fitted with a distribution (given the Cullen & Gray Graph). For this subreddit, following is the simple plot of the 2015 slant score data along with the Cullen & Gray graph:

## summary statistics
## ------
## min:  8.351289e-06   max:  0.9999916 
## median:  0.5632193 
## mean:  0.5717666 
## estimated sd:  0.0818536 
## estimated skewness:  0.9953259 
## estimated kurtosis:  26.49989

2015 Slant Scores:

Clearly, the candidates for distribution fitting are: beta, gamma & lognormal. We use: Weibull, beta & gamma. Note that since MLE is used to fit the data, we scale it such that all values are positive. First, the 2015 slant scores is fitted as follows:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -67.440   0.000   0.000   1.023   0.000  52.300
##     25%     50%     75%     90%     99% 
##  0.0000  0.0000  0.0000  0.8540 46.0143

2016 Slant scores:

Here are the 2016 slant scores plotted:

## summary statistics
## ------
## min:  1.694571e-05   max:  0.9999831 
## median:  0.5690538 
## mean:  0.5696607 
## estimated sd:  0.08418964 
## estimated skewness:  -2.14618 
## estimated kurtosis:  27.55875
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## -33.58000   0.00000   0.00000   0.03582   0.00000  25.43000
##     25%     50%     75%     90%     99% 
##  0.0000  0.0000  0.0000  1.2625 11.9570

2017 Slant scores:

Here are the 2017 slant scores plotted:

## summary statistics
## ------
## min:  2.065177e-05   max:  0.9999793 
## median:  0.3697699 
## mean:  0.3748525 
## estimated sd:  0.09457507 
## estimated skewness:  3.263917 
## estimated kurtosis:  27.58009
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -17.9040   0.0000   0.0000   0.2461   0.0000  30.5160
##     25%     50%     75%     90%     99% 
##  0.0000  0.0000  0.0000  0.3710 24.5172

2018 Slant scores:

Here are the 2018 slant scores plotted:

## summary statistics
## ------
## min:  1.502855e-05   max:  0.999985 
## median:  0.1305831 
## mean:  0.1488947 
## estimated sd:  0.1160694 
## estimated skewness:  6.405704 
## estimated kurtosis:  44.27104
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  -8.688   0.000   0.000   1.218   0.000  57.850
##      25%      50%      75%      90%      99% 
##  0.00000  0.00000  0.00000  0.00000 53.45655

2019 Slant scores:

Here are the 2019 slant scores plotted:

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## -10.86000   0.00000   0.00000  -0.08432   0.00000   6.52400
## 25% 50% 75% 90% 99% 
##   0   0   0   0   0