We used Google Jigsaw Perspective API, to quantify the use of harmful language in post, then we applied Principal Component Analysis (PCA), where we only considered the first principle component also considered the quality news sources that they shared
For Linkedin data, we have 157,171 posts with valid URL and their rating.
For all posts that contain a link to a news website, linear regression with robust standard errors.
Model_Linkedin_1 <- lm_robust(scale(harm_PC1)~scale(domain_quality_rating), data=Linkedin_data_clean)
summary(Model_Linkedin_1)
##
## Call:
## lm_robust(formula = scale(harm_PC1) ~ scale(domain_quality_rating),
## data = Linkedin_data_clean)
##
## Standard error type: HC2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.036e-14 0.002548 4.066e-12 1.000e+00
## scale(domain_quality_rating) -4.572e-02 0.002672 -1.711e+01 1.429e-65
## CI Lower CI Upper DF
## (Intercept) -0.004995 0.004995 153655
## scale(domain_quality_rating) -0.050955 -0.040481 153655
##
## Multiple R-squared: 0.00209 , Adjusted R-squared: 0.002084
## F-statistic: 292.8 on 1 and 153655 DF, p-value: < 2.2e-16
For all posts that contain a link to a news website, linear regression with robust standard errors clustered on user.
Model_Linkedin_2 <- feols(scale(harm_PC1) ~ scale(domain_quality_rating), cluster = ~authorName, data = Linkedin_data_clean)
print(summary(Model_Linkedin_2))
## OLS estimation, Dep. Var.: scale(harm_PC1)
## Observations: 153,657
## Standard-errors: Clustered (authorName)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.030000e-14 0.010842 9.530000e-13 1.0000e+00
## scale(domain_quality_rating) -4.571842e-02 0.006906 -6.620338e+00 3.6107e-11
##
## (Intercept)
## scale(domain_quality_rating) ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.998951 Adj. R2: 0.002084
For bluesky data, we have 538,361 posts with valid URL and their rating.
For all posts that contain a link to a news website, linear regression with robust standard errors.
Model_bluesky_1 <- lm_robust(scale(harm_PC1)~scale(domain_quality_rating), data=BlueSky_data_clean)
print(summary(Model_bluesky_1))
##
## Call:
## lm_robust(formula = scale(harm_PC1) ~ scale(domain_quality_rating),
## data = BlueSky_data_clean)
##
## Standard error type: HC2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6.616e-14 0.001363 -4.855e-11 1.000e+00
## scale(domain_quality_rating) 1.076e-02 0.001401 7.680e+00 1.588e-14
## CI Lower CI Upper DF
## (Intercept) -0.002671 0.002671 538359
## scale(domain_quality_rating) 0.008015 0.013508 538359
##
## Multiple R-squared: 0.0001158 , Adjusted R-squared: 0.000114
## F-statistic: 58.99 on 1 and 538359 DF, p-value: 1.588e-14
For all posts that contain a link to a news website, linear regression with robust standard errors clustered on user.
Model_bluesky_2 <- feols(scale(harm_PC1) ~ scale(domain_quality_rating), cluster = ~username, data = BlueSky_data_clean)
print(summary(Model_bluesky_2))
## OLS estimation, Dep. Var.: scale(harm_PC1)
## Observations: 538,361
## Standard-errors: Clustered (username)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6.460000e-14 0.009315 -6.940000e-12 1.00000
## scale(domain_quality_rating) 1.076158e-02 0.008677 1.240276e+00 0.21488
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.999941 Adj. R2: 1.14e-4
For Gab data, we have 93,064 posts with valid URL and their rating.
For all posts that contain a link to a news website, linear regression with robust standard errors
Model_gab_1 <- lm_robust(scale(harm_PC1)~scale(domain_quality_rating), data=Gab_data_clean)
print(summary(Model_gab_1))
##
## Call:
## lm_robust(formula = scale(harm_PC1) ~ scale(domain_quality_rating),
## data = Gab_data_clean)
##
## Standard error type: HC2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.697e-15 0.003278 1.128e-12 1.00000
## scale(domain_quality_rating) -5.817e-03 0.003038 -1.915e+00 0.05554
## CI Lower CI Upper DF
## (Intercept) -0.006425 0.0064248 93062
## scale(domain_quality_rating) -0.011772 0.0001376 93062
##
## Multiple R-squared: 3.384e-05 , Adjusted R-squared: 2.309e-05
## F-statistic: 3.666 on 1 and 93062 DF, p-value: 0.05554
For all posts that contain a link to a news website, linear regression with robust standard errors clustered on user.
Model_gab_2 <- feols(scale(harm_PC1) ~ scale(domain_quality_rating), cluster = ~username, data = Gab_data_clean)
summary(Model_gab_2)
## OLS estimation, Dep. Var.: scale(harm_PC1)
## Observations: 93,064
## Standard-errors: Clustered (username)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.590000e-15 0.009312 3.850000e-13 1.00000
## scale(domain_quality_rating) -5.817091e-03 0.009287 -6.263795e-01 0.53108
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.999978 Adj. R2: 2.309e-5
For Telegram, we have 3,556,345 posts with valid URL and their rating (so far).
For all posts that contain a link to a news website, linear regression with robust standard errors.
Model_telegram_1 <- lm_robust(scale(harm_PC1)~scale(domain_quality_rating), data=Telegram_data_clean)
print(summary(Model_telegram_1))
##
## Call:
## lm_robust(formula = scale(harm_PC1) ~ scale(domain_quality_rating),
## data = Telegram_data_clean)
##
## Standard error type: HC2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.213e-13 0.0005296 -9.844e-10 1
## scale(domain_quality_rating) -5.185e-02 0.0005389 -9.620e+01 0
## CI Lower CI Upper DF
## (Intercept) -0.001038 0.001038 3556343
## scale(domain_quality_rating) -0.052903 -0.050790 3556343
##
## Multiple R-squared: 0.002688 , Adjusted R-squared: 0.002688
## F-statistic: 9255 on 1 and 3556343 DF, p-value: < 2.2e-16
For all posts that contain a link to a news website, linear regression with robust standard errors clustered on user.
Model_telegram_2 <- feols(scale(harm_PC1) ~ scale(domain_quality_rating), cluster = ~Username, data = Telegram_data_clean)
summary(Model_telegram_2)
## OLS estimation, Dep. Var.: scale(harm_PC1)
## Observations: 3,556,345
## Standard-errors: Clustered (Username)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -4.930000e-13 0.018779 -2.620000e-11 1.0000e+00
## scale(domain_quality_rating) -5.184646e-02 0.010419 -4.976303e+00 6.5301e-07
##
## (Intercept)
## scale(domain_quality_rating) ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## RMSE: 0.998655 Adj. R2: 0.002688