Intercoder reliability for the Argentina sample of users

Objective

Calculate the intercoder reliability index (K) for the following variables:

Organization
User category
ideology
Cod country
Country
Oficial, if it’s an organization official account

Procedure

After collecting and cleaning the data, a total of 9,690 users were included in the final dataset based on the tweets produced during the period under study.

The process of coding started using two coders working simulatenously, after generating a random sample of users to code by both. Coders followed a Codebook developed in advance and had regular meetings to compare results. They worked together in the sample until reaching a decent K-rate. After this point, both coders continue coding the dataset. Final intercoder reliability indexes are presented below for the two samples.

Sample coded by both coders: 1,142 (11,78% of all records) users coded by both
Sample of verified users: 317 users (11,67% of all verified)

Sample coded by both coders

#load file
both <- read_excel("Users_2009-2022_final - May242023.xlsx", sheet = "coded by both")


#calculating Cohen´s K indexes using irr
organization <- kappa2(both[,c(13,20)], "unweighted")
category <- kappa2(both[,c(14,21)], "unweighted")
ideology <- kappa2(both[,c(15,22)], "equal")
cod_country <-kappa2(both[,c(16,23)], "unweighted")
country <- kappa2(both[,c(17,24)], "unweighted")
official <- kappa2(both[,c(18,25)], "unweighted")
verified <- kappa2(both[,c(19,26)], "unweighted")


#calculating agreement
agree_org <- agree(both[,c(13,20)])
agree_cat <-agree(both[,c(14,21)])
agree_ideo <- agree(both[,c(15,22)])
agree_cod_country <-agree(both[,c(16,23)])
agree_country <- agree(both[,c(17,24)])
agree_official <- agree(both[,c(18,25)])
agree_verified <- agree(both[,c(19,26)])

#organizaing to show results in a chart
kappa <- c(organization$value, category$value, ideology$value, cod_country$value, country$value, official$value, verified$value)
agreement <- c(agree_org$value, agree_cat$value, agree_ideo$value, agree_cod_country$value, agree_country$value, agree_official$value, agree_verified$value)
labels <- c("organization", "category", "ideology", "cod_country", "country", "official", "verified")

results<-data.frame(labels, agreement, kappa)

results %>% kbl (caption = "Results of inter-coder agreement for a sample of the dataset", col.names = c("item", "agreement", "Cohen's kappa")) %>% kable_styling(full_width = FALSE, position = "center")

Results of inter-coder agreement for a sample of the dataset
item	agreement	Cohen’s kappa
organization	98.51138	0.9554850
category	99.03678	0.9707317
ideology	90.71804	0.8399955
cod_country	97.89842	0.8551344
country	97.37303	0.8220528
official	99.56217	0.9603956
verified	99.82487	0.9956333

Average to be reported

In sum, the values to report Cohen K are the following:

sprintf("Average of Cohen's Kappa for the sample of the whole dataset: %f", mean(kappa))

## [1] "Average of Cohen's Kappa for the sample of the whole dataset: 0.914204"

Sample of Verified users

Since we have identified verified users, we created a subset with those verified and coded by both coders. Notice that the value of Verified is Nan, because of this form of creating the current subset.

#load dataset
verified <- read_excel("Users_2009-2022_final - May242023.xlsx", sheet = "ver_forK")

                                
#calculating Cohen´s K indexes using irr
org_ver <- kappa2(verified[,c(13,20)], "unweighted")
cat_ver <- kappa2(verified[,c(14,21)], "unweighted")
ide_ver <- kappa2(verified[,c(15,22)], "equal")
cod_ver <-kappa2(verified[,c(16,23)], "unweighted")
cou_ver <- kappa2(verified[,c(17,24)], "unweighted")
off_ver <-kappa2(verified[,c(18,25)], "unweighted")
ver_ver <- kappa2(verified[,c(19,26)])

#calculating agreement
agree_org <- agree(verified[,c(13,20)])
agree_cat <-agree(verified[,c(14,21)])
agree_ideo <- agree(verified[,c(15,22)])
agree_cod_country <-agree(verified[,c(16,23)])
agree_country <- agree(verified[,c(17,24)])
agree_official <- agree(verified[,c(18,25)])
agree_verified <- agree(verified[,c(19,26)])

#tabla
kappa2 <- c(org_ver$value, cat_ver$value, ide_ver$value, cod_ver$value, cou_ver$value, off_ver$value, ver_ver$value)

agreement2 <- c(agree_org$value, agree_cat$value, agree_ideo$value, agree_cod_country$value, agree_country$value, agree_official$value, agree_verified$value)

labels <- c("organization", "category", "ideology", "cod_country", "country", "official", "verified")

results2<-data.frame(labels, agreement2, kappa2)

results2 %>% kbl (caption = "Results inter-coder analysis for a sample of verified users", col.names = c("item", "agreement", "Cohen's kappa")) %>% kable_styling(full_width = FALSE, position = "center")

Results inter-coder analysis for a sample of verified users
item	agreement	Cohen’s kappa
organization	99.05363	0.9888325
category	99.68454	0.9959671
ideology	99.68454	0.9968144
cod_country	100.00000	1.0000000
country	99.68454	0.9824445
official	99.68454	0.9889211
verified	100.00000	NaN

Intercoder reliability for the Argentina sample of users

Rubens Yanes

2023-05-29

Objective

Procedure

Average to be reported

Sample of Verified users