Objective

Calculate the intercoder reliability index (K) for the following variables:

  1. Organization

  2. User category

  3. ideology

  4. Cod country

  5. Country

  6. Oficial, if it’s an organization official account

Procedure

After collecting and cleaning the data, a total of 9,690 users were included in the final dataset based on the tweets produced during the period under study.

The process of coding started using two coders working simulatenously, after generating a random sample of users to code by both. Coders followed a Codebook developed in advance and had regular meetings to compare results. They worked together in the sample until reaching a decent K-rate. After this point, both coders continue coding the dataset. Final intercoder reliability indexes are presented below for the two samples.

  1. Sample coded by both coders: 1,142 (11,78% of all records) users coded by both
  2. Sample of verified users: 317 users (11,67% of all verified)

Sample coded by both coders

#load file
both <- read_excel("Users_2009-2022_final - May242023.xlsx", sheet = "coded by both")


#calculating Cohen´s K indexes using irr
organization <- kappa2(both[,c(13,20)], "unweighted")
category <- kappa2(both[,c(14,21)], "unweighted")
ideology <- kappa2(both[,c(15,22)], "equal")
cod_country <-kappa2(both[,c(16,23)], "unweighted")
country <- kappa2(both[,c(17,24)], "unweighted")
official <- kappa2(both[,c(18,25)], "unweighted")
verified <- kappa2(both[,c(19,26)], "unweighted")


#calculating agreement
agree_org <- agree(both[,c(13,20)])
agree_cat <-agree(both[,c(14,21)])
agree_ideo <- agree(both[,c(15,22)])
agree_cod_country <-agree(both[,c(16,23)])
agree_country <- agree(both[,c(17,24)])
agree_official <- agree(both[,c(18,25)])
agree_verified <- agree(both[,c(19,26)])

#organizaing to show results in a chart
kappa <- c(organization$value, category$value, ideology$value, cod_country$value, country$value, official$value, verified$value)
agreement <- c(agree_org$value, agree_cat$value, agree_ideo$value, agree_cod_country$value, agree_country$value, agree_official$value, agree_verified$value)
labels <- c("organization", "category", "ideology", "cod_country", "country", "official", "verified")

results<-data.frame(labels, agreement, kappa)

results %>% kbl (caption = "Results of inter-coder agreement for a sample of the dataset", col.names = c("item", "agreement", "Cohen's kappa")) %>% kable_styling(full_width = FALSE, position = "center")
Results of inter-coder agreement for a sample of the dataset
item agreement Cohen’s kappa
organization 98.51138 0.9554850
category 99.03678 0.9707317
ideology 90.71804 0.8399955
cod_country 97.89842 0.8551344
country 97.37303 0.8220528
official 99.56217 0.9603956
verified 99.82487 0.9956333

Average to be reported

In sum, the values to report Cohen K are the following:

sprintf("Average of Cohen's Kappa for the sample of the whole dataset: %f", mean(kappa))
## [1] "Average of Cohen's Kappa for the sample of the whole dataset: 0.914204"

Sample of Verified users

Since we have identified verified users, we created a subset with those verified and coded by both coders. Notice that the value of Verified is Nan, because of this form of creating the current subset.

#load dataset
verified <- read_excel("Users_2009-2022_final - May242023.xlsx", sheet = "ver_forK")

                                
#calculating Cohen´s K indexes using irr
org_ver <- kappa2(verified[,c(13,20)], "unweighted")
cat_ver <- kappa2(verified[,c(14,21)], "unweighted")
ide_ver <- kappa2(verified[,c(15,22)], "equal")
cod_ver <-kappa2(verified[,c(16,23)], "unweighted")
cou_ver <- kappa2(verified[,c(17,24)], "unweighted")
off_ver <-kappa2(verified[,c(18,25)], "unweighted")
ver_ver <- kappa2(verified[,c(19,26)])

#calculating agreement
agree_org <- agree(verified[,c(13,20)])
agree_cat <-agree(verified[,c(14,21)])
agree_ideo <- agree(verified[,c(15,22)])
agree_cod_country <-agree(verified[,c(16,23)])
agree_country <- agree(verified[,c(17,24)])
agree_official <- agree(verified[,c(18,25)])
agree_verified <- agree(verified[,c(19,26)])

#tabla
kappa2 <- c(org_ver$value, cat_ver$value, ide_ver$value, cod_ver$value, cou_ver$value, off_ver$value, ver_ver$value)

agreement2 <- c(agree_org$value, agree_cat$value, agree_ideo$value, agree_cod_country$value, agree_country$value, agree_official$value, agree_verified$value)

labels <- c("organization", "category", "ideology", "cod_country", "country", "official", "verified")

results2<-data.frame(labels, agreement2, kappa2)

results2 %>% kbl (caption = "Results inter-coder analysis for a sample of verified users", col.names = c("item", "agreement", "Cohen's kappa")) %>% kable_styling(full_width = FALSE, position = "center")
Results inter-coder analysis for a sample of verified users
item agreement Cohen’s kappa
organization 99.05363 0.9888325
category 99.68454 0.9959671
ideology 99.68454 0.9968144
cod_country 100.00000 1.0000000
country 99.68454 0.9824445
official 99.68454 0.9889211
verified 100.00000 NaN