Calculate the intercoder reliability index (K) for the following variables:
Organization
User category
ideology
Cod country
Country
Oficial, if it’s an organization official account
After collecting and cleaning the data, a total of 9,690 users were included in the final dataset based on the tweets produced during the period under study.
The process of coding started using two coders working simulatenously, after generating a random sample of users to code by both. Coders followed a Codebook developed in advance and had regular meetings to compare results. They worked together in the sample until reaching a decent K-rate. After this point, both coders continue coding the dataset. Final intercoder reliability indexes are presented below for the two samples.
Sample coded by both coders
#load file
both <- read_excel("Users_2009-2022_final - May242023.xlsx", sheet = "coded by both")
#calculating Cohen´s K indexes using irr
organization <- kappa2(both[,c(13,20)], "unweighted")
category <- kappa2(both[,c(14,21)], "unweighted")
ideology <- kappa2(both[,c(15,22)], "equal")
cod_country <-kappa2(both[,c(16,23)], "unweighted")
country <- kappa2(both[,c(17,24)], "unweighted")
official <- kappa2(both[,c(18,25)], "unweighted")
verified <- kappa2(both[,c(19,26)], "unweighted")
#calculating agreement
agree_org <- agree(both[,c(13,20)])
agree_cat <-agree(both[,c(14,21)])
agree_ideo <- agree(both[,c(15,22)])
agree_cod_country <-agree(both[,c(16,23)])
agree_country <- agree(both[,c(17,24)])
agree_official <- agree(both[,c(18,25)])
agree_verified <- agree(both[,c(19,26)])
#organizaing to show results in a chart
kappa <- c(organization$value, category$value, ideology$value, cod_country$value, country$value, official$value, verified$value)
agreement <- c(agree_org$value, agree_cat$value, agree_ideo$value, agree_cod_country$value, agree_country$value, agree_official$value, agree_verified$value)
labels <- c("organization", "category", "ideology", "cod_country", "country", "official", "verified")
results<-data.frame(labels, agreement, kappa)
results %>% kbl (caption = "Results of inter-coder agreement for a sample of the dataset", col.names = c("item", "agreement", "Cohen's kappa")) %>% kable_styling(full_width = FALSE, position = "center")
| item | agreement | Cohen’s kappa |
|---|---|---|
| organization | 98.51138 | 0.9554850 |
| category | 99.03678 | 0.9707317 |
| ideology | 90.71804 | 0.8399955 |
| cod_country | 97.89842 | 0.8551344 |
| country | 97.37303 | 0.8220528 |
| official | 99.56217 | 0.9603956 |
| verified | 99.82487 | 0.9956333 |
In sum, the values to report Cohen K are the following:
sprintf("Average of Cohen's Kappa for the sample of the whole dataset: %f", mean(kappa))
## [1] "Average of Cohen's Kappa for the sample of the whole dataset: 0.914204"
Since we have identified verified users, we created a subset with those verified and coded by both coders. Notice that the value of Verified is Nan, because of this form of creating the current subset.
#load dataset
verified <- read_excel("Users_2009-2022_final - May242023.xlsx", sheet = "ver_forK")
#calculating Cohen´s K indexes using irr
org_ver <- kappa2(verified[,c(13,20)], "unweighted")
cat_ver <- kappa2(verified[,c(14,21)], "unweighted")
ide_ver <- kappa2(verified[,c(15,22)], "equal")
cod_ver <-kappa2(verified[,c(16,23)], "unweighted")
cou_ver <- kappa2(verified[,c(17,24)], "unweighted")
off_ver <-kappa2(verified[,c(18,25)], "unweighted")
ver_ver <- kappa2(verified[,c(19,26)])
#calculating agreement
agree_org <- agree(verified[,c(13,20)])
agree_cat <-agree(verified[,c(14,21)])
agree_ideo <- agree(verified[,c(15,22)])
agree_cod_country <-agree(verified[,c(16,23)])
agree_country <- agree(verified[,c(17,24)])
agree_official <- agree(verified[,c(18,25)])
agree_verified <- agree(verified[,c(19,26)])
#tabla
kappa2 <- c(org_ver$value, cat_ver$value, ide_ver$value, cod_ver$value, cou_ver$value, off_ver$value, ver_ver$value)
agreement2 <- c(agree_org$value, agree_cat$value, agree_ideo$value, agree_cod_country$value, agree_country$value, agree_official$value, agree_verified$value)
labels <- c("organization", "category", "ideology", "cod_country", "country", "official", "verified")
results2<-data.frame(labels, agreement2, kappa2)
results2 %>% kbl (caption = "Results inter-coder analysis for a sample of verified users", col.names = c("item", "agreement", "Cohen's kappa")) %>% kable_styling(full_width = FALSE, position = "center")
| item | agreement | Cohen’s kappa |
|---|---|---|
| organization | 99.05363 | 0.9888325 |
| category | 99.68454 | 0.9959671 |
| ideology | 99.68454 | 0.9968144 |
| cod_country | 100.00000 | 1.0000000 |
| country | 99.68454 | 0.9824445 |
| official | 99.68454 | 0.9889211 |
| verified | 100.00000 | NaN |