In this markdown, we’re going to summarize the results of everyone’s literature search for their group project so far. We’re going to do that by reading the data directly from each group’s relevant_studes google spreadsheet.
First, we need the sheet id for each of your google spreadsheets containg your literature searches. This id is found in the URL of the spreadsheet.
Next, let’s read the data from your google spreadsheet directly into R. To do that, we can use the googlesheets4 R package and the read_sheet function.
COLS_WE_CARE_ABOUT <- c("coder_name", "unique_id", "screening_decision", "exclusion_reason")
g1_relevant_studies <- read_sheet(SHEET_ID_G1, "relevant_studies") %>%
select(COLS_WE_CARE_ABOUT) %>%
mutate(group = "Minimal Group Paradigm Group")
g2_relevant_studies <- read_sheet(SHEET_ID_G2, "relevant_studies") %>%
select(COLS_WE_CARE_ABOUT) %>%
mutate(group = "Linda Group")
g4_relevant_studies <- read_sheet(SHEET_ID_G4, 5) %>%
select(COLS_WE_CARE_ABOUT) %>%
bind_rows(read_sheet(SHEET_ID_G4, 5) %>% select(COLS_WE_CARE_ABOUT)) %>%
bind_rows(read_sheet(SHEET_ID_G4, 6) %>% select(COLS_WE_CARE_ABOUT)) %>%
bind_rows(read_sheet(SHEET_ID_G4, 7) %>% select(COLS_WE_CARE_ABOUT)) %>%
bind_rows(read_sheet(SHEET_ID_G4, 8) %>% select(COLS_WE_CARE_ABOUT)) %>%
mutate(group = "Syntactic Bootstrapping Group")
Combine each group’s relevant studies into a single dataframe. Note that this data is tidy (each row is a single observation).
all_relevant_studies <- bind_rows(g1_relevant_studies,
g2_relevant_studies,
g4_relevant_studies)
Let’s only look at rows that have complete data for coder_name, unique_id, screening_decision, exclusion_reason.
clean_data <- all_relevant_studies %>%
select(group, everything()) %>% # this moves `group` to be the first column
drop_na(group:screening_decision) # drop columns if they don't have complete data for all columns from group to screenting_decision
Let’s see what the data look like. Pring the first 10 rows.
clean_data %>%
slice(1:10) %>%
kable()
| group | coder_name | unique_id | screening_decision | exclusion_reason |
|---|---|---|---|---|
| Minimal Group Paradigm Group | jailyn | mlangeni2017 | include | NA |
| Minimal Group Paradigm Group | jailyn | thompson1990 | include | NA |
| Minimal Group Paradigm Group | jailyn | abrams2008 | include | NA |
| Minimal Group Paradigm Group | jailyn | otten2004 | include | NA |
| Minimal Group Paradigm Group | jailyn | zhong2008 | exclude | categorization is not arbitary |
| Minimal Group Paradigm Group | jailyn | wen2016 | include | NA |
| Minimal Group Paradigm Group | jailyn | bigler2001 | exclude | categorization is not anonymous |
| Minimal Group Paradigm Group | jailyn | decremer1999 | exclude | categorization is not arbitary |
| Minimal Group Paradigm Group | jailyn | foels2006 | include | NA |
| Minimal Group Paradigm Group | jailyn | peysakhovich2017 | exclude | no full paper access |
How many papers has our class entered so far?
count(clean_data) %>%
kable()
| n |
|---|
| 1215 |
How many papers has our class entered so far?, by group?
clean_data %>%
count(group) %>%
kable()
| group | n |
|---|---|
| Linda Group | 422 |
| Minimal Group Paradigm Group | 191 |
| Syntactic Bootstrapping Group | 602 |
clean_data %>%
group_by(group) %>%
summarize(count = n()) %>%
kable()
| group | count |
|---|---|
| Linda Group | 422 |
| Minimal Group Paradigm Group | 191 |
| Syntactic Bootstrapping Group | 602 |
How many have inclusion decisions?
clean_data %>%
count(screening_decision)%>%
kable()
| screening_decision | n |
|---|---|
| excldue | 2 |
| exclude | 839 |
| excluded | 1 |
| include | 373 |
Ah, it’s hard to tell! Because people used different conventions. Let’s fix this to use include, exclude, and ?.
How many have inclusion decisions by group?
clean_data %>%
count(screening_decision, group) %>%
kable()
| screening_decision | group | n |
|---|---|---|
| excldue | Syntactic Bootstrapping Group | 2 |
| exclude | Linda Group | 293 |
| exclude | Minimal Group Paradigm Group | 51 |
| exclude | Syntactic Bootstrapping Group | 495 |
| excluded | Linda Group | 1 |
| include | Linda Group | 128 |
| include | Minimal Group Paradigm Group | 140 |
| include | Syntactic Bootstrapping Group | 105 |
Let’s plot this by group
clean_data %>%
count(screening_decision, group) %>%
ggplot(aes(x = group, fill = screening_decision, y = n)) +
geom_bar(stat = "identity") +
ylab("Number of papers entered") +
ggtitle("Literature search counts by group") +
theme_classic() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))