In this markdown, we’re going to summarize the results of everyone’s literature search for their group project so far. We’re going to do that by reading the data directly from each group’s relevant_studes google spreadsheet.
First, we need the sheet id for each of your google spreadsheets containg your literature searches. This id is found in the URL of the spreadsheet.
Next, let’s read the data from your google spreadsheet directly into R. To do that, we can use the googlesheets4 R package and the read_sheet function.
COLS_WE_CARE_ABOUT <- c("coder_name", "unique_id", "screening_decision", "exclusion_reason")
g1_relevant_studies <- read_sheet(SHEET_ID_G1, "relevant_studies") %>%
select(COLS_WE_CARE_ABOUT) %>%
mutate(group = "Minimal Group Paradigm Group")
g2_relevant_studies <- read_sheet(SHEET_ID_G2, "relevant_studies") %>%
select(COLS_WE_CARE_ABOUT) %>%
mutate(group = "Linda Group")
g4_relevant_studies <- read_sheet(SHEET_ID_G4, 5) %>%
select(COLS_WE_CARE_ABOUT) %>%
bind_rows(read_sheet(SHEET_ID_G4, 6) %>% select(COLS_WE_CARE_ABOUT)) %>%
bind_rows(read_sheet(SHEET_ID_G4, 7) %>% select(COLS_WE_CARE_ABOUT)) %>%
bind_rows(read_sheet(SHEET_ID_G4, 8) %>% select(COLS_WE_CARE_ABOUT)) %>%
mutate(group = "Syntactic Bootstrapping Group")
Combine each group’s relevant studies into a single dataframe. Note that this data is tidy (each row is a single observation).
all_relevant_studies <- bind_rows(g1_relevant_studies,
g2_relevant_studies,
g4_relevant_studies)
Let’s only look at rows that have complete data for coder_name, unique_id, screening_decision, exclusion_reason.
clean_data <- all_relevant_studies %>%
select(group, everything()) %>% # this moves `group` to be the first column
drop_na(group:screening_decision) # drop columns if they don't have complete data for all columns from group to screenting_decision
Let’s see what the data look like. Pring the first 10 rows.
clean_data %>%
slice(1:10) %>%
kable()
| group | coder_name | unique_id | screening_decision | exclusion_reason |
|---|---|---|---|---|
| Linda Group | zoe | morier1984 | exclude | not empirical (review paper) |
| Linda Group | zoe | charness2009 | include | Is this the same paper as the one above? |
| Linda Group | zoe | sides2002 | include | NA |
| Linda Group | zoe | hertwig1999 | include | NA |
| Linda Group | zoe | fiedler1988 | include | NA |
| Linda Group | zoe | bonini2004 | include | NA |
| Linda Group | zoe | wolford1990 | include | NA |
| Linda Group | zoe | agnoli1989 | include | NA |
| Linda Group | zoe | moro2008 | exclude | not empirical (review paper) |
| Linda Group | zoe | dulany1991 | include | NA |
How many papers has our class entered so far?
count(clean_data) %>%
kable()
| n |
|---|
| 94 |
How many papers has our class entered so far?
clean_data %>%
count(group) %>%
kable()
| group | n |
|---|---|
| Linda Group | 32 |
| Syntactic Bootstrapping Group | 62 |
How many have inclusion decisions?
clean_data %>%
count(screening_decision)%>%
kable()
| screening_decision | n |
|---|---|
| ? | 2 |
| ?entire book | 1 |
| exclude | 34 |
| excluded | 24 |
| include | 19 |
| included | 13 |
| not sure- think exclude | 1 |
Ah, it’s hard to tell! Because people used different conventions. Let’s fix this to use include, exclude, and ?.
How many have inclusion decisions by group?
clean_data %>%
count(screening_decision, group) %>%
kable()
| screening_decision | group | n |
|---|---|---|
| ? | Linda Group | 2 |
| ?entire book | Syntactic Bootstrapping Group | 1 |
| exclude | Linda Group | 19 |
| exclude | Syntactic Bootstrapping Group | 15 |
| excluded | Syntactic Bootstrapping Group | 24 |
| include | Linda Group | 11 |
| include | Syntactic Bootstrapping Group | 8 |
| included | Syntactic Bootstrapping Group | 13 |
| not sure- think exclude | Syntactic Bootstrapping Group | 1 |
Let’s plot this by group
clean_data %>%
count(screening_decision, group) %>%
ggplot(aes(x = group, fill = screening_decision, y = n)) +
geom_bar(stat = "identity") +
ylab("Number of papers entered") +
ggtitle("Literature search counts by group") +
theme_classic()