1.My fluency with regards to reproducing the statistics portion was alot easier in comparison to the plots - I still need to work on some exercises that will help me complete it easier 2. I was a bit scared to work on my own github issue since we’ve had some problems where other members had their files deleted accidentally. So I did NOT want to add anymore problems into the mixing pot, but I’ll make sure to try and work on it next week since we have to start a collaborative github document!
Loading in Libraries
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.2 v dplyr 1.0.6
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 1.4.0 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(ggeasy)
library(ggbeeswarm)
library(qualtRics)
library(patchwork)
Reading the data
When reading in the data, using {r echo=TRUE, message=FALSE, warning=FALSE} suppresses the long warning from popping up and the column specifications that occur as a result of the readr package
Haigh_Data_2 <- read_csv(file = "Study 7 data.csv")
Removing lines 2 and 3
In the previous learning log, I used the slice function found by Jia but Torunn also found another cleaner way and that’s by utilizing the qualtRics package.By utilizing the function read_survey to remove the unecessary lines.
Haigh_Data_2 = read_survey("Study 7 data.csv")
Making it into a data frame
When data is in the form of a data frame, it allows for certain functions from dyplr (from the tidyverse package) to be able to work as intended.
Haigh_Data_2 <- as.data.frame(Haigh_Data_2)
Renaming Odd Variables
This is to make it more clearer to the reader of the data as some variable names or just incomprehensible. So it will be changed using the rename() function to change these names.
Haigh_Data_2 <- Haigh_Data_2 %>%
rename(recall_score = SC0, #new name = old name
condition = FL_12_DO,
confidence = GSS)
Planned Exclusions
The planned exclusions for experiment 2 were the same as the ones completed in experiment 1 whereby participant data was filtered by: 1. participants who failed to complete the task 2. participants who admitted to not responding seriously 3. participants who failed the attention check by recalling less than 4 headlines (hence why recall_score >= 4)
Haigh_Data_2 <- Haigh_Data_2 %>%
filter(Finished == 1,
Serious_check == 1,
recall_score >= 4)
Total number of participants
The count() function is used to check the current amount of participants, and the study has it at n = 400 which matches.
count(Haigh_Data_2) #400
## n
## 1 400
Participants split by gender
To get the numbers I created separate data tables and filtered by gender, I’m wondering if there’s a more efficient way to do so - will look into this for next week.
Male_No.2 <- Haigh_Data_2 %>%
filter(Haigh_Data_2$Gender == 1) #Number of male is 150 since I have filtered for everything except those with gender '1'
count(Male_No.2)
## n
## 1 150
Female_No.2 <-Haigh_Data_2 %>%
filter(Haigh_Data_2$Gender == 2)
count(Female_No.2)#248
## n
## 1 248
Neither_No.2 <- Haigh_Data_2 %>%
filter(Haigh_Data_2$Gender == 3)
count(Neither_No.2)#2
## n
## 1 2
Age demographics
The demographics of the participants according to age is split along: 1. Mean 2. Standard deviation 3. The range (between the youngest and oldest) This is done via the summarise() function
Haigh_Data_2 %>%
summarise (mean_age = mean(Age), #33.465
SD = sd(Age), #12.03415
min_age = min(Age),
max_age = max(Age)
)
## mean_age SD min_age max_age
## 1 33.465 12.03415 18 73
Separating the Independent Variables (IV)
Also similar to experiment 1, the condition variable needs to be separated into its component variables: Format and Conflict. By separating them, the block and number portions become obsolete and allow for the other two to become their own separate variables.
Haigh_Data_2 <- Haigh_Data_2 %>%
separate(col = condition,
into = c(
"block",
"number",
"Format",
"Conflict"
))
Changing IVs into factors and renaming them
The as.factor() function is used to change the variables Format and Conflict into factors.
And the two conditions within Conflict were also changed using the recode_factor() to make it closer to the original paper’s descriptive plots
#Changing into factors
Haigh_Data_2$Format <- as.factor(Haigh_Data_2$Format)
Haigh_Data_2$Conflict <- as.factor(Haigh_Data_2$Conflict)
#Renaming conditions
Haigh_Data_2$Conflict <- recode_factor(Haigh_Data_2$Conflict,
Conflict = "Conf.",
Consistent = "Non-Conf.")
Creating new dependent variables (DV)
This creates 6 new variables: Nutritional Confusion, Nutritional Backlash, Mistrust of Expertise, Certainty of Knowledge, Development of Knowledge.And to obtain the mean scores for each dependent variable, we add all the scores and then divide it by the total number of columns.
#Nutritional Confusion Mean
Haigh_Data_2$confusion <- ((Haigh_Data_2$NC_1 + Haigh_Data_2$NC_2 + Haigh_Data_2$NC_3 + Haigh_Data_2$NC_4 + Haigh_Data_2$NC_5 + Haigh_Data_2$NC_6)/6)
#Nutritional Backlash Mean
Haigh_Data_2$backlash <- ((Haigh_Data_2$NBS_1 + Haigh_Data_2$NBS_2 + Haigh_Data_2$NBS_3 + Haigh_Data_2$NBS_4 + Haigh_Data_2$NBS_5 + Haigh_Data_2$NBS_6)/6)
#Mistrust of Expertise Mean
Haigh_Data_2$mistrust <- ((Haigh_Data_2$Mistrust_expertise_1 + Haigh_Data_2$Mistrust_expertise_2 + Haigh_Data_2$Mistrust_expertise_3)/3)
#Certainty of Knowledge Mean
Haigh_Data_2$certainty <- ((Haigh_Data_2$Certainty_sci_know_1 + Haigh_Data_2$Certainty_sci_know_2 + Haigh_Data_2$Certainty_sci_know_3 + Haigh_Data_2$Certainty_sci_know_4 + Haigh_Data_2$Certainty_sci_know_5 + Haigh_Data_2$Certainty_sci_know_6)/6)
#Development of Knowledge Mean
Haigh_Data_2$development <- ((Haigh_Data_2$Development_sci_know_1 + Haigh_Data_2$Development_sci_know_2 +Haigh_Data_2$Development_sci_know_3 +Haigh_Data_2$Development_sci_know_4 +Haigh_Data_2$Development_sci_know_5 +Haigh_Data_2$Development_sci_know_6)/6)
First we add in the name of the plot, then attach the ggplot() function to establish the axes for x and y and the fill allows for colour to be added according to the conflict variable.
Geom_violin() is to add violin plots, geom_beeswarm is used to add points along the violin plot and geom_crossbar is used to showcase both the mean and confidence intervals. The vector fun.y as “mean_ci” shows that we want both the mean and confidence intervals.
ggtittle is to give the plot the title of “Nutritional Confusion” and scale_y_continuous gives the Y axis the title of “Nutritional Confusion” and sets the scale from 1 to 5.
The ggeasy functions pretty much do whats in the name of the function, such as easy_center_title() centres the title.
The rest of the descriptive plots thankfully follow the same structure just with changes respect plot name, y variable, axis and plot titles
Nutritional Confusion Plot
#Nutritional Confusion Plot
Confusion_Plot <- ggplot(Haigh_Data_2,
aes(x = Conflict,
y = confusion,
fill = Conflict)) +
#geom_functions
geom_violin() +
facet_wrap(vars(Format), strip.position = "bottom") +
geom_beeswarm(cex = 0.2) +
geom_crossbar(stat = "summary",
fun.y = "mean_ci",
fill = "white") +
ggtitle(label = "Nutritional Confusion") +
scale_y_continuous(name = "Nutritional Confusion", limits = c(1,5)) +
scale_fill_manual(
values = c("slategray2", "lightpink1")) +
#ggeasy functions
easy_center_title()+
easy_remove_legend()+
easy_remove_x_axis(what = c ("title"))
## Warning: Ignoring unknown parameters: fun.y
print(Confusion_Plot)
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
Nutritional Backlash
Backlash_Plot <- ggplot(Haigh_Data_2,
aes(x = Conflict,
y = backlash,
fill = Conflict)) +
#geom_functions
geom_violin() +
facet_wrap(vars(Format), strip.position = "bottom") +
geom_beeswarm(cex = 0.2) +
geom_crossbar(stat = "summary",
fun.y = "mean_ci",
fill = "white") +
ggtitle(label = "Nutritional Backlash") +
scale_y_continuous(name = "Nutritional Backlash", limits = c(1,5)) +
scale_fill_manual(
values = c("slategray2", "lightpink1")) +
#ggeasy functions
easy_center_title()+
easy_remove_legend()+
easy_remove_x_axis(what = c ("title"))
## Warning: Ignoring unknown parameters: fun.y
print(Backlash_Plot)
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
Mistrust of Expertise
Expertise_Plot <- ggplot(Haigh_Data_2,
aes(x = Conflict,
y = mistrust,
fill = Conflict)) +
#geom_functions
geom_violin() +
facet_wrap(vars(Format), strip.position = "bottom") +
geom_beeswarm(cex = 0.2) +
geom_crossbar(stat = "summary",
fun.y = "mean_ci",
fill = "white") +
ggtitle(label = "Mistrust of Expertise") +
scale_y_continuous(name = "Mistrust of Expertise", limits = c(1,5)) +
scale_fill_manual(
values = c("slategray2", "lightpink1")) +
#ggeasy functions
easy_center_title()+
easy_remove_legend()+
easy_remove_x_axis(what = c ("title"))
## Warning: Ignoring unknown parameters: fun.y
print(Expertise_Plot)
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
Confidence in Scientific Community
Confidence_Plot <- ggplot(Haigh_Data_2,
aes(x = Conflict,
y = confidence,
fill = Conflict)) +
#geom_functions
geom_violin() +
facet_wrap(vars(Format), strip.position = "bottom") +
geom_beeswarm(cex = 0.2) +
geom_crossbar(stat = "summary",
fun.y = "mean_ci",
fill = "white") +
ggtitle(label = "Confidence in Scientific Community") +
scale_y_continuous(name = "Confidence in Scientific Community", limits = c(1,3)) + #limit was changed from 5 to 3
scale_fill_manual(
values = c("slategray2", "lightpink1")) +
#ggeasy functions
easy_center_title()+
easy_remove_legend()+
easy_remove_x_axis(what = c ("title"))
## Warning: Ignoring unknown parameters: fun.y
print(Confidence_Plot)
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
Certainty of Knowledge
Certainty_Plot <- ggplot(Haigh_Data_2,
aes(x = Conflict,
y = certainty,
fill = Conflict)) +
#geom_functions
geom_violin() +
facet_wrap(vars(Format), strip.position = "bottom") +
geom_beeswarm(cex = 0.2) +
geom_crossbar(stat = "summary",
fun.y = "mean_ci",
fill = "white") +
ggtitle(label = "Certainty of Knowledge") +
scale_y_continuous(name = "Certainty of Knowledge", limits = c(1,5)) +
scale_fill_manual(
values = c("slategray2", "lightpink1")) +
#ggeasy functions
easy_center_title()+
easy_remove_legend()+
easy_remove_x_axis(what = c ("title"))
## Warning: Ignoring unknown parameters: fun.y
print(Certainty_Plot)
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
Development of Knowledge
Development_Plot <- ggplot(Haigh_Data_2,
aes(x = Conflict,
y = development,
fill = Conflict)) +
#geom_functions
geom_violin() +
facet_wrap(vars(Format), strip.position = "bottom") +
geom_beeswarm(cex = 0.2) +
geom_crossbar(stat = "summary",
fun.y = "mean_ci",
fill = "white") +
ggtitle(label = "Development of Knowledge") +
scale_y_continuous(name = "Development of Knowledge", limits = c(2.5,5)) + #Changed limit from 1 to 2.5
scale_fill_manual(
values = c("slategray2", "lightpink1")) +
#ggeasy functions
easy_center_title()+
easy_remove_legend()+
easy_remove_x_axis(what = c ("title"))
## Warning: Ignoring unknown parameters: fun.y
print(Development_Plot)
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
When combined, the plots were all squished, but using a fix that Sam suggested work which was changing the figures width and height in the chunk like so, {r, fig.width=10,fig.height=11}.
VoltronPlot2 <- Confusion_Plot + Backlash_Plot + Expertise_Plot + Confidence_Plot + Certainty_Plot + Development_Plot
print(VoltronPlot2)
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
I called it the voltron plot since all the pieces have come together!!