My Coding Goals This Week

  1. Finish the descriptives and demographics for Experiment 2
  2. Improving on my rubber duckying
  3. Trying to fix the GitHub Issue

Successes

  1. I was able to recreate the descriptives, demographic statistics and plots for Experiment 2
  2. I think my rubber ducks have been a lot better throughout this week

Challenges

1.My fluency with regards to reproducing the statistics portion was alot easier in comparison to the plots - I still need to work on some exercises that will help me complete it easier 2. I was a bit scared to work on my own github issue since we’ve had some problems where other members had their files deleted accidentally. So I did NOT want to add anymore problems into the mixing pot, but I’ll make sure to try and work on it next week since we have to start a collaborative github document!

Goals Next Week

  1. To tidy up my code and to bring it all together - it’s currently quite split between learning logs Week 5 and 6, and some better implementations in the learning log were not applied to the original rmarkdown document.
  2. To bring it all together with the group on GitHub

Experiment 2

Loading in Libraries

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.2     v dplyr   1.0.6
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(ggeasy) 
library(ggbeeswarm)
library(qualtRics)
library(patchwork)

Reading the data

When reading in the data, using {r echo=TRUE, message=FALSE, warning=FALSE} suppresses the long warning from popping up and the column specifications that occur as a result of the readr package

Haigh_Data_2 <- read_csv(file = "Study 7 data.csv")

Cleaning up the Data

Removing lines 2 and 3

In the previous learning log, I used the slice function found by Jia but Torunn also found another cleaner way and that’s by utilizing the qualtRics package.By utilizing the function read_survey to remove the unecessary lines.

Haigh_Data_2 = read_survey("Study 7 data.csv")

Making it into a data frame

When data is in the form of a data frame, it allows for certain functions from dyplr (from the tidyverse package) to be able to work as intended.

Haigh_Data_2 <- as.data.frame(Haigh_Data_2)

Renaming Odd Variables

This is to make it more clearer to the reader of the data as some variable names or just incomprehensible. So it will be changed using the rename() function to change these names.

Haigh_Data_2 <- Haigh_Data_2 %>% 
  rename(recall_score = SC0, #new name = old name
         condition = FL_12_DO, 
         confidence = GSS)

Planned Exclusions

The planned exclusions for experiment 2 were the same as the ones completed in experiment 1 whereby participant data was filtered by: 1. participants who failed to complete the task 2. participants who admitted to not responding seriously 3. participants who failed the attention check by recalling less than 4 headlines (hence why recall_score >= 4)

Haigh_Data_2 <- Haigh_Data_2 %>%  
  filter(Finished == 1, 
         Serious_check == 1, 
         recall_score >= 4)

Demographics

Total number of participants

The count() function is used to check the current amount of participants, and the study has it at n = 400 which matches.

count(Haigh_Data_2) #400
##     n
## 1 400

Participants split by gender

To get the numbers I created separate data tables and filtered by gender, I’m wondering if there’s a more efficient way to do so - will look into this for next week.

Male_No.2 <- Haigh_Data_2 %>%
  filter(Haigh_Data_2$Gender == 1) #Number of male is 150 since I have filtered for everything except those with gender '1' 
count(Male_No.2) 
##     n
## 1 150
Female_No.2 <-Haigh_Data_2 %>% 
  filter(Haigh_Data_2$Gender == 2)  
count(Female_No.2)#248
##     n
## 1 248
Neither_No.2 <- Haigh_Data_2 %>%  
  filter(Haigh_Data_2$Gender == 3)
count(Neither_No.2)#2
##   n
## 1 2

Age demographics

The demographics of the participants according to age is split along: 1. Mean 2. Standard deviation 3. The range (between the youngest and oldest) This is done via the summarise() function

Haigh_Data_2 %>% 
  summarise (mean_age = mean(Age), #33.465
             SD = sd(Age), #12.03415
             min_age = min(Age), 
             max_age = max(Age)
  )
##   mean_age       SD min_age max_age
## 1   33.465 12.03415      18      73

Establishing Variables

Separating the Independent Variables (IV)

Also similar to experiment 1, the condition variable needs to be separated into its component variables: Format and Conflict. By separating them, the block and number portions become obsolete and allow for the other two to become their own separate variables.

Haigh_Data_2 <- Haigh_Data_2 %>%  
  separate(col = condition, 
           into = c(
             "block", 
             "number", 
             "Format",
             "Conflict"
           ))

Changing IVs into factors and renaming them

The as.factor() function is used to change the variables Format and Conflict into factors.

And the two conditions within Conflict were also changed using the recode_factor() to make it closer to the original paper’s descriptive plots

#Changing into factors 
Haigh_Data_2$Format <- as.factor(Haigh_Data_2$Format)

Haigh_Data_2$Conflict <- as.factor(Haigh_Data_2$Conflict)

#Renaming conditions 
Haigh_Data_2$Conflict <- recode_factor(Haigh_Data_2$Conflict, 
                                       Conflict = "Conf.",
                                       Consistent = "Non-Conf.")

Creating new dependent variables (DV)

This creates 6 new variables: Nutritional Confusion, Nutritional Backlash, Mistrust of Expertise, Certainty of Knowledge, Development of Knowledge.And to obtain the mean scores for each dependent variable, we add all the scores and then divide it by the total number of columns.

#Nutritional Confusion Mean 
Haigh_Data_2$confusion <- ((Haigh_Data_2$NC_1 + Haigh_Data_2$NC_2 + Haigh_Data_2$NC_3 + Haigh_Data_2$NC_4 + Haigh_Data_2$NC_5 + Haigh_Data_2$NC_6)/6)

#Nutritional Backlash Mean
Haigh_Data_2$backlash <- ((Haigh_Data_2$NBS_1 + Haigh_Data_2$NBS_2 + Haigh_Data_2$NBS_3 + Haigh_Data_2$NBS_4 + Haigh_Data_2$NBS_5 + Haigh_Data_2$NBS_6)/6) 

#Mistrust of Expertise Mean
Haigh_Data_2$mistrust <- ((Haigh_Data_2$Mistrust_expertise_1 + Haigh_Data_2$Mistrust_expertise_2 + Haigh_Data_2$Mistrust_expertise_3)/3)

#Certainty of Knowledge Mean
Haigh_Data_2$certainty <- ((Haigh_Data_2$Certainty_sci_know_1 + Haigh_Data_2$Certainty_sci_know_2 + Haigh_Data_2$Certainty_sci_know_3 + Haigh_Data_2$Certainty_sci_know_4 + Haigh_Data_2$Certainty_sci_know_5 + Haigh_Data_2$Certainty_sci_know_6)/6)

#Development of Knowledge Mean
Haigh_Data_2$development <- ((Haigh_Data_2$Development_sci_know_1 + Haigh_Data_2$Development_sci_know_2 +Haigh_Data_2$Development_sci_know_3 +Haigh_Data_2$Development_sci_know_4 +Haigh_Data_2$Development_sci_know_5 +Haigh_Data_2$Development_sci_know_6)/6)

Descriptive Plots

First we add in the name of the plot, then attach the ggplot() function to establish the axes for x and y and the fill allows for colour to be added according to the conflict variable.

Geom_violin() is to add violin plots, geom_beeswarm is used to add points along the violin plot and geom_crossbar is used to showcase both the mean and confidence intervals. The vector fun.y as “mean_ci” shows that we want both the mean and confidence intervals.

ggtittle is to give the plot the title of “Nutritional Confusion” and scale_y_continuous gives the Y axis the title of “Nutritional Confusion” and sets the scale from 1 to 5.

The ggeasy functions pretty much do whats in the name of the function, such as easy_center_title() centres the title.

The rest of the descriptive plots thankfully follow the same structure just with changes respect plot name, y variable, axis and plot titles

Nutritional Confusion Plot

#Nutritional Confusion Plot 
Confusion_Plot <- ggplot(Haigh_Data_2, 
                         aes(x = Conflict,
                             y = confusion, 
                             fill = Conflict)) +
  #geom_functions
  geom_violin() + 
  facet_wrap(vars(Format), strip.position = "bottom") +
  geom_beeswarm(cex = 0.2) +
  geom_crossbar(stat = "summary",
                fun.y = "mean_ci", 
                fill = "white") +
  ggtitle(label = "Nutritional Confusion") + 
  scale_y_continuous(name = "Nutritional Confusion", limits = c(1,5)) + 
  scale_fill_manual(
    values = c("slategray2", "lightpink1")) +
  #ggeasy functions 
  easy_center_title()+ 
  easy_remove_legend()+
  easy_remove_x_axis(what = c ("title"))
## Warning: Ignoring unknown parameters: fun.y
print(Confusion_Plot)
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`

Nutritional Backlash

Backlash_Plot <- ggplot(Haigh_Data_2, 
                        aes(x = Conflict,
                            y = backlash, 
                            fill = Conflict)) +
  #geom_functions
  geom_violin() + 
  facet_wrap(vars(Format), strip.position = "bottom") +
  geom_beeswarm(cex = 0.2) +
  geom_crossbar(stat = "summary",
                fun.y = "mean_ci", 
                fill = "white") +
  ggtitle(label = "Nutritional Backlash") + 
  scale_y_continuous(name = "Nutritional Backlash", limits = c(1,5)) + 
  scale_fill_manual(
    values = c("slategray2", "lightpink1")) +
  #ggeasy functions 
  easy_center_title()+ 
  easy_remove_legend()+
  easy_remove_x_axis(what = c ("title"))
## Warning: Ignoring unknown parameters: fun.y
print(Backlash_Plot)
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`

Mistrust of Expertise

Expertise_Plot <- ggplot(Haigh_Data_2, 
                        aes(x = Conflict,
                             y = mistrust, 
                             fill = Conflict)) +
#geom_functions
geom_violin() + 
  facet_wrap(vars(Format), strip.position = "bottom") +
geom_beeswarm(cex = 0.2) +
geom_crossbar(stat = "summary",
              fun.y = "mean_ci", 
              fill = "white") +
ggtitle(label = "Mistrust of Expertise") + 
scale_y_continuous(name = "Mistrust of Expertise", limits = c(1,5)) + 
    scale_fill_manual(
      values = c("slategray2", "lightpink1")) +
#ggeasy functions 
  easy_center_title()+ 
  easy_remove_legend()+
  easy_remove_x_axis(what = c ("title"))
## Warning: Ignoring unknown parameters: fun.y
print(Expertise_Plot)
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`

Confidence in Scientific Community

Confidence_Plot <- ggplot(Haigh_Data_2, 
                        aes(x = Conflict,
                             y = confidence, 
                             fill = Conflict)) +
#geom_functions
geom_violin() + 
  facet_wrap(vars(Format), strip.position = "bottom") +
geom_beeswarm(cex = 0.2) +
geom_crossbar(stat = "summary",
              fun.y = "mean_ci", 
              fill = "white") +
ggtitle(label = "Confidence in Scientific Community") + 
scale_y_continuous(name = "Confidence in Scientific Community", limits = c(1,3)) + #limit was changed from 5 to 3 
    scale_fill_manual(
      values = c("slategray2", "lightpink1")) +
#ggeasy functions 
  easy_center_title()+ 
  easy_remove_legend()+
  easy_remove_x_axis(what = c ("title"))
## Warning: Ignoring unknown parameters: fun.y
print(Confidence_Plot)
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`

Certainty of Knowledge

Certainty_Plot <- ggplot(Haigh_Data_2, 
                        aes(x = Conflict,
                             y = certainty, 
                             fill = Conflict)) +
#geom_functions
geom_violin() + 
  facet_wrap(vars(Format), strip.position = "bottom") +
geom_beeswarm(cex = 0.2) +
geom_crossbar(stat = "summary",
              fun.y = "mean_ci", 
              fill = "white") +
ggtitle(label = "Certainty of Knowledge") + 
scale_y_continuous(name = "Certainty of Knowledge", limits = c(1,5)) + 
    scale_fill_manual(
      values = c("slategray2", "lightpink1")) +
#ggeasy functions 
  easy_center_title()+ 
  easy_remove_legend()+
  easy_remove_x_axis(what = c ("title"))
## Warning: Ignoring unknown parameters: fun.y
print(Certainty_Plot)
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`

Development of Knowledge

Development_Plot <- ggplot(Haigh_Data_2, 
                        aes(x = Conflict,
                             y = development, 
                             fill = Conflict)) +
#geom_functions
geom_violin() + 
  facet_wrap(vars(Format), strip.position = "bottom") +
geom_beeswarm(cex = 0.2) +
geom_crossbar(stat = "summary",
              fun.y = "mean_ci", 
              fill = "white") +
ggtitle(label = "Development of Knowledge") + 
scale_y_continuous(name = "Development of Knowledge", limits = c(2.5,5)) + #Changed limit from 1 to 2.5 
    scale_fill_manual(
      values = c("slategray2", "lightpink1")) +
#ggeasy functions 
  easy_center_title()+ 
  easy_remove_legend()+
  easy_remove_x_axis(what = c ("title"))
## Warning: Ignoring unknown parameters: fun.y
print(Development_Plot)
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`

Combining all the plots together

When combined, the plots were all squished, but using a fix that Sam suggested work which was changing the figures width and height in the chunk like so, {r, fig.width=10,fig.height=11}.

VoltronPlot2 <- Confusion_Plot + Backlash_Plot + Expertise_Plot + Confidence_Plot + Certainty_Plot + Development_Plot

print(VoltronPlot2)
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`
## No summary function supplied, defaulting to `mean_se()`

I called it the voltron plot since all the pieces have come together!!