A 3-point scale was used (know less, know the same, know more) to assess how participants perceive scientific advancement. I am interested to see whether age influences participants’ ratings of scientific advancement (i.e. do older adults rate scientific advancement differently to younger adults in this study)? If so, this may be a confounding variable that could influence the results, and further studies should account for this variable when interpreting results.
First, I am going to group the age variables in categories to allow easier interpretation. From the verification report, Experiment 1’s age range is 18-69. This allows us to group as below:
I am thus going to use the mutate() and case_when to create a new variable called generation where I will specify using %in% the age ranges and generation names. Then, I am converting generations to a factor variable and ordering it chronologically (R defaults to alphabetical).
exponefinaldata <- exponefinaldata %>%
mutate(
generations = case_when(
Age %in% 18:24 ~ "GenZ",
Age %in% 25:40 ~ "Millennials",
Age %in% 41:56 ~ "GenX",
Age %in% 57:69 ~ "Boomers"
)
)
exponefinaldata$generations <- factor(exponefinaldata$generations, levels = c("GenZ", "Millennials", "GenX", "Boomers")) #change to factor variable and specify levels chronologically
Next, I am going to create a frequency table with both advancement and generations
generations_table <- table(exponefinaldata$advancement, exponefinaldata$generations)
print(generations_table)
##
## GenZ Millennials GenX Boomers
## -1 28 33 19 5
## 0 41 69 32 17
## 1 14 25 9 2
Displaying frequency table as % of each generation might be more insightful…
prop.table(
generations_table,
margin = 2 #indicates to calculate proportion across columns of the table
)*100
##
## GenZ Millennials GenX Boomers
## -1 33.734940 25.984252 31.666667 20.833333
## 0 49.397590 54.330709 53.333333 70.833333
## 1 16.867470 19.685039 15.000000 8.333333
From the table above, it looks like significantly more Boomers say they know the same before/after reading the headlines (70%), contrasted with approximately 50% of this response from all other generations.
Plotting a bar plot of counts using ggplot() and geom_bar()
ggplot(
exponefinaldata, aes(advancement)) +
geom_bar(aes(y=..prop.., fill=generations), position="dodge") + #plot y as proportion
scale_y_continuous(labels=scales::percent) + #displays scale as a percentage i.e. out of 100
facet_wrap(vars(generations),strip.position = "bottom")
This is in line with what we observed from the frequency table - Boomers show a higher percentage for “know the same” compared to other age groups. To determine whether this observed pattern is significantly different, we need to run a Chi-square test.
A chi-square test is selected as the two variables are of a categorial nature.
chisq.test(generations_table)
##
## Pearson's Chi-squared test
##
## data: generations_table
## X-squared = 5.0736, df = 6, p-value = 0.5344
P value = 0.5344. Therefore, we fail to reject the null hypothesis that no association exists between the two categorical variables generation and perceived scientific advancement (i.e. both variables are independent). We have insufficient evidence to conclude that a participant’s generation and their perceived scientific advance are related.
corrplot() was really cool and I found the process for Q2 much easier and faster! This could also be due to my familiarity with correlation coefficient from stats courses. :)