The spread of science denial and anti-science misinformation is a critical concern (McIntyre, 2019; Pertwee et al., 2022), highlighting the need for improved science literacy to help the public navigate the ‘post-truth’ world. Science literacy is more than simply recognizing facts; it also includes knowledge of scientific inquiry, the ability to critically evaluate claims, and recognition of the interrelationships between science, the humanities, society, and everyday life (National Academies of Sciences, Engineering, and Medicine, 2016). The significance of this aspect of science literacy is corroborated by research linking anti-intellectualism and the perception of science as limited to the spread of misinformation (Morgan et al., 2018; Chen et al., 2023; Merkley & Loewen, 2021).
Identity, particularly political identity, also plays a role in susceptibility to misinformation (Szebeni et al., 2021; Pereira et al., 2023), with emerging work on other identity domains (Oyserman & Dawson, 2020; Kahan, 2017; Takahashi et al., 2023; Chen et al., 2020). In response, science identity – the extent to which an individual sees themselves as a scientist (Gee, 2000; Carlone & Johnson, 2007), often linked to persistence and academic performance (Lockhart et al., 2022; Robinson et al., 2019) – emerges as a potential avenue for fostering science literacy and reducing susceptibility to misinformation (Većkalov et al., 2022; Lapsley & Chaloner, 2020; Lucas, 2021). This is potentially crucial for those in science-adjacent fields like nursing and allied health who, despite taking many basic and applied science courses, are often not recognized as scientists by others (Perkins et al., 2023), are a potential source of vaccine hesitancy and science-related populism (Mede et al., 2020; McCready et al., 2023), but are also a potential resource in combating misinformation in communities both off- and online (Fotsch, 2022; Bautista et al., 2021).
The study explores the relationship between three dimensions of science identity and two dimensions of science literacy (science applications and course connections) among a sample of community college biology students, the majority of whom are interested in nursing or allied health careers.
Biology students attending a diverse community college in northern California (n = 426) were asked to complete a pre/post survey in the Fall 2018, 2019, and 2020 quarters for extra credit. Students were enrolled in either an introductory biology course (n = 116), an anatomy and physiology sequence (n = 269), or a microbiology course (n = 114). The majority of students reported interest in nursing careers (n = 180) or allied health careers (e.g., dietician, kinesiologist, radiologist; n = 156). The majority of students also identified as Asian (n = 192), Latino/a/x (n = 98), or White (n = 85), and most were female (n = 316).
Students completed a measure of science identity that asked about recognition (ɑ = .87), performance/competence (ɑ = .87), and interest (ɑ = .87). Two sets of items were used as measures of science literacy: science applications (ɑ = .85) and content connections (ɑ = .83). Science applications asked students to rate their agreement with statements about their scientific reasoning habits (“connecting key ideas I learn in my classes with other knowledge” and “using systematic reasoning in my approach to problems”). Content connections asked students to rate their agreement with statements about connecting ideas in their classes and the real world (“ideas and concepts we explore in this class relate to those that I have encountered in classes outside of this subject area” and “studying this subject can help to address real world issues”). Both measures are derived from the Student Assessment of Learning Gains (Seymour et al., 2000; Perkins et al., 2023).
Twenty-four outliers were detected using Mahalnobis’ distance and removed (final n = 402). Two regression models were used to examine the impact of science identity on science applications and content connections. The predictors were post-test scores of recognition, performance/competence, and interest, with pre-test scores in all science identity and literacy measures entered as covariates. Final course grade was also entered as a covariate to control for differences in self-efficacy and satisfaction with the course. We checked VIF, linearity of residuals, and homoscedasticity using diagnostic plots and the studentized Breusch-Pagan test.
Bautista, J. R., Zhang, Y., & Gwizdka, J. (2021). US Physicians’ and Nurses’ Motivations, Barriers, and Recommendations for Correcting Health Misinformation on Social Media: Qualitative Interview Study. JMIR Public Health and Surveillance, 7(9), e27715. https://doi.org/10.2196/27715
Carlone, H. B., & Johnson, A. (2007). Understanding the science experiences of successful women of color: Science identity as an analytic lens. Journal of Research in Science Teaching, 44(8), 1187–1218. https://doi.org/10.1002/tea.20237
Chen, K., Shao, A., Jin, Y., & Ng, A. (2020). I Am Proud of My National Identity and I Am Superior To You: The Role of Nationalism in Knowledge and Misinformation (SSRN Scholarly Paper 3758287). https://doi.org/10.2139/ssrn.3758287
Chen, Y., Long, J., Jun, J., Kim, S.-H., Zain, A., & Piacentine, C. (2023). Anti-intellectualism amid the COVID-19 pandemic: The discursive elements and sources of anti-Fauci tweets. Public Understanding of Science (Bristol, England), 32(5), 641. https://doi.org/10.1177/09636625221146269
Fotsch, R. (2022). Who to Believe? Consequences for Physicians and Nurses Who Spread Misinformation. Journal of Nursing Regulation, 13(1), 70–72. https://doi.org/10.1016/S2155-8256(22)00036-9
Gee, J. P. (2000). Identity as an Analytic Lens for Research in Education. Review of Research in Education, 25, 99–125. https://doi.org/10.2307/1167322
Kahan, D. M. (2017). Misconceptions, Misinformation, and the Logic of Identity-Protective Cognition (SSRN Scholarly Paper 2973067). https://doi.org/10.2139/ssrn.2973067
Lapsley, D., & Chaloner, D. (2020). Post-truth and science identity: A virtue-based approach to science education. Educational Psychologist, 55(3), 132–143. https://doi.org/10.1080/00461520.2020.1778480
Lockhart, M. E., Kwok, O.-M., Yoon, M., & Wong, R. (2022). An important component to investigating STEM persistence: The development and validation of the science identity (SciID) scale. International Journal of STEM Education, 9(1), 34. https://doi.org/10.1186/s40594-022-00351-1
Lucas, K. L. (2021). Examining Science Identity Work and Scientific Literacy in Non-STEM Majors [Ph.D., University of California, Santa Barbara]. https://www.proquest.com/docview/2522191041/abstract/151C83F54D824507PQ/1
McCready, J. L., Nichol, B., Steen, M., Unsworth, J., Comparcini, D., & Tomietto, M. (2023). Understanding the barriers and facilitators of vaccine hesitancy towards the COVID-19 vaccine in healthcare workers and healthcare students worldwide: An Umbrella Review. PLOS ONE, 18(4), e0280439. https://doi.org/10.1371/journal.pone.0280439
McIntyre, L. (2019). The Scientific Attitude: Defending Science from Denial, Fraud, and Pseudoscience. MIT Press.
Mede, N. G., & Schäfer, M. S. (2020). Science-related populism: Conceptualizing populist demands toward science. Public Understanding of Science, 29(5), 473–491. https://doi.org/10.1177/0963662520924259
Merkley, E., & Loewen, P. J. (2021). Anti-intellectualism and the mass public’s response to the COVID-19 pandemic. Nature Human Behaviour, 5(6), Article 6. https://doi.org/10.1038/s41562-021-01112-w
Morgan, M., Collins, W. B., Sparks, G. G., & Welch, J. R. (2018). Identifying Relevant Anti-Science Perceptions to Improve Science-Based Communication: The Negative Perceptions of Science Scale. Social Sciences, 7(4), Article 4. https://doi.org/10.3390/socsci7040064
National Academies of Sciences, Engineering, and Medicine. (2016). Science literacy: Concepts, contexts, and consequences. Washington, DC: The National Academies Press. https://doi.org/10.17226/23595.
Oyserman, D., & Dawson, A. (2020). Your Fake News, Our Facts. In R. Greifeneder, M. Jaffe, E. Newman, & N. Schwarz, The Psychology of Fake News (1st ed., pp. 173–195). Routledge. https://doi.org/10.4324/9780429295379-13
Pereira, A., Harris, E., & Van Bavel, J. J. (2023). Identity concerns drive belief: The impact of partisan identity on the belief and dissemination of true and false news. Group Processes and Intergroup Relations, 26(1), 24–47. https://doi.org/10.1177/13684302211030004
Perkins, H., Royse, E. A., Cooper, S., Kurushima, J. D., & Schinske, J. N. (2023). Are there any “science people” in undergraduate health science courses? Assessing science identity among pre-nursing and pre-allied health students in a community college setting. Journal of Research in Science Teaching, n/a(n/a). https://doi.org/10.1002/tea.21902
Pertwee, E., Simas, C., & Larson, H. J. (2022). An epidemic of uncertainty: Rumors, conspiracy theories and vaccine hesitancy. Nature Medicine, 28(3), Article 3. https://doi.org/10.1038/s41591-022-01728-z
Robinson, K. A., Perez, T., Carmel, J. H., & Linnenbrink-Garcia, L. (2019). Science identity development trajectories in a gateway college chemistry course: Predictors and relations to achievement and STEM pursuit. Contemporary Educational Psychology, 56, 180–192. https://doi.org/10.1016/j.cedpsych.2019.01.004
Seymour, E., Wiese, D., Hunter, A., & Daffinrud, S. M. (2000). Creating a better mousetrap: On-line student assessment of their learning gains. In National Meeting of the American Chemical Society (pp. 1-40). San Francisco, CA, USA: National Institute of Science Education, University of Wisconsin-Madison.
Szebeni, Z., Lönnqvist, J.-E., & Jasinskaja-Lahti, I. (2021). Social Psychological Predictors of Belief in Fake News in the Run-Up to the 2019 Hungarian Elections: The Importance of Conspiracy Mentality Supports the Notion of Ideological Symmetry in Fake News Belief. Frontiers in Psychology, 12. https://www.frontiersin.org/articles/10.3389/fpsyg.2021.790848
Takahashi, K., Jefferson, H., & Earl, A. (2023). White racial identity, political attitudes, and selective exposure to information about racis. https://doi.org/10.31234/osf.io/s52bz
Većkalov, B., Zarzeczna, N., McPhetres, J., van Harreveld, F., & Rutjens, B. T. (2024). Psychological Distance to Science as a Predictor of Science Skepticism Across Domains. Personality and Social Psychology Bulletin, 50(1), 18–37. https://doi.org/10.1177/01461672221118184
This steps loads the libraries used in the rest of the analysis.
library(expss) # for cross_cases()
library(plyr) # for rbind.fill()
library(sjPlot) # for regression plots
library(psych) # for describe()
library(kableExtra) # for tables
library(corrplot) #for correlation plots
library(ggplot2) # for plots
library(ggpubr) # for plots
library(lmtest) # check homoscedasticity
library(tidyr) # for gather()
library(car) # for VIF
This step loads the data that was cleaned in a previous step. Scores have been previously calculated for all of the scales in this study.
d <- read.csv(file="data/prepost_clean.csv", header=T)
d2 <- subset(d, select=-c(sapp_pr, sapp_po, conn_pr, conn_po))
This step creates a table of the descriptive statistics, and bolds/italicizes any variables that have kurtosis/skew outside of the desired range (-2/+2).
desc <- describe(subset(d2, select=-c(1,12:16)))
kable(round(desc, digits = 2)) %>%
kable_styling() %>%
row_spec(which(desc$kurtosis > 2), bold = T) %>%
row_spec(which(desc$kurtosis < -2), bold = T) %>%
row_spec(which(desc$skew > 2), italic = T) %>%
row_spec(which(desc$skew < -2), italic = T)
| vars | n | mean | sd | median | trimmed | mad | min | max | range | skew | kurtosis | se | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| gen_pr | 1 | 426 | 3.48 | 1.04 | 3.50 | 3.53 | 0.74 | 1.00 | 5 | 4.00 | -0.41 | -0.37 | 0.05 |
| rec_pr | 2 | 426 | 3.30 | 0.95 | 3.33 | 3.32 | 0.99 | 1.00 | 5 | 4.00 | -0.31 | -0.30 | 0.05 |
| pc_pr | 3 | 426 | 3.78 | 0.76 | 4.00 | 3.81 | 0.74 | 1.25 | 5 | 3.75 | -0.40 | -0.11 | 0.04 |
| onint_pr | 4 | 426 | 4.20 | 0.76 | 4.25 | 4.29 | 0.74 | 1.00 | 5 | 4.00 | -0.99 | 0.97 | 0.04 |
| appcon_pr | 5 | 426 | 4.14 | 0.61 | 4.17 | 4.18 | 0.74 | 1.00 | 5 | 4.00 | -0.73 | 1.18 | 0.03 |
| gen_po | 6 | 426 | 3.96 | 0.90 | 4.00 | 4.06 | 0.74 | 1.00 | 5 | 4.00 | -0.74 | 0.15 | 0.04 |
| rec_po | 7 | 426 | 3.63 | 0.94 | 3.67 | 3.69 | 0.99 | 1.00 | 5 | 4.00 | -0.45 | -0.10 | 0.05 |
| pc_po | 8 | 426 | 3.93 | 0.77 | 4.00 | 3.98 | 0.74 | 1.00 | 5 | 4.00 | -0.55 | 0.10 | 0.04 |
| onint_po | 9 | 426 | 4.29 | 0.74 | 4.50 | 4.39 | 0.74 | 1.25 | 5 | 3.75 | -1.12 | 1.10 | 0.04 |
| appcon_po | 10 | 426 | 4.27 | 0.63 | 4.33 | 4.33 | 0.74 | 1.50 | 5 | 3.50 | -0.80 | 0.51 | 0.03 |
d <- na.omit(subset(d2, select=-c(12:16)))
m_dist <- mahalanobis(d[-1], colMeans(d[-1]), cov(d[-1]))
d$MD <- round(m_dist, 1)
plot(d$MD)
describe(m_dist)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 426 9.98 6.79 7.86 8.93 5.08 1.12 41.24 40.12 1.52 2.47 0.33
cut <- qchisq(.95, df=(ncol(d)-1))
abline(a=cut, b=0, col="red")
d$outlier <- F
d$outlier[d$MD > cut] <- T
table(d$outlier)
##
## FALSE TRUE
## 383 43
outs <- subset(d, select=c(AnonymousID, outlier), outlier == T)
d3 <- subset(d2, !(AnonymousID %in% outs$AnonymousID))
This step creates a correlation plot of the variables.
corrout <- corr.test(subset(d3, select=-c(1,12:16)))
corrplot(corrout$r, order="hclust", type="lower")
corrout$r
## gen_pr rec_pr pc_pr onint_pr appcon_pr gen_po rec_po
## gen_pr 1.0000000 0.7702534 0.6062522 0.5643257 0.5259461 0.5898872 0.5673558
## rec_pr 0.7702534 1.0000000 0.5876059 0.4867697 0.4513462 0.5788235 0.6416866
## pc_pr 0.6062522 0.5876059 1.0000000 0.5264379 0.5952228 0.4530899 0.4982170
## onint_pr 0.5643257 0.4867697 0.5264379 1.0000000 0.6235241 0.4799346 0.4133101
## appcon_pr 0.5259461 0.4513462 0.5952228 0.6235241 1.0000000 0.4017706 0.3676509
## gen_po 0.5898872 0.5788235 0.4530899 0.4799346 0.4017706 1.0000000 0.7504018
## rec_po 0.5673558 0.6416866 0.4982170 0.4133101 0.3676509 0.7504018 1.0000000
## pc_po 0.4439928 0.4406924 0.5882353 0.3840966 0.4365744 0.6957276 0.6611013
## onint_po 0.4756908 0.4127618 0.4520488 0.6171117 0.4826256 0.6296622 0.5550993
## appcon_po 0.4192606 0.3916757 0.4867315 0.4344190 0.5503966 0.5936771 0.5745589
## pc_po onint_po appcon_po
## gen_pr 0.4439928 0.4756908 0.4192606
## rec_pr 0.4406924 0.4127618 0.3916757
## pc_pr 0.5882353 0.4520488 0.4867315
## onint_pr 0.3840966 0.6171117 0.4344190
## appcon_pr 0.4365744 0.4826256 0.5503966
## gen_po 0.6957276 0.6296622 0.5936771
## rec_po 0.6611013 0.5550993 0.5745589
## pc_po 1.0000000 0.5579346 0.6789098
## onint_po 0.5579346 1.0000000 0.7006650
## appcon_po 0.6789098 0.7006650 1.0000000
This step recodes several variables at once. Some recoding happened in an earlier step.
This section was dropped so the code is commented out. Kept for future reference.
# d$rac_b <- 0
# d$rac_a <- 0
# d$rac_l <- 0
# d$rac_m <- 0
# d$rac_w <- 0
# d$rac_b[d$eth2 == "African American/Black"] <- 1
# d$rac_a[d$eth2 == "Asian"] <- 1
# d$rac_l[d$eth2 == "Latinx"] <- 1
# d$rac_m[d$eth2 == "Multiracial"] <- 1
# d$rac_w[d$eth2 == "White"] <- 1
# d$gen_m <- 0
# d$gen_f <- 0
# d$gen_m[d$gender == "M"] <- 1
# d$gen_f[d$gender == "F"] <- 1
Letter grades are recoded as high or low. Grades lower than a B- are coded as low.
d3$grade_rc[d3$grade == "A"] <- "high"
d3$grade_rc[d3$grade == "A-"] <- "high"
d3$grade_rc[d3$grade == "A+"] <- "high"
d3$grade_rc[d3$grade == "B"] <- "high"
d3$grade_rc[d3$grade == "B-"] <- "high"
d3$grade_rc[d3$grade == "B+"] <- "high"
d3$grade_rc[d3$grade == "C"] <- "low"
d3$grade_rc[d3$grade == "C+"] <- "low"
d3$grade_rc[d3$grade == "D"] <- "low"
d3$grade_rc[d3$grade == "D+"] <- "low"
d3$grade_rc[d3$grade == "F"] <- "low"
d3$grade_rc[d3$grade == "P"] <- NA
Students’ career interest in nursing or allied health is indicated here. Previous variables pre_Career and post_Career are used. If participants indicated an N or AH to pre_Career and post_Career, they are recoded as NAH = 1.
d3$nah <- 0
d3$nah[d3$pre_Career == "N" & d3$post_Career == "N"] <- 1
d3$nah[d3$pre_Career == "AH" & d3$post_Career == "AH"] <- 1
d3$nah <- as.factor(d3$nah)
table(d3$nah, useNA = "always")
##
## 0 1 <NA>
## 86 297 0
table(d3$post_Career, useNA = "always")
##
## AH N Non-STEM Other Health STEM Veterinary
## 147 174 5 31 4 22
## <NA>
## 0
subset_data <- subset(d, select=c(6,3:5))
pairs.panels(subset_data, pch = 16, lm = TRUE)
subset_data <- subset(d, select=c(11,8:10))
pairs.panels(subset_data, pch = 16, lm = TRUE)
Standardized items.
Applications and connections is the outcome. The predictors are post-test in recognition, performance/competence, and ongoing interest. Pre-test scores (recognition, performance/competence, ongoing interest, applications and connections) and grades are entered as controls.
Why pre-test controls? Wanted to control for pre-existing differences in science identity, content connections, and science application, in order to ensure that effects are due to changes that occurred in the class.
Overall the model is significant (p < .001) and R^2 is .65
d3$appcon_pr <- scale(d3$appcon_pr, center = T, scale = T)
d3$appcon_po <- scale(d3$appcon_po, center = T, scale = T)
d3$rec_pr <- scale(d3$rec_pr, center = T, scale = T)
d3$rec_po <- scale(d3$rec_po, center = T, scale = T)
d3$pc_pr <- scale(d3$pc_pr, center = T, scale = T)
d3$pc_po <- scale(d3$pc_po, center = T, scale = T)
d3$onint_pr <- scale(d3$onint_pr, center = T, scale = T)
d3$onint_po <- scale(d3$onint_po, center = T, scale = T)
regout <- lm(data=d3, appcon_po ~ rec_po + pc_po + onint_po + rec_pr + pc_pr + onint_pr + appcon_pr + grade_rc)
plot(regout, 1)
plot(regout, 4)
plot(regout, 5)
plot(regout, 3)
vif(regout)
## rec_po pc_po onint_po rec_pr pc_pr onint_pr appcon_pr grade_rc
## 2.609836 2.361217 2.151263 2.176797 2.299927 2.239105 2.001382 1.087889
bptest(regout)
##
## studentized Breusch-Pagan test
##
## data: regout
## BP = 15.597, df = 8, p-value = 0.04853
summary(regout)
##
## Call:
## lm(formula = appcon_po ~ rec_po + pc_po + onint_po + rec_pr +
## pc_pr + onint_pr + appcon_pr + grade_rc, data = d3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.27376 -0.38103 0.04293 0.33585 1.63500
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.01250 0.03376 0.370 0.711292
## rec_po 0.11467 0.04889 2.345 0.019534 *
## pc_po 0.32074 0.04651 6.897 2.28e-11 ***
## onint_po 0.43987 0.04439 9.909 < 2e-16 ***
## rec_pr -0.04902 0.04465 -1.098 0.273032
## pc_pr -0.02329 0.04590 -0.507 0.612221
## onint_pr -0.15247 0.04529 -3.367 0.000839 ***
## appcon_pr 0.28432 0.04282 6.641 1.10e-10 ***
## grade_rclow -0.06746 0.08112 -0.832 0.406222
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5915 on 374 degrees of freedom
## Multiple R-squared: 0.6574, Adjusted R-squared: 0.6501
## F-statistic: 89.72 on 8 and 374 DF, p-value: < 2.2e-16
plot_model(regout, type="est")
The relationship between post-test performance/competence and science applications was significant (b = .32, p < .001). Increase in performance/competence is associated with increased applications and connections.
# Extract coefficients from the model
intercept <- coef(regout)[1]
slope <- coef(regout)[3]
# Create a sequence of values for x
x_values <- seq(-3.5, 1, length.out = 100)
# Create a dataframe for plotting the regression line
regression_line_data <- data.frame(x = x_values, y = intercept + slope * x_values)
plot1<- ggplot(d3, aes(x = pc_po, y = appcon_po)) +
geom_jitter(size = 4, width = .4, height = .4, alpha = .7) +
geom_line(data = regression_line_data, aes(x = x, y = y), color = "blue", size = 1.5) +
ylim(-3.5,1) + ylab("Applications & Connections") + xlab("Performance/Competence") + xlim(-3.5,1) +
theme_minimal()
plot1
The relationship between post-test ongoing interest and science application was significant (b = .43, p < .001). Increase in ongoing interest is associated with increased applications and connections.
# Extract coefficients from the model
intercept <- coef(regout)[1]
slope <- coef(regout)[4]
# Create a sequence of values for x
x_values <- seq(-3.5, 1, length.out = 100)
# Create a dataframe for plotting the regression line
regression_line_data <- data.frame(x = x_values, y = intercept + slope * x_values)
plot2 <- ggplot(d3, aes(x = onint_po, y = appcon_po)) +
geom_jitter(size = 4, width = .4, height = .4, alpha = .7) +
geom_line(data = regression_line_data, aes(x = x, y = y), color = "blue", size = 1.5) +
ylim(-3.5,1) + ylab("Applications & Connections") + xlab("Interest") + xlim(-3.5,1) +
theme_minimal()
plot2
The relationship between post-test recognition and science application was significant (b = .11, p = .019). Increase in recognition is associated with increased applications and connections.
# Extract coefficients from the model
intercept <- coef(regout)[1]
slope <- coef(regout)[2]
# Create a sequence of values for x
x_values <- seq(-3.5, 1, length.out = 100)
# Create a dataframe for plotting the regression line
regression_line_data <- data.frame(x = x_values, y = intercept + slope * x_values)
plot3 <- ggplot(d3, aes(x = rec_po, y = appcon_po)) +
geom_jitter(size = 4, width = .4, height = .4, alpha = .7) +
geom_line(data = regression_line_data, aes(x = x, y = y), color = "blue", size = 1.5) +
ylim(-3.5,1) + ylab("Applications & Connections") + xlab("Recognition") + xlim(-3.5,1) +
theme_minimal()
plot3
ggarrange(plot1 + rremove("ylab") + rremove("y.text") + rremove("y.ticks"), plot2 + rremove("ylab") + rremove("y.text") + rremove("y.ticks"), plot3 + rremove("ylab") + rremove("y.text") + rremove("y.ticks"),
ncol = 3)