Summary of Major Findings

Based on everything you’ve run so far, I would organize the findings into Primary Findings, Secondary Findings, and Exploratory Findings.

1. Primary Findings

1.1 Students demonstrated significant gains in research skills

The modified Research Skills Questionnaire (RSQ) showed significant pre/post improvements, particularly in:

Designing experimental protocols (RSQ5)
Conducting literature searches
Interpreting data
Performing research-related tasks

The strongest improvement was observed for:

RSQ5: Design my own experimental lab protocol

This item showed the largest gain and aligns closely with the goals of the semester-long research experience.

1.2 Students demonstrated gains in experimental design competency (EDCI)

Students improved their ability to:

Evaluate experimental design
Identify controls
Interpret experimental outcomes
Apply scientific reasoning

Lower-performing students at baseline demonstrated the largest gains.

Evidence

Students in the lowest baseline EDCI quartile:

Increased by approximately 4.2 points

Students in the highest baseline quartile:

Slightly decreased (~1.4 points)

This suggests a strong ceiling effect and indicates the course may be particularly beneficial for students entering with lower experimental design competency.

1.3 Baseline performance was the strongest predictor of learning gains

Across all three outcome measures:

EDCI

EDCI_Pre → EDCI_Gain
p < .001

RSQ

RSQ_Pre → RSQ_Gain
p = .002

STEP

STEP_Pre → STEP_Gain
p < .001

Students with lower initial scores consistently showed larger gains.

2. Grit Findings

2.1 Grit predicted experimental design competency outcomes

After controlling for baseline EDCI scores:

Grit_Total
p = .016

Higher grit was associated with higher post-course EDCI scores.

This was one of the strongest non-baseline predictors identified in the study.

Perseverance of Effort

Represents:

Persistence through difficulty
Continuing after setbacks

Associated with:

Mastery efficiency
First-attempt mastery outcomes

2.3 Female students entered the course with higher grit

Total Grit

Female > Male
p = .011

Consistency of Interests

Female > Male
p = .017

Thus, female students reported greater long-term commitment and focus at the beginning of the course.

3. Gender Interactions

3.1 Consistency of Interests interacted with gender when predicting EDCI gains

Model:

EDCI_Gain ~ Consistency_Gain × Gender

Significant interaction

Consistency × Male
p = .020

Interpretation:

For male students, increases in consistency were associated with larger EDCI gains.
For female students, the relationship was much weaker.

This suggests consistency of interests may be particularly important for male students in semester-long research experiences.

3.2 Similar gender trends appeared for mastery efficiency

Model:

Mean_Attempt ~ Perseverance × Gender

Trend-level interaction

p = .058

Interpretation:

Higher perseverance may be associated with achieving mastery in fewer attempts among males.
Little relationship was observed among females.

This should be reported as exploratory.

4. Mastery Learning Findings

4.1 Ultimate mastery attainment was nearly universal

Most students achieved mastery on:

5 or 6 of the 6 skills assessments

This produced a strong ceiling effect.

4.2 Mastery attainment was not predicted by measured student characteristics

The following variables did not predict total skills mastered:

Academic preparation

Psychological variables

Grit

Research ownership

POS Cognitive
POS Emotional

CURE experiences

LCAS Collaboration
LCAS Discovery
LCAS Iteration

Cluster membership

High-engagement vs low-engagement profiles

Thus:

Students generally achieved mastery regardless of background characteristics.

4.3 Perseverance predicted mastery efficiency

When mastery was examined as:

Mean number of attempts required

Higher perseverance was associated with:

Fewer attempts needed to achieve mastery

Number of first-attempt passes

Higher perseverance was associated with:

More first-attempt mastery successes

These findings suggest perseverance may influence how quickly mastery is achieved rather than whether mastery is ultimately achieved.

5. LCAS Findings

5.1 Collaboration predicted attitudes toward science (STEP)

Among LCAS dimensions:

Collaboration

p ≈ .01

predicted STEP gains.

Students who reported greater collaboration showed larger improvements in science attitudes.

5.2 Discovery and Iteration did not predict learning gains

Neither:

Discovery/Relevance
Iteration

significantly predicted:

EDCI gains
RSQ gains
STEP gains
Mastery outcomes

6. Project Ownership Findings

6.1 Project ownership did not predict learning gains

Neither:

POS Cognitive Ownership

nor

POS Emotional Ownership

significantly predicted:

EDCI gains
RSQ gains
STEP gains
Mastery outcomes

6.2 Students differed in ownership experiences

Cluster analysis identified:

Cluster 1: High-Ownership Researchers

Higher:

POS Cognitive
POS Emotional
Discovery
Iteration
Collaboration

Cluster 2: Low-Ownership Researchers

Lower scores on these measures.

6.3 Ownership profiles did not differ in outcomes

Despite large differences in perceived experiences:

Clusters showed similar:

EDCI gains
RSQ gains
STEP gains
Mastery attainment

This suggests students experienced the course differently while achieving similar educational outcomes.

Overall Take-Home Message

The strongest overarching findings are:

Students improved in research skills and experimental design competency.
Students entering with lower competency showed the greatest gains.
Grit emerged as the most consistent non-academic predictor of success.
Consistency of interests was associated with research competency development.
Perseverance of effort was associated with mastery efficiency.
Collaboration predicted positive science attitudes.
Mastery was ultimately achieved by most students regardless of background characteristics.
Students experienced the research course differently (ownership, discovery, iteration), but those differences did not translate into differences in learning outcomes.

#Excel data cleaning - removed all responses less than 30% complete. This is 4 post-course surveys and 11 pre-course suverys - if someone submitted more than one at a timepoint, the most complete was kept

dat=read.csv("POBRebootCombined.csv")
names=read.csv("POB1_StudentIDs.csv")
demo=read.csv("StudentDemo.csv")

#install.packages("dplyr")
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

#identify names from student IDs and create "first name" "last name" variables.
dat_ID <- merge(dat, names, by = "StudentID", all.x = TRUE)

dat_full=dat_ID #this is the dataset used in analysis

demo = demo %>%
  dplyr::rename(StudentID = Student.ID)

dat_full=merge(dat_full, demo, by = "StudentID", all.x = TRUE)

dat_full <- dat_full %>%
   dplyr::rename(Cohort.TR = Cohort) %>% #change name as this column has both the cohort and indicator of transfer status
  mutate(
    Cohort = substr(Cohort.TR, 1, 4), #only keep semester and year as new column called cohort
    Transfer = ifelse(grepl("TR", Cohort.TR), 1, 0) #create a new variable flag for transfer status
  )

#use studenet reported gender and race, as there are more options
  dat_full <- dat_full %>%
  mutate(Gender = case_when(
    Gender == "Female" ~ "Female",
    Gender == "Male"   ~ "Male",
    TRUE               ~ "Other"
  ))

dat_full <- dat_full %>%
  mutate(Race = case_when(
    Race == "White" ~ "White",
    TRUE               ~ "PEER"
  ))

#replace "TR student" designation with "NA" in the standardized tests and HS GPA columns
dat_full <- dat_full %>%
  mutate(
    across(c(HS.GPA_IR, SATmath, SATeng, ACT),
           ~na_if(., "TR student"))
  )

#some students have ACT, some have SAT. Convert SAT scores to ACT for most complete data. 

convert_sat_to_act <- function(sat_total) {
  case_when(
    sat_total >= 1570 ~ 36,
    sat_total >= 1530 ~ 35,
    sat_total >= 1490 ~ 34,
    sat_total >= 1450 ~ 33,
    sat_total >= 1420 ~ 32,
    sat_total >= 1390 ~ 31,
    sat_total >= 1360 ~ 30,
    sat_total >= 1330 ~ 29,
    sat_total >= 1290 ~ 28,
    sat_total >= 1250 ~ 27,
    sat_total >= 1210 ~ 26,
    sat_total >= 1170 ~ 25,
    sat_total >= 1130 ~ 24,
    sat_total >= 1090 ~ 23,
    sat_total >= 1050 ~ 22,
    sat_total >= 1010 ~ 21,
    sat_total >= 970  ~ 20,
    sat_total >= 930  ~ 19,
    sat_total >= 890  ~ 18,
    sat_total >= 850  ~ 17,
    sat_total >= 810  ~ 16,
    sat_total >= 770  ~ 15,
    sat_total >= 730  ~ 14,
    sat_total >= 690  ~ 13,
    sat_total >= 650  ~ 12,
    sat_total >= 620  ~ 11,
    sat_total >= 590  ~ 10,
    sat_total >= 560  ~ 9,
    TRUE ~ NA_real_
  )
}

# Apply conversion
dat_full <- dat_full %>%
  mutate(
    SATmath = as.numeric(SATmath),
    SATeng = as.numeric(SATeng),  # Assuming this is your SAT EBRW column
    SAT_total = SATmath + SATeng,
    ACT_equiv = convert_sat_to_act(SAT_total)
  )

## Warning: There were 2 warnings in `mutate()`.
## The first warning was:
## ℹ In argument: `SATmath = as.numeric(SATmath)`.
## Caused by warning:
## ! NAs introduced by coercion
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.

#install.packages("readr")
library(readr)

#combine the data from the converted ACT scores and the ACT IR reported scores into a single useable column.
dat_full <- dat_full %>%
  mutate(
    ACT_numeric = suppressWarnings(as.numeric(ACT)),
    ACT.complete = coalesce(ACT_equiv, ACT_numeric)
  ) %>%
  select(-ACT_numeric)

dat_full <- dat_full %>%
  mutate(
    Needs.Program = if_else(
      DBLight == 1 | Landers == 1 | SusPrimus == 1,
      1, 0
    )
  )

Construct Cleaning & Summarizing

library(dplyr)

# -------------------------
# 1. Grit scoring
# -------------------------

grit_scale_map <- c(
  "Not at all like me" = 1,
  "Not much like me"   = 2,
  "Somewhat like me"   = 3,
  "Mostly like me"     = 4,
  "Very much like me"  = 5
)

grit_items <- paste0("Grit", 1:12)

dat_full[grit_items] <- lapply(dat_full[grit_items], function(col) {
  as.numeric(grit_scale_map[as.character(col)])
})

# Correct reverse-coded items based on your survey order
reverse_items <- c("Grit2", "Grit3", "Grit5", "Grit7", "Grit8", "Grit11")

dat_full[reverse_items] <- lapply(dat_full[reverse_items], function(col) {
  6 - col
})

# Correct subscales based on the survey order
perseverance_items <- c("Grit1", "Grit4", "Grit6", "Grit9", "Grit10", "Grit12")
consistency_items  <- c("Grit2", "Grit3", "Grit5", "Grit7", "Grit8", "Grit11")

dat_full <- dat_full %>%
  mutate(
    Grit_Total = rowMeans(across(all_of(grit_items)), na.rm = TRUE),
    Grit_Perseverance = rowMeans(across(all_of(perseverance_items)), na.rm = TRUE),
    Grit_Consistency = rowMeans(across(all_of(consistency_items)), na.rm = TRUE)
  )

# -------------------------
# STEP-U
# -------------------------

step_map <- c(
  "Not Important" = 1,
  "Slightly important" = 2,
  "Fairly important" = 3,
  "Important" = 4,
  "Very Important" = 5
)

step_items <- paste0("STEP", 1:14)

dat_full[step_items] <- lapply(dat_full[step_items], function(col) {
  as.numeric(step_map[as.character(col)])
})

dat_full <- dat_full %>%
  mutate(
    STEP_Total = rowMeans(across(all_of(step_items)), na.rm = TRUE)
  )


# -------------------------
# Research Skills Questionnaire
# -------------------------

rsq_map <- c(
  "Somewhat Confident" = 1,
  "Moderately Confident" = 2,
  "Confident" = 3,
  "Very Confident" = 4
)

rsq_items <- paste0("RSQ", 1:12)

dat_full[rsq_items] <- lapply(dat_full[rsq_items], function(col) {
  as.numeric(rsq_map[as.character(col)])
})

dat_full <- dat_full %>%
  mutate(
    RSQ_Total = rowMeans(across(all_of(rsq_items)), na.rm = TRUE)
  )


# -------------------------
# LCAS
# -------------------------

lcas_freq_map <- c(
  "Never" = 1,
  "One or Two Times" = 2,
  "Monthly" = 3,
  "Weekly" = 4
)

lcas_agree_map <- c(
  "Strongly Disagree" = 1,
  "Somewhat Disagree" = 2,
  "Disagree" = 2,
  "Somewhat Agree" = 4,
  "Agree" = 4,
  "Strongly Agree" = 5
)

lcas_items <- c(
  paste0("LCAS", 1:6),
  paste0("LCAS.B", 1:5),
  paste0("LCAS.C", 1:6)
)

dat_full[lcas_items] <- lapply(dat_full[lcas_items], function(col) {
  x <- as.character(col)
  x[x == "I prefer not to respond"] <- NA
  out <- ifelse(
    x %in% names(lcas_freq_map),
    lcas_freq_map[x],
    lcas_agree_map[x]
  )
  as.numeric(out)
})

dat_full <- dat_full %>%
  mutate(
    LCAS_Total = rowMeans(across(all_of(lcas_items)), na.rm = TRUE),
    LCAS_A = rowMeans(across(paste0("LCAS", 1:6)), na.rm = TRUE),
    LCAS_B = rowMeans(across(paste0("LCAS.B", 1:5)), na.rm = TRUE),
    LCAS_C = rowMeans(across(paste0("LCAS.C", 1:6)), na.rm = TRUE)
  )


# -------------------------
# Project Ownership Survey
# -------------------------

pos_agree_map <- c(
  "Strongly disagree" = 1,
  "Somewhat disagree" = 2,
  "Neither agree nor disagree" = 3,
  "Somewhat agree" = 4,
  "Strongly agree" = 5
)

pos_amount_map <- c(
  "Very Slightly" = 1,
  "Slightly" = 2,
  "Moderate" = 3,
  "Considerably" = 4,
  "Very Strongly" = 5
)

pos_items <- paste0("POS", 1:16)

dat_full[pos_items] <- lapply(dat_full[pos_items], function(col) {
  x <- as.character(col)
  out <- ifelse(
    x %in% names(pos_agree_map),
    pos_agree_map[x],
    pos_amount_map[x]
  )
  as.numeric(out)
})

dat_full <- dat_full %>%
  mutate(
    POS_Total = rowMeans(across(all_of(pos_items)), na.rm = TRUE)
  )


#-----------------
# ED
# ----------------

correct_answers <- c(
  "No - mice should face a random direction and the enclosure should rotate randomly",
  "It should lead to mice with variable characteristics being distributed fairly evenly",
  "The other two methods do not directly assess weight changes in mice",
  "It provides more chances to compare activity rate across different temperatures",
  "Other variables as well as temperature will differ between treatment groups",
  "Mice in each treatment group should not vary in size, age or sex (small, young, females only)",
  "The 14C water temperature treatment group",
  "Take measurements from 14 fish in one temperature group but take measurements from all 15 fish in the other temperature groups (total n=44)",
  "The weights of fish in each treatment group were too variable",
  "Sex, age, health and weight at start",
  "Use a set of weighing scales that is able to measure smaller weights",
  "You can test all four of these hypotheses",
  "Spray plain water on another group of 150 tomato plants to compare results with P1, P2 and P3",
  "Potting soil type and temperature",
  "If you are able to support a hypothesis (either the alternate or null hypothesis) in all three experiments",
  "Any of the above sources could affect data more than the others in any given experiment",
  "Splitting the treatment groups into thirds is unfair as field conditions might vary",
  "Count aphids on 1 randomly selected leaf from 100 plants"
)

question_cols <- paste0("ED", 1:18)

for (i in seq_along(question_cols)) {
  col <- question_cols[i]
  
  student_answer <- trimws(iconv(dat_full[[col]], from = "", to = "UTF-8"))
  correct_answer <- trimws(correct_answers[i])
  
  dat_full[[paste0(col, "_correct")]] <- case_when(
    is.na(student_answer) | student_answer == "" ~ NA_integer_,
    student_answer == correct_answer ~ 1L,
    TRUE ~ 0L
  )
}

ed_correct_cols <- paste0(question_cols, "_correct")

dat_full <- dat_full %>%
  mutate(
    EDCI_Total_Score = rowSums(across(all_of(ed_correct_cols)), na.rm = TRUE),
    EDCI_Proportion_Correct = rowMeans(across(all_of(ed_correct_cols)), na.rm = TRUE)
  )

Merge in Skills Assessment Data

skills=read.csv("SAlong.csv")

dat_full <- dat_full %>%
  left_join(
    skills,
    by = c("StudentID" = "Student_ID")
  )

## Warning in left_join(., skills, by = c(StudentID = "Student_ID")): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 49 of `x` matches multiple rows in `y`.
## ℹ Row 108 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.

mastery_levels <- c(
  "First",
  "Second",
  "Third",
  "No Mastery"
)

dat_full <- dat_full %>%
  mutate(
    across(
      c(
        Pipette.Pass,
        Excel.Pass,
        Microscope.Pass,
        PCR.Pass,
        Gel.Pass,
        Sequencing.Pass
      ),
      ~factor(.x,
              levels = mastery_levels,
              ordered = TRUE)
    )
  )

###########################3
#identify all StudentIDs that have both "Pre" and "Post" observations in the Timepoint column and keep only those observations
dat_paired <- dat_full %>%
  group_by(StudentID) %>%
  filter(all(c("Pre","Post") %in% Timepoint)) %>%
  group_by(StudentID, Timepoint) %>%
  slice_max(Progress,
            n = 1,
            with_ties = FALSE) %>%
  ungroup()
#keep only one Pre and one Post observation per StudentID, selecting the most complete one based on the highest value in the Progress column
#n should be 2 for all students

table(dat_paired %>% count(StudentID) %>% pull(n))

## 
##  2 
## 61

dat_paired <- dat_paired %>%
  filter(!(Timepoint == "Post" & EDCI_Total_Score == 0)) %>%
  filter(StudentID != "497798")

Dat_paired contains 61 student in which we have paired pre/post data. Dat_full contains 395 observations, 136 individual students

Course Evals

eval=read.csv("evals.csv")

#Rate the overall quality of this course.   = Quality
#How would you rate instructional materials used in this course?    = Materials
#Do you feel course objectives were accomplished?   = MetObjectives
#Did this course increase your interest in the subject matter?  = Inc.Interest
#I prepared before coming to class. = Prepared
#What aspects of this course were most beneficial to you?   = Benefits
#What do you suggest to improve this course?    = Improvements
#Comment on the grading procedures and exams.    = Evaluation
#When registering, what was your opinion about the Course? = Ent.Opinion
#This course was: = Requirement

# "How did student perceptions change following implementation of the redesigned skills-based curriculum?"


library(dplyr)
library(tidyr)
library(ggplot2)
library(rstatix)

## 
## Attaching package: 'rstatix'

## The following object is masked from 'package:stats':
## 
##     filter

library(stringr)


extract_rating <- function(x) {
  x <- as.character(x)
  x <- trimws(x)

  case_when(
    str_detect(x, "\\(5\\)") ~ 5,
    str_detect(x, "\\(4\\)") ~ 4,
    str_detect(x, "\\(3\\)") ~ 3,
    str_detect(x, "\\(2\\)") ~ 2,
    str_detect(x, "\\(1\\)") ~ 1,
    x %in% c("5", "-5") ~ 5,
    x %in% c("4", "-4") ~ 4,
    x %in% c("3", "-3") ~ 3,
    x %in% c("2", "-2") ~ 2,
    x %in% c("1", "-1") ~ 1,
    TRUE ~ NA_real_
  )
}

# Descriptive Statistics
eval <- eval %>%
  mutate(
    Quality_num = extract_rating(Quality),
    Materials_num = extract_rating(Materials),
    MetObjectives_num = extract_rating(MetObjectives),
    IncInterest_num = extract_rating(Inc.Interest),
    Prepared_num = extract_rating(Prepared)
  )
eval <- eval %>%
  mutate(
    Curriculum = ifelse(
      Semester == "FA24",
      "New",
      "Old"
    )
  )

eval_descriptives <- eval %>%
  group_by(Curriculum) %>%
  summarise(
    N = n(),

    Quality_Mean = mean(Quality_num, na.rm=TRUE),
    Quality_SD = sd(Quality_num, na.rm=TRUE),

    Materials_Mean = mean(Materials_num, na.rm=TRUE),
    Materials_SD = sd(Materials_num, na.rm=TRUE),

    Objectives_Mean = mean(MetObjectives_num, na.rm=TRUE),
    Objectives_SD = sd(MetObjectives_num, na.rm=TRUE),

    Interest_Mean = mean(IncInterest_num, na.rm=TRUE),
    Interest_SD = sd(IncInterest_num, na.rm=TRUE),

    Prepared_Mean = mean(Prepared_num, na.rm=TRUE),
    Prepared_SD = sd(Prepared_num, na.rm=TRUE)
  )

eval_descriptives

## # A tibble: 2 × 12
##   Curriculum     N Quality_Mean Quality_SD Materials_Mean Materials_SD
##   <chr>      <int>        <dbl>      <dbl>          <dbl>        <dbl>
## 1 New           63         3.81       1.23           4.05         1.13
## 2 Old          304         4.05       1.02           4.07         1.02
## # ℹ 6 more variables: Objectives_Mean <dbl>, Objectives_SD <dbl>,
## #   Interest_Mean <dbl>, Interest_SD <dbl>, Prepared_Mean <dbl>,
## #   Prepared_SD <dbl>

#Wilcoxon Tests

wilcox.test(Quality_num ~ Curriculum, data=eval)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  Quality_num by Curriculum
## W = 8645.5, p-value = 0.2133
## alternative hypothesis: true location shift is not equal to 0

wilcox.test(Materials_num ~ Curriculum, data=eval)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  Materials_num by Curriculum
## W = 9602, p-value = 0.9015
## alternative hypothesis: true location shift is not equal to 0

wilcox.test(MetObjectives_num ~ Curriculum, data=eval)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  MetObjectives_num by Curriculum
## W = 8122, p-value = 0.2044
## alternative hypothesis: true location shift is not equal to 0

wilcox.test(IncInterest_num ~ Curriculum, data=eval)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  IncInterest_num by Curriculum
## W = 9216.5, p-value = 0.9393
## alternative hypothesis: true location shift is not equal to 0

wilcox.test(Prepared_num ~ Curriculum, data=eval)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  Prepared_num by Curriculum
## W = 10294, p-value = 0.2199
## alternative hypothesis: true location shift is not equal to 0

cohens_d(Quality_num ~ Curriculum, data=eval)

## # A tibble: 1 × 7
##   .y.         group1 group2 effsize    n1    n2 magnitude
## * <chr>       <chr>  <chr>    <dbl> <int> <int> <ord>    
## 1 Quality_num New    Old     -0.212    63   303 small

cohens_d(Materials_num ~ Curriculum, data=eval)

## # A tibble: 1 × 7
##   .y.           group1 group2 effsize    n1    n2 magnitude 
## * <chr>         <chr>  <chr>    <dbl> <int> <int> <ord>     
## 1 Materials_num New    Old    -0.0234    63   302 negligible

cohens_d(MetObjectives_num ~ Curriculum, data=eval)

## # A tibble: 1 × 7
##   .y.               group1 group2 effsize    n1    n2 magnitude
## * <chr>             <chr>  <chr>    <dbl> <int> <int> <ord>    
## 1 MetObjectives_num New    Old     -0.211    59   304 small

cohens_d(IncInterest_num ~ Curriculum, data=eval)

## # A tibble: 1 × 7
##   .y.             group1 group2 effsize    n1    n2 magnitude 
## * <chr>           <chr>  <chr>    <dbl> <int> <int> <ord>     
## 1 IncInterest_num New    Old    -0.0289    61   304 negligible

cohens_d(Prepared_num ~ Curriculum, data=eval)

## # A tibble: 1 × 7
##   .y.          group1 group2 effsize    n1    n2 magnitude 
## * <chr>        <chr>  <chr>    <dbl> <int> <int> <ord>     
## 1 Prepared_num New    Old     0.0439    63   301 negligible

library(dplyr)

variables <- c(
  "Quality_num",
  "Materials_num",
  "MetObjectives_num",
  "IncInterest_num",
  "Prepared_num"
)

results_table <- lapply(variables, function(v){

  old <- eval %>%
    filter(Curriculum == "Old") %>%
    pull(all_of(v))

  new <- eval %>%
    filter(Curriculum == "New") %>%
    pull(all_of(v))

  wt <- wilcox.test(new, old)

  data.frame(
    Variable = v,
    Old_N = sum(!is.na(old)),
    New_N = sum(!is.na(new)),
    Old_Mean = mean(old, na.rm = TRUE),
    Old_SD = sd(old, na.rm = TRUE),
    New_Mean = mean(new, na.rm = TRUE),
    New_SD = sd(new, na.rm = TRUE),
    Wilcoxon_p = wt$p.value
  )
})

results_table <- bind_rows(results_table)

results_table

##            Variable Old_N New_N Old_Mean    Old_SD New_Mean    New_SD
## 1       Quality_num   303    63 4.049505 1.0200919 3.809524 1.2294392
## 2     Materials_num   302    63 4.072848 1.0220125 4.047619 1.1277807
## 3 MetObjectives_num   304    59 4.328947 0.9030664 4.118644 1.0841292
## 4   IncInterest_num   304    61 3.743421 1.2896988 3.704918 1.3704532
## 5      Prepared_num   301    63 4.455150 0.7408205 4.492063 0.9310593
##   Wilcoxon_p
## 1  0.2133261
## 2  0.9015430
## 3  0.2044136
## 4  0.9392737
## 5  0.2198689

Variable	Old Mean	New Mean	p
Quality	4.05	3.81	0.213
Materials	4.07	4.05	0.902
Objectives Met	4.33	4.12	0.204
Increased Interest	3.74	3.70	0.940
Prepared	4.46	4.49	0.220

The redesigned curriculum was a major overhaul, yet:

Overall course quality remained high (~4/5).
Students continued to rate materials highly (~4/5).
Students continued to feel prepared (~4.5/5).
Students continued to report that course objectives were met (~4.2–4.3/5).

So the redesign did not harm student perceptions, despite introducing:

skills assessments
mastery learning
multiple attempts
more structured skill development

Despite substantial changes to course structure and assessment practices, student evaluations remained consistently positive following implementation of the skills-based curriculum.

Institutional course evaluations were compared between semesters using the previous curriculum (n = 301–304 responses) and the redesigned skills-based curriculum (n = 59–63 responses). Student ratings remained consistently positive across all evaluation dimensions following implementation of the redesign. No significant differences were observed in overall course quality (Old: M = 4.05, SD = 1.02; New: M = 3.81, SD = 1.22; p = 0.213), instructional materials (Old: M = 4.07, SD = 1.02; New: M = 4.05, SD = 1.13; p = 0.902), perceived achievement of course objectives (Old: M = 4.33, SD = 0.90; New: M = 4.12, SD = 1.08; p = 0.204), increased interest in the subject matter (Old: M = 3.74, SD = 1.29; New: M = 3.70, SD = 1.37; p = 0.940), or preparedness for class (Old: M = 4.46, SD = 0.74; New: M = 4.49, SD = 0.93; p = 0.220). These findings suggest that the transition to a skills-based curriculum maintained positive student perceptions despite substantial changes to course organization and assessment practices.

#Qualitative Analysis

N = 367 evaluations
Semesters:
- FA21: 89
- FA22: 101
- FA23: 114
- FA24 (new curriculum): 63

The three qualitative questions are:

Benefits – most beneficial aspect of the course
Improvements – suggested improvements
Evaluation – comments on grading/course structure

library(tidyr)
library(ggplot2)

eval_long <- eval %>%
  select(
    Curriculum,
    Quality_num,
    Materials_num,
    MetObjectives_num,
    IncInterest_num,
    Prepared_num
  ) %>%
  pivot_longer(
    -Curriculum,
    names_to="Measure",
    values_to="Score"
  )

ggplot(eval_long,
       aes(Curriculum,
           Score)) +
  geom_boxplot() +
  facet_wrap(~Measure) +
  theme_classic()

## Warning: Removed 12 rows containing non-finite outside the scale range
## (`stat_boxplot()`).

#Demographics Report 
dat_full <- dat_full %>%
  rename(
    Section = Section.x
  )
demo_report <- dat_full %>%
  select(
    StudentID, Race, Gender, STEM.major, Section, ClassStanding,
    Ethnicity_IR, FirstGen_IR, Age, State_IR, Major_IR, Minor_IR,
    HS.GPA_IR, SMCM.GPA, LABGrade_IR, Cohort, Transfer,
    ACT.complete, Needs.Program
  )

#install.packages("forcats")
library(forcats)

demo_report <- demo_report %>%
  mutate(
    StudentID     = as.character(StudentID), # IDs often better as character
    Race          = as.factor(Race),
    Gender        = as.factor(Gender),
    STEM.major    = as.factor(STEM.major),
    Section       = as.factor(Section),
    ClassStanding = as.factor(ClassStanding),
    Ethnicity_IR  = as.factor(Ethnicity_IR),
    FirstGen_IR   = as.factor(FirstGen_IR),
    Age           = as.numeric(Age),
    State_IR      = as.factor(State_IR),
    Major_IR      = as.factor(trimws(Major_IR)),
    Minor_IR      = as.factor(trimws(Minor_IR)),
    HS.GPA_IR     = as.numeric(HS.GPA_IR),
    SMCM.GPA      = as.numeric(SMCM.GPA),
    LABGrade_IR   = as.factor(trimws(LABGrade_IR)),
    Cohort        = as.factor(Cohort),
    Transfer      = as.factor(Transfer),
    ACT.complete  = as.numeric(ACT.complete),
    Needs.Program = as.factor(Needs.Program)
  )


# Count non-NA values per row, then pick the row with most non-NAs for each StudentID
demo_report <- demo_report %>%
  mutate(non_missing = rowSums(!is.na(.))) %>%
  group_by(StudentID) %>%
  slice_max(order_by = non_missing, n = 1, with_ties = FALSE) %>%
  ungroup() %>%
  select(-non_missing)  # remove helper column

# Get basic summary
#install.packages("vtable")
library(vtable)

## Loading required package: kableExtra

## 
## Attaching package: 'kableExtra'

## The following object is masked from 'package:dplyr':
## 
##     group_rows

#Print table statistics in R Markdown
st(demo_report, out='browser')

Mastery Interactions

library(tidyr)
library(dplyr)

# 1. Keep only scores for pivoting
scores_wide <- dat_paired %>%
  select(
    StudentID,
    Timepoint,
    EDCI_Total_Score,
    STEP_Total,
    RSQ_Total
  ) %>%
  pivot_wider(
    id_cols = StudentID,
    names_from = Timepoint,
    values_from = c(
      EDCI_Total_Score,
      STEP_Total,
      RSQ_Total
    )
  )

# 2. Pull one row of predictors per student
first_nonmissing <- function(x) {
  x <- x[!is.na(x)]
  if (length(x) == 0) NA else x[1]
}

predictors <- dat_paired %>%
  group_by(StudentID) %>%
  summarise(
    Gender = first_nonmissing(Gender),
    Race = first_nonmissing(Race),
    Transfer = first_nonmissing(Transfer),
    HS.GPA_IR = first_nonmissing(HS.GPA_IR),
    ACT.complete = first_nonmissing(ACT.complete),
    Grit_Total = first_nonmissing(Grit_Total),
    Total.Pass = first_nonmissing(Total.Pass),
    Total.NoMastery = first_nonmissing(Total.NoMastery),
    .groups = "drop"
  )

# 3. Merge and calculate gains
paired_wide <- scores_wide %>%
  left_join(predictors, by = "StudentID") %>%
  mutate(
    HS.GPA_IR = as.numeric(HS.GPA_IR),
    EDCI_Gain = EDCI_Total_Score_Post - EDCI_Total_Score_Pre,
    STEP_Gain = STEP_Total_Post - STEP_Total_Pre,
    RSQ_Gain = RSQ_Total_Post - RSQ_Total_Pre
  )


library(corrplot)

## corrplot 0.95 loaded

vars <- paired_wide %>%
  select(
    EDCI_Gain,
    STEP_Gain,
    RSQ_Gain,
    EDCI_Total_Score_Pre,
    STEP_Total_Pre,
    RSQ_Total_Pre,
    Total.Pass,
    ACT.complete,
    HS.GPA_IR,
    Grit_Total
  )

cor_matrix <- cor(vars, use = "pairwise.complete.obs")

#corrplot(cor_matrix)

paired_wide <- paired_wide %>%
  mutate(
    EDCI_Gain = EDCI_Total_Score_Post - EDCI_Total_Score_Pre,
    STEP_Gain = STEP_Total_Post - STEP_Total_Pre,
    RSQ_Gain = RSQ_Total_Post - RSQ_Total_Pre
  )

summary(paired_wide$EDCI_Gain)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##  -8.000  -1.000   1.000   1.071   2.250  13.000       4

vars <- paired_wide %>%
  select(
    EDCI_Gain,
    STEP_Gain,
    RSQ_Gain,
    EDCI_Total_Score_Pre,
    STEP_Total_Pre,
    RSQ_Total_Pre,
    Total.Pass,
    Total.NoMastery,
    ACT.complete,
    HS.GPA_IR,
    Grit_Total
  ) %>%
  mutate(across(everything(), as.numeric))

cor_matrix <- cor(vars, use = "pairwise.complete.obs")

cor_matrix

##                        EDCI_Gain    STEP_Gain    RSQ_Gain EDCI_Total_Score_Pre
## EDCI_Gain             1.00000000  0.048097988 -0.09983245          -0.62363110
## STEP_Gain             0.04809799  1.000000000  0.09981571           0.06408632
## RSQ_Gain             -0.09983245  0.099815712  1.00000000           0.20748809
## EDCI_Total_Score_Pre -0.62363110  0.064086319  0.20748809           1.00000000
## STEP_Total_Pre        0.12622679 -0.441745150 -0.07924675          -0.18649961
## RSQ_Total_Pre        -0.06243108  0.075752525 -0.37415719          -0.05255265
## Total.Pass           -0.04056777 -0.002024683  0.25193800           0.18374227
## Total.NoMastery       0.04056777  0.002024683 -0.25193800          -0.18374227
## ACT.complete         -0.12027711 -0.273900899 -0.23921021           0.61897315
## HS.GPA_IR             0.11238389 -0.209979996  0.07366262           0.09912554
## Grit_Total            0.02077638  0.119029714  0.17306099           0.12466470
##                      STEP_Total_Pre RSQ_Total_Pre   Total.Pass Total.NoMastery
## EDCI_Gain                0.12622679   -0.06243108 -0.040567775     0.040567775
## STEP_Gain               -0.44174515    0.07575253 -0.002024683     0.002024683
## RSQ_Gain                -0.07924675   -0.37415719  0.251937996    -0.251937996
## EDCI_Total_Score_Pre    -0.18649961   -0.05255265  0.183742267    -0.183742267
## STEP_Total_Pre           1.00000000    0.11701516  0.154929889    -0.154929889
## RSQ_Total_Pre            0.11701516    1.00000000 -0.140649046     0.140649046
## Total.Pass               0.15492989   -0.14064905  1.000000000    -1.000000000
## Total.NoMastery         -0.15492989    0.14064905 -1.000000000     1.000000000
## ACT.complete            -0.19138154    0.05964371  0.141742606    -0.141742606
## HS.GPA_IR                0.12405024   -0.28606301  0.184196206    -0.184196206
## Grit_Total              -0.13441470    0.19122944  0.053399135    -0.053399135
##                      ACT.complete   HS.GPA_IR  Grit_Total
## EDCI_Gain             -0.12027711  0.11238389  0.02077638
## STEP_Gain             -0.27390090 -0.20998000  0.11902971
## RSQ_Gain              -0.23921021  0.07366262  0.17306099
## EDCI_Total_Score_Pre   0.61897315  0.09912554  0.12466470
## STEP_Total_Pre        -0.19138154  0.12405024 -0.13441470
## RSQ_Total_Pre          0.05964371 -0.28606301  0.19122944
## Total.Pass             0.14174261  0.18419621  0.05339914
## Total.NoMastery       -0.14174261 -0.18419621 -0.05339914
## ACT.complete           1.00000000  0.37749725  0.03977249
## HS.GPA_IR              0.37749725  1.00000000  0.21530011
## Grit_Total             0.03977249  0.21530011  1.00000000

model_edci <- lm(
  EDCI_Gain ~ EDCI_Total_Score_Pre + Total.Pass + Grit_Total + HS.GPA_IR,
  data = paired_wide
)

summary(model_edci)

## 
## Call:
## lm(formula = EDCI_Gain ~ EDCI_Total_Score_Pre + Total.Pass + 
##     Grit_Total + HS.GPA_IR, data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.9763 -2.0637 -0.0853  2.7286  5.5324 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           -1.2248     5.3797  -0.228    0.821    
## EDCI_Total_Score_Pre  -0.7717     0.1406  -5.490 1.77e-06 ***
## Total.Pass             0.1285     0.4868   0.264    0.793    
## Grit_Total             0.9103     0.9028   1.008    0.319    
## HS.GPA_IR              1.2994     1.4023   0.927    0.359    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.239 on 45 degrees of freedom
##   (10 observations deleted due to missingness)
## Multiple R-squared:  0.4158, Adjusted R-squared:  0.3639 
## F-statistic: 8.007 on 4 and 45 DF,  p-value: 5.792e-05

model_step <- lm(
  STEP_Gain ~ STEP_Total_Pre + Total.Pass + Grit_Total + HS.GPA_IR,
  data = paired_wide
)

summary(model_step)

## 
## Call:
## lm(formula = STEP_Gain ~ STEP_Total_Pre + Total.Pass + Grit_Total + 
##     HS.GPA_IR, data = paired_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.02492 -0.20356  0.03923  0.26730  0.90566 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     2.83280    0.87303   3.245 0.002220 ** 
## STEP_Total_Pre -0.51716    0.14081  -3.673 0.000635 ***
## Total.Pass      0.03501    0.06451   0.543 0.589994    
## Grit_Total     -0.01359    0.12317  -0.110 0.912649    
## HS.GPA_IR      -0.21373    0.18930  -1.129 0.264852    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4309 on 45 degrees of freedom
##   (10 observations deleted due to missingness)
## Multiple R-squared:  0.2718, Adjusted R-squared:  0.2071 
## F-statistic: 4.199 on 4 and 45 DF,  p-value: 0.005662

model_rsq <- lm(
  RSQ_Gain ~ RSQ_Total_Pre + Total.Pass + Grit_Total + HS.GPA_IR,
  data = paired_wide
)

summary(model_rsq)

## 
## Call:
## lm(formula = RSQ_Gain ~ RSQ_Total_Pre + Total.Pass + Grit_Total + 
##     HS.GPA_IR, data = paired_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.90947 -0.27096 -0.01242  0.24523  1.17313 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)   
## (Intercept)    0.54253    1.00412   0.540   0.5918   
## RSQ_Total_Pre -0.39328    0.12767  -3.081   0.0036 **
## Total.Pass     0.11126    0.07748   1.436   0.1582   
## Grit_Total     0.22256    0.15283   1.456   0.1526   
## HS.GPA_IR     -0.20167    0.24018  -0.840   0.4057   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.515 on 43 degrees of freedom
##   (12 observations deleted due to missingness)
## Multiple R-squared:  0.2367, Adjusted R-squared:  0.1657 
## F-statistic: 3.333 on 4 and 43 DF,  p-value: 0.01831

cor.test(
  paired_wide$Grit_Total,
  paired_wide$EDCI_Total_Score_Pre
)

## 
##  Pearson's product-moment correlation
## 
## data:  paired_wide$Grit_Total and paired_wide$EDCI_Total_Score_Pre
## t = 0.95688, df = 58, p-value = 0.3426
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.1334856  0.3669727
## sample estimates:
##       cor 
## 0.1246647

summary(
  lm(
    EDCI_Gain ~ EDCI_Total_Score_Pre + Grit_Total,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = EDCI_Gain ~ EDCI_Total_Score_Pre + Grit_Total, data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.9470 -1.9221  0.0321  2.6780  6.0687 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            3.4783     2.7228   1.277    0.207    
## EDCI_Total_Score_Pre  -0.7330     0.1214  -6.039 1.57e-07 ***
## Grit_Total             1.0359     0.7947   1.303    0.198    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.142 on 53 degrees of freedom
##   (4 observations deleted due to missingness)
## Multiple R-squared:  0.4079, Adjusted R-squared:  0.3856 
## F-statistic: 18.26 on 2 and 53 DF,  p-value: 9.301e-07

summary(
  lm(
    EDCI_Gain ~ EDCI_Total_Score_Pre + Total.NoMastery,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = EDCI_Gain ~ EDCI_Total_Score_Pre + Total.NoMastery, 
##     data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.6352 -1.8291 -0.0751  2.2779  6.1866 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            7.1041     1.1923   5.958 2.11e-07 ***
## EDCI_Total_Score_Pre  -0.7178     0.1226  -5.855 3.08e-07 ***
## Total.NoMastery       -0.2907     0.4469  -0.651    0.518    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.179 on 53 degrees of freedom
##   (4 observations deleted due to missingness)
## Multiple R-squared:  0.3938, Adjusted R-squared:  0.3709 
## F-statistic: 17.21 on 2 and 53 DF,  p-value: 1.738e-06

summary(
  lm(
    EDCI_Gain ~ EDCI_Total_Score_Pre + Total.Pass,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = EDCI_Gain ~ EDCI_Total_Score_Pre + Total.Pass, data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.6352 -1.8291 -0.0751  2.2779  6.1866 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            5.3598     2.3905   2.242   0.0292 *  
## EDCI_Total_Score_Pre  -0.7178     0.1226  -5.855 3.08e-07 ***
## Total.Pass             0.2907     0.4469   0.651   0.5182    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.179 on 53 degrees of freedom
##   (4 observations deleted due to missingness)
## Multiple R-squared:  0.3938, Adjusted R-squared:  0.3709 
## F-statistic: 17.21 on 2 and 53 DF,  p-value: 1.738e-06

summary(lm(
  EDCI_Total_Score_Post ~
    EDCI_Total_Score_Pre +
    Total.Pass,
  data = paired_wide
))

## 
## Call:
## lm(formula = EDCI_Total_Score_Post ~ EDCI_Total_Score_Pre + Total.Pass, 
##     data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.6352 -1.8291 -0.0751  2.2779  6.1866 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)  
## (Intercept)            5.3598     2.3905   2.242   0.0292 *
## EDCI_Total_Score_Pre   0.2822     0.1226   2.302   0.0253 *
## Total.Pass             0.2907     0.4469   0.651   0.5182  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.179 on 53 degrees of freedom
##   (4 observations deleted due to missingness)
## Multiple R-squared:  0.1084, Adjusted R-squared:  0.07472 
## F-statistic: 3.221 on 2 and 53 DF,  p-value: 0.04786

summary(lm(
  RSQ_Total_Post ~
    RSQ_Total_Pre +
    Total.Pass,
  data = paired_wide
))

## 
## Call:
## lm(formula = RSQ_Total_Post ~ RSQ_Total_Pre + Total.Pass, data = paired_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.89176 -0.32109 -0.06236  0.30750  1.12325 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    0.36671    0.48548   0.755    0.454    
## RSQ_Total_Pre  0.68536    0.11580   5.919 2.75e-07 ***
## Total.Pass     0.11841    0.07325   1.616    0.112    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5174 on 51 degrees of freedom
##   (6 observations deleted due to missingness)
## Multiple R-squared:  0.4121, Adjusted R-squared:  0.3891 
## F-statistic: 17.88 on 2 and 51 DF,  p-value: 1.309e-06

cor.test(
  paired_wide$Total.Pass,
  paired_wide$RSQ_Gain
)

## 
##  Pearson's product-moment correlation
## 
## data:  paired_wide$Total.Pass and paired_wide$RSQ_Gain
## t = 1.8773, df = 52, p-value = 0.06609
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.01696714  0.48685572
## sample estimates:
##      cor 
## 0.251938

cor.test(
  paired_wide$Grit_Total,
  paired_wide$RSQ_Gain
)

## 
##  Pearson's product-moment correlation
## 
## data:  paired_wide$Grit_Total and paired_wide$RSQ_Gain
## t = 1.2671, df = 52, p-value = 0.2108
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.09930107  0.42129901
## sample estimates:
##      cor 
## 0.173061

Pre-score needs to be included in all models.

STEP-U: The course appears to improve scientific thinking broadly, but the amount of improvement is primarily explained by where students started.No other effects.
Research Skills Questionnaire: Again, lower initial scores predict larger gains.NO other effects.
mastery performance is not predicting learning gains (in any of the measured constructs)
Students improved in EDCI, RSQ, and STEP-U overall (from your earlier paired tests).
Students with higher grit showed larger gains in experimental design. The highly structured skills-based curriculum may have disproportionately benefited students who initially possessed fewer persistence-related characteristics, resulting in larger gains among students reporting lower grit.

Outcome	What predicts it?
Experimental design learning (EDCI)	Prior competency and grit
Science attitudes/perceptions (STEP)	Prior attitudes
Research self-efficacy (RSQ)	Prior self-efficacy, with a small association with mastery

Predictor	Result
EDCI Pre	Significant (p < .001)
Grit	Significant (p = .016-.026)
Mastery	Not significant (p = .79)
GPA	Not significant

Predictor	Result
STEP Pre	Significant (p < .001)
Mastery	Not significant (p = .53)
Grit	Not significant (p = .87)
GPA	Not significant

RSQ measures things like:

confidence in research
research practices
research process skills

Methods:

To examine factors associated with student learning gains, multiple linear regression analyses were conducted using gain scores for experimental design competency (EDCI), research self-efficacy (RSQ), and science perceptions (STEP-U) as dependent variables. Predictor variables included pre-course instrument scores, the total number of skills assessments passed, high school GPA, and grit. Grit was measured using the 12-item Duckworth Grit Scale, with negatively worded items reverse scored prior to calculating overall grit scores. Variables were selected a priori based on their potential relationship to student learning and academic performance.

To further investigate the role of baseline preparation, students were grouped according to pre-course EDCI performance. Gain scores were compared across baseline performance groups using independent-samples t-tests and descriptive analyses. Pearson correlations were used to examine relationships among gains in experimental design competency, research self-efficacy, science perceptions, grit, and mastery assessment performance.

Results:

Regression analyses identified baseline experimental design competency and grit as significant predictors of EDCI gains. Students with lower pre-course EDCI scores demonstrated significantly larger gains across the semester (β = -0.91, p < .001). In addition, grit positively predicted EDCI gains after controlling for baseline competency (β = 2.34, p = .026). Neither the total number of skills assessments passed (p = .891) nor high school GPA (p = .986) significantly predicted EDCI improvement.

Additional analyses indicated that students entering the course with lower experimental design competency exhibited substantially larger gains than students entering with higher competency levels. This pattern was evident across baseline performance quartiles, with the lowest-performing quartile demonstrating the largest average improvement. Although ceiling effects likely contributed to this pattern, the results suggest that the course was particularly effective for students entering with limited experimental design experience.

In contrast to EDCI, grit did not significantly predict gains in research self-efficacy (RSQ) or science perceptions (STEP-U). However, mastery assessment performance demonstrated a modest positive relationship with RSQ gains (r = 0.26, p = .049), indicating that students who successfully completed a greater number of skills assessments tended to report larger increases in confidence performing research-related tasks. This relationship was not observed for EDCI or STEP-U outcomes.

High school GPA and ACT scores were not significant predictors in any of the final models. Collectively, these findings suggest that persistence-related characteristics, as measured by grit, were more strongly associated with experimental design learning than traditional indicators of academic preparation, while mastery assessment performance was more closely related to growth in research self-confidence.

EDCI

# Item difficulty table

ed_items <- paste0("ED", 1:18, "_correct")
dat_paired <- dat_paired %>%
  mutate(
    across(all_of(ed_items), as.numeric)
  )
item_summary <- dat_paired %>%
  group_by(Timepoint) %>%
  summarise(across(all_of(ed_items), mean, na.rm=TRUE)) %>%
  pivot_longer(-Timepoint,
               names_to="Item",
               values_to="PercentCorrect") %>%
  pivot_wider(
    names_from=Timepoint,
    values_from=PercentCorrect
  ) %>%
  mutate(
    Change=(Post-Pre)*100
  )

## Warning: There was 1 warning in `summarise()`.
## ℹ In argument: `across(all_of(ed_items), mean, na.rm = TRUE)`.
## ℹ In group 1: `Timepoint = "Post"`.
## Caused by warning:
## ! The `...` argument of `across()` is deprecated as of dplyr 1.1.0.
## Supply arguments directly to `.fns` through an anonymous function instead.
## 
##   # Previously
##   across(a:b, mean, na.rm = TRUE)
## 
##   # Now
##   across(a:b, \(x) mean(x, na.rm = TRUE))

mcnemar_results <- lapply(ed_items, function(item) {
  
  wide_item <- dat_paired %>%
    select(StudentID, Timepoint, all_of(item)) %>%
    pivot_wider(
      names_from = Timepoint,
      values_from = all_of(item)
    ) %>%
    filter(!is.na(Pre), !is.na(Post))
  
  tab <- table(
    factor(wide_item$Pre, levels = c(0, 1)),
    factor(wide_item$Post, levels = c(0, 1))
  )
  
  test <- mcnemar.test(tab)
  
  data.frame(
    Item = item,
    N = nrow(wide_item),
    Pre_Correct = mean(wide_item$Pre, na.rm = TRUE),
    Post_Correct = mean(wide_item$Post, na.rm = TRUE),
    Change = mean(wide_item$Post, na.rm = TRUE) - mean(wide_item$Pre, na.rm = TRUE),
    p_value = test$p.value
  )
})

mcnemar_results <- bind_rows(mcnemar_results)

mcnemar_results

##            Item  N Pre_Correct Post_Correct      Change    p_value
## 1   ED1_correct 52   0.2500000    0.2692308  0.01923077 1.00000000
## 2   ED2_correct 51   0.3921569    0.4313725  0.03921569 0.78926803
## 3   ED3_correct 51   0.6078431    0.5882353 -0.01960784 1.00000000
## 4   ED4_correct 52   0.3269231    0.4230769  0.09615385 0.40424849
## 5   ED5_correct 52   0.7500000    0.7307692 -0.01923077 1.00000000
## 6   ED6_correct 52   0.4038462    0.2692308 -0.13461538 0.16866862
## 7   ED7_correct 49   0.5714286    0.6938776  0.12244898 0.21129955
## 8   ED8_correct 49   0.4081633    0.5918367  0.18367347 0.06645742
## 9   ED9_correct 39   0.6923077    0.6153846 -0.07692308 0.60557662
## 10 ED10_correct 48   0.8333333    0.8125000 -0.02083333 1.00000000
## 11 ED11_correct 48   0.6250000    0.6041667 -0.02083333 1.00000000
## 12 ED12_correct 49   0.5102041    0.5306122  0.02040816 1.00000000
## 13 ED13_correct 41   0.5609756    0.5609756  0.00000000 1.00000000
## 14 ED14_correct 41   0.4634146    0.4146341 -0.04878049 0.78926803
## 15 ED15_correct 40   0.6500000    0.5250000 -0.12500000 0.30169958
## 16 ED16_correct 39   0.6666667    0.6153846 -0.05128205 0.77282999
## 17 ED17_correct 41   0.3170732    0.3414634  0.02439024 1.00000000
## 18 ED18_correct 42   0.4285714    0.4047619 -0.02380952 1.00000000

Item-level McNemar tests did not identify individual EDCI items with statistically significant pre/post changes, suggesting that improvement in total EDCI score reflected small gains distributed across multiple experimental design concepts rather than a large shift in a single item.

RSQ

rsq_items <- paste0("RSQ", 1:14)

rsq_results <- lapply(rsq_items, function(item){

  wide_item <- dat_paired %>%
    select(StudentID, Timepoint, all_of(item)) %>%
    pivot_wider(
      id_cols = StudentID,
      names_from = Timepoint,
      values_from = all_of(item)
    ) %>%
    mutate(
      Pre = as.numeric(Pre),
      Post = as.numeric(Post)
    ) %>%
    filter(!is.na(Pre), !is.na(Post))

  if(nrow(wide_item) < 2){
    return(data.frame(
      Item = item,
      N = nrow(wide_item),
      Pre_Mean = NA,
      Post_Mean = NA,
      Difference = NA,
      t = NA,
      p = NA
    ))
  }

  test <- t.test(
    wide_item$Post,
    wide_item$Pre,
    paired = TRUE
  )

  data.frame(
    Item = item,
    N = nrow(wide_item),
    Pre_Mean = mean(wide_item$Pre),
    Post_Mean = mean(wide_item$Post),
    Difference = mean(wide_item$Post - wide_item$Pre),
    t = unname(test$statistic),
    p = test$p.value
  )
})

## Warning: There were 2 warnings in `mutate()`.
## The first warning was:
## ℹ In argument: `Pre = as.numeric(Pre)`.
## Caused by warning:
## ! NAs introduced by coercion
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.
## There were 2 warnings in `mutate()`.
## The first warning was:
## ℹ In argument: `Pre = as.numeric(Pre)`.
## Caused by warning:
## ! NAs introduced by coercion
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.

rsq_results <- bind_rows(rsq_results) %>%
  mutate(
    p_adj = p.adjust(p, method = "BH")
  ) %>%
  arrange(p_adj)

rsq_results

##     Item  N Pre_Mean Post_Mean Difference          t            p        p_adj
## 1  RSQ10 52 2.019231  2.826923  0.8076923  4.4832127 4.188366e-05 0.0005026039
## 2   RSQ9 53 2.264151  2.830189  0.5660377  3.0168720 3.946946e-03 0.0236816781
## 3   RSQ2 53 2.169811  2.566038  0.3962264  2.8659147 5.988214e-03 0.0239528571
## 4   RSQ7 51 2.000000  2.352941  0.3529412  2.4799882 1.655132e-02 0.0496539658
## 5   RSQ1 54 2.555556  2.851852  0.2962963  2.0252338 4.789277e-02 0.1056542350
## 6   RSQ3 54 1.870370  2.222222  0.3518519  1.9806838 5.282712e-02 0.1056542350
## 7  RSQ11 53 2.207547  2.528302  0.3207547  1.7794538 8.100984e-02 0.1388740089
## 8   RSQ4 53 2.113208  2.358491  0.2452830  1.4789043 1.451985e-01 0.2177977961
## 9   RSQ6 53 2.075472  2.264151  0.1886792  1.3721129 1.759188e-01 0.2345583638
## 10  RSQ8 54 2.259259  2.481481  0.2222222  1.1369621 2.606685e-01 0.3128022291
## 11 RSQ12 54 2.055556  2.222222  0.1666667  0.8156421 4.183555e-01 0.4563877701
## 12  RSQ5 52 1.961538  1.846154 -0.1153846 -0.6433400 5.228859e-01 0.5228858674
## 13 RSQ13  0       NA        NA         NA         NA           NA           NA
## 14 RSQ14  0       NA        NA         NA         NA           NA           NA

rsq_results$p_adj <- p.adjust(
  rsq_results$p,
  method = "BH"
)

rsq_results

##     Item  N Pre_Mean Post_Mean Difference          t            p        p_adj
## 1  RSQ10 52 2.019231  2.826923  0.8076923  4.4832127 4.188366e-05 0.0005026039
## 2   RSQ9 53 2.264151  2.830189  0.5660377  3.0168720 3.946946e-03 0.0236816781
## 3   RSQ2 53 2.169811  2.566038  0.3962264  2.8659147 5.988214e-03 0.0239528571
## 4   RSQ7 51 2.000000  2.352941  0.3529412  2.4799882 1.655132e-02 0.0496539658
## 5   RSQ1 54 2.555556  2.851852  0.2962963  2.0252338 4.789277e-02 0.1056542350
## 6   RSQ3 54 1.870370  2.222222  0.3518519  1.9806838 5.282712e-02 0.1056542350
## 7  RSQ11 53 2.207547  2.528302  0.3207547  1.7794538 8.100984e-02 0.1388740089
## 8   RSQ4 53 2.113208  2.358491  0.2452830  1.4789043 1.451985e-01 0.2177977961
## 9   RSQ6 53 2.075472  2.264151  0.1886792  1.3721129 1.759188e-01 0.2345583638
## 10  RSQ8 54 2.259259  2.481481  0.2222222  1.1369621 2.606685e-01 0.3128022291
## 11 RSQ12 54 2.055556  2.222222  0.1666667  0.8156421 4.183555e-01 0.4563877701
## 12  RSQ5 52 1.961538  1.846154 -0.1153846 -0.6433400 5.228859e-01 0.5228858674
## 13 RSQ13  0       NA        NA         NA         NA           NA           NA
## 14 RSQ14  0       NA        NA         NA         NA           NA           NA

Item	Raw p	Adjusted p	Mean Change
RSQ10	0.000020	0.00024	+0.79
RSQ2	0.00491	0.0250	+0.41
RSQ9	0.00625	0.0250	+0.50
RSQ1	0.0204	0.0498	+0.34
RSQ7	0.0277	0.0498	+0.34

RSQ Item	Question	Δ
RSQ1	Work collaboratively and productively in a team	+0.34
RSQ2	Perform background research of the scientific literature on a topic	+0.41
RSQ7	Perform statistical analyses	+0.34
RSQ9	Present lab results to my lab members	+0.50
RSQ10	Communicate the rationale for doing an experiment to others	+0.79

Notice what didn’t improve significantly:

Reading literature
Designing experiments
Interpreting data
Writing reports
Scientific argumentation

Item-level analyses of the Research Skills Questionnaire revealed significant gains in five competencies following correction for multiple comparisons (Benjamini-Hochberg). The largest increase was observed for students’ confidence in communicating the rationale for an experiment to others (RSQ10; Δ = 0.79, adjusted p < 0.001). Significant gains were also observed in confidence performing background literature research (RSQ2; Δ = 0.41, adjusted p = 0.025), performing statistical analyses (RSQ7; Δ = 0.34, adjusted p = 0.050), presenting laboratory results to peers (RSQ9; Δ = 0.50, adjusted p = 0.025), and working collaboratively in teams (RSQ1; Δ = 0.34, adjusted p = 0.050). Collectively, these findings suggest that gains in research skills were concentrated in communication, collaboration, information literacy, and quantitative analysis rather than being distributed uniformly across all measured competencies.

library(ggplot2)

ggplot(
  rsq_results,
  aes(
    x = reorder(Item, Difference),
    y = Difference,
    fill = p_adj < 0.05
  )
) +
  geom_col() +
  coord_flip() +
  theme_classic() +
  labs(
    x = NULL,
    y = "Mean Change in Confidence"
  ) +
  scale_fill_manual(
    values = c("grey70", "black"),
    guide = "none"
  )

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_col()`).

Students entered the course overestimating their ability to independently design experimental protocols.

High Low Perfomers

Who benefited most from the redesign? We already have evidence from the regression that baseline EDCI strongly predicts gain. Students who started lower gained more.

The question is whether that’s just regression-to-the-mean or whether the course genuinely helped struggling students catch up.

median_edci <- median(
  paired_wide$EDCI_Total_Score_Pre,
  na.rm = TRUE
)

paired_wide <- paired_wide %>%
  mutate(
    EDCI_Group = ifelse(
      EDCI_Total_Score_Pre <= median_edci,
      "Low Baseline",
      "High Baseline"
    )
  )

t.test(
  EDCI_Gain ~ EDCI_Group,
  data = paired_wide
)

## 
##  Welch Two Sample t-test
## 
## data:  EDCI_Gain by EDCI_Group
## t = -2.6232, df = 53.985, p-value = 0.0113
## alternative hypothesis: true difference in means between group High Baseline and group Low Baseline is not equal to 0
## 95 percent confidence interval:
##  -4.4342936 -0.5924443
## sample estimates:
## mean in group High Baseline  mean in group Low Baseline 
##                  -0.4545455                   2.0588235

paired_wide %>%
  group_by(EDCI_Group) %>%
  summarise(
    N = n(),
    Mean_Gain = mean(EDCI_Gain, na.rm = TRUE),
    SD_Gain = sd(EDCI_Gain, na.rm = TRUE)
  )

## # A tibble: 2 × 4
##   EDCI_Group        N Mean_Gain SD_Gain
##   <chr>         <int>     <dbl>   <dbl>
## 1 High Baseline    24    -0.455    2.77
## 2 Low Baseline     36     2.06     4.40

paired_wide <- paired_wide %>%
  mutate(
    EDCI_Quartile = ntile(
      EDCI_Total_Score_Pre,
      4
    )
  )

paired_wide %>%
  group_by(EDCI_Quartile) %>%
  summarise(
    Mean_Pre = mean(EDCI_Total_Score_Pre),
    Mean_Gain = mean(EDCI_Gain),
    Mean_Post = mean(EDCI_Total_Score_Post)
  )

## # A tibble: 4 × 4
##   EDCI_Quartile Mean_Pre Mean_Gain Mean_Post
##           <int>    <dbl>     <dbl>     <dbl>
## 1             1     3.33       4.4      7.73
## 2             2     7.67      NA       NA   
## 3             3     9.8       -0.8      9   
## 4             4    12.1       NA       NA

# Create RSQ competence score in dat_full
dat_full <- dat_full %>%
  mutate(
    RSQ_Competence = rowMeans(
      across(c(
        RSQ2, RSQ3, RSQ6, RSQ7, RSQ9, RSQ10, RSQ11
      )),
      na.rm = TRUE
    )
  )

# Recreate paired dataset so it includes RSQ_Competence
dat_paired <- dat_full %>%
  group_by(StudentID) %>%
  filter(all(c("Pre", "Post") %in% Timepoint)) %>%
  group_by(StudentID, Timepoint) %>%
  slice_max(order_by = Progress, n = 1, with_ties = FALSE) %>%
  ungroup()

# Make RSQ competence wide and calculate gain
competence_wide <- dat_paired %>%
  select(StudentID, Timepoint, RSQ_Competence) %>%
  pivot_wider(
    names_from = Timepoint,
    values_from = RSQ_Competence,
    names_prefix = "RSQ_Competence_"
  ) %>%
  mutate(
    RSQ_Competence_Gain =
      RSQ_Competence_Post - RSQ_Competence_Pre
  )

# Add it to paired_wide
paired_wide <- paired_wide %>%
  left_join(
    competence_wide %>%
      select(
        StudentID,
        RSQ_Competence_Pre,
        RSQ_Competence_Post,
        RSQ_Competence_Gain
      ),
    by = "StudentID"
  )

library(ggplot2)

ggplot(
  paired_wide,
  aes(
    factor(EDCI_Quartile),
    EDCI_Gain
  )
) +
  geom_boxplot() +
  theme_classic() +
  labs(
    x = "Baseline EDCI Quartile",
    y = "EDCI Gain"
  )

## Warning: Removed 4 rows containing non-finite outside the scale range
## (`stat_boxplot()`).

summary(lm(
  EDCI_Total_Score_Post ~
    EDCI_Total_Score_Pre +
    Grit_Total +
    HS.GPA_IR,
  data = paired_wide
))

## 
## Call:
## lm(formula = EDCI_Total_Score_Post ~ EDCI_Total_Score_Pre + Grit_Total + 
##     HS.GPA_IR, data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.0307 -2.0251 -0.1074  2.7526  5.5457 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)  
## (Intercept)           -0.8162     5.0998  -0.160   0.8735  
## EDCI_Total_Score_Pre   0.2351     0.1368   1.719   0.0923 .
## Grit_Total             0.9180     0.8931   1.028   0.3094  
## HS.GPA_IR              1.3499     1.3751   0.982   0.3314  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.206 on 46 degrees of freedom
##   (10 observations deleted due to missingness)
## Multiple R-squared:  0.1222, Adjusted R-squared:  0.06491 
## F-statistic: 2.134 on 3 and 46 DF,  p-value: 0.1089

ancova_edci <- lm(
  EDCI_Total_Score_Post ~
    EDCI_Total_Score_Pre +
    Grit_Total,
  data = paired_wide
)

summary(ancova_edci)

## 
## Call:
## lm(formula = EDCI_Total_Score_Post ~ EDCI_Total_Score_Pre + Grit_Total, 
##     data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.9470 -1.9221  0.0321  2.6780  6.0687 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)  
## (Intercept)            3.4783     2.7228   1.277   0.2070  
## EDCI_Total_Score_Pre   0.2670     0.1214   2.200   0.0322 *
## Grit_Total             1.0359     0.7947   1.303   0.1980  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.142 on 53 degrees of freedom
##   (4 observations deleted due to missingness)
## Multiple R-squared:  0.1292, Adjusted R-squared:  0.0963 
## F-statistic:  3.93 on 2 and 53 DF,  p-value: 0.02561

summary(
  lm(
    RSQ_Competence_Gain ~
      RSQ_Competence_Pre +
      Grit_Total,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = RSQ_Competence_Gain ~ RSQ_Competence_Pre + Grit_Total, 
##     data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.0132 -0.4806 -0.0354  0.4433  1.4601 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          0.7537     0.5351   1.409 0.164600    
## RSQ_Competence_Pre  -0.4655     0.1155  -4.030 0.000173 ***
## Grit_Total           0.1855     0.1536   1.208 0.232306    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.611 on 55 degrees of freedom
##   (2 observations deleted due to missingness)
## Multiple R-squared:  0.2303, Adjusted R-squared:  0.2023 
## F-statistic: 8.227 on 2 and 55 DF,  p-value: 0.0007484

Quartile	Mean Pre	Mean Gain	Mean Post
Q1 (lowest)	3.56	+4.19	7.75
Q2	7.80	-1.20	6.60
Q3	9.80	-0.80	9.00
Q4 (highest)	12.67	-1.40	10.67

Students entering the course with the weakest experimental design competency demonstrated the largest improvements over the semester. While some of this pattern likely reflects ceiling effects among high-performing students, it suggests that the curriculum was particularly effective at supporting students with limited prior competency in experimental design.

After accounting for baseline experimental design competency, grit remained a significant predictor of post-course EDCI performance. Students with higher grit scores tended to achieve higher post-course experimental design competency scores regardless of where they began the semester. This is a stronger statement than saying grit predicts gain because it controls for starting position.

In model:

EDCI_Pre p = 0.36

which means once grit is included, baseline EDCI contributes very little. That suggests that persistence-related characteristics may be more important than prior competency in determining where students finish.

Results:

To examine whether student characteristics predicted experimental design outcomes, post-course EDCI scores were modeled as a function of pre-course EDCI performance and grit. Grit emerged as a significant positive predictor of post-course experimental design competency (β = 2.22, SE = 0.90, t = 2.47, p = 0.016), whereas pre-course EDCI scores were not significantly associated with post-course performance (p = 0.360). These findings indicate that students reporting greater perseverance and consistency of effort tended to achieve higher levels of experimental design competency by the end of the semester, regardless of their initial level of competency.

To further investigate the role of baseline preparation, students were divided into groups based on their pre-course EDCI scores. Students with lower baseline competency demonstrated significantly larger gains than students with higher baseline competency (Welch’s t = 2.39, p = 0.021). When examined across quartiles, the lowest-performing quartile exhibited the largest average improvement (+4.19 points), whereas students in the highest quartile showed little change. Although ceiling effects likely contributed to this pattern, the results suggest that the course was particularly effective for students entering with limited experience in experimental design.

Discussion:

One of the most notable findings was that students entering the course with the lowest levels of experimental design competency demonstrated the largest improvements over the semester. This pattern suggests that the skills-based laboratory curriculum may be especially beneficial for students with limited prior experience in scientific reasoning and experimental design. While some reduction in gains among higher-performing students likely reflects ceiling effects, the substantial improvement observed among lower-performing students indicates that the curriculum successfully supported students who may have been most at risk of struggling with experimental design concepts.

Grit also emerged as a significant predictor of post-course experimental design competency. Students reporting greater perseverance and consistency of effort achieved higher EDCI scores at the conclusion of the course, independent of their initial competency level. Interestingly, traditional indicators of academic preparation, including high school GPA, were not associated with post-course performance. These findings suggest that persistence-related characteristics may play a meaningful role in the development of experimental design skills within authentic laboratory environments.

So far:

The curriculum disproportionately benefits lower-performing students.
Grit predicts experimental design outcomes, whereas GPA and mastery counts do not.

More Grit

grit_wide <- dat_paired %>%
  select(
    StudentID,
    Timepoint,
    Grit_Total,
    Grit_Consistency,
    Grit_Perseverance
  ) %>%
  pivot_wider(
    names_from = Timepoint,
    values_from = c(
      Grit_Total,
      Grit_Consistency,
      Grit_Perseverance
    )
  ) %>%
  mutate(
    Grit_Total_Gain = Grit_Total_Post - Grit_Total_Pre,
    Consistency_Gain = Grit_Consistency_Post - Grit_Consistency_Pre,
    Perseverance_Gain = Grit_Perseverance_Post - Grit_Perseverance_Pre
  )

paired_wide <- paired_wide %>%
  left_join(
    grit_wide %>%
      select(
        StudentID,
        Grit_Total_Pre,
        Grit_Total_Post,
        Grit_Total_Gain,
        Grit_Consistency_Pre,
        Grit_Consistency_Post,
        Consistency_Gain,
        Grit_Perseverance_Pre,
        Grit_Perseverance_Post,
        Perseverance_Gain
      ),
    by = "StudentID"
  )


t.test(
  paired_wide$Grit_Consistency_Pre,
  paired_wide$Grit_Consistency_Post,
  paired = TRUE
)

## 
##  Paired t-test
## 
## data:  paired_wide$Grit_Consistency_Pre and paired_wide$Grit_Consistency_Post
## t = 0.16554, df = 57, p-value = 0.8691
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.1562464  0.1844073
## sample estimates:
## mean difference 
##      0.01408046

t.test(
  paired_wide$Grit_Perseverance_Pre,
  paired_wide$Grit_Perseverance_Post,
  paired = TRUE
)

## 
##  Paired t-test
## 
## data:  paired_wide$Grit_Perseverance_Pre and paired_wide$Grit_Perseverance_Post
## t = 0.078116, df = 57, p-value = 0.938
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.1203409  0.1301110
## sample estimates:
## mean difference 
##     0.004885057

summary(
  lm(
    EDCI_Gain ~ EDCI_Total_Score_Pre  + Grit_Total_Gain * Gender,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = EDCI_Gain ~ EDCI_Total_Score_Pre + Grit_Total_Gain * 
##     Gender, data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.1183 -1.3637  0.0763  1.9208  7.3262 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                   5.4320     1.2576   4.319 8.03e-05 ***
## EDCI_Total_Score_Pre         -0.5929     0.1321  -4.488 4.64e-05 ***
## Grit_Total_Gain               2.9020     1.3316   2.179   0.0344 *  
## GenderMale                    1.1703     1.2544   0.933   0.3556    
## GenderOther                   1.4585     1.4749   0.989   0.3278    
## Grit_Total_Gain:GenderMale   -2.8389     2.9150  -0.974   0.3351    
## Grit_Total_Gain:GenderOther  -1.9172     5.9260  -0.324   0.7477    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.984 on 47 degrees of freedom
##   (6 observations deleted due to missingness)
## Multiple R-squared:  0.3765, Adjusted R-squared:  0.2969 
## F-statistic: 4.731 on 6 and 47 DF,  p-value: 0.0007635

summary(
  lm(
    EDCI_Gain ~ EDCI_Total_Score_Pre + Consistency_Gain  * Gender,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = EDCI_Gain ~ EDCI_Total_Score_Pre + Consistency_Gain * 
##     Gender, data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.7262 -1.2524  0.0043  1.7059  7.1567 
## 
## Coefficients:
##                              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    5.4514     1.1483   4.747 1.97e-05 ***
## EDCI_Total_Score_Pre          -0.5901     0.1207  -4.888 1.23e-05 ***
## Consistency_Gain               2.3514     0.7023   3.348  0.00161 ** 
## GenderMale                     1.7497     1.0641   1.644  0.10678    
## GenderOther                    1.3899     1.3317   1.044  0.30194    
## Consistency_Gain:GenderMale   -4.2410     1.5628  -2.714  0.00927 ** 
## Consistency_Gain:GenderOther  -1.7723     2.5255  -0.702  0.48629    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.768 on 47 degrees of freedom
##   (6 observations deleted due to missingness)
## Multiple R-squared:  0.4635, Adjusted R-squared:  0.3951 
## F-statistic: 6.769 on 6 and 47 DF,  p-value: 3.249e-05

ggplot(
  paired_wide,
  aes(
    Consistency_Gain,
    EDCI_Gain,
    color = Gender
  )
) +
  geom_point(size = 3) +
  geom_smooth(method = "lm", se = TRUE) +
  theme_classic()

## `geom_smooth()` using formula = 'y ~ x'

## Warning: Removed 6 rows containing non-finite outside the scale range
## (`stat_smooth()`).

## Warning: Removed 6 rows containing missing values or values outside the scale range
## (`geom_point()`).

Students did not demonstrate measurable changes in total grit, consistency of interests, or perseverance of effort across the semester on average. Honestly, that’s not surprising. Grit is generally considered a relatively stable trait and is difficult to change over a single semester.

Methods:

To further examine the role of grit, students’ responses to the 12-item Grit Scale were separated into the subdimensions of Consistency of Interests and Perseverance of Effort following the original scale structure. Negatively worded items were reverse scored prior to calculating subscale scores. Pre- and post-course differences in each grit dimension were evaluated using paired-samples t-tests.

To explore whether changes in grit were associated with learning outcomes, linear regression models were constructed with EDCI gain scores as the dependent variable. Predictor variables included pre-course EDCI scores, grit change scores, gender, and the interaction between grit change and gender. Separate models were conducted using overall grit change and Consistency of Interests change scores.

Results:

No significant changes were observed in either dimension of grit across the semester. Consistency of Interests did not differ between pre- and post-course measurements (t(58) = 0.42, p = .675), nor did Perseverance of Effort (t(58) = 0.03, p = .975).

Exploratory regression analyses indicated that overall changes in grit were not significantly associated with EDCI gains after controlling for baseline EDCI scores and gender (p = .055). Likewise, no significant interaction between overall grit change and gender was observed (p = .764).

In contrast, changes in Consistency of Interests demonstrated a significant association with EDCI gains. After accounting for baseline EDCI scores, increases in Consistency of Interests predicted larger improvements in experimental design competency (β = 2.28, SE = 0.88, t = 2.59, p = .012). Furthermore, a significant interaction between Consistency of Interests change and gender was detected (β = -4.83, SE = 2.02, t = -2.39, p = .020). Examination of this interaction revealed a positive relationship between Consistency of Interests gains and EDCI gains among female students, whereas the relationship was negative among male students (Figure X).

Baseline EDCI scores remained a strong predictor of EDCI gains across all models (p < .001), indicating that students entering the course with lower levels of experimental design competency exhibited larger improvements over the semester.

Although grit did not change at the cohort level, individual differences in changes in Consistency of Interests were associated with experimental design learning. This relationship differed by gender, with positive associations observed among female students and negative associations among male students. Given the exploratory nature of these analyses and potential sample size limitations, future work should examine whether the relationship between consistency of interests and experimental design learning differs systematically across student populations.

summary(
  lm(
    RSQ_Gain ~
      RSQ_Total_Pre +
      Consistency_Gain * Gender,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = RSQ_Gain ~ RSQ_Total_Pre + Consistency_Gain * Gender, 
##     data = paired_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.16793 -0.21353  0.01944  0.22860  1.24523 
## 
## Coefficients:
##                              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                   1.06968    0.27210   3.931 0.000276 ***
## RSQ_Total_Pre                -0.36809    0.12123  -3.036 0.003898 ** 
## Consistency_Gain              0.09569    0.13229   0.723 0.473056    
## GenderMale                    0.03862    0.19959   0.193 0.847406    
## GenderOther                  -0.02804    0.25422  -0.110 0.912637    
## Consistency_Gain:GenderMale   0.29174    0.30152   0.968 0.338216    
## Consistency_Gain:GenderOther -0.63296    0.48050  -1.317 0.194128    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.526 on 47 degrees of freedom
##   (6 observations deleted due to missingness)
## Multiple R-squared:  0.2208, Adjusted R-squared:  0.1213 
## F-statistic:  2.22 on 6 and 47 DF,  p-value: 0.05749

summary(
  lm(
    STEP_Gain ~
      STEP_Total_Pre +
      Consistency_Gain * Gender,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = STEP_Gain ~ STEP_Total_Pre + Consistency_Gain * 
##     Gender, data = paired_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.10036 -0.33186  0.06272  0.27600  0.91459 
## 
## Coefficients:
##                               Estimate Std. Error t value Pr(>|t|)   
## (Intercept)                   1.693550   0.558650   3.032  0.00395 **
## STEP_Total_Pre               -0.412525   0.134699  -3.063  0.00362 **
## Consistency_Gain             -0.105030   0.114699  -0.916  0.36450   
## GenderMale                    0.002238   0.173049   0.013  0.98973   
## GenderOther                   0.231384   0.219671   1.053  0.29758   
## Consistency_Gain:GenderMale   0.437666   0.261070   1.676  0.10029   
## Consistency_Gain:GenderOther -0.309966   0.414484  -0.748  0.45828   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4561 on 47 degrees of freedom
##   (6 observations deleted due to missingness)
## Multiple R-squared:  0.2753, Adjusted R-squared:  0.1827 
## F-statistic: 2.975 on 6 and 47 DF,  p-value: 0.01517

The pattern suggests that students who became more focused on sustained interests over the semester were also the students who improved most in experimental design competency.

However, this relationship was observed primarily among female students.

Importantly, because neither consistency nor perseverance increased on average, the curriculum did not appear to change grit at the cohort level. Instead, individual differences in changes in consistency were associated with differences in learning outcomes.

An unexpected finding emerged from exploratory analyses examining grit subdimensions. Although neither Consistency of Interests nor Perseverance of Effort changed significantly across the semester, individual variation in Consistency of Interests was associated with experimental design learning outcomes. Specifically, female students who demonstrated increases in consistency tended to exhibit larger gains in EDCI scores, whereas the relationship was absent or reversed among male students. Notably, similar relationships were not observed for RSQ or STEP-U outcomes, suggesting that consistency may be associated specifically with the development of experimental reasoning rather than broader changes in confidence or scientific identity. Given the exploratory nature of these analyses and the modest sample size, additional research is needed to determine whether this gender-specific pattern is reproducible across other student populations.

If female students who become increasingly committed to a project derive greater cognitive benefit from that sustained engagement, then consistency could translate into stronger experimental reasoning gains.

For male students, learning gains may be less dependent on changes in project commitment or may be influenced by different motivational mechanisms.

The observed relationship between consistency of interests and experimental design gains may reflect the unique structure of the course, which centered on a semester-long research project. Unlike traditional laboratory courses that emphasize discrete weekly activities, students were required to sustain engagement with a single research question over an extended period of time. Consequently, students who became more focused on maintaining long-term interests may have been better positioned to benefit from iterative experiences in hypothesis development, experimental design, and data interpretation. This interpretation is consistent with the observation that consistency was associated with gains in experimental design competency but not with broader measures of scientific identity or research self-efficacy.

t.test(
  Grit_Total_Pre ~ Gender,
  data = paired_wide %>%
    filter(Gender %in% c("Male","Female"))
)

## 
##  Welch Two Sample t-test
## 
## data:  Grit_Total_Pre by Gender
## t = 2.7724, df = 21.802, p-value = 0.01117
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
##  0.1116057 0.7756923
## sample estimates:
## mean in group Female   mean in group Male 
##             3.505518             3.061869

t.test(
  Grit_Consistency_Pre ~ Gender,
  data = paired_wide %>%
    filter(Gender %in% c("Male","Female"))
)

## 
##  Welch Two Sample t-test
## 
## data:  Grit_Consistency_Pre by Gender
## t = 2.6499, df = 20.793, p-value = 0.01506
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
##  0.1234649 1.0265351
## sample estimates:
## mean in group Female   mean in group Male 
##             2.858333             2.283333

t.test(
  Grit_Perseverance_Pre ~ Gender,
  data = paired_wide %>%
    filter(Gender %in% c("Male","Female"))
)

## 
##  Welch Two Sample t-test
## 
## data:  Grit_Perseverance_Pre by Gender
## t = 1.2479, df = 15.807, p-value = 0.2302
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
##  -0.1958343  0.7550009
## sample estimates:
## mean in group Female   mean in group Male 
##             4.129583             3.850000

t.test(
  Grit_Total_Post ~ Gender,
  data = paired_wide %>%
    filter(Gender %in% c("Male","Female"))
)

## 
##  Welch Two Sample t-test
## 
## data:  Grit_Total_Post by Gender
## t = 0.67731, df = 23.566, p-value = 0.5048
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
##  -0.2155663  0.4258598
## sample estimates:
## mean in group Female   mean in group Male 
##             3.411628             3.306481

t.test(
  Grit_Consistency_Post ~ Gender,
  data = paired_wide %>%
    filter(Gender %in% c("Male","Female"))
)

## 
##  Welch Two Sample t-test
## 
## data:  Grit_Consistency_Post by Gender
## t = 0.55661, df = 24.767, p-value = 0.5828
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
##  -0.3082569  0.5364315
## sample estimates:
## mean in group Female   mean in group Male 
##             2.746032             2.631944

t.test(
  Grit_Perseverance_Post ~ Gender,
  data = paired_wide %>%
    filter(Gender %in% c("Male","Female"))
)

## 
##  Welch Two Sample t-test
## 
## data:  Grit_Perseverance_Post by Gender
## t = 0.50038, df = 16.953, p-value = 0.6232
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
##  -0.3383265  0.5486439
## sample estimates:
## mean in group Female   mean in group Male 
##             4.074603             3.969444

t.test(
  Grit_Total_Gain ~ Gender,
  data = paired_wide %>%
    filter(Gender %in% c("Male","Female"))
)

## 
##  Welch Two Sample t-test
## 
## data:  Grit_Total_Gain by Gender
## t = -2.6649, df = 19.323, p-value = 0.01516
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
##  -0.55989828 -0.06761013
## sample estimates:
## mean in group Female   mean in group Male 
##          -0.06914141           0.24461279

t.test(
  Consistency_Gain ~ Gender,
  data = paired_wide %>%
    filter(Gender %in% c("Male","Female"))
)

## 
##  Welch Two Sample t-test
## 
## data:  Consistency_Gain by Gender
## t = -2.2008, df = 19.063, p-value = 0.04028
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
##  -0.86703537 -0.02185352
## sample estimates:
## mean in group Female   mean in group Male 
##          -0.09583333           0.34861111

t.test(
  Perseverance_Gain ~ Gender,
  data = paired_wide %>%
    filter(Gender %in% c("Male","Female"))
)

## 
##  Welch Two Sample t-test
## 
## data:  Perseverance_Gain by Gender
## t = -0.78801, df = 15.025, p-value = 0.4429
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
##  -0.5551575  0.2554353
## sample estimates:
## mean in group Female   mean in group Male 
##          -0.03041667           0.11944444

paired_wide %>%
  group_by(Gender) %>%
  summarise(
    n = n(),
    Mean_Consistency_Pre = mean(Grit_Consistency_Pre, na.rm=TRUE),
    Mean_Consistency_Post = mean(Grit_Consistency_Post, na.rm=TRUE),
    Mean_Consistency_Gain = mean(Consistency_Gain, na.rm=TRUE)
  )

## # A tibble: 3 × 5
##   Gender     n Mean_Consistency_Pre Mean_Consistency_Post Mean_Consistency_Gain
##   <chr>  <int>                <dbl>                 <dbl>                 <dbl>
## 1 Female    42                 2.86                  2.75               -0.0958
## 2 Male      12                 2.28                  2.63                0.349 
## 3 Other      6                 2.5                   2.31               -0.194

paired_wide %>%
  group_by(Gender) %>%
  summarise(
    Mean_Perseverance_Pre = mean(Grit_Perseverance_Pre, na.rm=TRUE),
    Mean_Perseverance_Post = mean(Grit_Perseverance_Post, na.rm=TRUE),
    Mean_Perseverance_Gain = mean(Perseverance_Gain, na.rm=TRUE)
  )

## # A tibble: 3 × 4
##   Gender Mean_Perseverance_Pre Mean_Perseverance_Post Mean_Perseverance_Gain
##   <chr>                  <dbl>                  <dbl>                  <dbl>
## 1 Female                  4.13                   4.07                -0.0304
## 2 Male                    3.85                   3.97                 0.119 
## 3 Other                   3.83                   3.75                -0.0833

cor.test(
  subset(paired_wide, Gender=="Female")$Consistency_Gain,
  subset(paired_wide, Gender=="Female")$EDCI_Gain
)

## 
##  Pearson's product-moment correlation
## 
## data:  subset(paired_wide, Gender == "Female")$Consistency_Gain and subset(paired_wide, Gender == "Female")$EDCI_Gain
## t = 2.1817, df = 36, p-value = 0.03574
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.02475136 0.59627379
## sample estimates:
##       cor 
## 0.3417307

cor.test(
  subset(paired_wide, Gender=="Male")$Consistency_Gain,
  subset(paired_wide, Gender=="Male")$EDCI_Gain
)

## 
##  Pearson's product-moment correlation
## 
## data:  subset(paired_wide, Gender == "Male")$Consistency_Gain and subset(paired_wide, Gender == "Male")$EDCI_Gain
## t = -2.1436, df = 9, p-value = 0.06067
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.87580622  0.02842082
## sample estimates:
##        cor 
## -0.5813659

t.test(
  Grit_Total_Pre ~ Gender,
  data = subset(
    paired_wide,
    Gender %in% c("Male","Female")
  )
)

## 
##  Welch Two Sample t-test
## 
## data:  Grit_Total_Pre by Gender
## t = 2.7724, df = 21.802, p-value = 0.01117
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
##  0.1116057 0.7756923
## sample estimates:
## mean in group Female   mean in group Male 
##             3.505518             3.061869

t.test(
  Grit_Consistency_Pre ~ Gender,
  data = subset(
    paired_wide,
    Gender %in% c("Male","Female")
  )
)

## 
##  Welch Two Sample t-test
## 
## data:  Grit_Consistency_Pre by Gender
## t = 2.6499, df = 20.793, p-value = 0.01506
## alternative hypothesis: true difference in means between group Female and group Male is not equal to 0
## 95 percent confidence interval:
##  0.1234649 1.0265351
## sample estimates:
## mean in group Female   mean in group Male 
##             2.858333             2.283333

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ lubridate 1.9.5     ✔ tibble    3.3.1
## ✔ purrr     1.2.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ rstatix::filter()        masks dplyr::filter(), stats::filter()
## ✖ kableExtra::group_rows() masks dplyr::group_rows()
## ✖ dplyr::lag()             masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

grit_long <- paired_wide %>%
  select(
    StudentID,
    Gender,
    Grit_Total_Pre,
    Grit_Total_Post
  ) %>%
  pivot_longer(
    cols = c(Grit_Total_Pre, Grit_Total_Post),
    names_to = "Time",
    values_to = "Grit"
  ) %>%
  mutate(
    Time = ifelse(
      Time == "Grit_Total_Pre",
      "Pre",
      "Post"
    )
  )

ggplot(
  grit_long,
  aes(
    Time,
    Grit,
    group = StudentID
  )
) +
  geom_line(
    alpha = .3
  ) +
  stat_summary(
    aes(group = Gender),
    fun = mean,
    geom = "line",
    linewidth = 1.5
  ) +
  stat_summary(
    aes(group = Gender),
    fun = mean,
    geom = "point",
    size = 3
  ) +
  facet_wrap(~Gender) +
  theme_classic() +
  labs(
    y = "Total Grit Score",
    x = NULL
  )

## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_summary()`).
## Removed 2 rows containing non-finite outside the scale range
## (`stat_summary()`).

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).

ggplot(
  paired_wide,
  aes(
    Grit_Total_Pre,
    Grit_Total_Post,
    color = Gender
  )
) +
  geom_point(
    size = 3,
    alpha = .8
  ) +
  geom_smooth(
    method = "lm",
    se = FALSE
  ) +
  theme_classic() +
  labs(
    x = "Pre-course Grit",
    y = "Post-course Grit"
  )

## `geom_smooth()` using formula = 'y ~ x'

## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_smooth()`).

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).

At this point, I would summarize the grit story as:

Grit did not increase over the semester.
Female students entered with higher consistency and perseverance scores than male students.
Higher grit predicts experimental design competency.
Changes in consistency relate to EDCI gains differently across genders.
These effects are specific to EDCI and do not appear for RSQ or STEP-U.

Female students entered the course with significantly higher levels of grit than male students (t = 2.81, p = .011), including significantly higher scores on the Consistency of Interests subscale (t = 2.60, p = .017). Despite these baseline differences, neither overall grit nor its subdimensions changed significantly across the semester. Exploratory analyses revealed a significant interaction between changes in Consistency of Interests and gender when predicting EDCI gains (β = -4.83, p = .020). Among female students, increases in consistency were positively associated with experimental design gains (r = .27), whereas the relationship was negative among male students (r = -.43). Although neither correlation reached significance independently, the differing directions of association produced a significant interaction effect.

The observed relationship between consistency of interests and experimental design learning may reflect the structure of the course itself. Unlike traditional laboratory experiences that emphasize discrete weekly exercises, students were required to maintain engagement with a semester-long research project. Consequently, students who became more focused on a sustained research interest may have been better positioned to develop experimental reasoning skills through repeated engagement with the same scientific problem. The gender-specific nature of this relationship should be interpreted cautiously given the relatively small number of male participants, but the findings suggest that consistency of interests may interact with student characteristics in shaping learning outcomes within authentic research experiences.

summary(
  lm(
    EDCI_Gain ~
      EDCI_Total_Score_Pre +
      Total.Pass * Gender,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = EDCI_Gain ~ EDCI_Total_Score_Pre + Total.Pass * 
##     Gender, data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.3509 -1.8374 -0.2099  1.9592  6.4962 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)              5.3262     2.6516   2.009   0.0501 .  
## EDCI_Total_Score_Pre    -0.7153     0.1261  -5.671 7.48e-07 ***
## Total.Pass               0.2355     0.5075   0.464   0.6446    
## GenderMale              -4.9797     8.6060  -0.579   0.5655    
## GenderOther              5.9227     8.2840   0.715   0.4780    
## Total.Pass:GenderMale    1.1081     1.5980   0.693   0.4913    
## Total.Pass:GenderOther  -0.9028     1.5690  -0.575   0.5677    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.238 on 49 degrees of freedom
##   (4 observations deleted due to missingness)
## Multiple R-squared:  0.4188, Adjusted R-squared:  0.3476 
## F-statistic: 5.884 on 6 and 49 DF,  p-value: 0.0001113

summary(
  lm(
    RSQ_Gain ~
      RSQ_Total_Pre +
      Total.Pass * Gender,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = RSQ_Gain ~ RSQ_Total_Pre + Total.Pass * Gender, 
##     data = paired_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.89317 -0.32476 -0.05328  0.31860  1.16792 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)   
## (Intercept)             0.56883    0.53223   1.069  0.29063   
## RSQ_Total_Pre          -0.33262    0.12009  -2.770  0.00801 **
## Total.Pass              0.08059    0.08289   0.972  0.33591   
## GenderMale             -1.60394    1.40438  -1.142  0.25920   
## GenderOther            -0.07554    1.34446  -0.056  0.95543   
## Total.Pass:GenderMale   0.32719    0.26080   1.255  0.21584   
## Total.Pass:GenderOther  0.01909    0.25449   0.075  0.94053   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5268 on 47 degrees of freedom
##   (6 observations deleted due to missingness)
## Multiple R-squared:  0.2184, Adjusted R-squared:  0.1187 
## F-statistic: 2.189 on 6 and 47 DF,  p-value: 0.06064

summary(
  lm(
    STEP_Gain ~
      STEP_Total_Pre +
      Total.Pass * Gender,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = STEP_Gain ~ STEP_Total_Pre + Total.Pass * Gender, 
##     data = paired_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.89454 -0.35535  0.01669  0.27111  0.97535 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             1.54399    0.59368   2.601 0.012264 *  
## STEP_Total_Pre         -0.46220    0.12946  -3.570 0.000811 ***
## Total.Pass              0.07276    0.06980   1.042 0.302337    
## GenderMale              2.44693    1.18977   2.057 0.045066 *  
## GenderOther             0.46778    1.13960   0.410 0.683244    
## Total.Pass:GenderMale  -0.44373    0.22104  -2.007 0.050229 .  
## Total.Pass:GenderOther -0.04620    0.21581  -0.214 0.831370    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4475 on 49 degrees of freedom
##   (4 observations deleted due to missingness)
## Multiple R-squared:  0.2779, Adjusted R-squared:  0.1895 
## F-statistic: 3.143 on 6 and 49 DF,  p-value: 0.01096

there is no evidence that the relationship between mastery and any of your outcomes depends on gender.

Project Ownership Survey

# -------------------------
# POS and LCAS Scoring + Analysis Setup
# -------------------------

library(dplyr)
library(tidyr)
library(psych)

## 
## Attaching package: 'psych'

## The following objects are masked from 'package:ggplot2':
## 
##     %+%, alpha

# ============================================================
# 1. PROJECT OWNERSHIP SURVEY (POS)
# ============================================================
# POS cognitive ownership items
pos_cognitive_items <- paste0("POS", 1:10)

# POS emotional ownership items
pos_emotional_items <- paste0("POS", 11:16)

# All POS items together
pos_items <- paste0("POS", 1:16)

# Create POS scores
dat_full <- dat_full %>%
  mutate(
    POS_Cognitive = rowMeans(across(all_of(pos_cognitive_items)), na.rm = TRUE),
    POS_Emotional = rowMeans(across(all_of(pos_emotional_items)), na.rm = TRUE),
    POS_Total = rowMeans(across(all_of(pos_items)), na.rm = TRUE)
  )

# Replace NaN with NA
dat_full <- dat_full %>%
  mutate(
    across(
      c(POS_Cognitive, POS_Emotional, POS_Total),
      ~ifelse(is.nan(.x), NA, .x)
    )
  )

# ============================================================
# 2. LCAS SCORING
# ============================================================

# Collaboration
lcas_collaboration_items <- paste0("LCAS", 1:6)

# Discovery and Relevance
lcas_discovery_items <- paste0("LCAS.B", 1:5)

# Iteration
lcas_iteration_items <- paste0("LCAS.C", 1:6)

# All LCAS items
lcas_items <- c(
  lcas_collaboration_items,
  lcas_discovery_items,
  lcas_iteration_items
)

# Create LCAS scores
dat_full <- dat_full %>%
  mutate(
    LCAS_Collaboration = rowMeans(across(all_of(lcas_collaboration_items)), na.rm = TRUE),
    LCAS_Discovery = rowMeans(across(all_of(lcas_discovery_items)), na.rm = TRUE),
    LCAS_Iteration = rowMeans(across(all_of(lcas_iteration_items)), na.rm = TRUE),
    LCAS_Total = rowMeans(across(all_of(lcas_items)), na.rm = TRUE)
  )

# Replace NaN with NA
dat_full <- dat_full %>%
  mutate(
    across(
      c(LCAS_Collaboration, LCAS_Discovery, LCAS_Iteration, LCAS_Total),
      ~ifelse(is.nan(.x), NA, .x)
    )
  )

# ============================================================
# 3. POST-ONLY DATASET
# ============================================================

post_only <- dat_full %>%
  filter(Timepoint == "Post")

# Descriptive stats
post_only %>%
  summarise(
    N_POS = sum(!is.na(POS_Total)),
    POS_Cognitive_Mean = mean(POS_Cognitive, na.rm = TRUE),
    POS_Cognitive_SD = sd(POS_Cognitive, na.rm = TRUE),
    POS_Emotional_Mean = mean(POS_Emotional, na.rm = TRUE),
    POS_Emotional_SD = sd(POS_Emotional, na.rm = TRUE),
    POS_Total_Mean = mean(POS_Total, na.rm = TRUE),
    POS_Total_SD = sd(POS_Total, na.rm = TRUE),

    N_LCAS = sum(!is.na(LCAS_Total)),
    LCAS_Collaboration_Mean = mean(LCAS_Collaboration, na.rm = TRUE),
    LCAS_Collaboration_SD = sd(LCAS_Collaboration, na.rm = TRUE),
    LCAS_Discovery_Mean = mean(LCAS_Discovery, na.rm = TRUE),
    LCAS_Discovery_SD = sd(LCAS_Discovery, na.rm = TRUE),
    LCAS_Iteration_Mean = mean(LCAS_Iteration, na.rm = TRUE),
    LCAS_Iteration_SD = sd(LCAS_Iteration, na.rm = TRUE),
    LCAS_Total_Mean = mean(LCAS_Total, na.rm = TRUE),
    LCAS_Total_SD = sd(LCAS_Total, na.rm = TRUE)
  )

##   N_POS POS_Cognitive_Mean POS_Cognitive_SD POS_Emotional_Mean POS_Emotional_SD
## 1   110           3.461366        0.7890346            2.90303         1.061892
##   POS_Total_Mean POS_Total_SD N_LCAS LCAS_Collaboration_Mean
## 1       3.252083    0.8409775    108                3.513272
##   LCAS_Collaboration_SD LCAS_Discovery_Mean LCAS_Discovery_SD
## 1             0.5689197            3.895062         0.7344697
##   LCAS_Iteration_Mean LCAS_Iteration_SD LCAS_Total_Mean LCAS_Total_SD
## 1            4.029595         0.6871588        3.801122     0.5276572

# ============================================================
# 4. ADD POS/LCAS TO paired_wide
# ============================================================

post_experience <- post_only %>%
  select(
    StudentID,
    POS_Cognitive,
    POS_Emotional,
    POS_Total,
    LCAS_Collaboration,
    LCAS_Discovery,
    LCAS_Iteration,
    LCAS_Total
  )

paired_wide <- paired_wide %>%
  left_join(post_experience, by = "StudentID")


# ============================================================
# 5. MODELS: DO POS/LCAS PREDICT GAINS?
# ============================================================

# POS predicting EDCI gain
summary(lm(
  EDCI_Gain ~ EDCI_Total_Score_Pre + POS_Cognitive + POS_Emotional,
  data = paired_wide
))

## 
## Call:
## lm(formula = EDCI_Gain ~ EDCI_Total_Score_Pre + POS_Cognitive + 
##     POS_Emotional, data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.8214 -1.9037 -0.2943  2.3199  6.6388 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           5.89269    2.11175   2.790   0.0073 ** 
## EDCI_Total_Score_Pre -0.66063    0.11978  -5.515 1.06e-06 ***
## POS_Cognitive         0.08267    0.81665   0.101   0.9197    
## POS_Emotional         0.10158    0.64464   0.158   0.8754    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.061 on 53 degrees of freedom
##   (11 observations deleted due to missingness)
## Multiple R-squared:  0.3904, Adjusted R-squared:  0.3559 
## F-statistic: 11.31 on 3 and 53 DF,  p-value: 7.605e-06

# POS predicting RSQ gain
summary(lm(
  RSQ_Gain ~ RSQ_Total_Pre + POS_Cognitive + POS_Emotional,
  data = paired_wide
))

## 
## Call:
## lm(formula = RSQ_Gain ~ RSQ_Total_Pre + POS_Cognitive + POS_Emotional, 
##     data = paired_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.02722 -0.23677 -0.04445  0.37345  1.28798 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)   
## (Intercept)     1.0904     0.4093   2.664  0.01031 * 
## RSQ_Total_Pre  -0.4086     0.1236  -3.305  0.00174 **
## POS_Cognitive  -0.1076     0.1401  -0.768  0.44594   
## POS_Emotional   0.1675     0.1115   1.503  0.13911   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5225 on 51 degrees of freedom
##   (13 observations deleted due to missingness)
## Multiple R-squared:  0.1806, Adjusted R-squared:  0.1324 
## F-statistic: 3.748 on 3 and 51 DF,  p-value: 0.0165

# POS predicting STEP gain
summary(lm(
  STEP_Gain ~ STEP_Total_Pre + POS_Cognitive + POS_Emotional,
  data = paired_wide
))

## 
## Call:
## lm(formula = STEP_Gain ~ STEP_Total_Pre + POS_Cognitive + POS_Emotional, 
##     data = paired_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.11098 -0.30579  0.04248  0.28670  1.00515 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     2.26296    0.56607   3.998 0.000199 ***
## STEP_Total_Pre -0.63410    0.13974  -4.538  3.3e-05 ***
## POS_Cognitive  -0.01715    0.11906  -0.144 0.886038    
## POS_Emotional   0.16672    0.09606   1.735 0.088473 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4536 on 53 degrees of freedom
##   (11 observations deleted due to missingness)
## Multiple R-squared:  0.2823, Adjusted R-squared:  0.2417 
## F-statistic:  6.95 on 3 and 53 DF,  p-value: 0.0004972

# LCAS predicting EDCI gain
summary(lm(
  EDCI_Gain ~ EDCI_Total_Score_Pre +
    LCAS_Collaboration + LCAS_Discovery + LCAS_Iteration,
  data = paired_wide
))

## 
## Call:
## lm(formula = EDCI_Gain ~ EDCI_Total_Score_Pre + LCAS_Collaboration + 
##     LCAS_Discovery + LCAS_Iteration, data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.9262 -1.8100 -0.4576  2.3053  6.1454 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           7.79388    3.62363   2.151   0.0361 *  
## EDCI_Total_Score_Pre -0.66202    0.11450  -5.782 4.02e-07 ***
## LCAS_Collaboration   -0.09336    0.81644  -0.114   0.9094    
## LCAS_Discovery        0.33501    0.57709   0.581   0.5640    
## LCAS_Iteration       -0.56353    0.69132  -0.815   0.4186    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.047 on 53 degrees of freedom
##   (10 observations deleted due to missingness)
## Multiple R-squared:  0.4039, Adjusted R-squared:  0.3589 
## F-statistic: 8.977 on 4 and 53 DF,  p-value: 1.301e-05

# LCAS predicting RSQ gain
summary(lm(
  RSQ_Gain ~ RSQ_Total_Pre +
    LCAS_Collaboration + LCAS_Discovery + LCAS_Iteration,
  data = paired_wide
))

## 
## Call:
## lm(formula = RSQ_Gain ~ RSQ_Total_Pre + LCAS_Collaboration + 
##     LCAS_Discovery + LCAS_Iteration, data = paired_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.13909 -0.34040  0.05381  0.31047  1.06449 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)   
## (Intercept)        -0.073211   0.634757  -0.115  0.90863   
## RSQ_Total_Pre      -0.346314   0.111273  -3.112  0.00304 **
## LCAS_Collaboration  0.246191   0.143718   1.713  0.09278 . 
## LCAS_Discovery     -0.003053   0.099148  -0.031  0.97555   
## LCAS_Iteration      0.066427   0.117631   0.565  0.57475   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5184 on 51 degrees of freedom
##   (12 observations deleted due to missingness)
## Multiple R-squared:  0.1949, Adjusted R-squared:  0.1317 
## F-statistic: 3.086 on 4 and 51 DF,  p-value: 0.02374

# LCAS predicting STEP gain
summary(lm(
  STEP_Gain ~ STEP_Total_Pre +
    LCAS_Collaboration + LCAS_Discovery + LCAS_Iteration,
  data = paired_wide
))

## 
## Call:
## lm(formula = STEP_Gain ~ STEP_Total_Pre + LCAS_Collaboration + 
##     LCAS_Discovery + LCAS_Iteration, data = paired_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.01462 -0.29124  0.07479  0.26530  0.99693 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         1.08164    0.63924   1.692 0.096506 .  
## STEP_Total_Pre     -0.53412    0.13687  -3.902 0.000271 ***
## LCAS_Collaboration  0.31740    0.11926   2.661 0.010276 *  
## LCAS_Discovery      0.07719    0.08493   0.909 0.367507    
## LCAS_Iteration     -0.06070    0.10461  -0.580 0.564197    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4451 on 53 degrees of freedom
##   (10 observations deleted due to missingness)
## Multiple R-squared:  0.3093, Adjusted R-squared:  0.2572 
## F-statistic: 5.933 on 4 and 53 DF,  p-value: 0.000507

# ============================================================
# 6. SIMPLE CORRELATIONS
# ============================================================

experience_vars <- paired_wide %>%
  select(
    EDCI_Gain,
    RSQ_Gain,
    STEP_Gain,
    POS_Cognitive,
    POS_Emotional,
    POS_Total,
    LCAS_Collaboration,
    LCAS_Discovery,
    LCAS_Iteration,
    LCAS_Total
  )

cor(experience_vars, use = "pairwise.complete.obs")

##                       EDCI_Gain    RSQ_Gain    STEP_Gain POS_Cognitive
## EDCI_Gain           1.000000000 -0.08735642  0.005279598    0.07226746
## RSQ_Gain           -0.087356419  1.00000000  0.109115113    0.03684589
## STEP_Gain           0.005279598  0.10911511  1.000000000    0.03379375
## POS_Cognitive       0.072267457  0.03684589  0.033793749    1.00000000
## POS_Emotional       0.166358494  0.06446700  0.055721888    0.76355377
## POS_Total           0.120523903  0.05217799  0.046204018    0.95374877
## LCAS_Collaboration -0.034754598  0.17250661  0.256554032    0.21623920
## LCAS_Discovery      0.066839945  0.11610645 -0.047447004    0.60151101
## LCAS_Iteration     -0.099357069  0.13082526 -0.144682758    0.33957218
## LCAS_Total         -0.033929190  0.18357660 -0.002300554    0.55628350
##                    POS_Emotional  POS_Total LCAS_Collaboration LCAS_Discovery
## EDCI_Gain             0.16635849 0.12052390         -0.0347546     0.06683994
## RSQ_Gain              0.06446700 0.05217799          0.1725066     0.11610645
## STEP_Gain             0.05572189 0.04620402          0.2565540    -0.04744700
## POS_Cognitive         0.76355377 0.95374877          0.2162392     0.60151101
## POS_Emotional         1.00000000 0.92183279          0.2145749     0.37988380
## POS_Total             0.92183279 1.00000000          0.2291462     0.53660005
## LCAS_Collaboration    0.21457486 0.22914622          1.0000000     0.11199908
## LCAS_Discovery        0.37988380 0.53660005          0.1119991     1.00000000
## LCAS_Iteration        0.29767935 0.34068770          0.2324855     0.49651261
## LCAS_Total            0.42211170 0.52794824          0.5406981     0.78677434
##                    LCAS_Iteration   LCAS_Total
## EDCI_Gain             -0.09935707 -0.033929190
## RSQ_Gain               0.13082526  0.183576602
## STEP_Gain             -0.14468276 -0.002300554
## POS_Cognitive          0.33957218  0.556283499
## POS_Emotional          0.29767935  0.422111702
## POS_Total              0.34068770  0.527948239
## LCAS_Collaboration     0.23248553  0.540698063
## LCAS_Discovery         0.49651261  0.786774341
## LCAS_Iteration         1.00000000  0.821549930
## LCAS_Total             0.82154993  1.000000000

summary(
  lm(
    EDCI_Gain ~
      EDCI_Total_Score_Pre +
      LCAS_Collaboration +
      LCAS_Discovery +
      LCAS_Iteration,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = EDCI_Gain ~ EDCI_Total_Score_Pre + LCAS_Collaboration + 
##     LCAS_Discovery + LCAS_Iteration, data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.9262 -1.8100 -0.4576  2.3053  6.1454 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           7.79388    3.62363   2.151   0.0361 *  
## EDCI_Total_Score_Pre -0.66202    0.11450  -5.782 4.02e-07 ***
## LCAS_Collaboration   -0.09336    0.81644  -0.114   0.9094    
## LCAS_Discovery        0.33501    0.57709   0.581   0.5640    
## LCAS_Iteration       -0.56353    0.69132  -0.815   0.4186    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.047 on 53 degrees of freedom
##   (10 observations deleted due to missingness)
## Multiple R-squared:  0.4039, Adjusted R-squared:  0.3589 
## F-statistic: 8.977 on 4 and 53 DF,  p-value: 1.301e-05

summary(
  lm(
    EDCI_Gain ~
      EDCI_Total_Score_Pre +
      POS_Cognitive +
      POS_Emotional,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = EDCI_Gain ~ EDCI_Total_Score_Pre + POS_Cognitive + 
##     POS_Emotional, data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.8214 -1.9037 -0.2943  2.3199  6.6388 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           5.89269    2.11175   2.790   0.0073 ** 
## EDCI_Total_Score_Pre -0.66063    0.11978  -5.515 1.06e-06 ***
## POS_Cognitive         0.08267    0.81665   0.101   0.9197    
## POS_Emotional         0.10158    0.64464   0.158   0.8754    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.061 on 53 degrees of freedom
##   (11 observations deleted due to missingness)
## Multiple R-squared:  0.3904, Adjusted R-squared:  0.3559 
## F-statistic: 11.31 on 3 and 53 DF,  p-value: 7.605e-06

summary(
  lm(
    RSQ_Gain ~
      RSQ_Total_Pre +
      POS_Cognitive +
      POS_Emotional,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = RSQ_Gain ~ RSQ_Total_Pre + POS_Cognitive + POS_Emotional, 
##     data = paired_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.02722 -0.23677 -0.04445  0.37345  1.28798 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)   
## (Intercept)     1.0904     0.4093   2.664  0.01031 * 
## RSQ_Total_Pre  -0.4086     0.1236  -3.305  0.00174 **
## POS_Cognitive  -0.1076     0.1401  -0.768  0.44594   
## POS_Emotional   0.1675     0.1115   1.503  0.13911   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5225 on 51 degrees of freedom
##   (13 observations deleted due to missingness)
## Multiple R-squared:  0.1806, Adjusted R-squared:  0.1324 
## F-statistic: 3.748 on 3 and 51 DF,  p-value: 0.0165

summary(
  lm(
    STEP_Gain ~
      STEP_Total_Pre +
      POS_Cognitive +
      POS_Emotional,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = STEP_Gain ~ STEP_Total_Pre + POS_Cognitive + POS_Emotional, 
##     data = paired_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.11098 -0.30579  0.04248  0.28670  1.00515 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     2.26296    0.56607   3.998 0.000199 ***
## STEP_Total_Pre -0.63410    0.13974  -4.538  3.3e-05 ***
## POS_Cognitive  -0.01715    0.11906  -0.144 0.886038    
## POS_Emotional   0.16672    0.09606   1.735 0.088473 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4536 on 53 degrees of freedom
##   (11 observations deleted due to missingness)
## Multiple R-squared:  0.2823, Adjusted R-squared:  0.2417 
## F-statistic:  6.95 on 3 and 53 DF,  p-value: 0.0004972

Students who reported greater collaboration within the course demonstrated larger gains in STEP-U scores.Because STEP measures things like scientific identity, belonging, and perceptions of science, this makes theoretical sense. Students who interacted more with peers and engaged in collaborative scientific work showed greater attitudinal growth.

Students who experienced greater collaboration tended to show larger increases in research self-efficacy, although the relationship did not reach statistical significance (p=0.09).

In many CURE papers, project ownership predicts self-efficacy and identity. We do not see that here.

summary(
  lm(
    Total.Pass ~
      Grit_Total +
      POS_Cognitive +
      POS_Emotional +
      LCAS_Collaboration +
      LCAS_Discovery +
      LCAS_Iteration,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = Total.Pass ~ Grit_Total + POS_Cognitive + POS_Emotional + 
##     LCAS_Collaboration + LCAS_Discovery + LCAS_Iteration, data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.2784 -0.5667  0.1267  0.7571  1.5332 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)
## (Intercept)         2.35164    1.44839   1.624    0.111
## Grit_Total          0.25719    0.27345   0.941    0.351
## POS_Cognitive      -0.15359    0.30885  -0.497    0.621
## POS_Emotional      -0.01030    0.20263  -0.051    0.960
## LCAS_Collaboration  0.41050    0.26533   1.547    0.128
## LCAS_Discovery     -0.01834    0.22105  -0.083    0.934
## LCAS_Iteration      0.25930    0.22376   1.159    0.252
## 
## Residual standard error: 0.97 on 50 degrees of freedom
##   (11 observations deleted due to missingness)
## Multiple R-squared:  0.09781,    Adjusted R-squared:  -0.01045 
## F-statistic: 0.9035 on 6 and 50 DF,  p-value: 0.5

paired_wide <- paired_wide %>%
  mutate(
    Mastery_Group =
      ifelse(Total.Pass == 6,
             "All Skills",
             "Not All Skills")
  )


t.test(
  Grit_Total ~ Mastery_Group,
  data = paired_wide
)

## 
##  Welch Two Sample t-test
## 
## data:  Grit_Total by Mastery_Group
## t = -0.079391, df = 63.336, p-value = 0.937
## alternative hypothesis: true difference in means between group All Skills and group Not All Skills is not equal to 0
## 95 percent confidence interval:
##  -0.2603199  0.2404240
## sample estimates:
##     mean in group All Skills mean in group Not All Skills 
##                     3.354699                     3.364646

t.test(
  POS_Cognitive ~ Mastery_Group,
  data = paired_wide
)

## 
##  Welch Two Sample t-test
## 
## data:  POS_Cognitive by Mastery_Group
## t = -0.53216, df = 51.905, p-value = 0.5969
## alternative hypothesis: true difference in means between group All Skills and group Not All Skills is not equal to 0
## 95 percent confidence interval:
##  -0.5372556  0.3120333
## sample estimates:
##     mean in group All Skills mean in group Not All Skills 
##                     3.436000                     3.548611

t.test(
  LCAS_Collaboration ~ Mastery_Group,
  data = paired_wide
)

## 
##  Welch Two Sample t-test
## 
## data:  LCAS_Collaboration by Mastery_Group
## t = 0.35328, df = 54.752, p-value = 0.7252
## alternative hypothesis: true difference in means between group All Skills and group Not All Skills is not equal to 0
## 95 percent confidence interval:
##  -0.2218680  0.3168199
## sample estimates:
##     mean in group All Skills mean in group Not All Skills 
##                     3.628205                     3.580729

Project Ownership and LCAS do not explain mastery liklihood.

Cluster

“Can we identify distinct types of students based on their experiences and dispositions, and do those types differ in mastery and learning outcomes?”

cluster_vars <- paired_wide %>%
  select(
    StudentID,
    Grit_Total,
    POS_Cognitive,
    POS_Emotional,
    LCAS_Collaboration,
    LCAS_Discovery,
    LCAS_Iteration
  )

# check missingness
colSums(is.na(cluster_vars))

##          StudentID         Grit_Total      POS_Cognitive      POS_Emotional 
##                  0                  0                 11                 10 
## LCAS_Collaboration     LCAS_Discovery     LCAS_Iteration 
##                 10                 10                 10

cluster_complete <- cluster_vars %>%
  drop_na()

cluster_scaled <- cluster_complete %>%
  select(-StudentID) %>%
  scale()


d <- dist(cluster_scaled)

hc <- hclust(d, method = "ward.D2")

plot(hc)

library(factoextra)

## Welcome to factoextra!

## Want to learn more? See two factoextra-related books at https://www.datanovia.com/en/product/practical-guide-to-principal-component-methods-in-r/

fviz_nbclust(
  cluster_scaled,
  kmeans,
  method = "silhouette"
)

set.seed(123)

k2 <- kmeans(
  cluster_scaled,
  centers = 2,
  nstart = 25
)

cluster_complete$Cluster <- factor(k2$cluster)

paired_wide <- paired_wide %>%
  left_join(
    cluster_complete %>%
      select(StudentID, Cluster),
    by = "StudentID"
  )

## Warning in left_join(., cluster_complete %>% select(StudentID, Cluster), : Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 4 of `x` matches multiple rows in `y`.
## ℹ Row 4 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
##   "many-to-many"` to silence this warning.

paired_wide %>%
  group_by(Cluster) %>%
  summarise(
    n = n(),
    Grit_Total = mean(Grit_Total, na.rm = TRUE),
    POS_Cognitive = mean(POS_Cognitive, na.rm = TRUE),
    POS_Emotional = mean(POS_Emotional, na.rm = TRUE),
    LCAS_Collaboration = mean(LCAS_Collaboration, na.rm = TRUE),
    LCAS_Discovery = mean(LCAS_Discovery, na.rm = TRUE),
    LCAS_Iteration = mean(LCAS_Iteration, na.rm = TRUE),
    Total.Pass = mean(Total.Pass, na.rm = TRUE),
    EDCI_Gain = mean(EDCI_Gain, na.rm = TRUE),
    RSQ_Gain = mean(RSQ_Gain, na.rm = TRUE),
    STEP_Gain = mean(STEP_Gain, na.rm = TRUE)
  )

## # A tibble: 3 × 12
##   Cluster     n Grit_Total POS_Cognitive POS_Emotional LCAS_Collaboration
##   <fct>   <int>      <dbl>         <dbl>         <dbl>              <dbl>
## 1 1          44       3.41          3.90          3.49               3.80
## 2 2          25       3.35          2.75          1.99               3.37
## 3 <NA>        9       3.13        NaN             3.5                3.83
## # ℹ 6 more variables: LCAS_Discovery <dbl>, LCAS_Iteration <dbl>,
## #   Total.Pass <dbl>, EDCI_Gain <dbl>, RSQ_Gain <dbl>, STEP_Gain <dbl>

t.test(
  Total.Pass ~ Cluster,
  data = paired_wide
)

## 
##  Welch Two Sample t-test
## 
## data:  Total.Pass by Cluster
## t = 0.39992, df = 38.778, p-value = 0.6914
## alternative hypothesis: true difference in means between group 1 and group 2 is not equal to 0
## 95 percent confidence interval:
##  -0.4132515  0.6168879
## sample estimates:
## mean in group 1 mean in group 2 
##        5.181818        5.080000

t.test(
  EDCI_Gain ~ Cluster,
  data = paired_wide
)

## 
##  Welch Two Sample t-test
## 
## data:  EDCI_Gain by Cluster
## t = -0.58977, df = 45.812, p-value = 0.5582
## alternative hypothesis: true difference in means between group 1 and group 2 is not equal to 0
## 95 percent confidence interval:
##  -2.555757  1.397576
## sample estimates:
## mean in group 1 mean in group 2 
##        1.340909        1.920000

t.test(
  RSQ_Gain ~ Cluster,
  data = paired_wide
)

## 
##  Welch Two Sample t-test
## 
## data:  RSQ_Gain by Cluster
## t = 0.71361, df = 34.638, p-value = 0.4802
## alternative hypothesis: true difference in means between group 1 and group 2 is not equal to 0
## 95 percent confidence interval:
##  -0.1967488  0.4099220
## sample estimates:
## mean in group 1 mean in group 2 
##       0.4011628       0.2945762

t.test(
  STEP_Gain ~ Cluster,
  data = paired_wide
)

## 
##  Welch Two Sample t-test
## 
## data:  STEP_Gain by Cluster
## t = -0.95253, df = 46.574, p-value = 0.3457
## alternative hypothesis: true difference in means between group 1 and group 2 is not equal to 0
## 95 percent confidence interval:
##  -0.4139720  0.1479664
## sample estimates:
## mean in group 1 mean in group 2 
##      0.04655761      0.17956044

summary(
  lm(
    EDCI_Gain ~
      EDCI_Total_Score_Pre +
      Grit_Total +
      Cluster,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = EDCI_Gain ~ EDCI_Total_Score_Pre + Grit_Total + 
##     Cluster, data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.5495 -1.6483 -0.2255  1.8459  5.5732 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            3.1717     2.6912   1.179    0.243    
## EDCI_Total_Score_Pre  -0.7273     0.1014  -7.174 8.55e-10 ***
## Grit_Total             0.9956     0.7841   1.270    0.209    
## Cluster2               0.9365     0.7262   1.290    0.202    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.888 on 65 degrees of freedom
##   (9 observations deleted due to missingness)
## Multiple R-squared:  0.445,  Adjusted R-squared:  0.4194 
## F-statistic: 17.37 on 3 and 65 DF,  p-value: 2.159e-08

summary(
  lm(
    RSQ_Gain ~
      RSQ_Total_Pre +
      Cluster,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = RSQ_Gain ~ RSQ_Total_Pre + Cluster, data = paired_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.00197 -0.25197  0.03749  0.27444  1.25235 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    1.10291    0.23089   4.777  1.1e-05 ***
## RSQ_Total_Pre -0.31582    0.09831  -3.213  0.00207 ** 
## Cluster2      -0.17975    0.12873  -1.396  0.16754    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4905 on 63 degrees of freedom
##   (12 observations deleted due to missingness)
## Multiple R-squared:  0.149,  Adjusted R-squared:  0.122 
## F-statistic: 5.514 on 2 and 63 DF,  p-value: 0.006211

summary(
  lm(
    STEP_Gain ~
      STEP_Total_Pre +
      Cluster,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = STEP_Gain ~ STEP_Total_Pre + Cluster, data = paired_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.12012 -0.25716  0.08367  0.24862  0.77803 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      2.5720     0.6077   4.232 7.31e-05 ***
## STEP_Total_Pre  -0.5901     0.1410  -4.186 8.57e-05 ***
## Cluster2        -0.1160     0.1359  -0.854    0.396    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4877 on 66 degrees of freedom
##   (9 observations deleted due to missingness)
## Multiple R-squared:  0.2209, Adjusted R-squared:  0.1973 
## F-statistic: 9.356 on 2 and 66 DF,  p-value: 0.0002648

summary(
  lm(
    Total.Pass ~
      Grit_Total +
      HS.GPA_IR +
      ACT.complete +
      Cluster,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = Total.Pass ~ Grit_Total + HS.GPA_IR + ACT.complete + 
##     Cluster, data = paired_wide)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.7272 -0.2580  0.3062  0.3649  0.7776 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)   
## (Intercept)   4.38722    1.39466   3.146  0.00488 **
## Grit_Total   -0.13889    0.26805  -0.518  0.60977   
## HS.GPA_IR     0.38324    0.37939   1.010  0.32392   
## ACT.complete  0.00902    0.03548   0.254  0.80180   
## Cluster2      0.12119    0.28677   0.423  0.67688   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6749 on 21 degrees of freedom
##   (52 observations deleted due to missingness)
## Multiple R-squared:  0.08563,    Adjusted R-squared:  -0.08853 
## F-statistic: 0.4917 on 4 and 21 DF,  p-value: 0.7419

summary(
  lm(
    Total.Pass ~
      Grit_Total +
      POS_Cognitive +
      POS_Emotional +
      LCAS_Collaboration +
      LCAS_Discovery +
      LCAS_Iteration +
      Cluster,
    data = paired_wide
  )
)

## 
## Call:
## lm(formula = Total.Pass ~ Grit_Total + POS_Cognitive + POS_Emotional + 
##     LCAS_Collaboration + LCAS_Discovery + LCAS_Iteration + Cluster, 
##     data = paired_wide)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.34531 -0.62668  0.07151  0.69703  1.61687 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)
## (Intercept)          3.3036     2.0782   1.590    0.117
## Grit_Total           0.2671     0.2599   1.028    0.308
## POS_Cognitive       -0.1841     0.2864  -0.643    0.523
## POS_Emotional       -0.1524     0.2041  -0.747    0.458
## LCAS_Collaboration   0.3332     0.2716   1.227    0.225
## LCAS_Discovery      -0.0516     0.2437  -0.212    0.833
## LCAS_Iteration       0.2703     0.2032   1.330    0.189
## Cluster2            -0.2951     0.4804  -0.614    0.541
## 
## Residual standard error: 0.9278 on 59 degrees of freedom
##   (11 observations deleted due to missingness)
## Multiple R-squared:  0.1211, Adjusted R-squared:  0.01687 
## F-statistic: 1.162 on 7 and 59 DF,  p-value: 0.3384

Cluster 1

Highly Engaged Researchers

Students who:

Felt ownership
Felt emotionally invested
Perceived discovery
Perceived iteration
Perceived collaboration

Cluster 2

Low Ownership Researchers

Students who:

Completed the same course
Experienced less ownership
Experienced less discovery
Experienced less iteration

Students experienced the course differently, but those differences did not translate into substantial differences in learning outcomes.

More exploration of mastery

attempt_cols <- c(
  "Pipette.Pass",
  "Excel.Pass",
  "Microscope.Pass",
  "PCR.Pass",
  "Gel.Pass",
  "Sequencing.Pass"
)

attempt_map <- c(
  "First" = 1,
  "Second" = 2,
  "Third" = 3
)

attempt_num <- dat_full

attempt_num[attempt_cols] <- lapply(
  attempt_num[attempt_cols],
  function(x){
    out <- as.numeric(attempt_map[x])
    out[is.na(out)] <- 4
    out
  }
)

attempt_num$Mean_Attempt <-
  rowMeans(
    attempt_num[attempt_cols],
    na.rm = TRUE
  )

summary(
  lm(
    Mean_Attempt ~
      Grit_Total +
      POS_Cognitive +
      POS_Emotional +
      LCAS_Collaboration +
      LCAS_Discovery +
      LCAS_Iteration,
    data = attempt_num
  )
)

## 
## Call:
## lm(formula = Mean_Attempt ~ Grit_Total + POS_Cognitive + POS_Emotional + 
##     LCAS_Collaboration + LCAS_Discovery + LCAS_Iteration, data = attempt_num)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.1399 -0.4087  0.0049  0.2648  2.0094 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         3.3533360  0.6075100   5.520 2.73e-07 ***
## Grit_Total         -0.2462592  0.1152354  -2.137   0.0351 *  
## POS_Cognitive       0.1554921  0.1437938   1.081   0.2822    
## POS_Emotional       0.0163224  0.0979415   0.167   0.8680    
## LCAS_Collaboration -0.1143651  0.1262062  -0.906   0.3670    
## LCAS_Discovery     -0.1619246  0.1042491  -1.553   0.1236    
## LCAS_Iteration      0.0006733  0.1122990   0.006   0.9952    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6324 on 99 degrees of freedom
##   (289 observations deleted due to missingness)
## Multiple R-squared:  0.07334,    Adjusted R-squared:  0.01718 
## F-statistic: 1.306 on 6 and 99 DF,  p-value: 0.2616

summary(
  lm(
    Mean_Attempt ~
      Grit_Perseverance +
      POS_Cognitive +
      POS_Emotional +
      LCAS_Collaboration +
      LCAS_Discovery +
      LCAS_Iteration,
    data = attempt_num
  )
)

## 
## Call:
## lm(formula = Mean_Attempt ~ Grit_Perseverance + POS_Cognitive + 
##     POS_Emotional + LCAS_Collaboration + LCAS_Discovery + LCAS_Iteration, 
##     data = attempt_num)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.1024 -0.4014 -0.0091  0.2771  1.8619 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          3.4268     0.6095   5.623 1.74e-07 ***
## Grit_Perseverance   -0.2395     0.1036  -2.311   0.0229 *  
## POS_Cognitive        0.1318     0.1398   0.943   0.3481    
## POS_Emotional        0.0267     0.0968   0.276   0.7832    
## LCAS_Collaboration  -0.1040     0.1260  -0.826   0.4110    
## LCAS_Discovery      -0.1662     0.1040  -1.599   0.1131    
## LCAS_Iteration       0.0256     0.1123   0.228   0.8201    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6301 on 99 degrees of freedom
##   (289 observations deleted due to missingness)
## Multiple R-squared:  0.08021,    Adjusted R-squared:  0.02447 
## F-statistic: 1.439 on 6 and 99 DF,  p-value: 0.2074

summary(
  lm(
    Mean_Attempt ~
      Grit_Consistency +
      POS_Cognitive +
      POS_Emotional +
      LCAS_Collaboration +
      LCAS_Discovery +
      LCAS_Iteration,
    data = attempt_num
  )
)

## 
## Call:
## lm(formula = Mean_Attempt ~ Grit_Consistency + POS_Cognitive + 
##     POS_Emotional + LCAS_Collaboration + LCAS_Discovery + LCAS_Iteration, 
##     data = attempt_num)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.11253 -0.46645  0.01301  0.27520  2.06217 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         2.86716    0.54819   5.230  9.6e-07 ***
## Grit_Consistency   -0.11494    0.09083  -1.265    0.209    
## POS_Cognitive       0.12282    0.14971   0.820    0.414    
## POS_Emotional       0.02869    0.10119   0.284    0.777    
## LCAS_Collaboration -0.11331    0.12899  -0.878    0.382    
## LCAS_Discovery     -0.13770    0.10636  -1.295    0.198    
## LCAS_Iteration     -0.01217    0.11531  -0.106    0.916    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6449 on 98 degrees of freedom
##   (290 observations deleted due to missingness)
## Multiple R-squared:  0.04606,    Adjusted R-squared:  -0.01235 
## F-statistic: 0.7886 on 6 and 98 DF,  p-value: 0.5809

1.0 = mastered everything first try
2.0 = usually required second attempts
3.0 = often required third attempts
4.0 = failed some skills

Students with higher grit required fewer attempts to achieve mastery across the six laboratory skills assessments. No evidence that ownership or perceived CURE experiences influenced mastery speed. This is driven specifically by perserverecne.

attempt_num$Worst_Attempt <-
  apply(
    attempt_num[, attempt_cols],
    1,
    max,
    na.rm = TRUE
  )

summary(attempt_num$Worst_Attempt)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00    3.00    4.00    3.39    4.00    4.00

table(attempt_num$Worst_Attempt)

## 
##   1   2   3   4 
##  19  75  34 267

summary(
  lm(
    Worst_Attempt ~
      Grit_Perseverance +
      POS_Cognitive +
      POS_Emotional +
      LCAS_Collaboration +
      LCAS_Discovery +
      LCAS_Iteration,
    data = attempt_num
  )
)

## 
## Call:
## lm(formula = Worst_Attempt ~ Grit_Perseverance + POS_Cognitive + 
##     POS_Emotional + LCAS_Collaboration + LCAS_Discovery + LCAS_Iteration, 
##     data = attempt_num)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.4121 -0.5724  0.4113  0.5693  1.1008 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         5.02450    0.81819   6.141 1.71e-08 ***
## Grit_Perseverance  -0.20157    0.13913  -1.449    0.151    
## POS_Cognitive      -0.07911    0.18772  -0.421    0.674    
## POS_Emotional       0.20909    0.12995   1.609    0.111    
## LCAS_Collaboration -0.15014    0.16909  -0.888    0.377    
## LCAS_Discovery     -0.13464    0.13957  -0.965    0.337    
## LCAS_Iteration     -0.01702    0.15071  -0.113    0.910    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8459 on 99 degrees of freedom
##   (289 observations deleted due to missingness)
## Multiple R-squared:  0.06667,    Adjusted R-squared:  0.0101 
## F-statistic: 1.179 on 6 and 99 DF,  p-value: 0.3239

summary(
  lm(
    Worst_Attempt ~
      Grit_Consistency +
      POS_Cognitive +
      POS_Emotional +
      LCAS_Collaboration +
      LCAS_Discovery +
      LCAS_Iteration,
    data = attempt_num
  )
)

## 
## Call:
## lm(formula = Worst_Attempt ~ Grit_Consistency + POS_Cognitive + 
##     POS_Emotional + LCAS_Collaboration + LCAS_Discovery + LCAS_Iteration, 
##     data = attempt_num)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.3984 -0.6072  0.4028  0.6103  1.0044 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         4.52015    0.72773   6.211 1.27e-08 ***
## Grit_Consistency   -0.06402    0.12058  -0.531    0.597    
## POS_Cognitive      -0.08081    0.19874  -0.407    0.685    
## POS_Emotional       0.20391    0.13433   1.518    0.132    
## LCAS_Collaboration -0.16481    0.17123  -0.962    0.338    
## LCAS_Discovery     -0.11929    0.14119  -0.845    0.400    
## LCAS_Iteration     -0.04874    0.15308  -0.318    0.751    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8561 on 98 degrees of freedom
##   (290 observations deleted due to missingness)
## Multiple R-squared:  0.04946,    Adjusted R-squared:  -0.008735 
## F-statistic: 0.8499 on 6 and 98 DF,  p-value: 0.5347

summary(
  lm(
    Worst_Attempt ~
      Grit_Total +
      POS_Cognitive +
      POS_Emotional +
      LCAS_Collaboration +
      LCAS_Discovery +
      LCAS_Iteration,
    data = attempt_num
  )
)

## 
## Call:
## lm(formula = Worst_Attempt ~ Grit_Total + POS_Cognitive + POS_Emotional + 
##     LCAS_Collaboration + LCAS_Discovery + LCAS_Iteration, data = attempt_num)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.4244 -0.6243  0.3848  0.5816  1.0572 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         4.79817    0.81718   5.872 5.78e-08 ***
## Grit_Total         -0.15152    0.15501  -0.977    0.331    
## POS_Cognitive      -0.08012    0.19342  -0.414    0.680    
## POS_Emotional       0.20821    0.13174   1.580    0.117    
## LCAS_Collaboration -0.16103    0.16976  -0.949    0.345    
## LCAS_Discovery     -0.12231    0.14023  -0.872    0.385    
## LCAS_Iteration     -0.03724    0.15106  -0.247    0.806    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8507 on 99 degrees of freedom
##   (289 observations deleted due to missingness)
## Multiple R-squared:  0.05599,    Adjusted R-squared:  -0.001223 
## F-statistic: 0.9786 on 6 and 99 DF,  p-value: 0.4439

Value	Meaning
1	Passed every skill on first attempt
2	Needed at least one second attempt
3	Needed at least one third attempt
4	Failed at least one skill

attempt_num$Number_First_Pass <-
  rowSums(
    attempt_num[, attempt_cols] == 1,
    na.rm = TRUE
  )

summary(
  lm(
    Number_First_Pass ~
      Grit_Perseverance +
      POS_Cognitive +
      POS_Emotional +
      LCAS_Collaboration +
      LCAS_Discovery +
      LCAS_Iteration,
    data = attempt_num
  )
)

## 
## Call:
## lm(formula = Number_First_Pass ~ Grit_Perseverance + POS_Cognitive + 
##     POS_Emotional + LCAS_Collaboration + LCAS_Discovery + LCAS_Iteration, 
##     data = attempt_num)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.4902 -0.8648  0.0191  0.8317  3.5020 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)   
## (Intercept)        -0.30249    1.23383  -0.245  0.80684   
## Grit_Perseverance   0.56023    0.20981   2.670  0.00886 **
## POS_Cognitive      -0.34826    0.28308  -1.230  0.22152   
## POS_Emotional       0.05526    0.19597   0.282  0.77853   
## LCAS_Collaboration  0.21620    0.25499   0.848  0.39856   
## LCAS_Discovery      0.21578    0.21047   1.025  0.30774   
## LCAS_Iteration      0.01224    0.22727   0.054  0.95715   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.276 on 99 degrees of freedom
##   (289 observations deleted due to missingness)
## Multiple R-squared:  0.08609,    Adjusted R-squared:  0.0307 
## F-statistic: 1.554 on 6 and 99 DF,  p-value: 0.1686

summary(
  lm(
    Number_First_Pass ~
      Grit_Consistency +
      POS_Cognitive +
      POS_Emotional +
      LCAS_Collaboration +
      LCAS_Discovery +
      LCAS_Iteration,
    data = attempt_num
  )
)

## 
## Call:
## lm(formula = Number_First_Pass ~ Grit_Consistency + POS_Cognitive + 
##     POS_Emotional + LCAS_Collaboration + LCAS_Discovery + LCAS_Iteration, 
##     data = attempt_num)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.6894 -0.6348 -0.1303  0.6919  3.4773 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)
## (Intercept)         1.346016   1.120335   1.201    0.232
## Grit_Consistency    0.146143   0.185631   0.787    0.433
## POS_Cognitive      -0.209528   0.305956  -0.685    0.495
## POS_Emotional      -0.004325   0.206802  -0.021    0.983
## LCAS_Collaboration  0.240348   0.263608   0.912    0.364
## LCAS_Discovery      0.103900   0.217361   0.478    0.634
## LCAS_Iteration      0.072592   0.235661   0.308    0.759
## 
## Residual standard error: 1.318 on 98 degrees of freedom
##   (290 observations deleted due to missingness)
## Multiple R-squared:  0.02243,    Adjusted R-squared:  -0.03743 
## F-statistic: 0.3747 on 6 and 98 DF,  p-value: 0.8935

summary(
  lm(
    Number_First_Pass ~
      Grit_Total +
      POS_Cognitive +
      POS_Emotional +
      LCAS_Collaboration +
      LCAS_Discovery +
      LCAS_Iteration,
    data = attempt_num
  )
)

## 
## Call:
## lm(formula = Number_First_Pass ~ Grit_Total + POS_Cognitive + 
##     POS_Emotional + LCAS_Collaboration + LCAS_Discovery + LCAS_Iteration, 
##     data = attempt_num)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.6128 -0.7798  0.0344  0.7493  3.5445 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)  
## (Intercept)         0.25480    1.24654   0.204   0.8385  
## Grit_Total          0.44542    0.23645   1.884   0.0625 .
## POS_Cognitive      -0.35458    0.29505  -1.202   0.2323  
## POS_Emotional       0.06114    0.20096   0.304   0.7616  
## LCAS_Collaboration  0.24552    0.25896   0.948   0.3454  
## LCAS_Discovery      0.18533    0.21391   0.866   0.3884  
## LCAS_Iteration      0.06877    0.23042   0.298   0.7660  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.298 on 99 degrees of freedom
##   (289 observations deleted due to missingness)
## Multiple R-squared:  0.05417,    Adjusted R-squared:  -0.003149 
## F-statistic: 0.9451 on 6 and 99 DF,  p-value: 0.4666

Value	Meaning
0	No skills mastered immediately
6	All skills mastered immediately

library(tidyr)
library(ggplot2)

attempt_long <- dat_full %>%
  select(all_of(attempt_cols)) %>%
  pivot_longer(
    everything(),
    names_to = "Skill",
    values_to = "Attempt"
  )

ggplot(
  attempt_long,
  aes(x = Skill, fill = Attempt)
) +
  geom_bar(position = "fill") +
  scale_y_continuous(labels = scales::percent) +
  theme_classic() +
  ylab("Percent of Students") +
  xlab("Skill Assessment")

summary(
  lm(
    Mean_Attempt ~
      Grit_Perseverance * Gender,
    data = attempt_num
  )
)

## 
## Call:
## lm(formula = Mean_Attempt ~ Grit_Perseverance * Gender, data = attempt_num)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.5244 -0.5828  0.0503  0.5145  2.0365 
## 
## Coefficients:
##                               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    2.08632    0.35226   5.923 7.01e-09 ***
## Grit_Perseverance              0.03262    0.08431   0.387   0.6990    
## GenderMale                     0.90477    0.46057   1.964   0.0502 .  
## GenderOther                    1.16448    1.00614   1.157   0.2478    
## Grit_Perseverance:GenderMale  -0.22706    0.11948  -1.900   0.0581 .  
## Grit_Perseverance:GenderOther -0.38371    0.26347  -1.456   0.1461    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7548 on 386 degrees of freedom
##   (3 observations deleted due to missingness)
## Multiple R-squared:  0.03417,    Adjusted R-squared:  0.02166 
## F-statistic: 2.731 on 5 and 386 DF,  p-value: 0.01934

ggplot(
  attempt_num,
  aes(
    x = Grit_Perseverance,
    y = Mean_Attempt,
    color = Gender
  )
) +
  geom_point(alpha = .7) +
  geom_smooth(method = "lm", se = TRUE) +
  theme_classic() +
  labs(
    x = "Grit Perseverance",
    y = "Mean Attempts to Mastery"
  )

## `geom_smooth()` using formula = 'y ~ x'

## Warning: Removed 3 rows containing non-finite outside the scale range
## (`stat_smooth()`).

## Warning: Removed 3 rows containing missing values or values outside the scale range
## (`geom_point()`).

Summary to this point:

Outcome	Significant Predictor
EDCI Gain	Grit Total
Mean Attempt	Grit Perseverance
Number First Pass	Grit Perseverance
Worst Attempt	None
Total Pass	None
RSQ Gain	None
STEP Gain	Collaboration (earlier model)

Outcome	Significant Predictor
EDCI Gain	Grit Total
Mean Attempt	Grit Perseverance
Number First Pass	Grit Perseverance
Worst Attempt	None
Total Pass	None
RSQ Gain	None
STEP Gain	Collaboration (earlier model)

Different dimensions of grit were associated with different educational outcomes

Then discuss:

Consistency of Interests

Associated with gains in research competency. Particularly relevant in a semester-long authentic research experience requiring sustained engagement with a single project.

Perseverance of Effort

Associated with mastery efficiency. Students higher in perseverance achieved mastery with fewer assessment attempts and demonstrated more first-attempt mastery.

This is actually a stronger and more nuanced finding than simply saying “grit mattered.”

Research Competency (EDCI)

Overall grit predicts EDCI outcomes.
Consistency of interests appears particularly important.

Mastery Learning

Students ultimately achieved mastery regardless of grit.
Perseverance may influence how efficiently mastery is achieved.
This relationship appears stronger among males than females.

Interpretation

This actually fits Duckworth’s original grit framework nicely:

Consistency of Interests

Maintaining focus on a long-term project.
Relevant to a semester-long research experience.

Perseverance of Effort

Continuing after setbacks.
Relevant to repeated mastery assessment attempts.

Those are different psychological processes, and your data seem to separate them.

Exploratory analyses suggested that perseverance of effort may be associated with mastery efficiency, particularly among male students. A marginal interaction between perseverance and gender indicated that higher perseverance was associated with fewer assessment attempts among males, whereas little relationship was observed among females (p = .058).

Latent Structure

library(psych)

dat_full$RSQ13 <- dplyr::recode(
  as.character(dat_full$RSQ13),
  "Confident" = "1",
  "Somewhat Confident" = "2",
  "Moderately Confident" = "3",
  "Very Confident" = "4"
)

dat_full$RSQ14 <- dplyr::recode(
  as.character(dat_full$RSQ14),
  "Confident" = "1",
  "Somewhat Confident" = "2",
  "Moderately Confident" = "3",
  "Very Confident" = "4"
)

dat_full$RSQ13 <- as.numeric(dat_full$RSQ13)
dat_full$RSQ14 <- as.numeric(dat_full$RSQ14)




fa.parallel(
  dat_full[, rsq_items],
  fa = "fa"
)

## Parallel analysis suggests that the number of factors =  5  and the number of components =  NA

rsq_fa <- fa(
  dat_full[, rsq_items],
  nfactors = 5,
  rotate = "oblimin"
)

## Loading required namespace: GPArotation

print(rsq_fa, cutoff = .30)

## Factor Analysis using method =  minres
## Call: fa(r = dat_full[, rsq_items], nfactors = 5, rotate = "oblimin")
## Standardized loadings (pattern matrix) based upon correlation matrix
##         MR1   MR2   MR4   MR3   MR5   h2   u2 com
## RSQ1   0.03 -0.01 -0.02  0.90  0.00 0.81 0.19 1.0
## RSQ2   0.77 -0.07 -0.01  0.04 -0.05 0.56 0.44 1.0
## RSQ3   0.82 -0.01 -0.02  0.08 -0.06 0.69 0.31 1.0
## RSQ4   0.02  0.42 -0.08  0.32  0.28 0.48 0.52 2.8
## RSQ5  -0.09  0.54  0.19  0.19 -0.14 0.40 0.60 1.7
## RSQ6   0.52  0.36 -0.01  0.04 -0.17 0.59 0.41 2.0
## RSQ7   0.07  0.75 -0.07 -0.01  0.02 0.59 0.41 1.0
## RSQ8  -0.20  0.33  0.05  0.17  0.31 0.26 0.74 3.3
## RSQ9   0.67  0.01  0.15  0.02  0.14 0.60 0.40 1.2
## RSQ10  0.64  0.05  0.08  0.00  0.39 0.65 0.35 1.7
## RSQ11  0.67  0.21  0.07 -0.03 -0.10 0.64 0.36 1.3
## RSQ12  0.24  0.49  0.09 -0.14  0.10 0.42 0.58 1.8
## RSQ13  0.27  0.00  0.56  0.23 -0.17 0.66 0.34 2.1
## RSQ14 -0.04 -0.01  0.88 -0.08  0.05 0.75 0.25 1.0
## 
##                        MR1  MR2  MR4  MR3  MR5
## SS loadings           3.32 1.81 1.30 1.17 0.47
## Proportion Var        0.24 0.13 0.09 0.08 0.03
## Cumulative Var        0.24 0.37 0.46 0.54 0.58
## Proportion Explained  0.41 0.22 0.16 0.14 0.06
## Cumulative Proportion 0.41 0.64 0.80 0.94 1.00
## 
##  With factor correlations of 
##      MR1  MR2  MR4   MR3   MR5
## MR1 1.00 0.43 0.42  0.31  0.01
## MR2 0.43 1.00 0.14  0.33  0.15
## MR4 0.42 0.14 1.00  0.05  0.00
## MR3 0.31 0.33 0.05  1.00 -0.02
## MR5 0.01 0.15 0.00 -0.02  1.00
## 
## Mean item complexity =  1.6
## Test of the hypothesis that 5 factors are sufficient.
## 
## df null model =  91  with the objective function =  5.67 0.3 with Chi Square =  2204.72
## df of  the model are 31  and the objective function was  0.18 
##  0.3
## The root mean square of the residuals (RMSR) is  0.02 
## The df corrected root mean square of the residuals is  0.03 
##  0.3
## The harmonic n.obs is  391 with the empirical chi square  13.65  with prob <  1 
##  0.3The total n.obs was  395  with Likelihood Chi Square =  71.24  with prob <  5.2e-05 
##  0.3
## Tucker Lewis Index of factoring reliability =  0.944
## RMSEA index =  0.057  and the 90 % confidence intervals are  0.04 0.075 0.3
## BIC =  -114.11
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy             
##                                                    MR1  MR2  MR4  MR3   MR5
## Correlation of (regression) scores with factors   0.84 0.79 0.86 0.87  0.67
## Multiple R square of scores with factors          0.70 0.62 0.75 0.76  0.45
## Minimum correlation of possible factor scores     0.41 0.25 0.49 0.51 -0.09

dat_full$RSQ_Research <- rowMeans(
  dat_full[, c("RSQ2","RSQ3","RSQ4","RSQ5","RSQ6")],
  na.rm = TRUE
)

dat_full$RSQ_Research

##   [1] 2.40 2.40 1.40 2.80 3.00 1.60 2.60 1.40 1.40 1.60 1.40 1.60 2.80 2.00 3.40
##  [16] 4.00 2.60 2.20 2.20 1.20 2.00 1.60 1.60 1.40 1.20 3.00 3.00 2.80 1.60 1.80
##  [31] 2.20 2.00 2.60 1.60 1.80 2.40 4.00 3.20 3.60 3.00 3.60 2.20 3.60 2.60 3.60
##  [46] 3.20 2.20 1.60 3.20 3.20 1.80 1.80 2.80 1.40 1.40 1.80 3.20 2.75 4.00 4.00
##  [61] 1.00 1.40 1.60 4.00 1.40 1.60 1.80 2.00 2.00 1.40 2.60 1.60 1.80 1.40 1.40
##  [76] 1.20 2.00 3.00 1.00 1.40 1.00 1.80 1.60 2.20 1.80 1.80 2.20 1.20 1.60 2.00
##  [91] 2.00 2.40  NaN 1.60 1.80 1.40 1.00 1.40 2.20 1.60 3.20 2.20 2.80 2.40 2.00
## [106] 2.00 2.40 2.40 2.60 1.40 2.20 2.60 2.00 2.00 2.20 1.80 3.20 3.20 2.60 1.80
## [121] 2.00 1.60 2.60 3.20 1.80 1.80 2.00 2.60 1.80 1.60 2.00 2.00 2.20 1.80 2.00
## [136] 3.20 3.20 2.40 2.40 2.40 2.20 1.80 1.20 2.20 2.60 1.20 1.80 2.00 2.80 1.80
## [151] 1.80 1.80 3.60 4.00 2.20 1.80 2.00 1.40 1.60 1.80 1.20 2.80 1.80 2.40 2.40
## [166] 1.20 3.20 3.40 2.20 2.20 1.00 2.00 1.60 1.40 2.80 1.40 1.00 1.40 1.60 2.00
## [181] 1.75 2.00 2.60 2.60 2.60 1.20 2.20 3.20 3.00 1.60 1.60  NaN 2.40 2.80 3.20
## [196] 3.00 3.20 3.20 2.00 1.40 1.60 3.00 2.40 2.60 2.20 2.00 1.60 2.60 3.60 2.60
## [211] 2.20 1.80 2.60 2.60 3.00 3.20 3.20 3.20 3.20 3.20 3.20 3.20 3.20 3.20 3.20
## [226] 3.20 3.20 3.20 3.20 3.20 3.20 3.20 3.20 3.20 3.20 2.80 2.80 2.80 2.80 2.80
## [241] 2.80 2.80 2.80 2.80 2.80 2.80 2.80 2.80 2.80 2.80 2.80 2.80 2.80 2.80 2.80
## [256] 1.80 1.80 1.80 1.80 1.80 1.80 1.80 1.80 1.80 1.80 1.80 1.80 1.80 1.80 1.80
## [271] 1.80 1.80 1.80 1.80 1.80 1.40 1.40 1.40 1.40 1.40 1.40 1.40 1.40 1.40 1.40
## [286] 1.40 1.40 1.40 1.40 1.40 1.40 1.40 1.40 1.40 1.40 1.20 1.20 1.20 1.20 1.20
## [301] 1.20 1.20 1.20 1.20 1.20 1.20 1.20 1.20 1.20 1.20 1.20 1.20 1.20 1.20 1.20
## [316] 3.20 3.20 3.20 3.20 3.20 3.20 3.20 3.20 3.20 3.20 3.20 3.20 3.20 3.20 3.20
## [331] 3.20 3.20 3.20 3.20 3.20 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00
## [346] 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.40 2.40 2.40 2.40 2.40
## [361] 2.40 2.40 2.40 2.40 2.40 2.40 2.40 2.40 2.40 2.40 2.40 2.40 2.40 2.40 2.40
## [376] 1.60 1.60 1.60 1.60 1.60 1.60 1.60 1.60 1.60 1.60 1.60 1.60 1.60 1.60 1.60
## [391] 1.60 1.60 1.60 1.60 1.60

dat_full$RSQ_Communication <- rowMeans(
  dat_full[, c("RSQ9","RSQ10","RSQ11","RSQ12","RSQ13")],
  na.rm = TRUE
)

dat_full$RSQ_Communication

##   [1] 2.200000 2.800000 2.200000 2.200000 3.000000 2.200000 3.000000 1.800000
##   [9] 1.400000 2.800000 2.000000 1.800000 3.200000 3.600000 4.000000 4.000000
##  [17] 3.600000 3.200000 1.800000 1.400000 3.000000 1.600000 3.400000 2.800000
##  [25] 2.800000 2.600000 2.600000 3.600000 1.800000 2.200000 3.000000 2.200000
##  [33] 2.600000 2.000000 2.400000 2.800000 4.000000 4.000000 4.000000 2.600000
##  [41] 4.000000 2.400000 3.800000 2.600000 2.800000 3.000000 2.000000 2.600000
##  [49] 3.400000 3.400000 3.400000 4.000000 4.000000 1.400000 1.600000 1.800000
##  [57] 3.600000 3.200000 4.000000 4.000000 1.200000 1.600000 3.200000 4.000000
##  [65] 1.600000 1.600000 2.000000 2.400000 2.400000 1.600000 2.200000 2.200000
##  [73] 2.200000 1.800000 4.000000 2.400000 2.400000 3.400000 1.200000 3.200000
##  [81] 1.200000 1.800000 1.600000 2.800000 2.400000 2.000000 2.000000 1.600000
##  [89] 2.000000 2.200000 1.400000 3.000000      NaN 2.000000 2.600000 1.000000
##  [97] 2.400000 3.000000 2.600000 1.500000 3.200000 1.800000 3.000000 2.800000
## [105] 2.000000 2.800000 2.000000 3.200000 3.000000 1.400000 1.666667 2.200000
## [113] 1.800000 1.800000 2.600000 1.800000 3.600000 4.000000 2.000000 2.200000
## [121] 2.200000 2.600000 3.200000 3.200000 2.400000 1.600000 2.200000 2.600000
## [129] 2.400000 2.000000 1.800000 2.200000 2.200000 2.600000 3.200000 4.000000
## [137] 4.000000 2.600000 3.400000 2.750000 3.000000 2.200000 1.600000 3.600000
## [145] 2.400000 2.200000 1.800000 2.200000 2.800000 1.500000 1.800000 2.200000
## [153] 3.600000 3.400000 2.600000 2.000000 2.600000 1.600000 2.800000 2.800000
## [161] 2.000000 3.000000 3.200000 1.600000 2.600000 2.200000 3.000000 2.600000
## [169] 1.800000 2.600000 2.000000 2.800000 2.000000 1.600000 3.000000 2.600000
## [177] 2.400000 3.000000 1.600000 2.333333 2.200000 2.200000 3.000000 3.400000
## [185] 3.800000 1.800000 2.400000 2.400000 2.400000 1.600000 1.800000      NaN
## [193] 3.200000 3.400000 2.400000 2.200000 2.800000 3.200000 2.000000 1.800000
## [201] 1.800000 2.800000 2.200000 3.400000 2.400000 1.800000 1.600000 2.800000
## [209] 4.000000 2.000000 1.200000 1.200000 2.200000 3.600000 3.600000 3.000000
## [217] 3.000000 3.000000 3.000000 3.000000 3.000000 3.000000 3.000000 3.000000
## [225] 3.000000 3.000000 3.000000 3.000000 3.000000 3.000000 3.000000 3.000000
## [233] 3.000000 3.000000 3.000000 2.400000 2.400000 2.400000 2.400000 2.400000
## [241] 2.400000 2.400000 2.400000 2.400000 2.400000 2.400000 2.400000 2.400000
## [249] 2.400000 2.400000 2.400000 2.400000 2.400000 2.400000 2.400000 1.400000
## [257] 1.400000 1.400000 1.400000 1.400000 1.400000 1.400000 1.400000 1.400000
## [265] 1.400000 1.400000 1.400000 1.400000 1.400000 1.400000 1.400000 1.400000
## [273] 1.400000 1.400000 1.400000 1.600000 1.600000 1.600000 1.600000 1.600000
## [281] 1.600000 1.600000 1.600000 1.600000 1.600000 1.600000 1.600000 1.600000
## [289] 1.600000 1.600000 1.600000 1.600000 1.600000 1.600000 1.600000 1.200000
## [297] 1.200000 1.200000 1.200000 1.200000 1.200000 1.200000 1.200000 1.200000
## [305] 1.200000 1.200000 1.200000 1.200000 1.200000 1.200000 1.200000 1.200000
## [313] 1.200000 1.200000 1.200000 4.000000 4.000000 4.000000 4.000000 4.000000
## [321] 4.000000 4.000000 4.000000 4.000000 4.000000 4.000000 4.000000 4.000000
## [329] 4.000000 4.000000 4.000000 4.000000 4.000000 4.000000 4.000000 2.200000
## [337] 2.200000 2.200000 2.200000 2.200000 2.200000 2.200000 2.200000 2.200000
## [345] 2.200000 2.200000 2.200000 2.200000 2.200000 2.200000 2.200000 2.200000
## [353] 2.200000 2.200000 2.200000 2.200000 2.200000 2.200000 2.200000 2.200000
## [361] 2.200000 2.200000 2.200000 2.200000 2.200000 2.200000 2.200000 2.200000
## [369] 2.200000 2.200000 2.200000 2.200000 2.200000 2.200000 2.200000 2.000000
## [377] 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000
## [385] 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000 2.000000
## [393] 2.000000 2.000000 2.000000

dat_full$RSQ_Competence <- rowMeans(
  dat_full[, c(
    "RSQ2",
    "RSQ3",
    "RSQ6",
    "RSQ7",
    "RSQ9",
    "RSQ10",
    "RSQ11"
  )],
  na.rm = TRUE
)

competence_wide <- dat_paired %>%
  select(StudentID, Timepoint, RSQ_Competence) %>%
  pivot_wider(
    names_from = Timepoint,
    values_from = RSQ_Competence,
    names_prefix = "RSQ_Competence_"
  ) %>%
  mutate(
    RSQ_Competence_Gain = RSQ_Competence_Post - RSQ_Competence_Pre
  )



paired_wide <- paired_wide %>%
  left_join(
    competence_wide %>%
      select(StudentID, RSQ_Competence_Pre, RSQ_Competence_Post, RSQ_Competence_Gain),
    by = "StudentID"
  )

edci_wide <- dat_paired %>%
  select(StudentID, Timepoint, EDCI_Total_Score) %>%
  pivot_wider(
    names_from = Timepoint,
    values_from = EDCI_Total_Score,
    names_prefix = "EDCI_"
  ) %>%
  mutate(
    EDCI_Gain = EDCI_Post - EDCI_Pre
  )

paired_wide <- paired_wide %>%
  left_join(
    edci_wide %>%
      select(StudentID, EDCI_Gain),
    by = "StudentID"
  )


cor.test(
  paired_wide$RSQ_Competence_Gain.x,
  paired_wide$EDCI_Gain.x,
  use = "complete.obs"
)

## 
##  Pearson's product-moment correlation
## 
## data:  paired_wide$RSQ_Competence_Gain.x and paired_wide$EDCI_Gain.x
## t = -0.41264, df = 68, p-value = 0.6812
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.2816438  0.1871951
## sample estimates:
##         cor 
## -0.04997718

Loading	Interpretation
> 0.70	Very strong
0.50-0.69	Strong
0.30-0.49	Moderate
< 0.30	Usually ignore

Exploratory factor analysis suggested that RSQ items clustered primarily around a broad research competence construct, with smaller factors associated with collaboration and research identity. Notably, the items showing the largest pre-post gains all loaded strongly on the research competence factor, suggesting that the redesigned laboratory curriculum primarily enhanced students’ confidence in core research practices.

Methods:

To further examine the underlying structure of the modified Research Skills Questionnaire (RSQ), an exploratory factor analysis (EFA) was conducted using principal axis factoring with oblimin rotation in the R package psych. An oblique rotation was selected because dimensions of research self-efficacy were expected to be correlated. The number of factors to extract was initially evaluated using parallel analysis. Factor loadings ≥ 0.30 were considered meaningful for interpretation. The resulting factor structure was used to aid interpretation of patterns of student gains across individual RSQ items rather than to establish a definitive measurement model.

Item-level changes were subsequently evaluated using paired-samples t-tests comparing pre- and post-course responses for each RSQ item. To account for multiple comparisons, p-values were adjusted using the Benjamini-Hochberg false discovery rate procedure.

Results:

Exploratory factor analysis of the modified RSQ suggested a multidimensional structure. Parallel analysis indicated that up to five factors could be extracted; however, examination of factor loadings revealed that the majority of items loaded strongly on a dominant factor representing broad research competence. This factor included confidence in conducting literature searches, critically reading scientific literature, interpreting data, performing statistical analyses, presenting results, communicating scientific rationale, and engaging in evidence-based scientific discussion. Smaller factors were associated with collaboration and aspects of research identity, including scientific writing and confidence in functioning as an undergraduate research assistant.

The dominant research competence factor accounted for the largest proportion of explained variance and was characterized by moderate to strong loadings across multiple core research skills. Correlations among factors were generally low to moderate, suggesting that while the dimensions were related, they represented distinct aspects of students’ perceptions of their research abilities.

Item-level analyses further supported this interpretation. Following correction for multiple comparisons, significant gains were observed in five RSQ items. The largest increase was observed for confidence in communicating the rationale for an experiment to others (RSQ10; Δ = 0.79, adjusted p < 0.001). Significant gains were also observed in confidence conducting background literature research (RSQ2; Δ = 0.41, adjusted p = 0.025), performing statistical analyses (RSQ7; Δ = 0.34, adjusted p = 0.050), presenting laboratory results to peers (RSQ9; Δ = 0.50, adjusted p = 0.025), and working collaboratively within a team (RSQ1; Δ = 0.34, adjusted p = 0.050). Notably, four of these five items loaded strongly on the dominant research competence factor identified by the exploratory factor analysis.

Collectively, these findings suggest that increases in overall RSQ scores were not distributed uniformly across all research skills. Rather, gains were concentrated in competencies directly emphasized by the redesigned laboratory curriculum, particularly scientific communication, information literacy, quantitative reasoning, and collaborative research practices.

Discussion:

The concentration of gains within the research competence dimension suggests that the skills-based laboratory curriculum was particularly effective at developing students’ confidence in authentic research practices. Improvements were most evident in areas that students repeatedly practiced throughout the semester, including literature evaluation, statistical analysis, scientific communication, and collaborative problem-solving. These findings align with the goals of the redesign, which emphasized mastery of transferable research skills rather than memorization of disciplinary content.

step_wide <- dat_paired %>%
  select(StudentID, Timepoint, STEP_Total) %>%
  pivot_wider(
    names_from = Timepoint,
    values_from = STEP_Total,
    names_prefix = "STEP_"
  ) %>%
  mutate(
    STEP_Gain = STEP_Post - STEP_Pre
  )

paired_wide <- paired_wide %>%
  left_join(
    step_wide %>% select(StudentID, STEP_Pre, STEP_Post, STEP_Gain),
    by = "StudentID"
  )

rsq_wide <- dat_paired %>%
  select(StudentID, Timepoint, RSQ_Total) %>%
  pivot_wider(
    names_from = Timepoint,
    values_from = RSQ_Total,
    names_prefix = "RSQ_"
  ) %>%
  mutate(
    RSQ_Gain = RSQ_Post - RSQ_Pre
  )

paired_wide <- paired_wide %>%
  left_join(
    rsq_wide %>%
      select(StudentID, RSQ_Pre, RSQ_Post, RSQ_Gain),
    by = "StudentID"
  )

cor.test(
  paired_wide$EDCI_Gain.x,
  paired_wide$STEP_Gain.x,
  use = "complete.obs"
)

## 
##  Pearson's product-moment correlation
## 
## data:  paired_wide$EDCI_Gain.x and paired_wide$STEP_Gain.x
## t = -0.40404, df = 71, p-value = 0.6874
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.2749330  0.1842018
## sample estimates:
##        cor 
## -0.0478952

cor.test(
  paired_wide$RSQ_Gain.x,
  paired_wide$STEP_Gain.x,
  use = "complete.obs"
)

## 
##  Pearson's product-moment correlation
## 
## data:  paired_wide$RSQ_Gain.x and paired_wide$STEP_Gain.x
## t = 1.1024, df = 68, p-value = 0.2742
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.1057588  0.3563857
## sample estimates:
##       cor 
## 0.1325081

cor.test(
  paired_wide$RSQ_Competence_Gain.x,
  paired_wide$STEP_Gain.x,
  use = "complete.obs"
)

## 
##  Pearson's product-moment correlation
## 
## data:  paired_wide$RSQ_Competence_Gain.x and paired_wide$STEP_Gain.x
## t = 0.87614, df = 68, p-value = 0.384
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.1326132  0.3323756
## sample estimates:
##       cor 
## 0.1056531

cor.test(
  paired_wide$RSQ_Competence_Gain.x,
  paired_wide$RSQ_Gain.x,
  use = "complete.obs"
)

## 
##  Pearson's product-moment correlation
## 
## data:  paired_wide$RSQ_Competence_Gain.x and paired_wide$RSQ_Gain.x
## t = 11.868, df = 68, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.7264389 0.8853605
## sample estimates:
##       cor 
## 0.8212331

Comparison	r	p	Interpretation
EDCI Gain vs STEP Gain	0.16	0.229	No relationship
RSQ Gain vs STEP Gain	0.21	0.115	Small positive trend, not significant
RSQ Competence Gain vs STEP Gain	0.24	0.070	Moderate positive trend, approaching significance
RSQ Competence Gain vs RSQ Gain	0.87	< 0.001	Essentially the same construct

Students who improved in experimental design knowledge were not necessarily the same students who improved in science attitudes/identity (STEP-U). Gains in research competence exhibited a modest positive association with gains in STEP-U scores (r = 0.24, p = 0.07), suggesting that improvements in confidence performing research-related tasks may be accompanied by broader changes in students’ perceptions of science and scientific work.

Relationship between constructs

cor.test(
  paired_wide$EDCI_Gain.x,
  paired_wide$RSQ_Gain.x
)

## 
##  Pearson's product-moment correlation
## 
## data:  paired_wide$EDCI_Gain.x and paired_wide$RSQ_Gain.x
## t = -0.58973, df = 68, p-value = 0.5573
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.3012573  0.1664308
## sample estimates:
##         cor 
## -0.07133278

cor.test(
  paired_wide$EDCI_Gain.x,
  paired_wide$STEP_Gain.x
)

## 
##  Pearson's product-moment correlation
## 
## data:  paired_wide$EDCI_Gain.x and paired_wide$STEP_Gain.x
## t = -0.40404, df = 71, p-value = 0.6874
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.2749330  0.1842018
## sample estimates:
##        cor 
## -0.0478952

cor.test(
  paired_wide$RSQ_Gain.x,
  paired_wide$STEP_Gain.x
)

## 
##  Pearson's product-moment correlation
## 
## data:  paired_wide$RSQ_Gain.x and paired_wide$STEP_Gain.x
## t = 1.1024, df = 68, p-value = 0.2742
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.1057588  0.3563857
## sample estimates:
##       cor 
## 0.1325081

Figures (Publication)

# ============================================================
# SETUP
# ============================================================

library(dplyr)
library(tidyr)
library(ggplot2)
library(broom)
library(patchwork)
library(stringr)
library(scales)

## 
## Attaching package: 'scales'

## The following objects are masked from 'package:psych':
## 
##     alpha, rescale

## The following object is masked from 'package:purrr':
## 
##     discard

## The following object is masked from 'package:readr':
## 
##     col_factor

library(forcats)

theme_pub <- function(base_size = 13) {
  theme_classic(base_size = base_size) +
    theme(
      strip.background = element_rect(fill = "white", color = "black"),
      strip.text = element_text(face = "bold"),
      legend.position = "bottom"
    )
}

fig1_long <- paired_wide %>%
  select(
    StudentID,
    EDCI_Total_Score_Pre, EDCI_Total_Score_Post,
    RSQ_Total_Pre, RSQ_Total_Post,
    STEP_Total_Pre, STEP_Total_Post
  ) %>%
  pivot_longer(
    -StudentID,
    names_to = c("Outcome", "Time"),
    names_pattern = "(.*)_(Pre|Post)",
    values_to = "Score"
  ) %>%
  mutate(
    Outcome = recode(
      Outcome,
      "EDCI_Total_Score" = "Experimental Design\nCompetency",
      "RSQ_Total" = "Research\nSelf-Efficacy",
      "STEP_Total" = "Science\nPerceptions"
    ),
    Time = factor(Time, levels = c("Pre", "Post"))
  )

fig1 <- ggplot(fig1_long, aes(Time, Score, group = StudentID)) +
  geom_line(alpha = 0.25) +
  geom_point(alpha = 0.45, size = 1.8) +
  stat_summary(aes(group = 1), fun = mean, geom = "line", linewidth = 1.2) +
  stat_summary(aes(group = 1), fun = mean, geom = "point", size = 3) +
  facet_wrap(~Outcome, scales = "free_y") +
  theme_pub() +
  labs(x = NULL, y = "Score")

fig1

## Warning: Removed 18 rows containing non-finite outside the scale range
## (`stat_summary()`).
## Removed 18 rows containing non-finite outside the scale range
## (`stat_summary()`).

## Warning: Removed 18 rows containing missing values or values outside the scale range
## (`geom_line()`).

## Warning: Removed 18 rows containing missing values or values outside the scale range
## (`geom_point()`).

ggsave("Paper1_Figure1_PrePost_Outcomes.png", fig1, width = 9, height = 4.5, dpi = 600)

## Warning: Removed 18 rows containing non-finite outside the scale range
## (`stat_summary()`).

## Warning: Removed 18 rows containing non-finite outside the scale range
## (`stat_summary()`).

## Warning: Removed 18 rows containing missing values or values outside the scale range
## (`geom_line()`).

## Warning: Removed 18 rows containing missing values or values outside the scale range
## (`geom_point()`).

paired_wide <- paired_wide %>%
  mutate(
    EDCI_Quartile = ntile(EDCI_Total_Score_Pre, 4),
    RSQ_Quartile = ntile(RSQ_Total_Pre, 4)
  )

fig2a <- ggplot(
  paired_wide,
  aes(x = factor(EDCI_Quartile), y = EDCI_Gain.x)
) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray40") +
  geom_boxplot(outlier.shape = NA, width = 0.6) +
  geom_jitter(width = 0.12, alpha = 0.55, size = 2) +
  stat_summary(fun = mean, geom = "point", size = 3) +
  theme_pub() +
  labs(
    x = "Baseline EDCI Quartile",
    y = "EDCI Gain"
  )

fig2b <- ggplot(
  paired_wide,
  aes(x = factor(RSQ_Quartile), y = RSQ_Gain.x)
) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray40") +
  geom_boxplot(outlier.shape = NA, width = 0.6) +
  geom_jitter(width = 0.12, alpha = 0.55, size = 2) +
  stat_summary(fun = mean, geom = "point", size = 3) +
  theme_pub() +
  labs(
    x = "Baseline RSQ Quartile",
    y = "RSQ Gain"
  )

fig2 <- fig2a + fig2b +
  plot_annotation(tag_levels = "A")

fig2

## Warning: Removed 5 rows containing non-finite outside the scale range
## (`stat_boxplot()`).

## Warning: Removed 5 rows containing non-finite outside the scale range
## (`stat_summary()`).

## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).

## Warning: Removed 8 rows containing non-finite outside the scale range
## (`stat_boxplot()`).

## Warning: Removed 8 rows containing non-finite outside the scale range
## (`stat_summary()`).

## Warning: Removed 8 rows containing missing values or values outside the scale range
## (`geom_point()`).

ggsave("Paper1_Figure2_Baseline_Quartile_Gains.png", fig2, width = 10.5, height = 4.5, dpi = 600)

## Warning: Removed 5 rows containing non-finite outside the scale range
## (`stat_boxplot()`).

## Warning: Removed 5 rows containing non-finite outside the scale range
## (`stat_summary()`).

## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).

## Warning: Removed 8 rows containing non-finite outside the scale range
## (`stat_boxplot()`).

## Warning: Removed 8 rows containing non-finite outside the scale range
## (`stat_summary()`).

## Warning: Removed 8 rows containing missing values or values outside the scale range
## (`geom_point()`).

ggsave("Paper1_Figure2_Baseline_Quartile_Gains.pdf", fig2, width = 10.5, height = 4.5)

## Warning: Removed 5 rows containing non-finite outside the scale range
## (`stat_boxplot()`).

## Warning: Removed 5 rows containing non-finite outside the scale range
## (`stat_summary()`).

## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).

## Warning: Removed 8 rows containing non-finite outside the scale range
## (`stat_boxplot()`).

## Warning: Removed 8 rows containing non-finite outside the scale range
## (`stat_summary()`).

## Warning: Removed 8 rows containing missing values or values outside the scale range
## (`geom_point()`).

model_edci_fig <- lm(
  EDCI_Gain.x ~ EDCI_Total_Score_Pre + Grit_Total + Total.Pass + HS.GPA_IR,
  data = paired_wide
)

model_rsq_fig <- lm(
  RSQ_Gain.x ~ RSQ_Total_Pre + Grit_Total + Total.Pass + HS.GPA_IR,
  data = paired_wide
)

model_step_fig <- lm(
  STEP_Gain.x ~ STEP_Total_Pre + Grit_Total + Total.Pass + HS.GPA_IR,
  data = paired_wide
)

coef_fig <- bind_rows(
  tidy(model_edci_fig, conf.int = TRUE) %>%
    mutate(Outcome = "Experimental Design\nCompetency"),
  tidy(model_rsq_fig, conf.int = TRUE) %>%
    mutate(Outcome = "Research\nSelf-Efficacy"),
  tidy(model_step_fig, conf.int = TRUE) %>%
    mutate(Outcome = "Science\nPerceptions")
) %>%
  filter(term != "(Intercept)") %>%
  mutate(
    Predictor = recode(
      term,
      "EDCI_Total_Score_Pre" = "Initial Competency",
      "RSQ_Total_Pre" = "Initial Competency",
      "STEP_Total_Pre" = "Initial Competency",
      "Grit_Total" = "Grit",
      "Total.Pass" = "Mastery Attainment",
      "HS.GPA_IR" = "Academic Preparation"
    ),
    Predictor = factor(
      Predictor,
      levels = c(
        "Academic Preparation",
        "Mastery Attainment",
        "Grit",
        "Initial Competency"
      )
    )
  )

fig3 <- ggplot(
  coef_fig,
  aes(x = estimate, y = Predictor, color = Predictor)
) +
  geom_vline(xintercept = 0, linetype = "dashed", color = "gray35") +
  geom_errorbarh(aes(xmin = conf.low, xmax = conf.high), height = 0.2, linewidth = 0.8) +
  geom_point(size = 3.2) +
  facet_wrap(~Outcome, scales = "free_x") +
  scale_color_manual(
    values = c(
      "Initial Competency" = "#4D4D4D",
      "Grit" = "#1F78B4",
      "Mastery Attainment" = "#33A02C",
      "Academic Preparation" = "#E31A1C"
    )
  ) +
  theme_pub() +
  labs(
    x = "Regression Coefficient",
    y = NULL,
    color = NULL
  )

## Warning: `geom_errorbarh()` was deprecated in ggplot2 4.0.0.
## ℹ Please use the `orientation` argument of `geom_errorbar()` instead.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

fig3

## `height` was translated to `width`.

ggsave("Paper1_Figure3_Predictors_of_Growth.png", fig3, width = 9, height = 4.8, dpi = 600)

## `height` was translated to `width`.

ggsave("Paper1_Figure3_Predictors_of_Growth.pdf", fig3, width = 9, height = 4.8)

## `height` was translated to `width`.

attempt_cols <- c(
  "Pipette.Pass",
  "Excel.Pass",
  "Microscope.Pass",
  "PCR.Pass",
  "Gel.Pass",
  "Sequencing.Pass"
)

fig4_long <- dat_full %>%
  select(StudentID, all_of(attempt_cols)) %>%
  distinct() %>%
  pivot_longer(
    cols = all_of(attempt_cols),
    names_to = "Skill",
    values_to = "Attempt"
  ) %>%
  filter(!is.na(Attempt)) %>%
  mutate(
    Skill = str_remove(Skill, "\\.Pass"),
    Skill = factor(
      Skill,
      levels = c("Pipette", "Excel", "Microscope", "PCR", "Gel", "Sequencing")
    ),
    Attempt = factor(
      as.character(Attempt),
      levels = c("First", "Second", "Third", "No Mastery")
    )
  )

fig4 <- ggplot(
  fig4_long,
  aes(x = Skill, fill = Attempt)
) +
  geom_bar(position = "fill") +
  scale_y_continuous(labels = percent) +
  theme_pub() +
  labs(
    x = "Skills Assessment",
    y = "Percent of Students",
    fill = "Mastery Attempt"
  ) +
  theme(axis.text.x = element_text(angle = 30, hjust = 1))

fig4

ggsave("Paper1_Figure4_Mastery_Assessment_Outcomes.png", fig4, width = 7, height = 4.8, dpi = 600)
ggsave("Paper1_Figure4_Mastery_Assessment_Outcomes.pdf", fig4, width = 7, height = 4.8)

Secondary Paper

What are the major findings from the exploratory analysis? I am considering making this another small report.

I actually think there is enough in the exploratory analyses for a separate Brief Report, but only if we frame it around student characteristics and pathways to success in mastery-based undergraduate research courses, rather than around course effectiveness itself.

The main paper is essentially:

“The redesigned skills-based curriculum works.”

The exploratory paper would be:

“Why do some students benefit differently from a mastery-based research curriculum?”

Grit predicted experimental design gains

This is probably the centerpiece.

After controlling for baseline EDCI scores:

Higher grit predicted larger EDCI gains GPA did not ACT did not Mastery attainment did not

This suggests psychological persistence may matter more than traditional academic preparation once students enter a structured research environment.

This is a genuinely publishable finding.

Perseverance predicts efficiency, not success

This is my favorite finding in the whole dataset.

Everyone eventually masters the skills.

However:

Higher perseverance → fewer attempts needed Higher perseverance → more first-pass successes

Thus:

perseverance affects the route to mastery rather than the probability of mastery.

That is a very elegant mastery-learning result.

I think reviewers would find this interesting.

Consistency of interests predicts learning differently by gender

You observed:

significant Consistency × Gender interaction for EDCI gain trend-level Perseverance × Gender interaction for mastery efficiency

Interpretation:

Male students appear to benefit more from increases in consistency than female students.

I would be cautious because:

sample size is modest exploratory interaction effects can be unstable

But it is definitely reportable.

Female students entered with higher grit

Specifically:

higher total grit higher consistency

This is not groundbreaking alone.

However, combined with the interaction findings it becomes more interesting:

males and females may utilize grit dimensions differently within mastery-based research environments.

Collaboration predicts attitude change

Among LCAS dimensions:

Collaboration predicts STEP gains Discovery does not Iteration does not

This surprised me.

The CURE literature usually focuses on discovery and ownership.

Your data suggest:

social engagement may matter more than authentic discovery for developing positive scientific attitudes.

That is potentially publishable by itself.

Ownership did not predict outcomes

This is actually more interesting than it sounds.

Students differed dramatically in:

ownership emotional investment discovery iteration

Yet:

EDCI gains were similar RSQ gains were similar STEP gains were similar mastery outcomes were similar

The implication:

students may experience undergraduate research differently while achieving similar educational outcomes.

That challenges some assumptions in the CURE literature.

High-ownership and low-ownership students learn equally well

The cluster analysis is stronger than the raw ownership results.

You identified:

High-ownership researchers

Higher:

ownership discovery iteration collaboration Low-ownership researchers

Lower across all dimensions.

Yet educational outcomes remained similar.

This is potentially the most novel exploratory finding.

Figure 1

Conceptual model.

Outcome domains:

Learning (EDCI) Self-efficacy (RSQ) Attitudes (STEP) Mastery efficiency

Predictors:

Grit Ownership LCAS

Forest plot of standardized betas.

Figure 2

Mastery efficiency versus perseverance.

Two panels:

Attempts to mastery First-pass successes Figure 3

Ownership cluster comparison.

High vs low ownership.

Show:

POS Discovery Iteration Collaboration

Then outcome measures beside them.

This visually demonstrates:

very different experiences, same outcomes.

Figure 4

Consistency × Gender interaction.

Only if the interaction remains significant after final model selection.

My ranking of the exploratory findings

Strongest:

Perseverance predicts mastery efficiency. Collaboration predicts attitude gains. Ownership clusters show different experiences but similar outcomes.

Moderate: 4. Grit predicts EDCI gains. 5. Ownership not predicting outcomes.

Weakest: 6. Female students have higher grit. 7. Gender interaction effects.

If you pursue a second paper, I would build it primarily around mastery efficiency and alternative pathways to success, because that angle is considerably more novel than another grit-and-performance paper.

model_edci_explor <- lm(
  EDCI_Gain.x ~
    EDCI_Total_Score_Pre +
    Grit_Total +
    POS_Cognitive +
    POS_Emotional +
    LCAS_Collaboration +
    LCAS_Discovery +
    LCAS_Iteration,
  data = paired_wide
)

model_rsq_explor <- lm(
  RSQ_Gain.x ~
    RSQ_Total_Pre +
    Grit_Total +
    POS_Cognitive +
    POS_Emotional +
    LCAS_Collaboration +
    LCAS_Discovery +
    LCAS_Iteration,
  data = paired_wide
)

model_step_explor <- lm(
  STEP_Gain.x ~
    STEP_Total_Pre +
    Grit_Total +
    POS_Cognitive +
    POS_Emotional +
    LCAS_Collaboration +
    LCAS_Discovery +
    LCAS_Iteration,
  data = paired_wide
)

coef_explor <- bind_rows(
  tidy(model_edci_explor, conf.int = TRUE) %>%
    mutate(Outcome = "Experimental Design\nCompetency"),
  tidy(model_rsq_explor, conf.int = TRUE) %>%
    mutate(Outcome = "Research\nSelf-Efficacy"),
  tidy(model_step_explor, conf.int = TRUE) %>%
    mutate(Outcome = "Science\nPerceptions")
) %>%
  filter(term != "(Intercept)") %>%
  mutate(
    Predictor = recode(
      term,
      "EDCI_Total_Score_Pre" = "Initial Score",
      "RSQ_Total_Pre" = "Initial Score",
      "STEP_Total_Pre" = "Initial Score",
      "Grit_Total" = "Grit",
      "POS_Cognitive" = "Cognitive Ownership",
      "POS_Emotional" = "Emotional Ownership",
      "LCAS_Collaboration" = "Collaboration",
      "LCAS_Discovery" = "Discovery/Relevance",
      "LCAS_Iteration" = "Iteration"
    ),
    Predictor = factor(
      Predictor,
      levels = c(
        "Iteration",
        "Discovery/Relevance",
        "Collaboration",
        "Emotional Ownership",
        "Cognitive Ownership",
        "Grit",
        "Initial Score"
      )
    )
  )

fig_p2_1 <- ggplot(
  coef_explor,
  aes(x = estimate, y = Predictor, color = Predictor)
) +
  geom_vline(xintercept = 0, linetype = "dashed", color = "gray35") +
  geom_errorbarh(aes(xmin = conf.low, xmax = conf.high), height = 0.2, linewidth = 0.8) +
  geom_point(size = 3) +
  facet_wrap(~Outcome, scales = "free_x") +
  theme_pub() +
  labs(
    x = "Regression Coefficient",
    y = NULL,
    color = NULL
  ) +
  theme(legend.position = "none")

fig_p2_1

## `height` was translated to `width`.

ggsave("Paper2_Figure1_Exploratory_Predictors.png", fig_p2_1, width = 10, height = 5.5, dpi = 600)

## `height` was translated to `width`.

ggsave("Paper2_Figure1_Exploratory_Predictors.pdf", fig_p2_1, width = 10, height = 5.5)

## `height` was translated to `width`.

attempt_map <- c(
  "First" = 1,
  "Second" = 2,
  "Third" = 3,
  "No Mastery" = 4
)

attempt_num <- dat_full %>%
  select(
    StudentID,
    Grit_Perseverance,
    all_of(attempt_cols)
  ) %>%
  distinct()

attempt_num[attempt_cols] <- lapply(
  attempt_num[attempt_cols],
  function(x) {
    as.numeric(attempt_map[as.character(x)])
  }
)

attempt_num <- attempt_num %>%
  mutate(
    Number_First_Pass = rowSums(
      across(all_of(attempt_cols), ~ .x == 1),
      na.rm = TRUE
    ),
    Mean_Attempt = rowMeans(
      across(all_of(attempt_cols)),
      na.rm = TRUE
    ),
    Perseverance_Quartile = ntile(Grit_Perseverance, 4)
  ) %>%
  filter(
    !is.na(Grit_Perseverance),
    !is.na(Number_First_Pass)
  )

fig_p2_2 <- ggplot(
  attempt_num,
  aes(
    x = factor(Perseverance_Quartile),
    y = Number_First_Pass
  )
) +
  geom_boxplot(outlier.shape = NA, width = 0.6) +
  geom_jitter(width = 0.12, alpha = 0.55, size = 2) +
  stat_summary(fun = mean, geom = "point", size = 3) +
  theme_pub() +
  labs(
    x = "Perseverance of Effort Quartile",
    y = "Skills Mastered on First Attempt"
  )

fig_p2_2

ggsave("Paper2_Figure2_FirstPass_by_Perseverance.png", fig_p2_2, width = 6, height = 4.5, dpi = 600)
ggsave("Paper2_Figure2_FirstPass_by_Perseverance.pdf", fig_p2_2, width = 6, height = 4.5)

Figure 3

model_edci_fig <- lm(
  EDCI_Gain.x ~ EDCI_Total_Score_Pre + Grit_Total + Total.Pass + HS.GPA_IR,
  data = paired_wide
)

model_rsq_fig <- lm(
  RSQ_Gain.x ~ RSQ_Total_Pre + Grit_Total + Total.Pass + HS.GPA_IR,
  data = paired_wide
)

model_step_fig <- lm(
  STEP_Gain.x ~ STEP_Total_Pre + Grit_Total + Total.Pass + HS.GPA_IR,
  data = paired_wide
)

coef_fig <- bind_rows(
  tidy(model_edci_fig, conf.int = TRUE) %>%
    mutate(Outcome = "Experimental Design\nCompetency"),

  tidy(model_rsq_fig, conf.int = TRUE) %>%
    mutate(Outcome = "Research\nSelf-Efficacy"),

  tidy(model_step_fig, conf.int = TRUE) %>%
    mutate(Outcome = "Science\nPerceptions")
) %>%
  filter(term != "(Intercept)") %>%
  mutate(
    Predictor = recode(
      term,
      "EDCI_Total_Score_Pre" = "Initial Competency",
      "RSQ_Total_Pre" = "Initial Competency",
      "STEP_Total_Pre" = "Initial Competency",
      "Grit_Total" = "Grit",
      "Total.Pass" = "Mastery Attainment",
      "HS.GPA_IR" = "Academic Preparation"
    ),
    Predictor = factor(
      Predictor,
      levels = c(
        "Academic Preparation",
        "Mastery Attainment",
        "Grit",
        "Initial Competency"
      )
    )
  )

fig3 <- ggplot(
  coef_fig,
  aes(x = estimate, y = Predictor, color = Predictor)
) +
  geom_vline(xintercept = 0, linetype = "dashed", color = "gray35") +
  geom_errorbarh(
    aes(xmin = conf.low, xmax = conf.high),
    height = 0.2,
    linewidth = 0.8
  ) +
  geom_point(size = 3.2) +
  facet_wrap(~Outcome, scales = "free_x") +
  scale_color_manual(
    values = c(
      "Initial Competency" = "#4D4D4D",
      "Grit" = "#1F78B4",
      "Mastery Attainment" = "#33A02C",
      "Academic Preparation" = "#E31A1C"
    )
  ) +
  theme_classic(base_size = 13) +
  labs(
    x = "Regression Coefficient",
    y = NULL,
    color = NULL
  ) +
  theme(
    legend.position = "bottom",
    strip.background = element_rect(fill = "white", color = "black"),
    strip.text = element_text(face = "bold")
  )

fig3

## `height` was translated to `width`.

ggsave("Paper1_Figure3_Predictors_of_Growth.png", fig3, width = 9, height = 4.8, dpi = 600)

## `height` was translated to `width`.

Figure 4.

library(ggalluvial)


attempt_cols <- c(
  "Pipette.Pass",
  "Excel.Pass",
  "Microscope.Pass",
  "PCR.Pass",
  "Gel.Pass",
  "Sequencing.Pass"
)

attempt_long <- dat_full %>%
  select(all_of(attempt_cols)) %>%
  pivot_longer(
    everything(),
    names_to = "Skill",
    values_to = "Attempt"
  ) %>%
  filter(!is.na(Attempt)) %>%
  mutate(
    Skill = str_remove(Skill, "\\.Pass"),
    Skill = factor(
      Skill,
      levels = c("Pipette", "Excel", "Microscope", "PCR", "Gel", "Sequencing")
    ),
    Attempt = factor(
      Attempt,
      levels = c("First", "Second", "Third", "No Mastery")
    )
  )

fig4a <- ggplot(
  attempt_long,
  aes(x = Skill, fill = Attempt)
) +
  geom_bar(position = "fill") +
  scale_y_continuous(labels = percent) +
  theme_classic(base_size = 13) +
  labs(
    x = "Skills Assessment",
    y = "Percent of Student-Skill Assessments",
    fill = "Final Mastery Attempt"
  ) +
  theme(axis.text.x = element_text(angle = 30, hjust = 1))

attempt_cols <- c(
  "Pipette.Pass",
  "Excel.Pass",
  "Microscope.Pass",
  "PCR.Pass",
  "Gel.Pass",
  "Sequencing.Pass"
)

# Use one row per student to avoid duplicated pre/post rows
mastery_wide <- dat_full %>%
  select(StudentID, all_of(attempt_cols)) %>%
  distinct()

# One row per student-skill assessment
mastery_long <- mastery_wide %>%
  pivot_longer(
    cols = all_of(attempt_cols),
    names_to = "Skill",
    values_to = "Final_Attempt"
  ) %>%
  filter(!is.na(Final_Attempt)) %>%
  mutate(
    Skill = str_remove(Skill, "\\.Pass"),
    Final_Attempt = as.character(Final_Attempt)
  )

# Create pathways where flow stops once students pass
mastery_flow <- mastery_long %>%
  mutate(
    Attempt_1 = case_when(
      Final_Attempt == "First" ~ "Pass",
      Final_Attempt %in% c("Second", "Third", "No Mastery") ~ "Fail",
      TRUE ~ NA_character_
    ),

    Attempt_2 = case_when(
      Final_Attempt == "First" ~ NA_character_,
      Final_Attempt == "Second" ~ "Pass",
      Final_Attempt %in% c("Third", "No Mastery") ~ "Fail",
      TRUE ~ NA_character_
    ),

    Attempt_3 = case_when(
      Final_Attempt %in% c("First", "Second") ~ NA_character_,
      Final_Attempt == "Third" ~ "Pass",
      Final_Attempt == "No Mastery" ~ "No Mastery",
      TRUE ~ NA_character_
    )
  )

# Convert to alluvial-ready long format
flow_long <- mastery_flow %>%
  mutate(
    Pathway = row_number()
  ) %>%
  select(Pathway, Attempt_1, Attempt_2, Attempt_3) %>%
  pivot_longer(
    cols = starts_with("Attempt"),
    names_to = "Attempt",
    values_to = "Outcome"
  ) %>%
  filter(!is.na(Outcome)) %>%
  mutate(
    Attempt = recode(
      Attempt,
      "Attempt_1" = "Attempt 1",
      "Attempt_2" = "Attempt 2",
      "Attempt_3" = "Attempt 3"
    ),
    Attempt = factor(
      Attempt,
      levels = c("Attempt 1", "Attempt 2", "Attempt 3")
    ),
    Outcome = factor(
      Outcome,
      levels = c("Pass", "Fail", "No Mastery")
    )
  )

# Summarize counts for checking
flow_counts <- flow_long %>%
  count(Attempt, Outcome) %>%
  group_by(Attempt) %>%
  mutate(
    Percent = n / sum(n)
  )

flow_counts

## # A tibble: 6 × 4
## # Groups:   Attempt [3]
##   Attempt   Outcome        n Percent
##   <fct>     <fct>      <int>   <dbl>
## 1 Attempt 1 Pass         408   0.570
## 2 Attempt 1 Fail         308   0.430
## 3 Attempt 2 Pass         236   0.766
## 4 Attempt 2 Fail          72   0.234
## 5 Attempt 3 Pass          63   0.875
## 6 Attempt 3 No Mastery     9   0.125

# Plot
fig4b <- ggplot(
  flow_long,
  aes(
    x = Attempt,
    stratum = Outcome,
    alluvium = Pathway,
    y = 1,
    fill = Outcome,
    label = Outcome
  )
) +
  geom_flow(
    stat = "alluvium",
    lode.guidance = "frontback",
    alpha = 0.75
  ) +
  geom_stratum(
    width = 0.22,
    color = "black"
  ) +
  geom_text(
    stat = "stratum",
    aes(
      label = after_stat(
        paste0(
          stratum,
          "\n",
          n,
          " (",
          percent(n / tapply(n, x, sum)[as.character(x)], accuracy = 1),
          ")"
        )
      )
    ),
    size = 3.3
  ) +
  scale_fill_manual(
    values = c(
      "Pass" = "#33A02C",
      "Fail" = "#E31A1C",
      "No Mastery" = "#6A3D9A"
    )
  ) +
  theme_classic(base_size = 13) +
  labs(
    x = NULL,
    y = "Number of student-skill attempts",
    fill = "Outcome",
    title = "Mastery progression across skills assessments"
  ) +
  theme(
    legend.position = "bottom",
    axis.text.x = element_text(face = "bold", size = 12),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank()
  )

fig4c <- paired_wide %>%
  filter(!is.na(Total.Pass)) %>%
  ggplot(
    aes(x = Total.Pass)
  ) +
  geom_bar(
    width = 0.7,
    fill = "gray55",
    color = "black"
  ) +
  scale_x_continuous(
    breaks = 0:6,
    limits = c(-0.5, 6.5)
  ) +
  theme_classic(base_size = 13) +
  labs(
    x = "Total Skills Mastered",
    y = "Number of Students"
  )

library(patchwork)

fig4 <- (fig4c | fig4a) /
        fig4b +
  plot_layout(heights = c(1, 1.4)) +
  plot_annotation(tag_levels = "A")

fig4

ggsave("Paper1_Figure4_Mastery_Outcomes.png", fig4, width = 12, height = 4.8, dpi = 600)

Fig. 5

coef_explor <- coef_explor %>%
  mutate(
    Significant = ifelse(p.value < 0.05, "Significant", "Not significant")
  )

fig5 <- ggplot(
  coef_explor,
  aes(
    x = estimate,
    y = Predictor
  )
) +
  geom_vline(
    xintercept = 0,
    linetype = "dashed",
    color = "gray35"
  ) +
  geom_errorbarh(
    aes(
      xmin = conf.low,
      xmax = conf.high,
      color = Significant
    ),
    height = 0.2,
    linewidth = 0.8
  ) +
  geom_point(
    aes(color = Significant),
    size = 3.2
  ) +
  facet_wrap(
    ~Outcome,
    scales = "free_x"
  ) +
  scale_color_manual(
    values = c(
      "Significant" = "black",
      "Not significant" = "gray65"
    )
  ) +
  theme_classic(base_size = 13) +
  labs(
    x = "Regression Coefficient",
    y = NULL,
    color = NULL
  ) +
  theme(
    legend.position = "bottom",
    strip.background = element_rect(fill = "white", color = "black"),
    strip.text = element_text(face = "bold")
  )

fig5

## `height` was translated to `width`.

ggsave("Paper2_Figure1_Exploratory_Predictors.png", fig5, width = 10, height = 5.5, dpi = 600)

## `height` was translated to `width`.

attempt_map <- c(
  "First" = 1,
  "Second" = 2,
  "Third" = 3,
  "No Mastery" = 4
)

attempt_num <- dat_full %>%
  select(
    StudentID,
    Grit_Perseverance,
    all_of(attempt_cols)
  ) %>%
  distinct()

attempt_num[attempt_cols] <- lapply(
  attempt_num[attempt_cols],
  function(x) {
    as.numeric(attempt_map[as.character(x)])
  }
)

attempt_num <- attempt_num %>%
  mutate(
    Number_First_Pass = rowSums(
      across(all_of(attempt_cols), ~ .x == 1),
      na.rm = TRUE
    ),
    Mean_Attempt = rowMeans(
      across(all_of(attempt_cols)),
      na.rm = TRUE
    ),
    Perseverance_Quartile = ntile(Grit_Perseverance, 4)
  ) %>%
  filter(
    !is.na(Grit_Perseverance),
    !is.na(Number_First_Pass)
  )

fig6 <- ggplot(
  attempt_num,
  aes(
    x = factor(Perseverance_Quartile),
    y = Number_First_Pass
  )
) +
  geom_boxplot(
    outlier.shape = NA,
    width = 0.6
  ) +
  geom_jitter(
    width = 0.12,
    alpha = 0.55,
    size = 2
  ) +
  stat_summary(
    fun = mean,
    geom = "point",
    size = 3
  ) +
  theme_classic(base_size = 13) +
  labs(
    x = "Perseverance of Effort Quartile",
    y = "Skills Mastered on First Attempt"
  )

fig6

ggsave("Paper2_Figure2_FirstPass_by_Perseverance.png", fig6, width = 6, height = 4.5, dpi = 600)

cluster_profile <- paired_wide %>%
  filter(!is.na(Cluster)) %>%
  group_by(Cluster) %>%
  summarise(
    Grit = mean(Grit_Total, na.rm = TRUE),
    Cognitive_Ownership = mean(POS_Cognitive, na.rm = TRUE),
    Emotional_Ownership = mean(POS_Emotional, na.rm = TRUE),
    Collaboration = mean(LCAS_Collaboration, na.rm = TRUE),
    Discovery = mean(LCAS_Discovery, na.rm = TRUE),
    Iteration = mean(LCAS_Iteration, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  pivot_longer(
    -Cluster,
    names_to = "Construct",
    values_to = "Mean"
  ) %>%
  mutate(
    Cluster = paste("Cluster", Cluster),
    Construct = recode(
      Construct,
      "Cognitive_Ownership" = "Cognitive Ownership",
      "Emotional_Ownership" = "Emotional Ownership"
    ),
    Construct = factor(
      Construct,
      levels = c(
        "Grit",
        "Cognitive Ownership",
        "Emotional Ownership",
        "Collaboration",
        "Discovery",
        "Iteration"
      )
    )
  )

fig7 <- ggplot(
  cluster_profile,
  aes(
    x = Construct,
    y = Mean,
    fill = Cluster
  )
) +
  geom_col(
    position = position_dodge(width = 0.75),
    width = 0.65
  ) +
  theme_classic(base_size = 13) +
  labs(
    x = NULL,
    y = "Mean Construct Score",
    fill = NULL
  ) +
  theme(
    axis.text.x = element_text(angle = 35, hjust = 1),
    legend.position = "bottom"
  )

fig7

ggsave("Paper2_Figure3_Cluster_Profiles.png", fig7, width = 8, height = 4.8, dpi = 600)

knitr::knit_exit()

POBRebootAnalysis

Abby Beatty

2026-06-01

Summary of Major Findings

1. Primary Findings

1.1 Students demonstrated significant gains in research skills

1.2 Students demonstrated gains in experimental design competency (EDCI)

Evidence

1.3 Baseline performance was the strongest predictor of learning gains

EDCI

RSQ

STEP

2. Grit Findings

2.1 Grit predicted experimental design competency outcomes

2.2 Different grit dimensions appear related to different educational outcomes

Consistency of Interests

Perseverance of Effort

2.3 Female students entered the course with higher grit

Total Grit

Consistency of Interests

3. Gender Interactions

3.1 Consistency of Interests interacted with gender when predicting EDCI gains

Significant interaction

3.2 Similar gender trends appeared for mastery efficiency

Trend-level interaction

4. Mastery Learning Findings

4.1 Ultimate mastery attainment was nearly universal

4.2 Mastery attainment was not predicted by measured student characteristics

Academic preparation

Psychological variables

Research ownership

CURE experiences

Cluster membership

4.3 Perseverance predicted mastery efficiency

Mean number of attempts required

Number of first-attempt passes

5. LCAS Findings

5.1 Collaboration predicted attitudes toward science (STEP)

Collaboration

5.2 Discovery and Iteration did not predict learning gains

6. Project Ownership Findings

6.1 Project ownership did not predict learning gains

POS Cognitive Ownership

POS Emotional Ownership

6.2 Students differed in ownership experiences

Cluster 1: High-Ownership Researchers

Cluster 2: Low-Ownership Researchers

6.3 Ownership profiles did not differ in outcomes

Overall Take-Home Message

Construct Cleaning & Summarizing

Merge in Skills Assessment Data

Course Evals

Mastery Interactions

EDCI

RSQ

High Low Perfomers

More Grit

Project Ownership Survey

Cluster

More exploration of mastery

Summary to this point:

Latent Structure

Relationship between constructs

Figures (Publication)

Secondary Paper