This is the general descriptive statistics for all numeric data.
Table 1. Descriptive statistics of all numeric unfiltered data
To address our research questions (listed below), we separated the data by gender and participation in the PBL course.
[RQ1] What is the potential relationship between gender and the sense of belonging in undergraduate engineering students participating in a project-based learning (PBL) engineering course? (QUAN/qual)
[RQ2] How do female-identifying undergraduate students in engineering (FUSE) perceive their sense of belonging in engineering after participation in a PBL engineering course? (QUAL/quan)
[RQ3] How does the sense of belonging in engineering disciplines of FUSE who participated in a PBL engineering course compare to that of FUSE who did not participate in a PBL engineering course? (MIXED)
| PBL Participation by Gender (with Totals) | |||
| Gender | Did Not Participate | Participated in PBL | Total Respondents |
|---|---|---|---|
| Female | 29 | 11 | 40 |
| Male | 84 | 26 | 110 |
| Total | 113 | 37 | 150 |
## Warning: package 'e1071' was built under R version 4.4.2
Table 3. Sense of Belonging Gender Comparison
## Warning: There were 24 warnings in `summarise()`.
## The first warning was:
## ℹ In argument: `across(...)`.
## ℹ In group 1: `gender_pbl_group = "Female_No"`.
## Caused by warning in `min()`:
## ! no non-missing arguments to min; returning Inf
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 23 remaining warnings.
Table 3. Sense of Belonging Gender and PBL Participation Comparison
To determine which types of statistical testing are necessary to analyze our quantitative data, we checked for normality and reliability of Likert-scale data (1 - “Strongly Disagree” to 6 - “Strongly Agree”).
We ran a Shapiro-Wilk normality test on each numeric column of data and found that all of the p-values were incredibly small (p-value < 0.05). Therefore, we rejected the assumption of normality and useed non-parametric testing (e.g., Mann-Whitney U test) for our statistical analysis.
We found Cronbach’s alpha for each column of data to test for the Likert-scale questions’ internal consistency, or reliability, of survey scale items. This measure is used to determine how well the survey items measure the same concept. A high alpha (closer to 1.0) indicates strong internal consistency. We found extremely high alpha levels among our tested survey items, indicating little variation between responses and highly correlated survey items. In other words, the survey questions do a very good job of measuring what they are intended to measure with little variance.
To check for significant differences in student responses (RQ1), we utilized Mann-Whitney U (or Wilcoxon Rank-Sum) tests for non-normally distributed data.
The results of our non-parametric tests suggest that most comparisons between gender groups did not show statistically significant differences in student sense of belonging in engineering. A few variables related to Engineering Identity (EI) (including id_en_sa_1, id_en_sa_2, and id_en_sa_comp, defined below) had p-values less than 0.05, indicating significant differences among gender groups in student responses. Additionally, some measures were found to approach statistical significance (including id_en_sa_3, id_en_perf_11, id_en_perf_12, and id_en_comp, defined below) but did not reach the 0.05 p-value threshold.
We used Cliff’s Delta, a non-parametric statistical measure used to quantify the effect size–the probability that a randomly selected value from one group (i.e., male) is greater than a randomly selected value from another group (i.e., female), minus the probability that it is less than the other group–between gender groups. This test provides a practical interpretation of numeric data beyond testing statistical significance and it is best applied when the two groups of interest have unequal sample sizes (male = 110 respondents, female = 40 respondents).
Listed findings from the Mann-Whitney U Test for Gender vs Engineering Identity Measures and Cronbach’s alpha are summarized by Table 4 below
Statistically Significant Gender Differences (p < 0.05):
EI Self-Awareness - id_en_sa_1: “I feel like an
engineer now”
Test statistic (U) = 1711, p-value = 0.033
Effect size (Cliff’s Delta) = -0.222 (small effect)
EI Self-Awareness - id_en_sa_2: “I will feel
like an engineer in the future”
Test statistic (U) = 1577, p-value = 0.005
Effect size (Cliff’s Delta) = -0.283 (small effect)
EI Self-Awareness Composite Score - id_en_sa_comp:
“Composite EI sub-construct self-awareness score”
Test statistic (U) = 1855, p-value = 0.011
Effect size (Cliff’s Delta) = -0.271 (small effect)
The delta-value for our statistically significant results indicated that the differences in the distributions between groups are small, but statistically significant. For each result, the 95% confidence intervals do not include 0, reinforcing that the practical effect is not negligible, even if small. Therefore, the small effect sizes and the confidence intervals that exclude 0 indicate that the statistical significance is strong for these tests and there is a strong but small practical significance between the gender groups.
Trends approaching statistically gender differences included (p < 0.10):
EI Self-awareness - id_en_sa_3: “I see myself as
an engineer”
Test statistic (U) = 1760, p-value = 0.05286
Effect size (Cliff’s Delta) = -0.200 (small effect)
EI Performance/Competence - id_en_perf_11: “I am
confident that I can understand engineering in class”
Test statistic (U) = 1794.5, p-value = 0.06891
Effect size (Cliff’s Delta) = -0.184 (small effect)
EI Performance/Competence - id_en_perf_12: “I am
confident that I can understand engineering outside of class”
Test statistic (U) = 1762, p-value = 0.05155
Effect size (Cliff’s Delta) = -0.199 (small effect)
EI Composite Score - id_en_comp: “Composite EI
construct score”
Test statistic (U) = 1799.5, p-value = 0.08911
Effect size (Cliff’s Delta) = -0.182 (small effect)
The delta-value for our results approaching statistical significance indicated that the differences in the distributions between groups are small and marginally statistically significant. For each result, the 95% confidence intervals do not include 0, reinforcing that the practical effect is not negligible, even if small. Therefore, the small effect sizes and the confidence intervals that exclude 0 indicate that while the statistical significance is weak or borderline for these tests, there is still a strong but small practical significance between the gender groups.
Table 4. Mann-Whitney U (Wilcoxon Rank Sum) test results
The violin plots (Fig. 1-7) displayed above show the distributions of the Likert-scale answers (1 - “Strongly Disagree” to 6 - “Strongly Agree”) for our responses that are statistically significant (p < 0.05) and approach statistically significant (p < 0.10) differences between gender groups.
Because our data did not meet statistical assumptions of normality and homogeneity, we used the Kruskal-Wallis non-parametric test to determine if there are statistically significant interactions (p < 0.05) between gender (male or female) and PBL participation (participated or did not participate) within our data set (RQ2 & RQ3). More specifically, this test examined whether gender and PBL participation together impact student sense of belonging in engineering. Our findings from the Kruskal-Wallis test indicated that the interactions between gender and PBL participation produced statistically significant findings (p-value < 0.05) in two variables related to Engineering Identity (EI) self-awareness, id_en_sa_2 and id_en_sa_comp, defined in the Mann_Whitney U test section above. We also found potential trends in interactions between gender and PBL participation (p-value > 0.10) in comm_engr_2 (p = 0.09035), and id_en_int_9 (p = 0.0846), defined below. To determine which specific groups within gender (male or female) and PBL participation (participated or did not participate) produce the statistical significance observed in our Kruskal-Wallis test findings, we conducted a post-hoc Dunn’s test and an Eta-squared effect size calculation to determine the practical significance of these findings.
A post-hoc Dunn’s test was used after the Kruskal-Wallis test to identify which specific pairs of groups are different, producing the statistically significant results observed in the Kruskal-Wallis test. In addition, we applied a Bonferroni correction to adjust the p-values for multiple comparison, controlling for Type I (false positive) errors by accounting for the increased likelihood of finding statistically significant results (p-value < 0.05) due to the increased number of comparisons being made. A potential limitation of the Bonferroni correction is that it is a conservative method as it might missed “true positive” findings by adjusting the p-value post-hoc, especially when many comparisons are being made.
Listed findings from the Kruskal-Wallis test and post-hoc Dunn’s test with a Bonferroni correction for Gender and PBL participation vs Engineering Identity Measures and Eta-squared calculations are summarized by Table 5 below
Statistically significant gender and participation interactions from the Kruskall-Wallis test (p < 0.0.5):
EI Self-awareness - id_en_sa_2: “I will feel like an engineer in the future”
Test statistic (H) = 9.626, p-value = 0.02203
Effect size (Eta-squared) = 4.81 (LARGE effect)
The only significant result after Bonferroni correction is between Female.Yes and Male.No, where female students who participated in PBL had a significantly different score from male students who did not participate in PBL.
Female.No - Female.Yes: no sig diff (p-unadjusted = 0.241, p-adjusted = 1.00)
Female.No - Male.No: no sig diff (p-unadjusted = 0.043, p-adjusted = 0.257)
Female.Yes - Male.No: SIG DIFF (p-unadjusted = 0.008, p-adjusted = 0.048)
Female.No - Male.Yes: no sig diff (p-unadjusted = 0.260, p-adjusted = 1.00)
Female.Yes - Male.Yes: no sig diff (p-unadjusted = 0.046, p-adjusted = 0.273)
Male.No - Male.Yes: no sig diff (p-unadjusted = 0.556, p-adjusted = 1.00)
EI Self-awareness - id_en_sa_comp: “Composite EI sub-construct self-awareness score”
Test statistic (H) = 7.8716, p-value = 0.04874
Effect size (Eta-squared) = 3.94 (LARGE effect)
The only marginally significant result (after the Bonferroni correction) is between Female.Yes and Male.No, where female students who participated in PBL seem to show a difference in scores compared to male students who did not participate, although this result is just above the usual threshold for significance.
Female.No - Female.Yes: no sig diff (p-unadjusted = 0.247, p-adjusted = 1.00)
Female.No - Male.No: no sig diff (p-unadjusted = 0.090, p-adjusted = 0.541)
Female.Yes - Male.No: marginal sig diff (p-unadjusted = 0.016, p-adjusted = 0.094)
Female.No - Male.Yes: no sig diff (p-unadjusted = 0.212, p-adjusted = 1.00)
Female.Yes - Male.Yes: no sig diff (p-unadjusted = 0.038, p-adjusted = 0.226)
Male.No - Male.Yes: no sig diff (p-unadjusted = 0.902, p-adjusted = 1.00)
Trends approaching statistically significant gender and participation interactions from the Kruskall-Wallis test (p < 0.10) (Note - cannot run post-hoc Dunn’s test on variables with p-values greater than 0.05):
Sense of Community - Engineering Dept. Level - comm_engr_2: “I am disliked by students in the [major] engineering department”
Test statistic (H) = 6.4826, p-value = 0.09035
Effect size (Eta-squared) = 3.24 (LARGE effect)
EI Interest - id_en_int_9: “I enjoy learning engineering”
Test statistic (H) = 6.6319, p-value = 0.0846
Effect size (Eta-squared) = 3.32 (LARGE effect)
Table 5. Kruskal-Wallis & post-hoc Dunns tests results for Gender-Participation Interactions
The interaction plots displayed above in Figures 8 & 9 show the
distributions of the Likert-scale answers (1 - “Strongly Disagree” to 6
- “Strongly Agree”) for our responses that are statistically significant
(p < 0.05).