Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
The tables above provides the distributions of respondents in terms of sex, year level, and strand. It can be seen that there are 66 females and 34 males. Furthermore, an equal distribution was made per strand which is 25 each strand.
Call:
lm(formula = `Topic Presentation` ~ `Language Proficiency`, data = Data)
Coefficients:
(Intercept) `Language Proficiency`
1.487 0.528
From this, we may deduce that the data fail to satisfy the two assumptions – Linearity and Homogeneity of Variance.
`summarise()` has grouped output by 'Sex'. You can override using the `.groups`
argument.
Attaching package: 'rstatix'
The following object is masked from 'package:stats':
filter
The mean for male and female is 3.147 and 3.076, respectively.
The above graph shows the plotting of data by sex, which contains two sexes – male and female.
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ lubridate 1.9.3 ✔ tibble 3.2.1
✔ purrr 1.0.2 ✔ tidyr 1.3.1
✔ readr 2.1.5
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ rstatix::filter() masks dplyr::filter(), stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Warning: The following aesthetics were dropped during statistical transformation: fill
ℹ This can happen when ggplot fails to infer the correct grouping structure in
the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
variable into a factor?
It clearly shows that there is a difference on the topic presentation (variable) when grouped according to their sex. However, we still need to check the significance of this difference.
Loading required package: carData
Attaching package: 'car'
The following object is masked from 'package:purrr':
some
The following object is masked from 'package:dplyr':
recode
The histogram does not resemble a bell curve as seen above, means that the residuals do not have a normal distribution. Moreover, the points in the QQ-plots do not follow the straight line, with the majority of them falling outside the confidence bands. .
Shapiro-Wilk normality test
data: res_aov$residuals
W = 0.94742, p-value = 0.0005633
The Shapiro-Wilk p-value = 0.0005633 on the residuals is less than the usual significance level of 0.05. Thus, we reject the hypothesis that residuals have a normal distribution.
Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 1 0.3517 0.5545
98
The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.
Wilcoxon rank sum test
data: a and b
W = 1291, p-value = 0.4559
alternative hypothesis: true location shift is not equal to 0
Since the p-value is greater than 0.05, we fail to reject the null hypothesis, that is, there is no significant difference on the variable topic presentation when grouped according to sex.
`summarise()` has grouped output by 'Sex'. You can override using the `.groups`
argument.
The mean for male and female is 3.206 and 2.979, respectively.
The above graph shows the plotting of data by sex, which contains two sexes – male and female.
Warning: The following aesthetics were dropped during statistical transformation: fill
ℹ This can happen when ggplot fails to infer the correct grouping structure in
the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
variable into a factor?
It clearly shows that there is a difference on the variable language proficiency when grouped according to their sex. However, we still need to check the significance of its difference.
The histogram does not resemble a bell curve as seen above, means that the residuals do not have a normal distribution. Moreover, the points in the QQ-plots roughly follow the straight line, with the majority of them falling outside the confidence bands.
Shapiro-Wilk normality test
data: res_aov$residuals
W = 0.95943, p-value = 0.003663
The Shapiro-Wilk p-value = 0.003663 on the residuals is less than the usual significance level of 0.05. Thus, we reject the hypothesis that residuals have a normal distribution.
Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 1 0.491 0.4852
98
The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.
Wilcoxon rank sum test
data: c and d
W = 1397.5, p-value = 0.02573
alternative hypothesis: true location shift is not equal to 0
Since the p-value is less than 0.05, we reject the null hypothesis, that is, there is significant difference on the variable language proficiency when grouped according to sex.
Shapiro-Wilk normality test
data: Data$`Topic Presentation`
W = 0.92912, p-value = 4.436e-05
Since p-value = 4.436e-05 < 0.05, it is conclusive that we reject the null hypothesis. That is, we cannot assume normality.
Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 3 1.1061 0.3506
96
The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.
Attaching package: 'gplots'
The following object is masked from 'package:stats':
lowess
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
Warning in axis(1, at = 1:length(means), labels = legends, ...): "frame" is not
a graphical parameter
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
# A tibble: 4 × 11
Strand variable n min max median iqr mean sd se ci
<fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 GAS Topic Presentat… 25 2 4 3 0.4 2.98 0.477 0.095 0.197
2 ABM Topic Presentat… 25 2 4 3 0.6 3.12 0.507 0.101 0.209
3 HUMSS Topic Presentat… 25 2.6 3.6 3 0.4 3.10 0.278 0.056 0.115
4 STEM Topic Presentat… 25 2.2 4 3 0.6 3.2 0.458 0.092 0.189
The mean of GAS, ABM, HUMSS, and STEM is 2.976, 3.120, 3.104, and 3.200, respectively.
# A tibble: 1 × 6
.y. n statistic df p method
* <chr> <int> <dbl> <int> <dbl> <chr>
1 Topic Presentation 100 4.87 3 0.182 Kruskal-Wallis
Based on the p-value, there is no significant difference was observed between the group pairs.
# A tibble: 6 × 9
.y. group1 group2 n1 n2 statistic p p.adj p.adj.signif
* <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <chr>
1 Topic Presentat… GAS ABM 25 25 1.43 0.153 0.919 ns
2 Topic Presentat… GAS HUMSS 25 25 1.58 0.114 0.683 ns
3 Topic Presentat… GAS STEM 25 25 2.11 0.0352 0.211 ns
4 Topic Presentat… ABM HUMSS 25 25 0.152 0.879 1 ns
5 Topic Presentat… ABM STEM 25 25 0.678 0.498 1 ns
6 Topic Presentat… HUMSS STEM 25 25 0.525 0.599 1 ns
There is a significant difference between GAS and STEM.
Shapiro-Wilk normality test
data: Data$`Language Proficiency`
W = 0.92795, p-value = 3.813e-05
Since p-value = 3.813e-05 < 0.05, it is conclusive that we reject the null hypothesis. That is, we cannot assume normality.
Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 3 1.9735 0.1231
96
The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
Warning in axis(1, at = 1:length(means), labels = legends, ...): "frame" is not
a graphical parameter
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
# A tibble: 4 × 11
Strand variable n min max median iqr mean sd se ci
<fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 GAS Language Profic… 25 2 3.6 3 0.2 2.95 0.333 0.067 0.137
2 ABM Language Profic… 25 2 4 3 0.8 3.07 0.55 0.11 0.227
3 HUMSS Language Profic… 25 2.6 4 3 0.2 3.16 0.374 0.075 0.154
4 STEM Language Profic… 25 2.2 4 3 0.2 3.04 0.44 0.088 0.181
The mean of GAS, ABM, HUMSS, and STEM is 2.952, 3.072, 3.160, and 3.040, respectively.
# A tibble: 1 × 6
.y. n statistic df p method
* <chr> <int> <dbl> <int> <dbl> <chr>
1 Language Proficiency 100 2.58 3 0.462 Kruskal-Wallis
Based on the p-value, there is no significant difference was observed between the group pairs.
# A tibble: 6 × 9
.y. group1 group2 n1 n2 statistic p p.adj p.adj.signif
* <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <chr>
1 Language Profici… GAS ABM 25 25 0.687 0.492 1 ns
2 Language Profici… GAS HUMSS 25 25 1.60 0.110 0.659 ns
3 Language Profici… GAS STEM 25 25 0.725 0.469 1 ns
4 Language Profici… ABM HUMSS 25 25 0.912 0.362 1 ns
5 Language Profici… ABM STEM 25 25 0.0379 0.970 1 ns
6 Language Profici… HUMSS STEM 25 25 -0.874 0.382 1 ns
Shapiro-Wilk normality test
data: Data1$Scores
W = 0.93032, p-value = 3.58e-08
Since p-value = 3.58e-08 < 0.05, it is conclusive that we reject the null hypothesis. That is, we cannot assume normality.
Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 1 0.0673 0.7956
198
The p-value is less than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is not met.
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
Warning in axis(1, at = 1:length(means), labels = legends, ...): "frame" is not
a graphical parameter
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
# A tibble: 2 × 11
Variables variable n min max median iqr mean sd se ci
<fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Topic Present… Scores 100 2 4 3 0.4 3.1 0.44 0.044 0.087
2 Language Prof… Scores 100 2 4 3 0.4 3.06 0.432 0.043 0.086
The mean of topic presentation and language proficiency is 3.100 and 3.056, respectively.
# A tibble: 1 × 6
.y. n statistic df p method
* <chr> <int> <dbl> <int> <dbl> <chr>
1 Scores 200 0.442 1 0.506 Kruskal-Wallis
Based on the p-value, there is no significant difference between topic presentation and language proficiency.
Based on the outputs above, we can say that it is the topic presentation.