Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
The tables above provides the distributions of respondents in terms of sex, year level, and strand. It can be seen that there are 70 females and 50 males; 24 of which are from ABM, 29 from GAS, 29 from HUMSS, and 38 from STEM.
Call:
lm(formula = `Availability of Reading Materials` ~ `Instructional Facilities` +
`Parental Involvement`, data = Data)
Coefficients:
(Intercept) `Instructional Facilities`
0.68197 0.70948
`Parental Involvement`
0.09139
From this, we may deduce that the data fail to satisfy the two assumptions – Linearity and Homogeneity of Variance.
`summarise()` has grouped output by 'Sex'. You can override using the `.groups`
argument.
Attaching package: 'rstatix'
The following object is masked from 'package:stats':
filter
The mean for male and female is 3.172 and 3.011, respectively.
The above graph shows the plotting of data by sex, which contains two sexes – male and female.
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ lubridate 1.9.3 ✔ tibble 3.2.1
✔ purrr 1.0.2 ✔ tidyr 1.3.1
✔ readr 2.1.5
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ rstatix::filter() masks dplyr::filter(), stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Warning: The following aesthetics were dropped during statistical transformation: fill
ℹ This can happen when ggplot fails to infer the correct grouping structure in
the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
variable into a factor?
It clearly shows that there is a significant difference on the availability of reading materials in terms of its effects towards reading comprehension when grouped according to their sex. However, we still need to check the significance of this difference.
Loading required package: carData
Attaching package: 'car'
The following object is masked from 'package:purrr':
some
The following object is masked from 'package:dplyr':
recode
The histogram does not resemble a bell curve as seen above, means that the residuals do not have a normal distribution. Moreover, the points in the QQ-plots roughly follow the straight line, with the majority of them falling within the confidence bands. However, this does not guarantee that residuals follow a normal distribution since when based on the diagram on the left, it is the exact opposite of it. Thus, it is more convenient to observe the two.
Shapiro-Wilk normality test
data: res_aov$residuals
W = 0.97713, p-value = 0.0386
The Shapiro-Wilk p-value = 0.0386 on the residuals is less than the usual significance level of 0.05. Thus, we reject the hypothesis that residuals have a normal distribution.
Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 1 0.6531 0.4206
118
The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.
Wilcoxon rank sum test
data: a and b
W = 2046, p-value = 0.1099
alternative hypothesis: true location shift is not equal to 0
Since the p-value is greater than 0.05, we fail to reject the null hypothesis, that is, there is no significant difference on the availability of reading materials when grouped according to sex.
`summarise()` has grouped output by 'Sex'. You can override using the `.groups`
argument.
The mean for male and female is 3.108 and 3.009, respectively.
The above graph shows the plotting of data by sex, which contains two sexes – male and female.
Warning: The following aesthetics were dropped during statistical transformation: fill
ℹ This can happen when ggplot fails to infer the correct grouping structure in
the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
variable into a factor?
It clearly shows that there is a significant difference on the instructional facilities in terms of its effects towards reading comprehension when grouped according to their sex. However, we still need to check the significance of its difference.
The histogram resembles a bell curve as seen above, means that the residuals have a normal distribution. Moreover, the points in the QQ-plots roughly follow the straight line, with the majority of them falling within the confidence bands.
Shapiro-Wilk normality test
data: res_aov$residuals
W = 0.98276, p-value = 0.1283
The Shapiro-Wilk p-value = 0.1283 on the residuals is greater than the usual significance level of 0.05. Thus, we fail to reject the hypothesis that residuals have a normal distribution.
Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 1 0.7184 0.3984
118
The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.
Welch Two Sample t-test
data: c and d
t = 1.3413, df = 103.37, p-value = 0.1828
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.03992289 0.20678003
sample estimates:
mean of x mean of y
3.112000 3.028571
Since the p-value is greater than 0.05, we fail to reject the null hypothesis, that is, there is no significant difference on the instructional facilities in terms of its effects towards reading comprehension when grouped according to sex.
`summarise()` has grouped output by 'Sex'. You can override using the `.groups`
argument.
The mean for male and female is 2.446 and 2.680, respectively.
The above graph shows the plotting of data by sex, which contains two sexes – male and female.
Warning: The following aesthetics were dropped during statistical transformation: fill
ℹ This can happen when ggplot fails to infer the correct grouping structure in
the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
variable into a factor?
It clearly shows that there is a significant difference between the impact of parental involvement when grouped according to their sex. However, we still need to check the significance of its difference.
The histogram does not resemble a bell curve as seen above, means that the residuals do not have a normal distribution. Moreover, the points in the QQ-plots roughly follow the straight line, with the majority of them falling within the confidence bands. However, this does not guarantee that residuals follow a normal distribution since when based on the diagram on the left, it is the exact opposite of it. Thus, it is more convenient to observe the two.
Shapiro-Wilk normality test
data: res_aov$residuals
W = 0.98829, p-value = 0.3943
The Shapiro-Wilk p-value = 0.04386 on the residuals is greater than the usual significance level of 0.05. Thus, we fail to reject the hypothesis that residuals have a normal distribution.
Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 1 0.0426 0.8369
118
The p-value is less than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is not met.
Welch Two Sample t-test
data: e and f
t = 1.7805, df = 112.37, p-value = 0.0777
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.02459003 0.46064637
sample estimates:
mean of x mean of y
2.680000 2.461972
Since the p-value is greater than 0.05, we fail to reject the null hypothesis, that is, there is no significant difference on the effects of parental involvement towards reading comprehension of their student when grouped according to sex.
Shapiro-Wilk normality test
data: Data$`Availability of Reading Materials`
W = 0.96388, p-value = 0.002625
Since p-value = 0.002625 < 0.05, it is conclusive that we reject the null hypothesis. That is, we cannot assume normality.
Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 3 2.4218 0.06948 .
116
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.
Attaching package: 'gplots'
The following object is masked from 'package:stats':
lowess
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
Warning in axis(1, at = 1:length(means), labels = legends, ...): "frame" is not
a graphical parameter
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
# A tibble: 4 × 11
Strand variable n min max median iqr mean sd se ci
<fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 GAS Availability of… 29 2 4 2.8 0.6 2.98 0.519 0.096 0.197
2 ABM Availability of… 24 2.8 3.8 3.1 0.3 3.13 0.293 0.06 0.124
3 HUMSS Availability of… 29 2.2 3.8 3 0.4 2.91 0.332 0.062 0.126
4 STEM Availability of… 38 2.2 4 3.2 0.6 3.25 0.404 0.066 0.133
The mean of GAS, ABM, HUMSS, and STEM is 2.979, 3.133, 2.910, and 3.247, respectively.
# A tibble: 1 × 6
.y. n statistic df p method
* <chr> <int> <dbl> <int> <dbl> <chr>
1 Availability of Reading Materials 120 15.5 3 0.00146 Kruskal-Wallis
Based on the p-value, there is significant difference was observed between the group pairs.
# A tibble: 6 × 9
.y. group1 group2 n1 n2 statistic p p.adj p.adj.signif
* <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <chr>
1 Availability… GAS ABM 29 24 1.75 8.04e-2 0.482 ns
2 Availability… GAS HUMSS 29 29 -0.289 7.72e-1 1 ns
3 Availability… GAS STEM 29 38 3.08 2.09e-3 0.0126 *
4 Availability… ABM HUMSS 24 29 -2.02 4.30e-2 0.258 ns
5 Availability… ABM STEM 24 38 1.06 2.90e-1 1 ns
6 Availability… HUMSS STEM 29 38 3.38 7.13e-4 0.00428 **
There is a significant difference between GAS and STEM, ABM and HUMSS, so with HUMSS and STEM.
Shapiro-Wilk normality test
data: Data$`Instructional Facilities`
W = 0.96524, p-value = 0.003411
Since p-value 0.05, it is conclusive that we reject the null hypothesis. That is, we cannot assume normality.
Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 3 0.9743 0.4074
116
The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
Warning in axis(1, at = 1:length(means), labels = legends, ...): "frame" is not
a graphical parameter
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
# A tibble: 4 × 11
Strand variable n min max median iqr mean sd se ci
<fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 GAS Instructional F… 29 2.2 3.8 3 0.6 3.03 0.397 0.074 0.151
2 ABM Instructional F… 24 2.4 3.6 3.2 0.3 3.08 0.311 0.063 0.131
3 HUMSS Instructional F… 29 2.4 3.6 3 0.4 2.97 0.328 0.061 0.125
4 STEM Instructional F… 38 2.4 3.8 3.2 0.2 3.10 0.289 0.047 0.095
The mean of GAS, ABM, HUMSS, and STEM is 3.034, 3.075, 2.972, and 3.105, respectively.
# A tibble: 1 × 6
.y. n statistic df p method
* <chr> <int> <dbl> <int> <dbl> <chr>
1 Instructional Facilities 120 2.88 3 0.41 Kruskal-Wallis
Based on the p-value, there is no significant difference was observed between the group pairs.
# A tibble: 6 × 9
.y. group1 group2 n1 n2 statistic p p.adj p.adj.signif
* <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <chr>
1 Instructional Fa… GAS ABM 29 24 0.629 0.530 1 ns
2 Instructional Fa… GAS HUMSS 29 29 -0.680 0.496 1 ns
3 Instructional Fa… GAS STEM 29 38 0.839 0.402 1 ns
4 Instructional Fa… ABM HUMSS 24 29 -1.28 0.202 1 ns
5 Instructional Fa… ABM STEM 24 38 0.128 0.898 1 ns
6 Instructional Fa… HUMSS STEM 29 38 1.56 0.118 0.708 ns
Shapiro-Wilk normality test
data: Data$`Parental Involvement`
W = 0.97876, p-value = 0.05464
Since p-value = 0.05464 > 0.05, it is conclusive that we fail reject the null hypothesis. That is, we assume its normality.
Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 3 0.6647 0.5754
116
The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
Warning in axis(1, at = 1:length(means), labels = legends, ...): "frame" is not
a graphical parameter
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
# A tibble: 4 × 11
Strand variable n min max median iqr mean sd se ci
<fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 GAS Parental Involv… 29 1 4 2.4 1 2.58 0.766 0.142 0.291
2 ABM Parental Involv… 24 1.2 3.6 2.8 0.9 2.74 0.65 0.133 0.274
3 HUMSS Parental Involv… 29 1 3.8 2.6 0.6 2.46 0.61 0.113 0.232
4 STEM Parental Involv… 38 1 3.8 2.4 0.75 2.45 0.645 0.105 0.212
The mean of GAS, ABM, HUMSS, and STEM is 2.579, 2.742, 2.462, and 2.453, respectively.
Df Sum Sq Mean Sq F value Pr(>F)
Strand 3 1.49 0.4952 1.105 0.35
Residuals 116 51.97 0.4480
Since p-value = 0.35 > 0.05, we fail to reject the null hypothesis, that is, there is no significant difference between the effects of parental involvement towards reading comprehension when grouped according to strand.
Shapiro-Wilk normality test
data: Data1$Scores
W = 0.95992, p-value = 2.402e-07
Since p-value = 2.402e-07 < 0.05, it is conclusive that we reject the null hypothesis. That is, we cannot assume normality.
Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 2 15.745 3.167e-07 ***
297
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The p-value is less than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is not met.
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
Warning in axis(1, at = 1:length(means), labels = legends, ...): "frame" is not
a graphical parameter
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
# A tibble: 3 × 11
Variables variable n min max median iqr mean sd se ci
<fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Availability … Scores 100 2 4 3 0.45 3.07 0.417 0.042 0.083
2 Instructional… Scores 100 2.2 3.8 3 0.4 3.05 0.339 0.034 0.067
3 Parental Invo… Scores 100 1 4 2.6 0.8 2.58 0.657 0.066 0.13
The mean of Availability of Reading Materials, Instructional Facilities, Parental Involvement is 3.072, 3.048, and 2.576 respectively.
# A tibble: 1 × 6
.y. n statistic df p method
* <chr> <int> <dbl> <int> <dbl> <chr>
1 Scores 300 47.4 2 5.21e-11 Kruskal-Wallis
Based on the p-value, there is a significant difference was observed between the group pairs.
# A tibble: 3 × 9
.y. group1 group2 n1 n2 statistic p p.adj p.adj.signif
* <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <chr>
1 Scores Availability… Instr… 100 100 -0.0978 9.22e-1 1 e+0 ns
2 Scores Availability… Paren… 100 100 -6.01 1.88e-9 5.64e-9 ****
3 Scores Instructiona… Paren… 100 100 -5.91 3.42e-9 1.03e-8 ****
There is a significant difference between availability of reading materials and parental involvement so with instructional facilities and parental involvement.
Based on the provided output above, it can be seen that availability of reading materials have been the most significant factor that affects the reading comprehension of the students.