2. Is there a significant difference on the students’ reading comprehension (technology distractions and teacher support) when grouped according to:

Sex


Call:
lm(formula = `Technology Distractions` ~ `Teacher Support`, data = Data)

Coefficients:
      (Intercept)  `Teacher Support`  
           1.8618             0.2892

From this, we may deduce that the data fail to satisfy the two assumptions – Linearity and Homogeneity of Variance.

2.1.1 Sex and Technology Distractions

`summarise()` has grouped output by 'Sex'. You can override using the `.groups`
argument.


Attaching package: 'rstatix'

The following object is masked from 'package:stats':

    filter

The mean for male and female is 2.654 and 2.738, respectively.

The above graph shows the plotting of data by sex, which contains two sexes – male and female.

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ lubridate 1.9.3     ✔ tibble    3.2.1
✔ purrr     1.0.2     ✔ tidyr     1.3.1
✔ readr     2.1.5     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ rstatix::filter() masks dplyr::filter(), stats::filter()
✖ dplyr::lag()      masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Warning: The following aesthetics were dropped during statistical transformation: fill
ℹ This can happen when ggplot fails to infer the correct grouping structure in
  the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
  variable into a factor?

It clearly shows that there is a significant difference between the reading comprehension of the student considering the effects of technology distractions when grouped according to their sex. However, illustration do not give exact results to see if the difference is significant. Thus, we have the following.

Loading required package: carData


Attaching package: 'car'

The following object is masked from 'package:purrr':

    some

The following object is masked from 'package:dplyr':

    recode

The histogram resembles a bell curve as seen above, means that the residuals have a normal distribution. Moreover, the points in the QQ-plots roughly follow the straight line, with the majority of them falling within the confidence bands. This also indicates that residuals have a normal distribution.

Normality Test


    Shapiro-Wilk normality test

data:  res_aov$residuals
W = 0.98076, p-value = 0.1524

The Shapiro-Wilk p-value = 0.1524 on the residuals is greater than the usual significance level of 0.05. Thus, we fail to reject the hypothesis that residuals have a normal distribution.

Equality of Variance

Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.

Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  1  1.5767 0.2122
      98

The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.

Two Sample T-test


    Welch Two Sample t-test

data:  a and b
t = -1.4304, df = 97.797, p-value = 0.1558
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.2012475  0.0326578
sample estimates:
mean of x mean of y 
 2.654167  2.738462

Since the p-value is larger than 0.05, we fail to reject the null hypothesis, that is, there is no significant difference between the reading comprehension of the students in consideration to the effects of technology distractions when grouped according to sex.

2.1.2 Sex and Teacher Support

`summarise()` has grouped output by 'Sex'. You can override using the `.groups`
argument.

# A tibble: 19 × 3
# Groups:   Sex [2]
   Sex    `Teacher Support` count
   <fct>              <dbl> <int>
 1 Female               2.2     1
 2 Female               2.4     3
 3 Female               2.6     6
 4 Female               2.8     8
 5 Female               3      24
 6 Female               3.2     3
 7 Female               3.4     4
 8 Female               3.6     2
 9 Female               3.8     1
10 Male                 1.4     1
11 Male                 1.8     1
12 Male                 2.2     1
13 Male                 2.4     3
14 Male                 2.6     9
15 Male                 2.8     9
16 Male                 3      17
17 Male                 3.2     4
18 Male                 3.4     1
19 Male                 3.6     2

The mean for male and female is 2.825 and 2.954, respectively.

The above graph shows the plotting of data by sex, which contains two sexes – male and female.

Warning: The following aesthetics were dropped during statistical transformation: fill
ℹ This can happen when ggplot fails to infer the correct grouping structure in
  the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
  variable into a factor?

The histogram does not resemble a bell curve as seen above, means that the residuals do not have a normal distribution. Moreover, the points in the QQ-plots roughly follow the straight line, with the majority of them falling within the confidence bands. However, this does not guarantee that residuals follow a normal distribution since when based on the diagram on the left, it is the exact opposite of it. Thus, it is more convenient to observe the two.

Normality Test


    Shapiro-Wilk normality test

data:  res_aov$residuals
W = 0.94019, p-value = 0.0001979

The Shapiro-Wilk p-value = 0.0001979 on the residuals is less than the usual significance level of 0.05. Thus, we reject the hypothesis that residuals have a normal distribution.

Equality of Variance

Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.

Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  1  1.6773 0.1983
      98

The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.

Wilcoxon Rank Sum Test


    Wilcoxon rank sum test with continuity correction

data:  c and d
W = 1028, p-value = 0.1148
alternative hypothesis: true location shift is not equal to 0

Since the p-value= 0.1148 is greater than 0.05, we fail to reject the null hypothesis. Hence, there is no significant difference between the reading comprehension of the students in considering the effects of teacher support when grouped according to their sex.

2.2 Frequency of reading

2.2.1 Frequency of reading and Technology Distractions

`summarise()` has grouped output by 'Frequency of reading'. You can override
using the `.groups` argument.

# A tibble: 18 × 3
# Groups:   Frequency of reading [4]
   `Frequency of reading` `Technology Distractions` count
   <fct>                                      <dbl> <int>
 1 Always                                       1.8     1
 2 Always                                       2.4     6
 3 Always                                       2.6    14
 4 Always                                       2.8    11
 5 Always                                       3       8
 6 Always                                       3.2     2
 7 Always                                       3.4     1
 8 Always                                       3.6     1
 9 Never                                        2.6     1
10 Seldom                                       2.4     1
11 Seldom                                       2.8     1
12 Sometimes                                    2       2
13 Sometimes                                    2.2     5
14 Sometimes                                    2.4     6
15 Sometimes                                    2.6    16
16 Sometimes                                    2.8    11
17 Sometimes                                    3      11
18 Sometimes                                    3.2     2

Warning: There was 1 warning in `mutate()`.
ℹ In argument: `ci = abs(stats::qt(alpha/2, .data$n - 1) * .data$se)`.
Caused by warning:
! There was 1 warning in `mutate()`.
ℹ In argument: `ci = abs(stats::qt(alpha/2, .data$n - 1) * .data$se)`.
Caused by warning in `stats::qt()`:
! NaNs produced

The mean for always, never, seldom, and sometimes is 2.745, 2.600, 2.600, and 2.664, respectively.

The above graph shows the plotting of data by sex, which contains two sexes – male and female.

Normality Test


    Shapiro-Wilk normality test

data:  res_aov$residuals
W = 0.9818, p-value = 0.183

The Shapiro-Wilk p-value = 0.183 on the residuals is greater than the usual significance level of 0.05. Thus, we fail to reject the hypothesis that residuals have a normal distribution.

Equality of Variance

Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.

Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  3  0.4548 0.7145
      96

The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.

One-way ANOVA

                       Df Sum Sq Mean Sq F value Pr(>F)
`Frequency of reading`  3  0.086 0.02874   0.222  0.881
Residuals              96 12.427 0.12945

Since p-value = 0.881 > 0.05, we fail to reject the null hypothesis, that is, the reading comprehension in consideration of the effects of technology distractions do not differ when grouped according to the frequency of reading.

2.2.2 Frequency of reading and Teacher Support

`summarise()` has grouped output by 'Frequency of reading'. You can override
using the `.groups` argument.

# A tibble: 22 × 3
# Groups:   Frequency of reading [4]
   `Frequency of reading` `Teacher Support` count
   <fct>                              <dbl> <int>
 1 Always                               1.4     1
 2 Always                               1.8     1
 3 Always                               2.2     1
 4 Always                               2.4     4
 5 Always                               2.6     4
 6 Always                               2.8     8
 7 Always                               3      15
 8 Always                               3.2     2
 9 Always                               3.4     4
10 Always                               3.6     3
# ℹ 12 more rows

Warning: There was 1 warning in `mutate()`.
ℹ In argument: `ci = abs(stats::qt(alpha/2, .data$n - 1) * .data$se)`.
Caused by warning:
! There was 1 warning in `mutate()`.
ℹ In argument: `ci = abs(stats::qt(alpha/2, .data$n - 1) * .data$se)`.
Caused by warning in `stats::qt()`:
! NaNs produced

The mean for always, never, seldom, sometimes is 2.895, 2.600, 2.900, and 2.894, respectively.

The above graph shows the plotting of data by frequency of reading.

Warning: The following aesthetics were dropped during statistical transformation: fill
ℹ This can happen when ggplot fails to infer the correct grouping structure in
  the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
  variable into a factor?

It clearly shows that there is a difference between the reading comprehension of the student cosidering teacher support when grouped according to their frequency of reading. However, this does not assure anyone that the difference is significant.

The histogram does not resemble a bell curve as seen above, means that the residuals does not have a normal distribution. Moreover, the points in the QQ-plots does not follow the straight line, with the majority of them falling outside the confidence bands. This also indicates that residuals does not have normal distribution.

Normality Test


    Shapiro-Wilk normality test

data:  res_aov$residuals
W = 0.90319, p-value = 2.003e-06

The Shapiro-Wilk p-value = 2.003e-06 on the residuals is less than the usual significance level of 0.05. Thus, we reject the hypothesis that residuals have a normal distribution.

Equality of Variance

Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.

Levene's Test for Homogeneity of Variance (center = median)
      Df F value  Pr(>F)  
group  3  2.2337 0.08923 .
      96                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.

Kruskal-wallis Test

Warning: There was 1 warning in `mutate()`.
ℹ In argument: `ci = abs(stats::qt(alpha/2, .data$n - 1) * .data$se)`.
Caused by warning:
! There was 1 warning in `mutate()`.
ℹ In argument: `ci = abs(stats::qt(alpha/2, .data$n - 1) * .data$se)`.
Caused by warning in `stats::qt()`:
! NaNs produced

# A tibble: 4 × 11
  `Frequency of reading` variable        n   min   max median   iqr  mean     sd
  <fct>                  <fct>       <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl>  <dbl>
1 Always                 Teacher Su…    44   1.4   3.8    3    0.25  2.90  0.457
2 Sometimes              Teacher Su…    53   2.2   3.6    3    0.2   2.89  0.256
3 Seldom                 Teacher Su…     2   2.8   3      2.9  0.1   2.9   0.141
4 Never                  Teacher Su…     1   2.6   2.6    2.6  0     2.6  NA    
# ℹ 2 more variables: se <dbl>, ci <dbl>

# A tibble: 1 × 6
  .y.                 n statistic    df     p method        
* <chr>           <int>     <dbl> <int> <dbl> <chr>         
1 Teacher Support   100      1.42     3   0.7 Kruskal-Wallis

Based on the p-value, no significant difference was observed between the group pairs.

3. Is there a significant difference on the reading comprehension of the students between the two factors, technology distractions and teacher support? Which of the two have the most significant impact on students’ reading comprehension?

Normality Test


    Shapiro-Wilk normality test

data:  Data1$`Scores in terms of reading comprehension`
W = 0.94273, p-value = 3.992e-07

Since p-value = 3.992e-07 < 0.05, it is conclusive that we reject the null hypothesis. That is, we cannot assume normality.

Equality of Variance

Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.

Levene's Test for Homogeneity of Variance (center = median)
       Df F value Pr(>F)
group   1  0.0802 0.7774
      198

The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.


Attaching package: 'gplots'

The following object is masked from 'package:stats':

    lowess

Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter

Warning in axis(1, at = 1:length(means), labels = legends, ...): "frame" is not
a graphical parameter

Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter

# A tibble: 2 × 11
  Variables      variable     n   min   max median   iqr  mean    sd    se    ci
  <fct>          <fct>    <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Technology Di… Scores …   100   1.8   3.6    2.6  0.25  2.70 0.297 0.03  0.059
2 Teacher Suppo… Scores …   100   1.4   3.8    3    0.25  2.89 0.356 0.036 0.071

The mean of technology distractions and teacher support is 2.698 and 2.892, respectively.

Kruskal-wallis Test

# A tibble: 1 × 6
  .y.                                          n statistic    df        p method
* <chr>                                    <int>     <dbl> <int>    <dbl> <chr> 
1 Scores in terms of reading comprehension   200      21.5     1  3.49e-6 Krusk…

Based on the p-value, there is a significant difference was observed between the group pairs.

Lastly, it is the teacher support that has been the main factor that affects the reading comprehension of the students.

VANESSA LUTCHA GROUP STATISTICAL ANALYSIS

Kyle Kenneth Ruaya

2024-02-13

Data

1. What is the demographic profile of the respondents in terms of:

Sex

Strand

Frequency of Reading

2. Is there a significant difference on the students’ reading comprehension (technology distractions and teacher support) when grouped according to:

Sex

2.1.1 Sex and Technology Distractions

Normality Test

Equality of Variance

Two Sample T-test

2.1.2 Sex and Teacher Support

Normality Test

Equality of Variance

Wilcoxon Rank Sum Test

2.2 Frequency of reading

2.2.1 Frequency of reading and Technology Distractions

Normality Test

Equality of Variance

One-way ANOVA

2.2.2 Frequency of reading and Teacher Support

Normality Test

Equality of Variance

Kruskal-wallis Test

3. Is there a significant difference on the reading comprehension of the students between the two factors, technology distractions and teacher support? Which of the two have the most significant impact on students’ reading comprehension?

Normality Test

Equality of Variance

Kruskal-wallis Test