Data

1. What is the demographic profile of the respondents in terms of:


Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

Sex

Strand

The tables above provides the distributions of respondents in terms of sex, year level, and strand. It can be seen that there are 65 females and 35 males; 22 of which are from ABM, 23 from GAS, 33 from HUMSS, and 22 from STEM.

2. Is there a significant difference on the variables personal problems, mental and psychological problems, and family problems when grouped according to:

2.1 Sex


Call:
lm(formula = `Mental and Psychological Aspect` ~ `Personal Problems` + 
    `Family Problem`, data = Data)

Coefficients:
        (Intercept)  `Personal Problems`     `Family Problem`  
             0.6409               0.5185               0.2394  

From this, we may deduce that the data fail to satisfy the two assumptions – Linearity and Homogeneity of Variance.

2.1.1 Sex and Personal Problems

`summarise()` has grouped output by 'Sex'. You can override using the `.groups`
argument.

Attaching package: 'rstatix'
The following object is masked from 'package:stats':

    filter

The mean for female and male is 2.877 and 2.829, respectively.

The above graph shows the plotting of data by sex, which contains two sexes – male and female.

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ lubridate 1.9.3     ✔ tibble    3.2.1
✔ purrr     1.0.2     ✔ tidyr     1.3.1
✔ readr     2.1.5     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ rstatix::filter() masks dplyr::filter(), stats::filter()
✖ dplyr::lag()      masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Warning: The following aesthetics were dropped during statistical transformation: fill
ℹ This can happen when ggplot fails to infer the correct grouping structure in
  the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
  variable into a factor?

It clearly shows that there is no significant difference on the variable Personal Problems when grouped according to their sex. However, we still need to check the significance of this difference.

Loading required package: carData

Attaching package: 'car'
The following object is masked from 'package:purrr':

    some
The following object is masked from 'package:dplyr':

    recode

The histogram almost resemble a bell curve as seen above, means that the residuals do not have a normal distribution. Moreover, the points in the QQ-plots roughly follow the straight line, with the majority of them falling within the confidence bands. However, this does not guarantee that residuals follow a normal distribution since when based on the diagram on the left, it is the exact opposite of it. Thus, it is more convenient to observe the two.

Normality Test


    Shapiro-Wilk normality test

data:  res_aov$residuals
W = 0.96913, p-value = 0.01897

The Shapiro-Wilk p-value = 0.01897 on the residuals is less than the usual significance level of 0.05. Thus, we reject the hypothesis that residuals have a normal distribution.

Equality of Variance

Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  1  0.0161 0.8992
      98               

The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.

Wilcoxon Rank Sum Test


    Wilcoxon rank sum test

data:  a and b
W = 1028, p-value = 0.4213
alternative hypothesis: true location shift is not equal to 0

Since the p-value is larger than 0.05, we fail to reject the null hypothesis, that is, there is no significant difference on the variable personal problems when grouped according to sex.

2.1.2 Sex and Mental and Psychological Aspect

`summarise()` has grouped output by 'Sex'. You can override using the `.groups`
argument.

The mean for female and male is 2.846 and 2.651, respectively.

The above graph shows the plotting of data by sex, which contains two sexes – male and female.

Warning: The following aesthetics were dropped during statistical transformation: fill
ℹ This can happen when ggplot fails to infer the correct grouping structure in
  the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
  variable into a factor?

It clearly shows that there is significant difference on the variable Mental and Psychological Aspect when grouped according to their sex. However, we still need to check the significance of this difference.

The histogram does not resemble a bell curve as seen above, means that the residuals do not have a normal distribution. Moreover, the points in the QQ-plots roughly follow the straight line, with the majority of them falling within the confidence bands. However, this does not guarantee that residuals follow a normal distribution since when based on the diagram on the left, it is the exact opposite of it. Thus, it is more convenient to observe the two.

Normality Test


    Shapiro-Wilk normality test

data:  res_aov$residuals
W = 0.96178, p-value = 0.005394

The Shapiro-Wilk p-value = 0.005394 on the residuals is less than the usual significance level of 0.05. Thus, we reject the hypothesis that residuals have a normal distribution.

Equality of Variance

Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  1  0.0631 0.8023
      98               

The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.

Wilcoxon Rank Sum Test


    Wilcoxon rank sum test

data:  c and d
W = 836.5, p-value = 0.02765
alternative hypothesis: true location shift is not equal to 0

Since the p-value is less than 0.05, we reject the null hypothesis, that is, there is significant difference on the variable Mental and Psychological Aspect when grouped according to sex.

2.1.3 Sex and Family Problem

`summarise()` has grouped output by 'Sex'. You can override using the `.groups`
argument.

The mean for female and male is 2.803 and 2.600, respectively.

The above graph shows the plotting of data by sex, which contains two sexes – male and female.

Warning: The following aesthetics were dropped during statistical transformation: fill
ℹ This can happen when ggplot fails to infer the correct grouping structure in
  the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
  variable into a factor?

It clearly shows that there is difference on the variable Family Problem when grouped according to their sex. However, we still need to check the significance of this difference.

The histogram do not resemble a bell curve as seen above, means that the residuals do not have a normal distribution. Moreover, the points in the QQ-plots roughly follow the straight line, with the majority of them falling within the confidence bands. However, this does not guarantee that residuals follow a normal distribution since when based on the diagram on the left, it is the exact opposite of it. Thus, it is more convenient to observe the two.

Normality Test


    Shapiro-Wilk normality test

data:  res_aov$residuals
W = 0.95255, p-value = 0.001227

The Shapiro-Wilk p-value = 0.001227 on the residuals is less than the usual significance level of 0.05. Thus, we reject the hypothesis that residuals have a normal distribution.

Equality of Variance

Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  1  0.0424 0.8373
      98               

The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.

Wilcoxon Rank Sum Test


    Wilcoxon rank sum test

data:  e and f
W = 832.5, p-value = 0.02571
alternative hypothesis: true location shift is not equal to 0

Since the p-value is less than 0.05, we reject the null hypothesis, that is, there is significant difference on the variable Family Problem when grouped according to sex.

2.2 Strand

2.2.1 Strand and Personal Problems

Normality Test


    Shapiro-Wilk normality test

data:  Data$`Personal Problems`
W = 0.96056, p-value = 0.004408

Since p-value = 0.004408 < 0.05, it is conclusive that we reject the null hypothesis. That is, we cannot assume normality.

Equality of Variance

Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  3  1.4106 0.2444
      96               

The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.


Attaching package: 'gplots'
The following object is masked from 'package:stats':

    lowess
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
Warning in axis(1, at = 1:length(means), labels = legends, ...): "frame" is not
a graphical parameter
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter

# A tibble: 4 × 11
  Strand variable             n   min   max median   iqr  mean    sd    se    ci
  <fct>  <fct>            <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 STEM   Personal Proble…    22   1.8   3.2    2.8  0.55  2.74 0.437 0.093 0.194
2 ABM    Personal Proble…    22   2     3.4    2.8  0.35  2.78 0.343 0.073 0.152
3 HUMSS  Personal Proble…    33   2     3.6    3    0.6   2.92 0.424 0.074 0.15 
4 GAS    Personal Proble…    23   2.4   4      3    0.3   2.96 0.317 0.066 0.137

The mean of STEM, ABM, HUMSS, and GAS is 2.745, 2.782, 2.915, and 2.965, respectively.

Kruskal-wallis Test

# A tibble: 1 × 6
  .y.                   n statistic    df     p method        
* <chr>             <int>     <dbl> <int> <dbl> <chr>         
1 Personal Problems   100      4.34     3 0.227 Kruskal-Wallis

Based on the p-value, there is no significant difference was observed between the group pairs.

Pairwise Comparisons

# A tibble: 6 × 9
  .y.               group1 group2    n1    n2 statistic     p p.adj p.adj.signif
* <chr>             <chr>  <chr>  <int> <int>     <dbl> <dbl> <dbl> <chr>       
1 Personal Problems STEM   ABM       22    22    -0.198 0.843 1     ns          
2 Personal Problems STEM   HUMSS     22    33     1.32  0.187 1     ns          
3 Personal Problems STEM   GAS       22    23     1.40  0.160 0.963 ns          
4 Personal Problems ABM    HUMSS     22    33     1.54  0.124 0.746 ns          
5 Personal Problems ABM    GAS       22    23     1.60  0.109 0.653 ns          
6 Personal Problems HUMSS  GAS       33    23     0.203 0.839 1     ns          

2.2.2 Strand and Mental and Psychological Aspect

Normality Test


    Shapiro-Wilk normality test

data:  Data$`Mental and Psychological Aspect`
W = 0.96644, p-value = 0.01188

Since p-value = 0.01188 < 0.05, it is conclusive that we reject the null hypothesis. That is, we cannot assume normality.

Equality of Variance

Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  3  0.5891 0.6236
      96               

The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.

Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
Warning in axis(1, at = 1:length(means), labels = legends, ...): "frame" is not
a graphical parameter
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter

# A tibble: 4 × 11
  Strand variable             n   min   max median   iqr  mean    sd    se    ci
  <fct>  <fct>            <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 STEM   Mental and Psyc…    22   2     3.8    2.8  0.35  2.72 0.417 0.089 0.185
2 ABM    Mental and Psyc…    22   2     3.8    2.8  0.55  2.76 0.457 0.098 0.203
3 HUMSS  Mental and Psyc…    33   1.2   3.8    2.8  0.6   2.78 0.494 0.086 0.175
4 GAS    Mental and Psyc…    23   2     3.6    2.8  0.4   2.86 0.364 0.076 0.157

The mean of STEM, ABM, HUMSS, and GAS is 2.718, 2.755, 2.776, and 2.861, respectively.

Kruskal-wallis Test

# A tibble: 1 × 6
  .y.                                 n statistic    df     p method        
* <chr>                           <int>     <dbl> <int> <dbl> <chr>         
1 Mental and Psychological Aspect   100      1.89     3 0.595 Kruskal-Wallis

Based on the p-value, there is no significant difference was observed between the group pairs.

Pairwise Comparisons

# A tibble: 6 × 9
  .y.               group1 group2    n1    n2 statistic     p p.adj p.adj.signif
* <chr>             <chr>  <chr>  <int> <int>     <dbl> <dbl> <dbl> <chr>       
1 Mental and Psych… STEM   ABM       22    22     0.389 0.697     1 ns          
2 Mental and Psych… STEM   HUMSS     22    33     0.757 0.449     1 ns          
3 Mental and Psych… STEM   GAS       22    23     1.33  0.185     1 ns          
4 Mental and Psych… ABM    HUMSS     22    33     0.330 0.741     1 ns          
5 Mental and Psych… ABM    GAS       22    23     0.933 0.351     1 ns          
6 Mental and Psych… HUMSS  GAS       33    23     0.690 0.490     1 ns          

2.2.3 Strand and Family Problem

Normality Test


    Shapiro-Wilk normality test

data:  Data$`Family Problem`
W = 0.9553, p-value = 0.001887

Since p-value = 0.001887 < 0.05, it is conclusive that we reject the null hypothesis. That is, we cannot assume normality.

Equality of Variance

Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  3  0.6109 0.6096
      96               

The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.

Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
Warning in axis(1, at = 1:length(means), labels = legends, ...): "frame" is not
a graphical parameter
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter

# A tibble: 4 × 11
  Strand variable           n   min   max median   iqr  mean    sd    se    ci
  <fct>  <fct>          <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 STEM   Family Problem    22   2     3.4    2.9   0.4  2.79 0.434 0.093 0.192
2 ABM    Family Problem    22   1.4   3.8    2.9   0.7  2.77 0.539 0.115 0.239
3 HUMSS  Family Problem    33   1.4   3.6    2.8   0.6  2.64 0.482 0.084 0.171
4 GAS    Family Problem    23   2     3.4    2.8   0.4  2.76 0.389 0.081 0.168

The mean of STEM, ABM, HUMSS, and GAS is 2.791, 2.773, 2.642, and 2.765, respectively.

Kruskal-wallis Test

# A tibble: 1 × 6
  .y.                n statistic    df     p method        
* <chr>          <int>     <dbl> <int> <dbl> <chr>         
1 Family Problem   100      1.85     3 0.605 Kruskal-Wallis

Based on the p-value, there is no significant difference was observed between the group pairs.

Pairwise Comparisons

# A tibble: 6 × 9
  .y.            group1 group2    n1    n2 statistic     p p.adj p.adj.signif
* <chr>          <chr>  <chr>  <int> <int>     <dbl> <dbl> <dbl> <chr>       
1 Family Problem STEM   ABM       22    22   -0.0920 0.927     1 ns          
2 Family Problem STEM   HUMSS     22    33   -1.16   0.245     1 ns          
3 Family Problem STEM   GAS       22    23   -0.268  0.789     1 ns          
4 Family Problem ABM    HUMSS     22    33   -1.06   0.289     1 ns          
5 Family Problem ABM    GAS       22    23   -0.175  0.861     1 ns          
6 Family Problem HUMSS  GAS       33    23    0.883  0.377     1 ns          

3. Is there a significant relationship between personal problems, mental and psychological aspect, and family problem?

Normality Test


    Shapiro-Wilk normality test

data:  Data1$Scores
W = 0.96878, p-value = 4.336e-06

Since p-value = 4.336e-06 < 0.05, it is conclusive that we reject the null hypothesis. That is, we cannot assume normality.

Equality of Variance

Warning in leveneTest.default(y = y, group = group, ...): group coerced to
factor.
Levene's Test for Homogeneity of Variance (center = median)
       Df F value Pr(>F)
group   2   1.454 0.2353
      297               

The p-value is greater than the 0.05 level of significance. Thus, the homogeneity assumption of the variance is met.

Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter
Warning in axis(1, at = 1:length(means), labels = legends, ...): "frame" is not
a graphical parameter
Warning in plot.xy(xy.coords(x, y), type = type, ...): "frame" is not a
graphical parameter

# A tibble: 3 × 11
  Variables      variable     n   min   max median   iqr  mean    sd    se    ci
  <fct>          <fct>    <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Personal Prob… Scores     100   1.8   4      2.8   0.6  2.86 0.392 0.039 0.078
2 Mental and Ps… Scores     100   1.2   3.8    2.8   0.4  2.78 0.438 0.044 0.087
3 Family Problem Scores     100   1.4   3.8    2.8   0.6  2.73 0.463 0.046 0.092

The mean of Personal Problems, Mental and Psychological Aspect, and Family Problem is 2.860, 2.778, and 2.732, respectively.

Kruskal-wallis Test

# A tibble: 1 × 6
  .y.        n statistic    df     p method        
* <chr>  <int>     <dbl> <int> <dbl> <chr>         
1 Scores   300      3.86     2 0.145 Kruskal-Wallis

Based on the p-value, there is no significant difference was observed between the group pairs.

Pairwise Comparisons

# A tibble: 3 × 9
  .y.    group1           group2    n1    n2 statistic      p p.adj p.adj.signif
* <chr>  <chr>            <chr>  <int> <int>     <dbl>  <dbl> <dbl> <chr>       
1 Scores Personal Proble… Menta…   100   100    -1.61  0.108  0.323 ns          
2 Scores Personal Proble… Famil…   100   100    -1.78  0.0747 0.224 ns          
3 Scores Mental and Psyc… Famil…   100   100    -0.174 0.862  1     ns          

4. Which have the most significant impact?

Based on the provided output above, we can say that it is the personal problems.