1. Provide the minimum, 1st quartile, median, mean, 3rd quartile and maximum of the variable “Years”.

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   9.00   11.00   13.00   13.77   16.00   30.00 

2. How many of them responded “yes” in the variable “c1.3”?

Answer: 45

# A tibble: 2 × 3
  c1.3  Frequency Percentage
  <chr>     <int>      <dbl>
1 no           55         55
2 yes          45         45

3. Provide the frequency and percentage of the categories in the variable “NewYears.”

# A tibble: 3 × 3
  NewYears           Frequency Percentage
  <chr>                  <int>      <dbl>
1 12 years and below        44         44
2 13 to 15 years            29         29
3 16 years and above        27         27

4. Provide the mean and standard deviation of the variable “score” when they are grouped according to the categories of the variable “NewYears.”

# A tibble: 3 × 3
  NewYears           Mean_Score SD_Score
  <chr>                   <dbl>    <dbl>
1 12 years and below       43      0.889
2 13 to 15 years           42.7    1.03 
3 16 years and above       42.9    1.17 

5. What is the strength of association on the variables “Score” and “ScoreB?”

Answer: Correlation: 0.0549

[1] 0.05485635

6. For the Gender” variable, classify this variable as boy, girl, and others. Provide the Frequency Distribution Table, that is, showing the frequency and the percentage of the categories for the variable “Gender.”

# A tibble: 3 × 3
  Gender Frequency Percentage
  <chr>      <int>      <dbl>
1 Boy           35         35
2 Girl          36         36
3 Others        29         29

7. Provide the same output given below

# A tibble: 3 × 5
  ScoreABC variable        n  mean    sd
  <fct>    <fct>       <dbl> <dbl> <dbl>
1 ScoreA   Scorevalues   100  2.55 0.446
2 ScoreB   Scorevalues   100  2.31 0.472
3 ScoreC   Scorevalues   100  2.36 0.479

8. Is there a significant difference on the variable “ScoreA” values between M and F in the variable “Sex” using the following methods:

T.test


    Welch Two Sample t-test

data:  Lyra3$ScoreA by Lyra3$Sex
t = 0.62143, df = 86.183, p-value = 0.536
alternative hypothesis: true difference in means between group F and group M is not equal to 0
95 percent confidence interval:
 -0.1246377  0.2380032
sample estimates:
mean in group F mean in group M 
       2.574074        2.517391 

Mann-Whitney U Test


    Wilcoxon rank sum test with continuity correction

data:  ScoreA by Sex
W = 1346, p-value = 0.4584
alternative hypothesis: true location shift is not equal to 0

9. Is there a significant difference in the categories (boy, girl, and Others) on the variable “ScoreC” using F-test?

            Df Sum Sq Mean Sq F value Pr(>F)  
Gender       2  1.662  0.8309   3.823 0.0252 *
Residuals   97 21.085  0.2174                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

10. Considering the variables “ScoreA”, “ScoreB”, and “ScoreC” as independent variables, which of these significantly predicts the variable “Score?”


Call:
lm(formula = Score ~ ScoreA + ScoreB + ScoreC, data = Lyra)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.29078 -0.58529  0.07604  0.72523  2.27412 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 43.78202    0.75029  58.353  < 2e-16 ***
ScoreA      -0.66041    0.21978  -3.005  0.00339 ** 
ScoreB      -0.04561    0.24431  -0.187  0.85229    
ScoreC       0.37603    0.24136   1.558  0.12253    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.969 on 96 degrees of freedom
Multiple R-squared:  0.1037,    Adjusted R-squared:  0.07567 
F-statistic: 3.702 on 3 and 96 DF,  p-value: 0.01435