ABSTRACT : This project investigates the impact of an AIDS awareness booklet on college students from five different departments. A total of 72 students participated, and their awareness scores were assessed both before and after receiving the booklet. The study aims to determine if the booklet effectively increases AIDS awareness among students and whether the spread of awareness remains consistent across the different departments. Statistical analysis will be conducted to evaluate changes in awareness scores and to assess any variations among departments, providing insights into the effectiveness of educational interventions in promoting AIDS awareness . Here we use secondary data and this work is done by R software.

1. INTRODUCTION

\(\mathbf{What}\) \(\mathbf{is}\) \(\mathbf{AIDS}\) (\(\mathbf{Acquired}\) \(\mathbf{Immune}\) \(\mathbf{Deficiency}\) \(\mathbf{Syndeome}\)) \(\mathbf{?}\) AIDS is the most advanced stage of \(\mathbf{HIV}\) (Human Immunodeficiency Virus) infection that occurs when it kills or damages the body’s immune system cells. The HIV virus causes a breakdown of the body’s immune system , making it vulnerable to infections and cancers . When it enters the body , the condition is described as being HIV infected. At this stage , the person may appear absolutely normal and may not even aware that this virus is present . HIV attacks the body’s white blood cells. White blood cells circulate around the body to detect infection and faults in other cells. It targets and infiltrates CD4 cells, a type of T cell. The virus uses these cells to create more copies of the virus. In doing so, HIV destroys the cells and reduces the body’s ability to combat other infections and diseases. This increases the risk and severity of opportunistic infections and some types of cancer.

\(\mathbf{In}\) \(\mathbf{India}\), the first HIV infection was detected in Madras (Chennai) in 1986 .

As long as we do not have a vaccine against HIV , AIDS prevention depends only on health education and behavioral changes based on AIDS awareness. In disease management, the proverb ’ \(\mathit{prevention}\)\(\mathit{is}\) \(\mathit{better}\) \(\mathit{than}\) \(\mathit{cure}\) ’ resonates deeply , especially when addressing sensitive and prevalent issues such as HIV/AIDS. For college students, who are prone to risky behaviors, targeted educational initiatives by equipping students with accurate information about transmission , prevention methods , stigma reduction (talking openly about AIDS can help normalize the subject) , and support resources through a booklet.

2. OBJECTIVE

The objective of the project is to:

• Examine whether the awareness of AIDS among students increases after providing the educational booklet.This will be assessed by comparing the test scores after and before proving the booklet.

• Determine whether the changes in awareness are consistent across students from different departments.

3. THE DATA-SET

This data-set contains scores of \(\mathbf{72}\) college students from various departments- \(\mathit{Chemistry}\) \(\mathit{(Special),}\) \(\mathit{Botany}\) \(\mathit{(Special)}\) ,\(\mathit{Microbiology}\) \(\mathit{(SYBSc}\) \(\mathit{level)}\) \(\mathit{,Microbiology}\) \(\mathit{(Special}\) \(\mathit{level)}\) ,\(\mathit{Zoology}\) \(\mathit{(Special)}\) -before and after receiving an AIDS awareness booklet at a college.It includes \(\mathbf{PRE-TEST}\) \(\mathbf{scores}\),reflecting baseline knowledge levels,and \(\mathbf{POST-TEST}\) \(\mathbf{scores}\),which measures the impact or changes in knowledge following the intervention. This data provides insights into the effectiveness of the booklet in educating students across different scientific disciplines about AIDS awareness.

There are \(\mathbf{15}\) students from \(\mathbf{Chemistry}\) \(\mathbf{(Special)}\), \(\mathbf{19}\) students from \(\mathbf{Botany}\) \(\mathbf{(Special})\) ,\(\mathbf{21}\) students from \(\mathbf{Microbiology}\) \(\mathbf{(SYBSc}\) \(\mathbf{level)}\) ,\(\mathbf{9}\) students from \(\mathbf{Microbiology}\) \(\mathbf{(Special}\) \(\mathbf{level)}\) ,\(\mathbf{8}\) students from \(\mathbf{Zoology}\) \(\mathbf{(Special)}\)

4. EXPLORATORY DATA ANALYSIS

4.1. Diagrams

Interpretation:

The pie chart illustrates that Botany , Chemistry , Microbiology SYBSc departments have nearly equal numbers of students. In contrast , Zoology and Microbiology Sp departments have comparatively smaller share of students.

[1] 12

Interpretation:

• Test scores increase after providing the booklet, as indicated by the boxplot.

• The median of the post-test scores data is higher than that of pre-test scores ,suggesting an improvement.

• Variability decreases for post-test data.

• There is an outlier in the post-test scores data,and its value is 12.

So after providing the booklet, there is a positive effect on the test scores.

4.2. Summary measures of the data-set

\(\underline{PRE}\) \(\underline{TEST}\) \(\underline{SCORES}:\)

Minimum Value : 3

Maximum Value : 17

Range : 14

Mean : 11.11

Median : 12

Standard Deviation : 3.09

1st Quartile : 9

3rd Quartile : 14

\(\underline{POST}\) \(\underline{TEST}\) \(\underline{SCORES}\):

Minimum Value : 12

Maximum Value : 20

Range : 8

Mean : 18

Median : 18

Standard Deviation : 1.62

1st Quartile : 17

3rd Quartile : 19

• PRE TEST data has lower central tendency values but higher variability,suggesting a more spread-out distribution with values ranging more widely around a lower mean.But POST TEST data has higher central tendency values with lower variablity,indicating a more concentrated distribution around a higher mean.

5. METHODOLOGY

5.1. Test For Normality

Normality tests are used to determine if a data-set is well-modeled by a normal distribution or not. An informal approach to testing normality is to compare a histogram of the data to a normal distribution curve. If the empirical distribution of the data is bell-shaped, then the data is assumed to be normally distributed. In addition, we use Shapiro Wilk Test for confirmation.

5.1.1. Histogram and Probabilty Density Curve of Normal Distribution

From the above diagrams, we can see that the pre-test scores data is approximately normally distributed , but the post-test scores data is not normally distributed.

5.1.2. Shapiro Wilk Test

The Shapiro–Wilk test is a test of normality.The Shapiro–Wilk test tests the null hypothesis that a sample \(x_{1}\), \(x_{2}\),…, \(x_{n}\) came from a normally distributed population. The test statistic is

W = \(\frac{(\sum_{i=1}^{n}a_{i}x_{(i)})^{2}}{\sum_{i=1}^{n}(x_{i}-\bar{x})^{2}}\)

where,

\(𝑥 _{(i)}\) with parentheses enclosing the subscript index \(i\) is the \(i^{th}\) order statistic, i.e., the \(i^{th}\)-smallest number in the sample.

\(\bar{x}\) = \(\frac{({{x}}_{1}+{{\cdots}}+{{x}}_{n})}{n}\) is the sample mean.

The coefficients \(𝑎_{i}\) are given by:

(\(𝑎_{1}\) , … , \(a_{n}\) ) = \(\frac{{{m}}^{{{T}}}{{V}}^{{-}1}}{C}\) , where C is :

C = \(\left\Vert V^{-1}m\right\Vert\) and the vector m = (\(m_{1,}\) ,…., \(m_{n})^{T}\) is made of the expected values of the order statistics of independent and identically distributed random variables sampled from the standard normal distribution; finally, 𝑉 is the co-variance matrix of those normal order statistics.

The null-hypothesis of this test is that the population is normally distributed. Thus, if the p value is less than the chosen alpha level, then the null hypothesis is rejected and there is evidence that the data tested are not normally distributed. On the other hand, if the p value is greater than the chosen alpha level, then the null hypothesis (that the data came from a normally distributed population) can not be rejected .

Result of the test:


    Shapiro-Wilk normality test

data:  PRE.TEST
W = 0.97543, p-value = 0.1708

    Shapiro-Wilk normality test

data:  POST.TEST
W = 0.8697, p-value = 2.274e-06

Interpretation:

• PRE TEST SCORES:

As, p-value=0.1708 > 0.05, we can accept \(H_{0}\) at 5% level of significance . So, we can conclude at 5% level of significance that the data come from a normally distributed population.

• POST TEST SCORES:

As, p-value=2.2774e-06 < 0.05, we reject \(H_{0}\) at 5% level of significance . So, we can conclude at 5% level of significance that the data do not come from a normally distributed population.

\(\mathit{Pre}\)-\(\mathit{test}\) \(\mathit{scores}\) \(\mathit{data}\) \(\mathit{follow}\) \(\mathit{normal}\) \(\mathit{distribution}\) , \(\mathit{so}\) \(\mathit{we}\) \(\mathit{can}\) \(\mathit{use}\) \(\mathit{both}\) \(\mathit{parametric}\) \(\mathit{tests}\) \(\mathit{and}\) \(\mathit{non}\) \(\mathit{parametric}\) \(\mathit{tests}\) \(\mathit{for}\) \(\mathit{it}\). \(\mathit{However}\), \(\mathit{post}\)-\(\mathit{test}\) \(\mathit{scores}\) \(\mathit{data}\) \(\mathit{do}\) \(\mathit{not}\) \(\mathit{follow}\) \(\mathit{normal}\) \(\mathit{distribution}\) ,\(\mathit{so}\) \(\mathit{we}\) \(\mathit{use}\) \(\mathit{only}\) \(\mathit{non}\) \(\mathit{parametric}\) \(\mathit{tests}\) \(\mathit{for}\) \(\mathit{it}\).

5.2. Parametric Tests

Parametric tests are those that make assumptions about the parameters of the population distribution from which the sample is drawn. This is often the assumption that the population data are normally distributed.

\(\mathit{Now}\) \(\mathit{we}\) \(\mathit{test}\) \(\mathit{whether}\) \(\mathit{there}\) \(\mathit{exists}\) \(\mathit{a}\) \(\mathit{significant}\) \(\mathit{difference}\) \(\mathit{in}\) \(\mathit{awareness}\) \(\mathit{among}\) \(\mathit{students}\) \(\mathit{from}\) \(\mathit{different}\) \(\mathit{departments}\) \(\mathit{or}\) \(\mathit{not.}\)

5.2.1. Analysis of Variance (ANOVA)

Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the “variation” among and between groups) used to analyze the differences among means. ANOVA is based on the law of total variance, where the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether two or more population means are equal, and therefore generalizes the t-test beyond two means. In other words, the ANOVA is used to test the difference between two or more means.

There are three classes of models used in the analysis of variance, and these are Fixed-effects models , Random-effects models, Mixed-effects models.

\(\mathbf{Fixed-effects}\) \(\mathbf{models}\): The fixed-effects model of analysis of variance applies to situations in which the experimenter applies one or more treatments to the subjects of the experiment to see whether the response variable values change.

The analysis of variance can be presented in terms of a linear model, which makes the following assumptions about the probability distribution of the responses:

Independence of observations – this is an assumption of the model that simplifies the statistical analysis.

Normality – the distributions of the residuals are normal.

• Equality (or “homogeneity”) of variances, called homoscedasticity —the variance of data in groups should be the same.

One-way analysis of variance The simplest experiment suitable for ANOVA analysis is the completely randomized experiment with a single factor.

It is useful to represent each data point in the following form, called a statistical model:

\(Y_{ij}\) = \(\mu\) + \(\tau_{i}\) + \(\varepsilon_{ij}\)

where,

i = 1, 2, 3, …, r j = 1, 2, 3, …, c

μ = general mean

n= total numbers of observations

\(τ_{i}\) = differential effect (response) associated with the i level of X; this assumes \(\sum_{i=1}^{r}n_{i}\) \(𝜏_{i}\)= 0

\(ε_{ij}\) = noise or error associated with the particular ij data value.

Result of the test:

PRE TEST:

            Df Sum Sq Mean Sq F value   Pr(>F)    
Subject      4  323.6   80.91   15.25 6.49e-09 ***
Residuals   67  355.5    5.31                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Interpretation:

As, p-value= 6.49e-09 < 0.05, we reject \(H_{0}\) at 5% level of significance . So, we can conclude at 5% level of significance that there is a significant difference in awareness among students from different departments before providing the booklet.

5.2.2. Critical Difference (CD) and Least Significant Difference(LSD)

In ANOVA if \(H_{0}\) is accepted , we may conclude that the means of different populations/groups are insignificant(or not significant) or there does not exist any significant difference among the means of different population/groups.

On the other hand, if \(H_{0}\) is rejected,we may conclude that the means of the populations /groups are significantly different among the means of different populations /groups.

Whenever \(H_{0}\) is rejected,one may be further interested to identify or select the pair (\(\mu_{i}\) , \(\mu_{i\prime}\)) , \(i \neq i\prime\) , responsible for such rejection. For (One - way fixed effect ANOVA model) this we test the null hypothesis \(H_{0}\) (\(i\) , \(i\prime\)) : \(\mu_{i}\) = \(\mu_{i\prime}\) , \(i\neq\) \(i\prime\) against the alternative hypothesis \(H_{A}\) \((i,i\prime)\) : \(\mu_{i}\neq\mu_{i\prime}\) , \(i\neq\) \(i\prime\) for each possible pairs. This is done by using the test statistic

t = \(\frac{(\bar{y}_{i0}-\bar{y}_{i\prime0})}{\sqrt{MSE(\frac{1}{n_{i}}+\frac{1}{n_{i\prime}})}}\)

Under the null hypothesis, this statistic follows a t-distribution with n − r degrees of freedom.

The critical region here is given by

\(\omega_{0}:\left|t\right|>t_{\frac{\alpha}{2};n-r}\equiv\left|\bar{y}_{i0}-\bar{y}_{i\prime0}\right|>t_{\frac{\alpha}{2};n-r}\sqrt{MSE(\frac{1}{n_{i}}+\frac{1}{n_{i\prime}})}\) = CD

where , CD = critical difference and \(t_{\frac{\alpha}{2};n-r}\) is the upper \(\frac{\alpha}{2}\) point of t- distribution with d.f. (n-r). In particular , if z \(n_{1}= n_{2}z\)=\(\ldots\) = \(n_{k}\) ( = m, say ) , CD = \(t_{\frac{\alpha}{2};r(m-1)}\sqrt{\frac{2MSE}{m}}\) , is called Least Significant Difference (LSD).

Result of the test:

PRE TEST:

$statistics
   MSerror Df     Mean       CV
  5.305816 67 11.11111 20.73092

$parameters
        test p.ajusted  name.t ntr alpha
  Fisher-LSD      none Subject   5  0.05

$means
                             PRE.TEST      std  r        se       LCL      UCL
Botany(Special)              9.368421 2.607905 19 0.5284444  8.313642 10.42320
Chemistry(Special)           9.533333 2.587516 15 0.5947446  8.346218 10.72045
Microbiology (SYBSc level)  10.857143 2.329929 21 0.5026509  9.853847 11.86044
Microbiology(Special level) 14.888889 1.536591  9 0.7678119 13.356330 16.42145
Zoology(Special)            14.625000 1.302470  8 0.8143875 12.999476 16.25052
                            Min Max  Q25 Q50   Q75
Botany(Special)               5  14  7.5   9 11.50
Chemistry(Special)            3  13  8.0  10 12.00
Microbiology (SYBSc level)    5  14  9.0  11 12.00
Microbiology(Special level)  12  17 14.0  15 15.00
Zoology(Special)             12  16 14.0  15 15.25

$comparison
NULL

$groups
                             PRE.TEST groups
Microbiology(Special level) 14.888889      a
Zoology(Special)            14.625000      a
Microbiology (SYBSc level)  10.857143      b
Chemistry(Special)           9.533333     bc
Botany(Special)              9.368421      c

attr(,"class")
[1] "group"

Interpretation:

• Microbiology(Special level) and Zoology(Special) do not have significantly different mean scores (since they both have a value of “a”)

• Microbiology (SYBSc level) and Chemistry(Special) do not have significantly different mean scores (since they both have a value of “b”)

• Botany(Special) and Chemistry(Special) do not have significantly different mean scores (since they both have a value of “c”)

• Microbiology(Special level) and Microbiology (SYBSc level) have significantly different mean scores (since Microbiology(Special level) has a value of “a” and Microbiology (SYBSc level) has a value of “b”)

Similarly, this applies to the others.

5.3. Non Parametric Tests

Non-parametric tests are experiments that do not require the underlying population for assumptions. It does not rely on any data referring to any particular parametric group of probability distributions. Non-parametric methods are also called distribution-free tests since they do not have any underlying population.

\(\mathit{We}\) \(\mathit{have}\) \(\mathit{data}\) \(\mathit{from}\) \(\mathit{same}\) \(\mathit{groups}\) \(\mathit{of}\) \(\mathit{students}\) \(\mathit{measured}\) \(\mathit{at}\) \(\mathit{two}\) \(\mathit{different}\) \(\mathit{times}\) \((\mathit{before}\) \(\mathit{and}\) \(\mathit{after}\)), \(\mathit{so}\) \(\mathit{we}\) \(\mathit{use}\) \(\mathit{Wilcoxon}\) \(\mathit{singed}\)-\(\mathit{rank}\) \(\mathit{test}\) \(\mathit{to}\) \(\mathit{compare}\) \(\mathit{the}\) \(\mathit{medians}\) \(\mathit{of}\) \(\mathit{two}\) \(\mathit{paired}\) \(\mathit{groups}\).

5.3.1. Wilcoxon singed-rank test

The Wilcoxon signed-rank test is a non-parametric rank test for statistical hypothesis testing used either to test the location of a population based on a sample of data, or to compare the locations of two populations using two matched samples. The Wilcoxon test can be a good alternative to the t-test when population means are not of interest; for example, when one wishes to test whether a population’s median is nonzero, or whether there is a better than 50% chance that a sample from one population is greater than a sample from another population.

The paired data test arises from taking paired differences. In each case, they become assertions about the behavior of the differences \(X_{i}-Y_{i}\) .

Let 𝐹 (𝑥 , 𝑦) be the joint cumulative distribution of the pairs (\(X_{i}\),\(Y_{i}\)). If 𝐹 is continuous, then the most general null and alternative hypotheses are expressed in terms of 𝑝\(_{1}\) = Pr(\(\frac{1}{2}(X_{i} - Y_{i} + X_{j} - Y_{j}\))>0) and are identical to the one-sample case:

Null hypothesis \(H_{0}\) , \(𝑝_{1}\) = \(\frac{1}{2}\)

One-sided alternative hypothesis \(H_{1}\) , \(𝑝_{1}\) > \(\frac{1}{2}\). (Right Tail)

One-sided alternative hypothesis \(H_{2}\) , \(𝑝_{1}\) < \(\frac{1}{2}\). (Left Tail)

Two-sided alternative hypothesis \(H_{3}\) , \(𝑝_{1}\)\(\frac{1}{2}\)(Both Tail)

Result of the test:


    Wilcoxon signed rank test with continuity correction

data:  PRE.TEST and POST.TEST
V = 0, p-value = 7.511e-14
alternative hypothesis: true location shift is less than 0

Interpretation:

Here we perform a left tail test.

As, p-value = 7.511e -14 < 0.05, we reject \(H_{0}\) at 5% level of significance . So, we can conclude at 5% level of significance that median of post-test scores data is greater than that of pre-test scores data. So the awareness of AIDS among students increases after providing the educational booklet.

5.3.2. Kruskal - Wallis Test

The Kruskal–Wallis test by ranks, Kruskal–Wallis H test , or one-way ANOVA on ranks is a non-parametric method for testing whether samples originate from the same distribution. It is used for comparing two or more independent samples of equal or different sample sizes.

Since it is a non parametric method, the Kruskal–Wallis test does not assume a normal distribution of the residuals, unlike the analogous one-way analysis of variance.

Rank all data from all groups together; i.e., rank the data from 1 to N ignoring group membership. Assign any tied values the average of the ranks they would have received had they not been tied.

The test statistic is given by

H = \((N-1)\frac{\sum_{i=1}^{g}n_{i}(\bar{r}_{i}-\bar{r})^{2}}{\sum_{i=1}^{g}\sum_{j=1}^{n_{i}}(r_{ij}-\bar{r})^{2}}\),

• N is the total number of observations across all groups

• g is the number of groups

\(n_{i}\) is the number of observations in group 𝑖

\(r_{ij}\) is the rank (among all observations) of observation 𝑗 from group 𝑖

\(\bar{r}{}_{i}\) = \(\frac{\sum_{j=1}^{n_{i}}r_{ij}}{n_{i}}\) is the average rank of all observations in group 𝑖

\(\bar{r}\) = \(\frac{(N+1)}{2}\) is the average of all the \(𝑟_{ij}\).

If the data contain no ties the denominator of the expression for 𝐻 is exactly \(\frac{(N-1)N(N+1)}{12}\) and \(\bar{r}\) = \(\frac{N+1}{2}\).

Finally, the decision to reject or not the null hypothesis is made by comparing H to a critical value \(H_{c}\) obtained from a table for a given significance or alpha level.

Result of the test:

PRE TEST:


    Kruskal-Wallis rank sum test

data:  PRE.TEST by Subject
Kruskal-Wallis chi-squared = 35.723, df = 4, p-value = 3.3e-07

POST TEST:


    Kruskal-Wallis rank sum test

data:  POST.TEST by Subject
Kruskal-Wallis chi-squared = 25.322, df = 4, p-value = 4.334e-05

Interpretation:

PRE-TEST:

As, p-value = 3.3e-07 < 0.05, we reject \(H_{0}\) at 5% level of significance . So, we can conclude at 5% level of significance that there is a significant difference in awareness among students from different departments before providing the booklet. (the conclusion is same as one way ANOVA)

POST-TEST:

As, p-value = 4.334e-05 < 0.05, we reject \(H_{0}\) at 5% level of significance . So, we can conclude at 5% level of significance that there is a significant difference in awareness among students from different departments after providing the booklet.

6. CONCLUSION

Our analysis shows a noticeable difference in AIDS awareness among students from different departments before and after distributing the AIDS awareness booklet. Statistical measures reveal a significant increase in median test scores post-distribution, demonstrating the booklet’s effectiveness in raising awareness.

We utilized a range of parametric and non-parametric tests to ensure the robustness of our analysis. However, it is important to note that the project was based on a small data set, which may limit the generalizability and statistical power of our findings. Despite this limitation, the results suggest that the AIDS awareness booklet is a valuable tool for improving knowledge among students.

7. APPENDIX

Here are the R codes used for this project.

data=read.csv("C:\\Users\\HP\\Downloads\\aidsnew.csv")
attach(data)
Subject=as.factor(Subject)
slices=c(sum(Subject=="Chemistry(Special)"),sum(Subject=="Botany(Special)"),sum(Subject=="Microbiology (SYBSc level)"),sum(Subject=="Microbiology(Special level)"),sum(Subject=="Zoology(Special)"))
pct=round(slices/sum(slices)*100) 
labls=c("Chemistry","Botany","Micro SYBSc","Micro Sp","Zoology")
labls=paste(labls,pct) 
labls=paste(labls,"%",sep="")
pie(slices,labels=labls,col=c("Red","Blue","yellow","green","orange"))

box=data.frame("Test"=c((rep("PRE.TEST",72)),rep("POST.TEST",72)),"Scores"=c(PRE.TEST,POST.TEST))
library("ggplot2")
boxplot=ggplot(box,aes(x=Test,y=Scores,fill=Test))+geom_boxplot()+scale_fill_manual(values=c("#E69F00","#FF0001"))+scale_x_discrete(limits=c("PRE.TEST","POST.TEST"))+theme(panel.border = element_rect(fill=NA))+theme_classic()+labs(title="Boxplot for test scores")
boxplot
unlist(layer_data(boxplot)$outlier)
hist(PRE.TEST,main="PRE TEST",xlab="Pre Test Scores",col="light blue")
par(new=TRUE)
x=seq(-3,3,0.01) 
y=dnorm(x)
plot(x,y,type="l",xlab="",ylab="",main="",xaxt="n",yaxt="n")

hist(POST.TEST,main="POST TEST",xlab="Post Test Scores",col="light green")
par(new=TRUE) x=seq(-3,3,0.01)
y=dnorm(x) 
plot(x,y,type="l",xlab="",ylab="",main="",xaxt="n",yaxt="n")

shapiro.test(PRE.TEST)

shapiro.test(POST.TEST)

t.test(PRE.TEST,POST.TEST,alternative="less",paired =TRUE,conf.level=0.95)

wilcox.test(PRE.TEST,POST.TEST,alternative="less",paired =TRUE,conf.level=0.95)

anova1=aov(PRE.TEST~Subject,data=data)
anova2=aov(POST.TEST~Subject,data=data)

summary(aov(PRE.TEST~Subject,data=data))
summary(aov(POST.TEST~Subject,data=data))


library("agricolae")

print(LSD.test(anova1,"Subject")) 
print(LSD.test(anova2,"Subject"))

kruskal.test(PRE.TEST~Subject)
kruskal.test(POST.TEST~Subject)