set.seed(20)

1 Introduction

Iterative Dichotomiser 3 (ID3) is a popular algorithm used to generate decision trees.

ID3 (Iterative Dichotomiser 3) is an algorithm used to create a decision tree, which is a predictive model used in machine learning and statistics. ID3 is particularly suited for classification problems where the target variable is categorical. Developed by Ross Quinlan, ID3 constructs decision trees using a top-down, greedy approach, selecting the attribute that maximizes information gain at each step.

ID3 can be adapted to handle continuous data as well. To be this, the comtinous data needs to be converted to into categorical attributes by defining thresholds. This involves splitting the continuous attribute into intervals or bins. One common method is to choose the threshold that provides the highest information gain https://www.kaggle.com/code/mrbisht/discretization-continuous-variables https://www.kaggle.com/code/mrbisht/discretization-continuous-variables.

ID3 primarily uses information gain based on entropy reduction to split attributes. However, some variations or adaptations of decision tree algorithms may use standard deviation reduction (SDR), particularly when dealing with continuous target variables in regression trees. This is more commonly associated with algorithms like CART (Classification and Regression Trees).

2 Standard Deviation Reduction (SDR)

When dealing with continuous data, especially for regression tasks, standard deviation reduction can be used to measure the effectiveness of a split. The basic idea is to choose splits that reduce the standard deviation within the resulting subsets.

The standard deviation is use to calculate the homogeneity of a numerical sample. If the numerical sample is completely homogeneous, its standard deviation would be zero (0)

In this manual, we’ll do a short demonstration of this algorithm, and also implement it in R.

To start with, consider the table below, which is the training data. From this data we’re to train a model that is capable of predicting student scores based on some given features. After training this model, teachers, parents, and students should be able to decide on what would a student score be based on those features. In this problem, the target variable is G3, while the features are Guardian, Sex, Studyingtime, and Paid.

sex	guardian	studytime	paid	G3
F	other	3	yes	13
F	mother	2	no	14
F	mother	3	no	15
F	mother	1	no	11
F	mother	2	no	17
F	mother	2	no	11
M	mother	2	no	10
F	father	2	no	10
F	other	2	no	9

3 Computational Implementation

Firtly, we state the required formular for this process. For every target variable T and some predictors X, we have that:

The mean of the continuous target variable is given as \[\bar{T} = \frac{1}{n} \sum_{i=1}^{n} T_i\]
The standard deviation of the target variable is given as

\[\sigma = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (T_i - \bar{T})^2}\] 3. The coefficient of variation which is a measure of relative variability. It is the ratio of the standard deviation to the mean and is often expressed as a percentage is given as \[CV = \frac{\sigma}{\bar{T}} \times 100\%\]

The Standard deviation for two attributes (prediction and target) is given as

\[\begin{equation} \sigma(T,X) = \sum_{c \in X}P(c)\sigma(c) \end{equation}\]

Finally, the algorithm standard deviation reduction is calculated with the equation below: \[\begin{equation} \sigma R(T,X) = \sigma(T) - \sigma (T,X) \end{equation}\]

3.1 Decision criteria and Splitting conditions

The attribute(predictor) with the largest reduction is chosen for the decision node
The CV for any branch must be smaller than 10% or when there are few instances (n) remaining in that branch after split. If any branch fail the threshold, we split such branch further to reduce the impurity.

3.2 Process of Training

We start training the model by calculating the mean of the target continous variable, and its standard deviation respectively. - Count of G3 = 8 \[\begin{equation} \bar{G3} = \frac{1}{n} \sum_{i=1}^{n} G3_i = \frac{8+11+18+14+19+11+6+9}{8} \end{equation}\]

\[= \frac{96}{8}\] \[=12\] \[\begin{equation} \sigma (G3) = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (T_i - \bar{G3})^2} =\sqrt{\frac{(8-12)^2+(11-12)^2+(18-12)^2+(14-12)^2+(19-12)^2+(11-12)^2+(6-12)^2+(9-12)^2}{8}} \\ =\sqrt{\frac{(-4)^2+(-1)^2+(6)^2+(2)^2+(7)^2+(-1)^2+(-6)^2+(-3)^2}{8}}\\ =\sqrt{\frac{16+1+36+4+49+1+36+9}{8}}\\ =\sqrt{\frac{152}{8}}\\ =\sqrt{19}\\ = 4.36 \end{equation}\]

\[\begin{equation} CV = \frac{\sigma (G3)}{\bar{G3}} \\ = \frac{4.36}{12} \times 100\% = 36.33\% \end{equation}\]

3.2.1 Now we calculate the mean, standard deviation, coefficient of variation, and standard deviation reduction of each attribute with respect to the target variable (G3).

The first feature is guardian

A quick explanation of how this would be done:

In the dataset given, from the guardian attribute, observe the number of instances Father occure, and for those instances pick the corresponding G3. Do the same for other guardian levels, you’ll have the result below:

3.2.1.1 Father serving as a Guardian

Count of Father = 3 G3 of Guardian as Father = 8, 14, 11.

Mean of G3, when Guardians is Father \[\begin{equation} \bar{(G3|G=F)} = \frac{8+14+11}{3} \\ =11 \end{equation}\]
Standard deviation of G3 when Guardians is Father \[\begin{equation} \sqrt{\frac{(8-11)^2+(14-11)^2+(11-11)^2}{3}} \\ =\sqrt{\frac{(-3)^2+(3)^2+(0)^2}{3}}\\ =\sqrt{\frac{9+9}{3}}\\ =\sqrt{\frac{18}{3}}\\ =\sqrt{6}\\ =2.45 \end{equation}\]
CV of G3 when Guardians is Father

\[\begin{equation} CV(Father) = \frac{\sigma(Father)}{\bar{G3|G=Father}} \times 100\% \\ CV(Father) = \frac{2.45}{11} \times 100\% \\ CV(Father) = 22.27\% \end{equation}\]

3.2.1.2 Mother serving as a Guardian

Count of Mother = 3 G3 of Guardian as Mother = 11, 19, 9.

-Mean of G3, when Guardians is Mother \[\begin{equation} \bar{G3|G=M} = \frac{11+19+9}{3}\\ =\frac{39}{3}\\ =13 \end{equation}\]

Standard deviation of G3, when Guardians is Mother \[\begin{equation} \sqrt{\frac{(11-13)^2+(19-13)^2+(9-13)^2}{3}} \\ =\sqrt{\frac{(-2)^2+(6)^2+(-4)^2}{3}}\\ =\sqrt{\frac{4+35+16}{3}}\\ =\sqrt{\frac{36}{3}}\\ =\sqrt{18.667}\\ =4.32 \end{equation}\]
CV of Mother as a Guardian \[\begin{equation} CV(Mother) = \frac{\sigma(Mother)}{\bar{G3|G=Mother}} \times 100\% \\ CV(Mother) = \frac{4.32}{13} \times 100\% \\ CV(Mother) = 33.23\% \end{equation}\]

3.2.1.3 Other forms of Guardian

Count of Other = 2 G3 of Guardian as Other = 18, 6.

-Mean \[\begin{equation} \bar{G3|G=O} = \frac{18+6}{2} \\ =\frac{24}{2} \\ =12 \end{equation}\]

Standard deviation \[\begin{equation} \sqrt{\frac{(18-12)^2+(6-12)^2}{2}} \\ =\sqrt{\frac{(6)^2+(-6)^2}{2}}\\ =\sqrt{\frac{36+36}{2}}\\ =\sqrt{\frac{72}{2}}\\ =\sqrt{36}\\ =6 \end{equation}\]
CV of Other forms of Guardian \[\begin{equation} CV(Other) = \frac{\sigma(O)}{\bar{G3|G=Other}} \times 100\% \\ CV(Other) = \frac{6}{12} \times 100\% \\ CV(Other) = 50\% \end{equation}\]
Standard deviation of Guardian \[\begin{equation} \sigma (G) = \sum_{c \in G} P(c) \sigma (c) \\ =\frac{3}{8} \times 2.45 + \frac{3}{8}\times 4.32 + \frac{2}{8}\times 6 \\ = 0.9375 + 1.6125 + 1.5 \\ = 4.05 \end{equation}\]
Standard deviation Reduction of Guardian \[\begin{equation} \sigma R(G3, G) = \sigma (G3) - \sigma (G) \\ = 4.36 - 4.05\\ = 0.31 \end{equation}\]

Guardian	Ave	STDev	CV	Count
Father	11	2.45	22.27	3.00
Mother	13	4.32	33.23	3.00
Other	12	6	50	2.00
SDR				0.31

3.2.2 Sex feature

3.2.2.1 Female candidate

Count of Female = 5 G3 of Female sex = 8, 14, 11, 6, 9.

-Mean of G3, when Sex is Female \[\begin{equation} \bar{G3|Sex=F} = \frac{8+14+11+6+9}{5}\\ =\frac{48}{5}\\ =9.6 \end{equation}\]

Standard deviation of G3 when Sex is Female \[\begin{equation} \sqrt{\frac{(8-9.6)^2+(14-9.6)^2+(11-9.6)^2+(6-9.6)^2+(9-9.6)^2}{5}}\\ =\sqrt{\frac{(-1.6)^2+(4.4)^2+(1.4)^2+(-3.6)^2+(-0.6)^2}{5}}\\ =\sqrt{\frac{2.56+19.36+1.96+12.96+0.36}{5}}\\ =\sqrt{\frac{37.2}{5}}\\ =\sqrt{7.44}\\ =2.73 \end{equation}\]
CV of G3 when Sex is Female \[\begin{equation} CV(Female) = \frac{\sigma(Female)}{\bar{G3|Sex=Female}} \times 100\%\\ CV(Female) = \frac{2.73}{9.6} \times 100\%\\ CV(Female) = 28.41\% \end{equation}\]

3.2.2.2 Male candidate

Count of Male = 3 G3 of Male sex = 11, 18, 19.

-Mean of G3, when Sex is Male \[\begin{equation} \bar{G3|Sex=F} = \frac{11+18+19}{3}\\ =\frac{48}{3}\\ =16 \end{equation}\]

Standard deviation of G3 when Sex is Male \[\begin{equation} \sqrt{\frac{(11-16)^2+(18-16)^2+(19-16)^2+(9-16)^2}{3}}\\ =\sqrt{\frac{(-5)^2+(2)^2+(3)^2+(-7)^2}{3}}\\ =\sqrt{\frac{25+4+49}{3}}\\ =\sqrt{\frac{78}{3}}\\ =\sqrt{26}\\ =5.10 \end{equation}\]
CV of G3 when Sex is Male \[\begin{equation} CV(Male) = \frac{\sigma(Male)}{\bar{G3|Sex=Male}} \times 100\%\\ CV(Male) = \frac{5.10}{16} \times 100\%\\ CV(Male) = 31.87\% \end{equation}\]
Standard deviation of Guardian \[\begin{equation} \sigma (S) = \sum_{c \in S} P(c) \sigma (c)\\ =\frac{5}{8} \times 2.73 + \frac{3}{8}\times 5.10 \\ = 1.7063 + 1.9125\\ = 3.61875 \end{equation}\]
Standard deviation Reduction of Sex \[\begin{equation} \sigma R(G3, S) = \sigma (G3) - \sigma (S) \\ = 4.36 - 3.62\\ = 0.74 \end{equation}\]

Sex	Ave	STDev	CV	Count
Female	9.6	2.73	28.41	5.00
Male	16	5.1	31.87	3.00
SDR				0.74

3.2.3 Study time feature

3.2.3.1 One hour of study per day

Count of One hour of study per day G3 of One hour = 8, 18, 11.

-Mean of G3, when Sudent study for one hr \[\begin{equation} \bar{G3|Study= 1} = \frac{8+18+11}{3} \\ =\frac{37}{3}\\ =12.33 \end{equation}\]

Standard deviation of G3 when Student study for one hr \[\begin{equation} \sqrt{\frac{(8-12.33)^2+(18-12.33)^2+(11-12.33)^2}{3}}\\ =\sqrt{\frac{(-4.33)^2+(5.67)^2+(-1.33)^2}{3}}\\ =\sqrt{\frac{18.7489+32.1489+1.7689}{3}}\\ =\sqrt{\frac{52.6667}{3}}\\ =\sqrt{17.56}\\ =4.19 \end{equation}\]
CV of G3 when Study time is one hr \[\begin{equation} CV(One) = \frac{\sigma(One)}{\bar{G3|Sudytime=One}} \times 100\% \\ CV(One) = \frac{4.19}{14} \times 100\% \\ CV(One) = 29.93\% \end{equation}\]

3.2.3.2 Two hours of study per day

Count of Two hours of study per day G3 of Two hours = 11,6.

-Mean of G3, when Sudent study for two hrs \[\begin{equation} \bar{G3|Study= 1} = \frac{11+6}{2} \\ =\frac{17}{2}\\ = 8.5 \end{equation}\]

Standard deviation of G3 when Student study for two hrs \[\begin{equation} \sqrt{\frac{(11-8.5)^2+(6-8.5)^2}{2}}\\ =\sqrt{\frac{(2.5)^2+(-2.5)^2}{2}}\\ =\sqrt{\frac{6.25+6.25}{2}}\\ =\sqrt{\frac{12.5}{2}}\\ =\sqrt{6.25}\\ =2.5 \end{equation}\]
CV of G3 when Study time is two hr \[\begin{equation} CV(Two) = \frac{\sigma(Two)}{\bar{G3|Sudytime=Two}} \times 100\% \\ CV(Two) = \frac{2.5}{8.5} \times 100\% \\ CV(Two) = 29.41\% \end{equation}\]

3.2.3.3 Three hours of study per day

Count of Three hours of study per day G3 of Three hours = 19, 9.

-Mean of G3, when Student study for three hrs \[\begin{equation} \bar{G3|Study= 3} = \frac{19+9}{2} \\ =\frac{28}{2}\\ =14 \end{equation}\]

Standard deviation of G3 when Student study for three hrs \[\begin{equation} \sqrt{\frac{(19-14)^2+(9-14)^2}{2}}\\ =\sqrt{\frac{(5)^2+(-5)^2}{2}}\\ =\sqrt{\frac{25+25}{2}}\\ =\sqrt{\frac{50}{2}}\\ =\sqrt{25}\\ =5 \end{equation}\]
CV of G3 when Study time is three hrs \[\begin{equation} CV(Three) = \frac{\sigma(Two)}{\bar{G3|Sudytime=Three}} \times 100\% \\ CV(Three) = \frac{5}{14} \times 100\% \\ CV(Three) = 35.71\% \end{equation}\]

3.2.3.4 Four hours of study per day

Count of Four hours of study per day is 1 G3 of Four hours = 11.

-Mean of G3, when Student study for four hrs \[\begin{equation} \bar{G3|Study= 4} = \frac{11}{1}\\ =11 \end{equation}\]

Standard deviation of G3 when Student study for four hrs \[\begin{equation} \sqrt{\frac{(11-141^2}{2}}\\ =\sqrt{\frac{(0)^2}{1}}\\ =0 \end{equation}\]
CV of G3 when Study time is four hrs \[\begin{equation} CV(Four) = \frac{\sigma(Four)}{\bar{G3|Sudytime=Four}} \times 100\%\\ CV(Four) = \frac{0}{11} \times 100\%\\ CV(Four) = 0\% \end{equation}\]
Standard deviation of Study hr \[\begin{equation} \sigma (St) = \sum_{c \in St} P(c) \sigma (c)\\ =\frac{3}{8} \times 4.19 + \frac{2}{8}\times 2.5+ \frac{2}{8} \times 5 + \frac{1}{8} \times 0 \\ = 1.571 + 0.625 + 1.25\\ = 3.446 \end{equation}\]
Standard deviation Reduction of Study \[\begin{equation} \sigma R(G3, S) = \sigma (G3) - \sigma (St)\\ = 4.36 - 3.446\\ = 0.914 \end{equation}\]

StudyTime	Ave	STDev	CV	Count
1	12.33	4.19	29.93	3.000
2	8.5	2.5	29.41	2.000
3	14	5	35.71	2.000
4	0	0	0	1.000
SDR				0.914

3.2.4 Paid feature

3.2.4.1 Student who has not paid

Count of not Paid = 4 G3 of student who has not Paid = 8, 11, 14 ,6.

-Mean of G3, when student has not Paid \[\begin{equation} \bar{G3|Paid=No} = \frac{8+11+14+6}{4} \\ =\frac{39}{4}\\ =9.75 \end{equation}\]

Standard deviation of G3 when Student has not paid \[\begin{equation} \sqrt{\frac{(8-9.75)^2+(14-9.75)^2+(11-9.75)^2+(6-9.75)^2}{4}}\\ =\sqrt{\frac{(-1.75)^2+(4.25)^2+(1.25)^2+(-3.75)^2}{4}}\\ =\sqrt{\frac{3.0625+18.0625+1.5625+14.0625}{4}}\\ =\sqrt{\frac{36.75}{4}}\\ =\sqrt{9.1875}\\ =3.031 \end{equation}\]
CV of G3 when Student has not paid \[\begin{equation} CV(No) = \frac{\sigma(No)}{\bar{G3|Paid=No}} \times 100\%\\ CV(No) = \frac{3.031}{9.75} \times 100\%\\ CV(No) = 31.09\% \end{equation}\]

3.2.4.2 Student who has paid

Count of student who has Paid = 4 G3 of student who has paid = 11, 18, 19,9.

-Mean of G3, when Student has paid \[\begin{equation} \bar{G3|Paid=Yes} = \frac{11+18+19+9}{4}\\ =\frac{57}{4}\\ =14.27 \end{equation}\]

Standard deviation of G3 when Student has paid \[\begin{equation} \sqrt{\frac{(11-14.27)^2+(18-14.27)^2+(19-14.27)^2+(9-14.27)^2}{4}}\\ =\sqrt{\frac{(-3.27)^2+(3.73)^2+(4.73)^2+(-5.27)^2}{4}}\\ =\sqrt{\frac{10.6929+13.9129+22.3729+27.7729}{4}}\\ =\sqrt{\frac{74.7516}{4}}\\ =\sqrt{18.6879}\\ =4.32 \end{equation}\]
CV of G3 when Student has paid \[\begin{equation} CV(Yes) = \frac{\sigma(Yes)}{\bar{G3|Paid=Yes}} \times 100\%\\ CV(Yes) = \frac{4.32}{14.27} \times 100\%\\ CV(Yes) = 30.27\% \end{equation}\]
Standard deviation of Paid \[\begin{equation} \sigma (P) = \sum_{c \in P} P(c) \sigma (c)\\ =\frac{4}{8} \times 3.031 + \frac{4}{8}\times 4.32 \\ = 1.5155 + 2.16\\ = 3.6755 \end{equation}\]
Standard deviation Reduction of Paid

\[\begin{equation} \sigma R(G3, S) = \sigma (G3) - \sigma (S)\\ = 4.36 - 2.16\\ = 2.2 \end{equation}\]

Paid	Ave	STDev	CV	Count
No	9.75	3.03	31.09	4.0
Yes	14.27	4.32	32.27	4.0
SDR				2.2

3.3 Selecting the node for the split.

From the result of the analysis in the table above, the SDR(i.e \(\sigma\)R) for Guardian is 0.31, Sex is 0.74, Study time is 0.914, and Paid is 2.2. Out of all these features, Paid has the largest SDR making it the best out of all the features. However, study time is the next after Sex.

The two levels of Paid (No and Yes) have a coefficient of variation greater than 10% which is the minimum and the threshold for branch split. Hence, we split this two branches further.

3.3.1 Paid to No

Guardian	Sex	StudyTIME	G3
Father	F	1	8
Mother	M	2	11
Father	F	4	4
Other	F	2	6

Average = 9.75, STDev = 3.03, CV = 31.09

3.3.1.1 Father as a Guardians for student who has not paid

Count of not Paid = 2 G3 of student for such students = 8,4.

-Mean of G3 such students \[\begin{equation} \bar{G3|G=Father} = \frac{8+4}{2} \\ =6 \end{equation}\]

Standard deviation \[\begin{equation} \sqrt{\frac{(8-6)^2+(4-6)^2}{2}}\\ =\sqrt{\frac{(2)^2+(-2)^2}{2}}\\ =\sqrt{\frac{4+4}{2}}\\ =\sqrt{\frac{8}{2}}\\ =\sqrt{4}\\ =2 \end{equation}\]
CV of G3 \[\begin{equation} CV(Father) = \frac{\sigma(Father)}{\bar{G3|G=Father}} \times 100\%\\ CV(Father) = \frac{2}{6} \times 100\%\\ CV(Father) = 33.3\% \end{equation}\]

3.3.1.2 Mother as a Guardians for student who has not paid

Count of not Paid = 1 G3 of student for such students = 11.

-Mean of G3 such students \[\begin{equation} \bar{G3|G=Mother} = \frac{11}{1} \\ = 11 \end{equation}\]

Standard deviation \[\begin{equation} \sqrt{\frac{(11-11)^2}{1}}\\ =\sqrt{\frac{(0)^2}{1}}\\ =\sqrt{0}\\ =0 \end{equation}\]
CV of G3 \[\begin{equation} CV(Mother) = \frac{\sigma(Mother)}{\bar{G3|G=Mother}} \times 100\%\\ CV(Mother) = \frac{0}{11} \times 100\%\\ CV(Mother) = 0\% \end{equation}\]

3.3.1.3 Other forms of Guardians for student who has not paid

Count of not Paid = 1 G3 of student for such students = 6.

-Mean of G3 such students \[\begin{equation} \bar{G3|G=Other} = \frac{6}{1} \\ =6 \end{equation}\]

Standard deviation \[\begin{equation} \sqrt{\frac{(6-6)^2}{1}}\\ =\sqrt{\frac{(0)^2}{1}}\\ =\sqrt{0}{1}\\ =\sqrt{0}\\ =0 \end{equation}\]
CV of G3 \[\begin{equation} CV(Other) = \frac{\sigma(Other)}{\bar{G3|G=Other}} \times 100\%\\ CV(Other) = \frac{0}{6} \times 100\%\\ CV(Other) = 0\% \end{equation}\]
Standard deviation of Paid to Guardians \[\begin{equation} \sigma (G) = \sum_{c \in G} P(c) \sigma (c)\\ =\frac{1}{4} \times 0 + \frac{2}{4}\times 2 + \frac{1}{4} \times 0 \\ = 0 + 1+ 0 \\ = 1 \end{equation}\]
Standard deviation Reduction of Paid

\[\begin{equation} \sigma R(G3, G) = \sigma (G3) - \sigma (G)\\ = 3.03 - 1\\ = 2.03 \end{equation}\]

Guardian	STDev	CV	Ave	Count
Father	2	33.3	6	2.00
Mother	0	0	11	1.00
Other	0	0	6	1.00
SDR				2.03

3.3.1.4 Female student who has not paid

Count of Female student who has not paid = 3 G3 of student for such students = 8, 4, 6.

-Mean of G3 such students \[\begin{equation} \bar{G3|G=Mother} = \frac{8+4+6}{3} \\ = \frac{18}{3}\\ = 6 \end{equation}\]

Standard deviation \[\begin{equation} \sqrt{\frac{(8-6)^2+(4-6)^2+(6-6)^2}{3}}\\ =\sqrt{\frac{(2)^2+(-2)^2+(0)^2}{3}}\\ =\sqrt{\frac{4+4+0}{3}}\\ =\sqrt{\frac{8}{3}}\\ =\sqrt{2.67}\\ = 1.63 \end{equation}\]
CV of G3 \[\begin{equation} CV(Female) = \frac{\sigma(Female)}{\bar{G3|S=Female}} \times 100\%\\ CV(Female) = \frac{1.63}{6} \times 100\%\\ CV(Female) = 27.22\% \end{equation}\]

3.3.1.5 Male student who has not paid

Count of Male student who has not paid = 1 G3 of student for such students = 11. \[\begin{equation} \bar{G3|S=Male} = \frac{11}{1} \\ = 11 \end{equation}\]

Standard deviation \[\begin{equation} \sqrt{\frac{(11-11)^2}{1}}\\ =\sqrt{\frac{(0)^2}{1}}\\ =\sqrt{0}\\ =0 \end{equation}\]
CV of G3 \[\begin{equation} CV(Male) = \frac{\sigma(Male)}{\bar{G3|S=Male}} \times 100\%\\ CV(Male) = \frac{0}{11} \times 100\%\\ CV(Male) = 0\% \end{equation}\]
Standard deviation of Paid to Sex \[\begin{equation} \sigma (PS) = \sum_{c \in S} P(c) \sigma (c)\\ =\frac{3}{4} \times 1.63 + \frac{1}{4}\times 0 \\ = 1.22 + 0 \\ = 1.22 \end{equation}\]
Standard deviation Reduction of Paid

\[\begin{equation} \sigma R(G3, PS) = \sigma (G3) - \sigma (PS)\\ = 3.03 - 1.22\\ = 1.81 \end{equation}\]

Guardian	STDev	CV	Ave	Count
Female	1.63	27.22	6	3.00
Male	0	0	11	1.00
SDR				1.81

3.3.1.6 Student who study for 1 hr and has not paid

Count of Student who study for 1 hr and has not paid = 1 G3 of student for such students = 8. \[\begin{equation} \bar{G3|St=One} = \frac{8}{1} \\ = 8 \end{equation}\]

Standard deviation \[\begin{equation} \sqrt{\frac{(8-8)^2}{1}}\\ =\sqrt{\frac{(0)^2}{1}}\\ =\sqrt{0}\\ =0 \end{equation}\]
CV of G3 \[\begin{equation} CV(One) = \frac{\sigma(One)}{\bar{G3|St=One}} \times 100\%\\ CV(One) = \frac{0}{8} \times 100\%\\ CV(One) = 0\% \end{equation}\]

3.3.1.7 Student who study for 2 hrs and has not paid

Count of Student who study for 1 hr and has not paid = 2 G3 of student for such students = 11, 6. \[\begin{equation} \bar{G3|St=One} = \frac{11+6}{2} \\ = 8.5 \end{equation}\]

Standard deviation \[\begin{equation} \sqrt{\frac{(11-8.5)^2+(6-8.5)^2}{2}}\\ =\sqrt{\frac{(2.5)^2+(-2.5)^2}{2}}\\ =\sqrt{\frac{6.25+6.25}{2}}\\ =\sqrt{\frac{12.5}{2}}\\ =\sqrt{6.25}\\ = 2.5 \end{equation}\]
CV of G3 \[\begin{equation} CV(Two) = \frac{\sigma(Two)}{\bar{G3|St=Two}} \times 100\%\\ CV(Two) = \frac{2.5}{8.5} \times 100\%\\ CV(Two) = 29.41\% \end{equation}\]

3.3.1.8 Student who study for 4 hrs and has not paid

Count of Student who study for 1 hr and has not paid = 1 G3 of student for such students = 4. \[\begin{equation} \bar{G3|St=Four} = \frac{4}{1} \\ = 4 \end{equation}\]

Standard deviation \[\begin{equation} \sqrt{\frac{(4-4)^2}{1}}\\ =\sqrt{\frac{(0)^2}{1}}\\ =\sqrt{0}\\ =0 \end{equation}\]
CV of G3 \[\begin{equation} CV(Four) = \frac{\sigma(Four)}{\bar{G3|St=Four}} \times 100\%\\ CV(Four) = \frac{0}{4} \times 100\%\\ CV(Four) = 0\% \end{equation}\]
Standard deviation of Paid to Study Time \[\begin{equation} \sigma (PSt) = \sum_{c \in St} P(c) \sigma (c)\\ =\frac{1}{4} \times 0 + \frac{2}{4}\times 2.5 +\frac{1}{4} \times 0 \\ = 0 + 1.25 + 0 \\ = 1.25 \end{equation}\]
Standard deviation Reduction of Paid

\[\begin{equation} \sigma R(G3, PS) = \sigma (G3) - \sigma (PS)\\ = 3.03 - 1.25\\ = 1.78 \end{equation}\]

StudyTIME	STDev	CV	Ave	count
1	0	0	0	1.00
2	2.5	29.41	8.5	2.00
4	0	0	0	1.00
SDR				1.78

3.3.2 Paid to Yes

Guardian	Sex	StudyTIME	G3
Other	M	1	18
Mother	M	3	19
Father	F	1	11
Mother	F	3	9

Average = 14.27, STDev = 4.32, CV = 32.27

3.3.2.1 Mother as a Guardians for student who has paid

Count of Paid = 2 G3 of student for such students = 19, 9. -Mean of G3 such students \[\begin{equation} \bar{G3|G=Mother} = \frac{19 + 9}{2} \\ =\frac{28}{2} = 14 \end{equation}\]

Standard deviation \[\begin{equation} \sqrt{(19-14)^2+(9-14)^2}{2}\\ \sqrt{(5)^2+(-5)^2}{2}\\ \sqrt{25+25}{2}\\ \sqrt{50}{2}\\ \sqrt{25}\\ =5 \end{equation}\]
CV of G3 \[\begin{equation} CV(Mother) = \frac{\sigma(Mother)}{\bar{G3|G=Mother}} \times 100\%\\ CV(Mother) = \frac{5}{14} \times 100\%\\ CV(Mother) = 35.71\% \end{equation}\]

3.3.2.2 Other forms of Guardians for student who has paid.

Count of Student such student = 1 G3 of student for such students = 18. \[\begin{equation} \bar{G3|St=Four} = \frac{18}{1} \\ = 18 \end{equation}\]

Standard deviation \[\begin{equation} \sqrt{\frac{(18-18)^2}{1}}\\ =\sqrt{\frac{(0)^2}{1}}\\ =\sqrt{0}\\ =0 \end{equation}\]
CV of G3 \[\begin{equation} CV(Other) = \frac{\sigma(Other)}{\bar{G3|G=Other}} \times 100\%\\ CV(Other) = \frac{0}{18} \times 100\%\\ CV(Other) = 0\% \end{equation}\]

3.3.2.3 Father of Guardians for student who has paid.

Count of such student = 1 G3 for such students = 11. \[\begin{equation} \bar{G3|St=Four} = \frac{11}{1} \\ = 11 \end{equation}\]

Standard deviation \[\begin{equation} \sqrt{\frac{(11-11)^2}{1}}\\ =\sqrt{\frac{(0)^2}{1}}\\ =\sqrt{0}\\ =0 \end{equation}\]
CV of G3 \[\begin{equation} CV(Father) = \frac{\sigma(Father)}{\bar{G3|G=Father}} \times 100\%\\ CV(Father) = \frac{0}{11} \times 100\%\\ CV(Father) = 0\% \end{equation}\]
Standard deviation of Paid to guardian \[\begin{equation} \sigma (PG) = \sum_{c \in G} P(c) \sigma (c)\\ =\frac{1}{4} \times 0 + \frac{2}{4}\times 5 +\frac{1}{4} \times 0 \\ = 0 + 2.5 + 0 \\ = 2.5 \end{equation}\]
Standard deviation Reduction of Paid

\[\begin{equation} \sigma R(G3, PS) = \sigma (G3) - \sigma (PS)\\ = 3.03 - 2.5\\ = 0.53 \end{equation}\]

Guardian	STDev	CV	Ave	count
Mother	5	35.71	14	2.00
Other	0	0	18	1.00
Father	0	0	11	1.00
SDR				0.53

3.3.2.4 Male student who has paid

Count of Paid = 2 G3 of student for such students = 18, 19. -Mean of G3 such students \[\begin{equation} \bar{(G3|G=Male)} = \frac{18 + 19}{2} \\ =\frac{37}{2} = 18.5 \end{equation}\]

Standard deviation \[\begin{equation} \sqrt{(18-18.5)^2+(19-18.5)^2}{2}\\ \sqrt{(-0.5)^2+(0.5)^2}{2}\\ \sqrt{0.25+0.25}{2}\\ \sqrt{0.5}{2}\\ \sqrt{0.25}\\ =0.5 \end{equation}\]
CV of G3 \[\begin{equation} CV(Male) = \frac{\sigma(Male)}{\bar{G3|S=Male}} \times 100\%\\ CV(Male) = \frac{0.5}{18.5} \times 100\%\\ CV(Male) = 2.70\% \end{equation}\]

3.3.2.5 Female student who has paid

Count of Paid = 2 G3 of student for such students = 11, 9. -Mean of G3 such students \[\begin{equation} \bar{(G3|G=Female)} = \frac{11 + 9}{2} \\ =\frac{20}{2} = 10 \end{equation}\]

Standard deviation \[\begin{equation} \sqrt{(11-10)^2+(9-10)^2}{2}\\ \sqrt{(1)^2+(-1)^2}{2}\\ \sqrt{1+1}{2}\\ \sqrt{2}{2}\\ \sqrt{1}\\ =1 \end{equation}\]
CV of G3 \[\begin{equation} CV(Female) = \frac{\sigma(Female)}{\bar{G3|G=Female}} \times 100\%\\ CV(Female) = \frac{1}{10} \times 100\%\\ CV(Female) = 10\% \end{equation}\]
Standard deviation of Paid to Sex \[\begin{equation} \sigma (PSe) = \sum_{c \in S} P(c) \sigma (c)\\ =\frac{2}{4} \times 0.5 + \frac{2}{4}\times 1 \\ = 0.25 + 0.5 \\ = 0.75 \end{equation}\]
Standard deviation Reduction of Paid

\[\begin{equation} \sigma R(G3, PS) = \sigma (G3) - \sigma (PS)\\ = 4.32 - 0.75\\ = 3.57 \end{equation}\]

StudyTIME	STDev	CV	Ave	count
Male	0.5	2.7	18.5	2.00
Female	1	10	10	2.00
SDR				3.57

3.3.2.6 Student who has paid and read for one hrs

Count of Paid = 2 G3 of student for such students = 18, 11. - Mean of G3 such students \[\begin{equation} \bar{(G3|St= one)} = \frac{18 + 11}{2} \\ =\frac{29}{2} = 14.5 \end{equation}\]

Standard deviation \[\begin{equation} \sqrt{(18-14.5)^2+(11-14.5)^2}{2}\\ \sqrt{(3.5)^2+(-3.5)^2}{2}\\ \sqrt{12.25+12.25}{2}\\ \sqrt{24.5}{2}\\ \sqrt{12.25}\\ =3.5 \end{equation}\]
CV of G3 \[\begin{equation} CV(One) = \frac{\sigma(One)}{\bar{G3|St=One}} \times 100\%\\ CV(One) = \frac{3.5}{14.5} \times 100\%\\ CV(One) = 24.14\% \end{equation}\]

3.3.2.7 Student who has paid and read for three hrs

Count of Paid = 2 G3 of student for such students = 19, 9. -Mean of G3 such students \[\begin{equation} \bar{(G3|St=three)} = \frac{19 + 9}{2} \\ =\frac{28}{2} = 14 \end{equation}\]

Standard deviation \[\begin{equation} \sqrt{(19-14)^2+(9-14)^2}{2}\\ \sqrt{(5)^2+(-5)^2}{2}\\ \sqrt{25+25}{2}\\ \sqrt{50}{2}\\ \sqrt{25}\\ =5 \end{equation}\]
CV of G3 \[\begin{equation} CV(Three) = \frac{\sigma(Three)}{\bar{G3|St=Three}} \times 100\%\\ CV(Three) = \frac{5}{14} \times 100\%\\ CV(Three) = 35.71\% \end{equation}\]
Standard deviation of Paid to Study Time \[\begin{equation} \sigma (PSt) = \sum_{c \in St} P(c) \sigma (c)\\ =\frac{1}{4} \times 0 + \frac{2}{4}\times 2.5 +\frac{1}{4} \times 0 \\ = 0 + 1.25 + 0 \\ = 1.25 \end{equation}\]
Standard deviation Reduction of Paid