library(readr)
Registration <- read_csv("~/School/Fall 2017/MATH 325/Math 325 Notebook/Math 325 Notebook/Data/Registration.csv")
library(mosaic)
library(ResourceSelection)
library(pander)
library(DT)
library(car)
set.seed(25)
Registration$`Rank in Camp` <- factor(Registration$`Rank in Camp`, levels = c("Scout", "Tenderfoot", "1st Class", "2nd Class", "Star", "Life", "Eagle"))
Reg <- table(Registration$`Rank in Camp`, Registration$Classes)
chis.Reg <- chisq.test(Reg, simulate.p.value = TRUE)
datatable(head(Registration, 284), options = list(columnDefs = list(list(className = 'dt-center', targets = 2)), pageLength = 5, lengthMenu = c(5, 10, 15, 20)))
Zion’s Camp is a boys camp that happens every year and involves many Boy Scout troops and LDS wards throughout a specific region. During these camps, boys and leaders take different classes to earn different merit badges. The number of merit badges along with certain required merit badges advance the Boy Scout to a certain rank. The Boy Scout Ranks start at Scout and progress towards Eagle and are as follows…
During registration for Zion’s Camp on the East Coast in the summer of 2017, 284 Boy Scouts filled out their personal information along with what rank they were and a schedule of what merit badge classes they were going to take during the camp. Each Boy Scout can take a maximum number of 6 classes during the week. This data is recorded in our data table above.
It is assumed that as the Boy Scout becomes more advanced, they have less need for taking merit badge courses offered at Zion’s Camp. Involvement for all Boy Scouts is a goal for this camp and they are searching for ways to help keep all boys involved. Therefore, we will be addressing the question about whether or not there is an association between the number of classes taken and the rank of the Boy Scout.
To determine if there is an association, we will be running a Chi-Squared test on our data. Our hypotheses are as follows…
\[ H_0:\ \text{Boy Scout rank and number of classes are independent.} \]
\[ H_0:\ \text{Boy Scout rank and number of classes are associated.} \]
\[a= 0.05\]
To represent the data, we will display the category of the number of classes taken and place the frequency of each rank within the category. This shows the relationship of ranks next to each other within each category. We notice that although it seems more common to take a higher number of classes, within each category the highest frequency of rank tends to be in the middle of the less advance and the most advanced ranks.
barplot(Reg, beside=TRUE, legend.text=TRUE, args.legend=list(x = "topleft", bty="n"), col= 1:7, xlab = "Number of Classes", ylab = "Frequency", main = "Frequency of Number of Classes Taken by Rank")
When we run a Chi-Squared Test on the data, we get a P-value of 0.01249.
pander(chis.Reg)
| Test statistic | df | P value |
|---|---|---|
| 78.48 | NA | 0.01249 * |
Because our P-value is less than our level of significance, we reject the null hypothesis that rank and number of classes are independent. We have sufficient evidence to suggest that rank and number of classes are associated. We will explore these findings below.
As we check for our assumptions from the expected values found below, we notice that not every value is greater than 1 and that the general average of the expected values doesn’t look greater than 5. Our assumptions have not been met and this test may not be completely accurate. We could exclude the Boy Scouts that are taking less than 3 classes to fix our assumptions and get more accurate results, but to stay true to our question we will continue on with all data included.
pander(chis.Reg$expected)
| 1 | 2 | 3 | 4 | 5 | 6 | |
|---|---|---|---|---|---|---|
| Scout | 1.092 | 0.2183 | 1.965 | 8.077 | 16.59 | 34.06 |
| Tenderfoot | 0.6162 | 0.1232 | 1.109 | 4.56 | 9.366 | 19.23 |
| 1st Class | 0.9683 | 0.1937 | 1.743 | 7.165 | 14.72 | 30.21 |
| 2nd Class | 1.074 | 0.2148 | 1.933 | 7.947 | 16.32 | 33.51 |
| Star | 0.669 | 0.1338 | 1.204 | 4.951 | 10.17 | 20.87 |
| Life | 0.5282 | 0.1056 | 0.9507 | 3.908 | 8.028 | 16.48 |
| Eagle | 0.05282 | 0.01056 | 0.09507 | 0.3908 | 0.8028 | 1.648 |
These residuals will give us further insight by showing how extreme our observed values are on either side of the expected values. The expected value is represented by zero and a negative number shows the extremity of how low our observed value was from our expected value. A positive number shows the extremity of how high our observed value was from our expected value.
For example:
There were more Star’s that took 4 classes than expected at a value of 0.921.
There were less Tenderfoot’s that took 5 class than expected at a value of -0.07732.
pander(chis.Reg$residuals)
| 1 | 2 | 3 | 4 | 5 | 6 | |
|---|---|---|---|---|---|---|
| Scout | -0.08763 | -0.4672 | 3.592 | 2.084 | -1.373 | -0.8664 |
| Tenderfoot | -0.785 | -0.3511 | -0.1036 | -0.2622 | -0.7732 | 0.8609 |
| 1st Class | -0.984 | -0.4401 | -1.32 | -1.183 | 0.3341 | 0.8712 |
| 2nd Class | -1.036 | -0.4635 | -0.6711 | -1.045 | 1.9 | -0.4331 |
| Star | -0.8179 | -0.3658 | -1.097 | 0.921 | -0.3666 | 0.2466 |
| Life | 3.401 | 2.752 | -0.975 | -0.4595 | -0.3629 | -0.118 |
| Eagle | 4.121 | -0.1028 | -0.3083 | -0.6252 | 1.336 | -1.284 |
So we can conclude from our data that there is an association between the rank of the Boy Scout and the number of merit badges they take.
Looking over the residuals, we notice a few trends:
The first is that Boy Scouts in the rank of Scout have a tendency to take only 3 or 4 classes. This may be due to the fact that it could be their first time at camp and want to have more free time to do other activities. They are less concerned with the merit badges they need to receive and are at camp for the experience of living in the outdoors with their peers. It is also possible that younger scouts will have more opportunities to receive different merit badges in their own troop meetings.
Second, we notice that starting at the rank of Tenderfoot, Boy Scouts begin to take more classes (notice how Tenderfoot with 6 classes is positive and 1-5 are negative), and as the ranks progress they start to take less and less classes. This is seen by Life and Eagle scouts taking only 1 and 2 classes. The values for Life and Eagles scouts are the most extreme from the residuals respectively at the values of 3.401, 4.121 for 1 class and 2.75 for 2 classes for Life scouts. This could be due to the fact that they already have a lot of the merit badges and don’t have the need for the classes.
If getting all Boy Scouts to participate in the fulfilling of merit badges, something that I would suggest is a “Big Brother initiative”. Having higher-ranked scouts pair up with one or more lower-ranked scout as an adviser or mentor during merit badge courses could help keep the higher-ranked scouts involved despite the amount of merit badges they have. It could also help lower-ranked scouts stay motivated to finish through till they receive their Eagle Scout award.