In our project, we decided to examine athletic school spirit at Macalester. In order to examine this unquantifiable variable, we looked at varsity athletics attendance.
We hypothesized the following:
The more involved in athletics you are, the more sporting events you will attend.
Spirit by frequency: more attendance means more school spirit
Varsity athletes have more school spirit.
By modeling data about game attendance, we can infer athletic spirit as a whole. We also predicted that there would be several other factors influencing attendance. These factors include: being a varsity athlete, personal importance of school spirit, major, etc. In order to further understand the relationship between athletic involvement and attendance, we also looked at how long people stayed at games, and what games they attended.
We found our data using an online survey administered over social media, mainly Facebook. We collected data over a two week span and received 216 responses. We analyzed our data using RStudio.
Variable Descriptions
Spirit was the measure of how important a student views school spirit. This was on a scale of 1-7, 1 being not important, 7 being extremely important.
Varsity Athlete Status was whether or not the student is a varsity athlete.
Club Athlete Status was whether or not the student is a club athlete.
Frequency was the measure of how often students attended varsity athletic events. The options were Never, Rarely, Sometimes, Often, or Always.
We renamed our variables for our sanity and yours.
## [1] ""
## [2] "Baseball"
## [3] "Football"
## [4] "Football, Men's Track and Field"
## [5] "Men's Cross Country"
## [6] "Men's Cross Country, Men's Track and Field"
## [7] "Men's Soccer"
## [8] "Men's Soccer, Men's Tennis"
## [9] "Men's Swimming and Diving"
## [10] "Men's Track and Field"
## [11] "Volleyball"
## [12] "Women's Cross Country"
## [13] "Women's Cross Country, Women's Track and Field"
## [14] "Women's Soccer"
## [15] "Women's Swimming and Diving"
## [16] "Women's Swimming and Diving, Women's Water Polo"
## [17] "Women's Tennis"
## [18] "Women's Track and Field"
## [19] "Women's Water Polo"
Graphs of Relationships between Variables
The majority of students answering the survey were Female for unknown reasons:
barchart(tally(~Gender, data = d, margins = FALSE, format = "count"), auto.key = TRUE)
The majority of students answering the survey were second years which is reflective of the fact that we are all second years:
barchart(tally(~Year, data = d, margins = FALSE, format = "count"), auto.key = TRUE)
The students answering the survey were primarily non-athletes which is reflective of the Macalester population:
barchart(tally(~VarsityAthlete, data = d, margins = FALSE, format = "count"),
auto.key = TRUE)
Most Macalester students typically do not attend varsity athletics on a regular basis:
barchart(tally(~Frequency, data = d, margins = FALSE, format = "count"), auto.key = TRUE)
This graph illustrates the responses to the question “How important is school spirit?” 1 corresponds to not important at all, 7 corresponds to extremely important. The responses show a normal distribution:
barchart(tally(~Spirit, data = d, margins = FALSE, format = "count"), auto.key = TRUE)
This is a graphical representation of varsity athletes by spirit. The blue corresponds to varsity athletes and the red corresponds to non-varsity athletes. From this graph it appears that on average varsity athletes have view school spirit as more important.
mosaicplot(Spirit ~ VarsityAthlete, data = d, las = 2, col = rainbow(2))
This is a representation of spirit by frequency of game attendance in a box and whiskers plot. From this graph we can see that as frequency of game attendance increases, so does the importance of school spirit.
bwplot(Spirit ~ Frequency, data = d, las = 2, col = rainbow(1))
This is a graphical representation of the model of frequency versus varsity athlete status. It shows that varsity athletes tend to attend games more frequently than non-varisty athletes.
mosaicplot(Frequency ~ VarsityAthlete, data = d, las = 2, col = rainbow(2))
This final graphical representation models frequency by status of club/intramural athletes. It shows that club/intramural athletes tend to also attend more games than non-club/intramural athletes, though this trend is less dramatic than in the Frequency~Varsity athlete model.
mosaicplot(Frequency ~ Club, data = d, las = 2, col = rainbow(2))
To analyze our hypotheses we created various models, shown below.
mod = lm(Spirit~VarsityAthlete,
data=d)
The regression table:
summary(mod)
##
## Call:
## lm(formula = Spirit ~ VarsityAthlete, data = d)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.051 -0.797 0.203 1.203 3.203
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.80 0.11 34.43 < 2e-16 ***
## VarsityAthleteYes 1.25 0.26 4.83 2.5e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.47 on 214 degrees of freedom
## Multiple R-squared: 0.0985, Adjusted R-squared: 0.0943
## F-statistic: 23.4 on 1 and 214 DF, p-value: 2.54e-06
This model shows that your varsity athlete status has a significant effect on how important school spirit is to you. Since there is a positive coefficient, varsity athletes view school spirit as more important than non-varsity athletes.
mod1 = lm(Spirit~Frequency,
data=d)
The regression table:
summary(mod1)
##
## Call:
## lm(formula = Spirit ~ Frequency, data = d)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.349 -0.927 0.353 1.073 3.514
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.782 0.222 21.53 < 2e-16 ***
## Frequency.L 2.450 0.657 3.73 0.00024 ***
## Frequency.Q 0.455 0.567 0.80 0.42358
## Frequency.C -0.135 0.402 -0.34 0.73765
## Frequency^4 -0.265 0.268 -0.99 0.32473
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.42 on 211 degrees of freedom
## Multiple R-squared: 0.163, Adjusted R-squared: 0.147
## F-statistic: 10.3 on 4 and 211 DF, p-value: 1.31e-07
anova(mod1)
## Analysis of Variance Table
##
## Response: Spirit
## Df Sum Sq Mean Sq F value Pr(>F)
## Frequency 4 83 20.80 10.3 1.3e-07 ***
## Residuals 211 428 2.03
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mod2 = lm(as.numeric(Frequency)~VarsityAthlete,
data=d)
The regression table:
summary(mod2)
##
## Call:
## lm(formula = as.numeric(Frequency) ~ VarsityAthlete, data = d)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.077 -0.825 0.175 0.175 2.175
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.825 0.063 28.97 < 2e-16 ***
## VarsityAthleteYes 1.252 0.148 8.44 4.6e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.838 on 214 degrees of freedom
## Multiple R-squared: 0.25, Adjusted R-squared: 0.246
## F-statistic: 71.3 on 1 and 214 DF, p-value: 4.64e-15
Being a varsity athlete has a significant effect on the frequency of varsity event attendance. The positive coefficient on the Yes varsity athlete variable indicates that varsity athletes attend more athletic events than non-varsity athletes.
mod3 = lm(as.numeric(Frequency)~Club,
data=d)
summary(mod3)
##
## Call:
## lm(formula = as.numeric(Frequency) ~ Club, data = d)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.2385 -0.8598 -0.0492 0.7615 3.1402
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.8598 0.0917 20.27 <2e-16 ***
## ClubYes 0.3787 0.1291 2.93 0.0037 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.949 on 214 degrees of freedom
## Multiple R-squared: 0.0386, Adjusted R-squared: 0.0341
## F-statistic: 8.6 on 1 and 214 DF, p-value: 0.00373
Being a club athlete has a significant effect on the frequency of varsity event attendance. The positive coefficient on the Yes club variable indicates that club athletes attend more athletic events than non-club athletes. However, comparing mod3 to mod2 we see that being a varsity athlete has a greater effect on varsity athletic event attendance than being a club athlete.
We hypothesized that the more involved in athletics you are, the more sporting events you will attend. This means that if you are a varsity or club/intramural athlete, you should go to more games.
This hypothesis was supported by our data.
We also hypothesized that varsity athletes have more school spirit. This hypothesis was supported by the data.
The most significant weakness in our methodology was the way we attained survey responses. To encourage people to take our survey, we posted it to our Facebook pages and the Macalester Class of 2015 and 2016 pages. If we wanted our survey to be representative of the Macalester student body, we should have randomly selected people from the entire student body. Instead, our respondents were largely part of our own social networks. This may not have been too much of a problem if each of us spent time in different circles, but since our group had prior connections, our networks largely overlap each other. By posting our surveys on the Class of 2015 and 2016 Facebook pages, we ensured that mostly these classes would answer the survey – indeed, 112 second years and 83 first years took the survey, while only 10 juniors and 11 seniors participated.
Our sample population is also less diverse than it should be because each of the demographic composition of our group. Each of us is a science/math major, so it’s likely that the respondents were disproportionately science math/majors. Because of the way we set up our “major” question, we weren’t able to establish a relationship between major other variables. Our study would benefit by finding a link between major and participation in athletics and game attendance, so we would likely allow people to pick only a “primary major” so it was easier to find relationships.
There are also issues with the proportion of athletes that took our survey. Since none of our groups members are Macalester athletes, this may mean that there are a disproportionate number of non-athletes who took the survey. However, since the subject of the survey is “Athletic Spirit,” there may also be a self selection bias and athletes may be more interested in taking the survey.
Again, all these problems could be solved with a better distribution system for our survey to get a good mix of classes, majors, athletes/non-athletes etc. Even if we didn’t have the large sample of 217 respondents, the sample would be more reflective of the general student body, which was ultimately our goal.