This project seeks to explore and visualize the narrative of health trends among students across their college career with a focus on their respective relationships with food and physical activity. The data utilized to explore the larger community of this narrative was provided by a combination of surveys collected from students who agreed to participate at Mercyhurst University. The combined dataset of these surveys includes information on food choices, nutrition, dietary preferences, academic performance, and other information from the students. In total, 61 variables were collected, the majority of which were poised to participants in the form of scales whose corresponding string values were provided in the data’s documentation. For example, for the variable “cook”, participants responded with a number ranging from 1 to 5 with 1 signifying that the participants cooked for themselves everyday, 5 signifying that the participant never cooked for themselves, and the numbers in between corresponding to degrees of cooking frequency between those extremes. Thus throughout this exploration the number values provided by the participants are replaced by their corresponding string values for increased clarity.
As for my own experience I recorded my own responses to the surveys using health and financial data I’ve recorded while at UVA, but unlike the Mercyhurst students whose responses act as a snapshot of their college career, I provided responses for each semester I’ve been at college. This was done in order to be able to compare the narrative of my own college experience over time to the trends present in the larger community of college students.
Finally, the following visualizations seek to demonstrate the relationships that a variety of factors have with the food choices, dietary practices, and weight of a college student. Particular consideration will be given to how these relationships vary over years and/or semesters of college with the goal of visualizing the intuitive as well as counterintuitive trends of these relationships.
Unlike the first plot, the plot below utilizes two continuous variables, weight and grade point average (GPA), to portray the relationship between a measure of health and academic performance, and how a student’s level of physical activity impacts this relationship. Through this visualization the plot is able to demonstrate that for those who work out more frequently, i.e. everyday or two to three times a week, there is little correlation between weight and GPA as the weight of these students varies marginally no matter their GPA. However, an interesting relationship is exhibited among students who exercise less frequently. Among students who reportedly exercise once a week, there is a relatively positive correlation between weight and GPA, meaning that among this subset of students, those that have high GPAs also tend to have higher weights in pounds. Yet this trend reverses among students who never exercise in a week with lighter students tending to have higher GPAs within this subset.
Continuing along the line of physical activity’s relationship with measures of health, the following visualization depicts the distribution of weight in pounds for each year of college, bifurcated by whether or not a student participated in sports that year. The first important insight this plot provides is that across all years, the median weight of students who participated in sports is higher than those who did not. This perhaps runs counter to the idea that increased physical activity results in decreased weight, but makes more sense when one considers that athletes generally have more muscle and eat a greater number of calories on average (Purcell). Another important insight is the progression of the medians for each class of sports participation. Among students who participated in sports there is a clear linear increase in weight for each year, while for students who did not participate in sports the medians follow no clear trend and the distributions are slightly wider. This in turn demonstrates that there exists a slightly positive relationship between weight and year, but only among students who participate in sports.
Perhaps more telling than weight however, a student’s frequency of eating out can also be highly indicative of their health. According to Healthline, eating out typically consists of “high calorie, low nutrient meals” and can contribute to a litany of adverse health effects (Pietrangelo) and as such is a crucial food choice in terms of student health. The visualization below portrays the relationship between three key variables in terms of food choices: year of college, housing locale, and frequency of eating out, with the housing variable separated into off campus and on campus, and the frequency of eating out recorded on a per week basis. Beginning with students living off campus, they are significantly more likely to eat out at least once a week and in turn very unlikely to never eat out. Additionally, seniors who live off campus are slightly more likely to eat out more frequentlu than others living off campus. As for students living on campus, they are also more likely to eat out at least once a week, with the majority of responses clustered around eating out 1-2 and 2-3 times a week. Additionally, there seems to be a gradual decline in frequency of eating out from to lower and upperclassmen. Finally, no matter the living situation, the visual demonstrates that eating out 1-2 times a week is undoubtedly the most common practice.
As much as eating out, cooking for one’s self is an important dietary practice when it comes to healthy eating. According to Harvard Health, studies have shown “that the more people cook at home, the healthier their diet, the fewer calories they consume” (Tello). In this way the frequency at which students cook for themselves is a key indicator of health habits, however unlike previous measures it can be argued that a student’s parents and home life affect this frequency more than any factors on a college campus. As such the visualization below depicts how this cooking frequency is affected by the education level of a student’s parents. For the highest degrees of cooking frequency, which include everyday, most days, and not often, there is little correlation as for each level of frequency there is a fairly evenly dispersion of education levels. However for the lowest degrees of cooking frequency, i.e. only holidays and never, there is significantly more clustering around higher ends of both mother and father education level which effectively evidences that students who cook very infrequently are more likely to have more highly educated parents.
As demonstrated in the first plot, there is little correlation between weight and GPA for students who exercise everyday. However, as a student who typically exercises at least a little everyday I found this trend to be indeterminate for my experience. While the first plot demonstrated a lack of variation in weight despite a change in GPA, the plot below illustrates that neither my weight, nor my GPA have varied significantly throughout my college career as of yet. One could extrapolate that this joint lack of variation constitutes a correlation, but given the lack of spread I believe that it is safer to say that unlike the first plot, my experience is inconclusive as to whether a relationship exists between weight and GPA for students who exercise everyday.
taken in regards to this plot as it went through many iterations. First, it began a simple scatter plot displaying the relationship between cooking frequency and mother’s education level. Then it evolved into a 3D scatter plot with the addition of father’s education level. Then color based on cooking frequency was added to make the z levels more distinguishable. Next GPA was added as a size variable to make the depth more vi
Moving on to the relationship between weight and year, separated by sports participation, for every year I’ve been at UVA I’ve only ever played sports in the spring as the only intramural sports I've been involved in happen in the spring. Thus the following visualization portrays a bar chart of my weights, bifurcated by semester, and colored by sports participation. Overlaid on this bar chart is the visualization from plot 2 so that a direct comparison may be made between my records and that of the larger college student community. Additionally, only my first three years of college and the first three years from plot 2 are included in this plot as I would have no weights to juxtapose semesters for my fourth year.
Unlike the previous plot 5, my experience visualized here is significantly different from that of the larger community. Where sports participation consistently resulted in higher median weight among the larger community, my weight went down during the spring semester of my junior year. Furthermore, while median weight consistently increased for each year among students who participated in sports, my weight during sports participation was nearly uniform across all three years. Finally, whereas the median weight for students who did not participate in sports lacked any clear trend in the larger community, my experience saw a clear increase in weight as each successive semester I did not participate in sports. Consequently, this visualization again demonstrates that my experiences strays from that of the larger community of college students.
Pivoting back to food choices and practices, the relationship between eating out and health habits has been one I’ve been aware of for a while now and one that significantly changed when I moved off campus second year. Similarly to plot 3, the figure below visualizes how my frequency of eating out changed as my college career progressed and my living situation changed with each bar representing the average amount of times I ate out a semester for each year. As illustrated there is a significant increase in frequency between my first and second year as I moved off campus as well as smaller subsequent increases each successive year. My experience does follow a trend of the large community in that as a senior I am slightly more likely to eat out more frequently than my previous years living off campus. My experience differs however in that although my frequency was in the 1-2 times a week range for my first year, once off campus my experience was outside the common practice of the larger community as I ate out 3-5 times a week for each of those years. Furthermore, my eating out frequency gradually increased from lower to upperclassmen, again making a difference in trends between myself and the larger community of college students.
Finally, although I was unable to verify whether eating at a dining hall constituted eating out in the surveys, so while I did not record dining hall trips, I acknowledge that this could be a source of discrepancy.
As visualized in plot 4, there is evidence of a relationship between parents’ education level and the cooking frequency of a student, especially for students who cook infrequently. The below plot further illustrates this relationship by only plotting one layer of the three dimensional visualization from figure 4, specifically the category of students who never cook, and adding my own experience as a student who also never cooks for himself. Surprisingly, unlike the previous visualizations, my experience aligns very well with the trend present among this class of students as my parents both have relatively higher degrees of education. Thus my experience provides further evidence for the direct relationship present between parents’ education level and students who never cook for themselves, a potentially inauspicious health habit.
Last, but not least, checking nutrition labels is another important dietary practice and useful indicator of health habits. The interactive visualization below depicts how the distribution of weight changes based on how frequently a student checks nutrition labels on products and its corresponding table of values. For three degrees of frequency, never, only certain products, and most products, the distributions were fairly right skewed and unimodal with much of the weights clustered around a single peak. However for the degrees of frequency of rarely and every product the distributions were much more uniform and relatively bimodal. This demonstrates that a specific value or range of weight can be determined and/or associated for students with the degrees of never, only certain products, and most products. While for students with degrees of rarely or every product, it is much more difficult to assign a likely range of weights because they are so evenly distributed. Thus although certain degrees of frequency of checking nutrition labels can help predict weight among college students, other degrees of frequency fall short and provide little insight.
BoraPajo. “Food Choices.” Kaggle, 23 Apr. 2017, www.kaggle.com/datasets/borapajo/food-choices?select=food_coded.csv.
Purcell, Laura K, and Canadian Paediatric Society, Paediatric Sports and Exercise Medicine Section. “Sport Nutrition for Young Athletes.” Paediatrics & Child Health, U.S. National Library of Medicine, Apr. 2013, www.ncbi.nlm.nih.gov/pmc/articles/PMC3805623/.
Kassambara. “Visualizing Multivariate Categorical Data.” STHDA, 17 Nov. 2017, www.sthda.com/english/articles/32-r-graphics-essentials/129-visualizing-multivariate-categorical-data/#mosaic-plot.
Monique Tello, MD, and MD Rani Polak. “Home Cooking: Good for Your Health.” Harvard Health, 6 Aug. 2018, www.health.harvard.edu/blog/home-cooking-good-for-your-health-2018081514449.
“Fast Food’s Effects on 8 Areas of the Body.” Healthline, Healthline Media, www.healthline.com/health/fast-food-effects-on-body#fast-food-popularity. Accessed 28 Nov. 2023.
ChatGPT Prompt (for Figure 9): “using diamonds dataset write code for density plot of carat by cut and output its filtered table using shiny app in r”