2,104 individuals provided responses and consented to participate. The dataset included 28 variables, of which 6 were selected for analysis in this report. (see Table 1).
Variable Name
Description
Classification
gender
Male, Female, Non-binary / Third gender,
Prefer not to say
Qualitative Nominal
standard_drinks
Any number. Avg. per week.
Quantitative Continuous
student_type
“Domestic”, “International”
Qualitative Nominal
stress_level
Whole number from 0 (No Stress) to 10 (Worst Stress Imaginable).
Quantitative Discrete
friend_count
Any positive whole number.
Quantitative Discrete
rent
Any positive whole number.
Quantitative Continuous
Table 1: Variable names and classifications.
This report, as a cross-sectional study, cannot account for fluctuations in answers over time. This may affect analysis of things such as alcohol consumption, which may vary due to factors like price changes. The data also represents a sample and not a population. Thus, findings cannot be accurately applied to individuals outside of the survey group without generalisation. This may affect any recommendations made.
It was assumed that all self-reported data was honest and unaffected by bias. Weekly alcohol consumption was treated as a reflection of consistent behavior, rather than a one-off occurrence. Concurrently, any unrealistic values could be explained by respondents entering their answers incorrectly, or misinterpreting the requirements of the survey question.
Responses that contained values deemed as reasonably unrealistic were cleaned from the dataset using the parameters in the code block above. This removed 11.84% (n=254) of responses which was believed to greatly increase the validity of the results.
Research Question 1
Are there significant differences in self-reported weekly alcohol consumption between genders, and how do these differences interact with student type?
data_cleanQ1 = data_clean |>mutate(gender =recode(gender, "Non-binary / third gender"="third gender")) |>filter(gender !="Prefer not to say")df_summary = data_cleanQ1 |>group_by(student_type, gender) |>summarise(AvgDrinks =mean(standard_drinks, na.rm =TRUE)) |>ungroup() |>mutate(Category =paste(student_type, gender))desired_order =c("Domestic Male", "Domestic Female", "International Male", "International Female", "Domestic third gender", "International third gender")df_summary$Category =factor(df_summary$Category, levels = desired_order)ggplot(df_summary, aes(x = Category, y = AvgDrinks)) +geom_bar(stat ="identity", fill ="#2f5182",width=0.5) +geom_text(aes(label =round(AvgDrinks, 1), fontface ="bold"),vjust =-0.5, hjust =1.2) +labs(title ="Standard Drinks per Week by Gender and Student Type",x ="Gender and Student Type",y ="Avg Standard Drinks per Week" ) +scale_y_continuous(limits =c(0, 3.7)) +theme_minimal() +theme(plot.title =element_text(hjust =0.5, size =14, margin =margin(b =20), family ='Georgia', colour =lighten('#825d0c', amount=0.1), face ="bold"),axis.text.x =element_text(angle =30, hjust =1, size =9, face ="bold"),plot.margin =margin(t =20, r =0, b =0, l =20),plot.background =element_rect(fill ="#d1d1c9", color =NA),panel.grid =element_line(color =lighten('#825d0c', amount=0.5)))
Figure 2.1
Code
df_summary2 = data_cleanQ1 |>group_by(student_type, gender) |>ungroup() |>mutate(Category =paste(student_type, gender))desired_order2 =c("Domestic Male", "Domestic Female", "International Male", "International Female", "Domestic third gender", "International third gender")df_summary2$Category =factor(df_summary2$Category, levels = desired_order)d=ggplot(df_summary2, aes(x = Category, y = standard_drinks)) +geom_boxplot(fill =lighten('#2f5182', amount =0.2), color =darken('#2f5182', amount =0.5),outlier.fill =lighten('#2f5182', amount =0.2), outlier.size =1.5,outlier.color =lighten('#2f5182', amount =0.2)) +labs(title ="Distribution of Standard Drinks by Category",x ="Gender and Student Type",y ="Standard Drinks") +theme(plot.background =element_rect(fill ="#d1d1c9", color =NA),plot.title =element_text(hjust =0.5, size =14, margin =margin(b =20), family ='Georgia',colour =lighten('#825d0c', amount=0.1), face ="bold"),axis.text.x =element_text(angle =30, hjust =1, size =9, face ="bold"))+theme_minimal()ggplotly(d) |>layout(plot_bgcolor ="#d1d1c9",paper_bgcolor ="#d1d1c9",title =list(x =0.5, font =list(size =18,family ="Georgia",color =lighten('#825d0c', amount =0.1) )),xaxis =list(tickangle =-30,tickfont =list(size =13,family ="Georgia Bold" ) ))
Figure 2.2
Domestic and international male students reported the highest average alcohol consumption (3.5 and 3.4 standards, respectively). Female students consumed less, and third gender/non-binary individuals reported the lowest levels, especially international students (0.7 standards). Domestic students generally drank more than international peers across all gender groups. However, results for third gender students should be interpreted with caution due to the small sample size.
These findings agree with literature that claims male students consume more alcohol on average than female students (Papier et al., 2015). This manifests where male students consumed an average of one additional standard per week . Domestic students are also at higher risk of excess drinking (Sanci et al., 2022). Domestic females consumed 0.6 more drinks than international females, a much larger gap than the 0.1 difference observed between domestic and international males.
Outliers, visible in the boxplots (see Graph 2.2), can skew the mean and lead to misinterpretation. However, the graph remains a useful indicator of relative alcohol consumption when paired with boxplots showing distribution.
Research Question 2 (Linear Model)
Is there a significant correlation between alcohol consumption among male students and stress level, number of university friends and weekly rent?
data_male = data_clean |>filter(gender =="Male") |>mutate(friends_count =as.numeric(friends_count))cor_coef =cor(data_male$stress, data_male$standard_drinks)p1 =ggplot(data_male, aes(x = stress, y = standard_drinks)) +geom_point(color ="#2f5182", size =2) +geom_smooth(method ="lm", se =FALSE, color =darken('#2f5182', amount =0.5)) +labs(title ="Stress vs Standard Drinks for Males",x ="Stress Level",y ="Standard Drinks per Week" ) +theme_minimal() +theme(plot.title =element_text(size =14, family ='Georgia',colour =lighten('#825d0c', amount=0.1), face ="bold"),axis.text.x =element_text(face ="bold"),plot.background =element_rect(fill ="#d1d1c9", color =NA),panel.grid =element_line(color =lighten('#825d0c', amount=0.5)))ggplotly(p1) |>layout(annotations =list(xref ="paper",yref ="paper",x =0.90,y =0.90,showarrow =FALSE,text =paste("r =", round(cor_coef, 4)),font =list(size =20, color ="black")),plot_bgcolor ="#d1d1c9")
Figure 3.1
Code
cor_coef2 =cor(data_male$friends_count, data_male$standard_drinks)p2 =ggplot(data_male, aes(x = friends_count, y = standard_drinks)) +geom_point(color ="#2f5182", size =2) +geom_smooth(method ="lm", se =FALSE, color =darken('#2f5182', amount =0.5) ) +labs(title ="Friend count vs Standard Drinks for Males",x ="Friend count",y ="Standard Drinks per Week" ) +theme_minimal() +theme(plot.title =element_text(size =14, family ='Georgia', colour =lighten('#825d0c', amount=0.1), face ="bold"),axis.text.x =element_text(face ="bold"),plot.background =element_rect(fill ="#d1d1c9", color =NA),panel.grid =element_line(color =lighten('#825d0c', amount=0.5)))ggplotly(p2) |>layout(annotations =list(xref ="paper",yref ="paper",x =0.90,y =0.90,showarrow =FALSE,text =paste("r =", round(cor_coef2, 4)),font =list(size =20, color ="black")),plot_bgcolor ="#d1d1c9")
Figure 3.2
Code
cor_coef3 =cor(data_male$rent, data_male$standard_drinks)p3 =ggplot(data_male, aes(x = rent, y = standard_drinks)) +geom_point(color ="#2f5182", size =2) +geom_smooth(method ="lm", se =FALSE, color =darken('#2f5182', amount =0.5) ) +labs(title ="Rent vs Standard Drinks for Males",x ="Rent (Aud)",y ="Standard Drinks per Week" ) +theme_minimal() +theme(plot.title =element_text(size =14, family ='Georgia',colour =lighten('#825d0c', amount=0.1), face ="bold"),axis.text.x =element_text(face ="bold"),plot.background =element_rect(fill ="#d1d1c9", color =NA),panel.grid =element_line(color =lighten('#825d0c', amount=0.5)))ggplotly(p3) |>layout(annotations =list(xref ="paper",yref ="paper",x =0.90,y =0.90,showarrow =FALSE,text =paste("r =", round(cor_coef3, 4)),font =list(size =20, color ="black")),plot_bgcolor ="#d1d1c9")
Figure 3.3
People experiencing higher stress consume more alcohol (de Wit et al., 2003). We expected a similar trend to be observed in survey respondents. Conversely, a negligible correlation (r = -0.0177) was obtained from the linear regression (see Graph 3.1). Indicating no correlation between stress level and alcohol consumption. Thus, stress level could not adequately explain high alcohol consumption in male students.
Individuals with friends who often drink are likely to drink similarly (Jones & Magee, 2014). A larger social network may increase exposure to heavy-drinking peers, raising the likelihood of adopting similar habits. Linear regression resulted in an r-value of 0.1480, representing a very weak positive correlation (see Graph 3.2), due to this weakness, friendship count could not explain high alcohol consumption in male students.
Students living away from home are known to drink more than students who are not (Harford et al., 2002). Weekly rent was thought to be a possible indicator of this, and thus, alcohol consumption too. Linear regression resulted in an r-coefficient of 0.0051, representing no correlation (see Graph 3.3).
Stress level, friend count, and weekly rent could not account for high alcohol consumption in men. Due to this, we cannot make any direct recommendations, further research is required to determine factors influencing alcohol consumption. Due to the higher correlation observed in Graph 3.2, we recommend a follow-up survey targeted toward social and interpersonal factors surrounding alcohol consumption.
Declaration of Professional Ethics
Shared Values
In maintaining transparency about statistical methodologies used, and by making these methodologies public in this report, the shared professional value of truthfulness and integrity has been adhered to.
Ethical Principles
Through accurate description of results and explanatory power of the data utilised, as well as the recognition of the data’s limits, the ethical principle of maintaining confidence in statistics has been adhered to.
Articles
Papier, K., Ahmed, F., Lee, P., & Wiseman, J. (2015). Stress and dietary behaviour among first-year university students in Australia: Sex differences. Nutrition, 31(2), 324–330. https://doi.org/10.1016/j.nut.2014.08.004
Sanci, L., Williams, I., Russell, M., Chondros, P., Duncan, A.-M., Tarzia, L., Peter, D., Lim, M. S., Tomyn, A., & Minas, H. (2022). Towards a health promoting university: Descriptive findings on Health, Wellbeing and Academic Performance Amongst University students in Australia. BMC Public Health, 22(1). https://doi.org/10.1186/s12889-022-14690-9
de Wit, H., S, Söderpalm, A. H. V., Nikolayev, L., & Young, E. (2003). Effects of Acute Social Stress on Alcohol Consumption in Healthy Subjects. Alcoholism: Clinical & Experimental Research, 27(8), 1270–1277. https://doi.org/10.1097/01.alc.0000081617.37539.d6
Jones, S. C., & Magee, C. A. (2014). The role of family, friends and peers in Australian adolescent’s alcohol consumption. Drug and Alcohol Review, 33(3), 304–313. https://doi.org/10.1111/dar.12111
Harford, T. C., Wechsler, H., & Muthén, B. O. (2002). The impact of current residence and high school drinking on alcohol problems among college students. Journal of Studies on Alcohol, 63(3), 271–279. https://doi.org/10.15288/jsa.2002.63.271
Acknowledgements
Group meetings
26th March, 12:35pm - Raz, Dan, Steven
Contribution of Group Members
Raz: Creating theme, formatting and graphs
Dan: Writing, editing, finding trend that led to RQ1, template of report, and researching
Mox: Writing, editing report, made presentation
Steven: Research, finding research papers
Geoffrey: Writing, editing report, made presentation
AI tools, including ChatGPT-4o, ChatGPT-4o Research, and ChatGPT-o3-mini-high, were used between April 4th and 9th to explore and implement functions beyond the scope of the DATA1001 curriculum. This included implementing CSS for custom themes, adding tabsets, styling and customizing graphs, and finding the root of error messages, among other enhancements. While AI contributed to the document’s enhanced functionality and aesthetics, its outputs often required significant human intervention and customization to align with the intended design and purpose.
“Sharing conversations with user uploaded images is not yet supported” - ChatGPT
At this moment we are unable to share the link to the entire prompt session