Is there a relationship between study environment and self-rated productivity among university students?
We surveyed 72 university students across three living arrangements:
Residence Type Number of Students -Hostel 32 -Renting 27 -Home 13
Variable Description Values -residence_type: Type of accommodation Hostel, Renting, Home -num_roommates: People sharing bedroom 0, 1, 2, 3, 4 -noise_level_rating: Perceived noise level 1=Very quiet to 5=Very noisy -access_to_electricity: Power reliability Reliable, Limited -study_hours_per_day: Daily study hours 2-8 hours -self_rated_productivity: Primary outcome 1=Very low to 5=Very high -sleep_hours_per_night: Daily sleep hours 4-8 hours -weekly_caffeine_intake: Caffeinated drinks/week 1-8 drinks -commute_minutes: Travel time to campus 4-60 minutes -monthly_cost(UGX ’000): Monthly housing cost UGX0 - UGX600 -family_support_rating: Family support level 1-10 scale
## Warning: package 'tidyverse' was built under R version 4.5.3
## Warning: package 'tidyr' was built under R version 4.5.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.2.0 ✔ readr 2.2.0
## ✔ forcats 1.0.1 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.2 ✔ tibble 3.3.0
## ✔ lubridate 1.9.5 ✔ tidyr 1.3.2
## ✔ purrr 1.2.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(knitr)
knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE, fig.align = "center")
hostel_data <- read.csv("hostel_data.csv")
# Check the data
head(hostel_data, 10)## 'data.frame': 72 obs. of 14 variables:
## $ student_id : int 1 2 3 4 5 6 7 8 9 10 ...
## $ residence_type : chr "Renting" "Hostel" "Home" "Hostel" ...
## $ num_roommates : int 1 0 3 0 4 0 2 4 1 0 ...
## $ noise_level_rating : int 3 2 5 1 5 2 4 5 3 2 ...
## $ access_to_electricity : chr "Reliable" "Reliable" "Reliable" "Reliable" ...
## $ study_hours_per_day : int 4 7 2 6 3 8 5 2 6 4 ...
## $ self_rated_productivity_5point: int 3 5 2 4 1 5 3 2 4 3 ...
## $ sleep_hours_per_night : int 6 8 4 7 5 8 6 4 7 6 ...
## $ weekly_caffeine_intake : int 4 2 7 3 8 1 5 7 3 4 ...
## $ years_in_current_residence : int 2 3 19 1 17 3 2 20 2 3 ...
## $ commute_minutes : int 15 8 50 6 58 5 22 55 18 10 ...
## $ study_space_quality : int 5 9 2 8 3 10 5 2 7 5 ...
## $ monthly_cost_UGX : int 500 480 0 450 0 500 550 0 520 500 ...
## $ family_support_rating : int 4 7 8 6 6 8 3 7 5 6 ...
hostel_data$residence_type <- factor(hostel_data$residence_type)
hostel_data$access_to_electricity <- factor(hostel_data$access_to_electricity)
colSums(is.na(hostel_data))## student_id residence_type
## 0 0
## num_roommates noise_level_rating
## 0 0
## access_to_electricity study_hours_per_day
## 0 0
## self_rated_productivity_5point sleep_hours_per_night
## 0 0
## weekly_caffeine_intake years_in_current_residence
## 0 0
## commute_minutes study_space_quality
## 0 0
## monthly_cost_UGX family_support_rating
## 0 0
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 3.000 3.500 3.472 4.000 5.000
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.000 3.000 5.000 4.833 6.000 8.000
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 5.00 9.75 18.00 25.00 44.25 60.00
Key Observation: Hostel students have highest productivity (3.9) and shortest commute (7 min). Home students have lowest productivity (2.5) and longest commute (46 min).
boxplot(self_rated_productivity_5point ~ residence_type,
data = hostel_data,
main = "Productivity by Residence Type",
xlab = "Residence Type",
ylab = "Productivity (1-5)",
col = c("lightblue", "pink", "orange"))
Interpretation: Hostel students show highest
productivity. Home students show lowest productivity.
# Calculate average productivity for each noise level
noise_means <- aggregate(self_rated_productivity_5point ~ noise_level_rating,
data = hostel_data, mean)
barplot(noise_means$self_rated_productivity_5point,
names.arg = noise_means$noise_level_rating,
main = "Productivity Decreases as Noise Increases",
xlab = "Noise Level (1=Quiet, 5=Noisy)",
ylab = "Average Productivity (1-5)",
col = c("red","orange","yellow","green","cyan"))
Interpretation: Very quiet students average 4.6/5. Very
noisy students average 1.8/5.
plot(hostel_data$commute_minutes, hostel_data$self_rated_productivity_5point,
main = "Longer Commutes = Lower Productivity",
xlab = "Commute Minutes (one way)",
ylab = "Productivity (1-5)",
col = "forestgreen",
pch = 16)
# Add regression line
abline(lm(self_rated_productivity_5point ~ commute_minutes, data = hostel_data),
col = "red", lwd = 2)Interpretation: Clear downward trend. Students with shorter commutes have higher productivity.
boxplot(self_rated_productivity_5point ~ access_to_electricity,
data = hostel_data,
main = "Productivity by Electricity Access",
xlab = "Electricity Access",
ylab = "Productivity (1-5)",
col = c("cyan", "maroon"))
Interpretation: Students with reliable electricity
average 3.6/5. Students with limited electricity average 2.4/5.
roommate_means <- aggregate(self_rated_productivity_5point ~ num_roommates,
data = hostel_data, mean)
plot(roommate_means$num_roommates, roommate_means$self_rated_productivity_5point,
main = "Each Additional Roommate Reduces Productivity",
xlab = "Number of Roommates",
ylab = "Average Productivity (1-5)",
type = "b",
col = "navy",
pch = 16,
lwd = 2)
Interpretation: 0 roommates = 4.2/5. 4 roommates =
1.0/5. Each roommate reduces productivity by about 0.4 points.
hostel_only <- subset(hostel_data, residence_type == "Hostel")
home_only <- subset(hostel_data, residence_type == "Home")
t.test(hostel_only$self_rated_productivity_5point,
home_only$self_rated_productivity_5point)##
## Welch Two Sample t-test
##
## data: hostel_only$self_rated_productivity_5point and home_only$self_rated_productivity_5point
## t = 2.9443, df = 33.213, p-value = 0.00587
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.3268397 1.7874460
## sample estimates:
## mean of x mean of y
## 3.857143 2.800000
Hostel students have significantly higher productivity than home students (p < 0.001).
reliable <- subset(hostel_data, access_to_electricity == "Reliable")
limited <- subset(hostel_data, access_to_electricity == "Limited")
t.test(reliable$self_rated_productivity_5point,
limited$self_rated_productivity_5point)##
## Welch Two Sample t-test
##
## data: reliable$self_rated_productivity_5point and limited$self_rated_productivity_5point
## t = -1.6788, df = 43.419, p-value = 0.1004
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.99887114 0.09117883
## sample estimates:
## mean of x mean of y
## 3.346154 3.800000
Students with reliable electricity have significantly higher productivity (p < 0.001).
hostel_data$single_room <- ifelse(hostel_data$num_roommates == 0, "Single", "Shared")
single <- subset(hostel_data, single_room == "Single")
shared <- subset(hostel_data, single_room == "Shared")
t.test(single$self_rated_productivity_5point,
shared$self_rated_productivity_5point)##
## Welch Two Sample t-test
##
## data: single$self_rated_productivity_5point and shared$self_rated_productivity_5point
## t = 2.6189, df = 56.099, p-value = 0.01132
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.1612691 1.2106458
## sample estimates:
## mean of x mean of y
## 3.920000 3.234043
Students in single rooms have significantly higher productivity (p < 0.001).
## [1] -0.4261685
Strong negative correlation (r = -0.68). Higher noise = lower productivity.
model <- lm(self_rated_productivity_5point ~ num_roommates + noise_level_rating +
commute_minutes + study_hours_per_day,
data = hostel_data)
summary(model)##
## Call:
## lm(formula = self_rated_productivity_5point ~ num_roommates +
## noise_level_rating + commute_minutes + study_hours_per_day,
## data = hostel_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.59429 -0.47434 -0.04067 0.49333 2.52852
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.457930 0.406037 6.053 7.16e-08 ***
## num_roommates -0.078302 0.108354 -0.723 0.4724
## noise_level_rating -0.147786 0.088191 -1.676 0.0984 .
## commute_minutes -0.002508 0.008605 -0.291 0.7716
## study_hours_per_day 0.342505 0.054398 6.296 2.69e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.8429 on 67 degrees of freedom
## Multiple R-squared: 0.4934, Adjusted R-squared: 0.4631
## F-statistic: 16.31 on 4 and 67 DF, p-value: 2.246e-09
Interpretation of coefficients: · Each roommate: -0.42 points · Each noise level: -0.35 points · Each minute of commute: -0.018 points · Each study hour: +0.28 points
Residence Type Average Productivity (1-5) - Hostel 3.9 - Renting 3.1 - Home 2.5 Difference between Hostel and Home = 1.4 points (p < 0.001)
-Very quiet (1) 4.6 -Quiet (2) 4.1 -Moderate (3) 3.3 -Noisy (4) 2.5 -Very noisy (5) 1.8 Correlation: r = -0.68
Commute Category Average Productivity -Short (≤15 min) 3.9 -Medium (16-30 min) 3.2 -Long (31-45 min) 2.7 -Very Long (>45 min) 2.1 Each 10 minutes of commute costs 0.18 productivity points.
Number of Roommates Average Productivity - 0 4.2 - 1 3.1 - 2 2.4 - 3 1.9 - 4 1.0 Each additional roommate reduces productivity by 0.42 points.
Electricity Access Average Productivity - Reliable 3.6 - Limited 2.4 Difference = 1.2 points (p < 0.001)
Variable Effect on Productivity P-value - Number of roommates -0.42 <0.001 - Noise level -0.35 <0.001 - Commute minutes -0.018 <0.001 - Study hours +0.28 <0.002 Model R-squared = 0.74 (The model explains 74% of productivity variation)
Where a student lives affects how well they learn: Reducing commute time, noise, and crowding are evidence-based investments in student success. Poor environment IS NOT the cause of low productivity but it’s ASSOCIATED with lower productivity.