The Restorative State of Sleep versus Exercise on Mental Wellbeing in College Students

Author

Affiliation

Josephine Janas

Abstract

Both exercise and sleep have a strong, positive, widely recorded correlation with positive well-being. The purpose of this study was to determine which variable had a stronger correlation. The working hypothesis of this study was that sleep quality is more strongly associated with promoting well-being. Participants were current full-time college students aged 18–22 who were members of the First-Year Research Immersion program (FRI) at Binghamton University. Using two separate wearables, the FitBit Charge 6 and the Muse S, data of the participants was recorded by the wearable and logged by the participant during a daily survey. The participants wore the FitBit Charge 6 each day to track exercise data, and the Muse S at night to track sleep data. Participants used the wearables for a period of 7 days, filling out a more in-depth survey about mental well-being on Day 1 and Day 7. The survey also asked participants to rate their mental health throughout the day on a five-point Likert scale from poor to excellent. Using the results of this study, public health officials can determine whether it is more important for students to ensure that their sleep quality is high or their exercise is consistent and sufficient and thus provide recommendations to both students and colleges.

Keywords

MUSE headband, step count, sleep score, college students, lifestyle mental health

Code

rm(list = ls())

Code

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.1     ✔ stringr   1.5.2
✔ ggplot2   4.0.0     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.1.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Code

library(knitr)
library(tibble)
library(dplyr)
library(tidyr)
library(psych)


Attaching package: 'psych'

The following objects are masked from 'package:ggplot2':

    %+%, alpha

Code

library(scales)     # for number formatting like comma()


Attaching package: 'scales'

The following objects are masked from 'package:psych':

    alpha, rescale

The following object is masked from 'package:purrr':

    discard

The following object is masked from 'package:readr':

    col_factor

Code

library(english)    # to convert numbers to words


Attaching package: 'english'

The following object is masked from 'package:scales':

    ordinal

Code

library(stringr)    # for text functions like str_c()

1 Import

1.1 Day One and Seven Survey

Code

library(readxl)
# Import Excel file
onesevendata <- read_excel(
    "10.27.25.Day1_7.Clean.xlsx",
    col_names = TRUE)

onesevendata[onesevendata == -99] <- NA
onesevendata[onesevendata == -50] <- NA

##explanation: all -99 and -50 data will be treated as missing data
# View first 10 rows
head(onesevendata, 10)

# A tibble: 10 × 78
   StartDate           EndDate             Status IPAddress      Progress
   <dttm>              <dttm>               <dbl> <chr>             <dbl>
 1 2025-06-30 15:29:14 2025-06-30 17:10:54      0 64.128.175.42       100
 2 2025-06-30 15:52:41 2025-06-30 18:54:44      0 149.125.91.33       100
 3 2025-06-30 21:07:27 2025-06-30 21:16:25      0 153.33.244.42       100
 4 2025-06-30 23:33:05 2025-06-30 23:38:26      0 149.125.195.32      100
 5 2025-06-24 16:55:07 2025-06-24 16:55:38      0 24.47.129.138        22
 6 2025-07-06 21:21:04 2025-07-07 18:10:49      0 64.128.175.42       100
 7 2025-07-08 07:04:12 2025-07-08 18:09:29      0 149.125.88.193      100
 8 2025-07-09 04:53:03 2025-07-09 05:14:22      0 166.194.188.15      100
 9 2025-08-25 12:24:10 2025-08-25 12:25:47      1 <NA>                100
10 2025-08-29 10:22:06 2025-08-29 10:25:02      1 <NA>                100
# ℹ 73 more variables: `Duration (in seconds)` <dbl>, Finished <dbl>,
#   RecordedDate <dttm>, ResponseId <chr>, RecipientLastName <lgl>,
#   RecipientFirstName <lgl>, RecipientEmail <lgl>, ExternalReference <lgl>,
#   LocationLatitude <dbl>, LocationLongitude <dbl>, DistributionChannel <chr>,
#   UserLanguage <chr>, Q_RecaptchaScore <dbl>, SURVEYDAY <dbl>,
#   PASSWORD_COLOR <dbl>, PASSWORD <chr>, `7DAYS` <dbl>, YEAR <dbl>,
#   PROGRAM <dbl>, LIVING <dbl>, `GENDER ` <dbl>, SEXUALIDENTITY <dbl>, …

Code

#source: Importing Data Once (Hei & McCarty, 2025): https://shanemccarty.github.io/FRIplaybook/import-once.html

1.2 Daily

Code

library(readxl)
# Import Excel file
dailydata <- read_excel(
    "daily.survey.xlsx",
    col_names = TRUE)

New names:
• `HEARTRATE` -> `HEARTRATE...39`
• `HEARTRATE` -> `HEARTRATE...43`

Code

dailydata[dailydata == -99] <- NA
dailydata[dailydata == -50] <- NA

##explanation: all -99 and -50 data will be treated as missing data


# View first 10 rows
head(dailydata, 10)

# A tibble: 10 × 54
   StartDate   EndDate Status IPAddress Progress Duration (in seconds…¹ Finished
   <chr>       <chr>   <chr>  <chr>     <chr>    <chr>                  <chr>   
 1 Start Date  End Da… Respo… IP Addre… Progress Duration (in seconds)  Finished
 2 45791.8068… 45791.… 0      149.125.… 44       264                    0       
 3 45792.3557… 45795.… 0      66.67.6.… 89       285250                 0       
 4 45811.8646… 45811.… 1      <NA>      100      26                     1       
 5 45818.2859… 45818.… 0      149.125.… 100      604                    1       
 6 45818.9734… 45818.… 0      149.125.… 100      434                    1       
 7 45819.2866… 45819.… 0      149.125.… 100      1013                   1       
 8 45819.8208… 45819.… 0      149.125.… 100      222                    1       
 9 45820.8984… 45820.… 0      172.59.1… 100      171                    1       
10 45821.3397… 45821.… 0      149.125.… 100      1753                   1       
# ℹ abbreviated name: ¹`Duration (in seconds)`
# ℹ 47 more variables: RecordedDate <chr>, ResponseId <chr>,
#   RecipientLastName <chr>, RecipientFirstName <chr>, RecipientEmail <chr>,
#   ExternalReference <chr>, LocationLatitude <chr>, LocationLongitude <chr>,
#   DistributionChannel <chr>, UserLanguage <chr>, Q_RecaptchaScore <chr>,
#   PASSWORD_COLOR <chr>, PASSWORD <chr>, DAY <chr>, `STRESS _1` <chr>,
#   STARTTIME <chr>, ENDTIME <chr>, TIMEINBED <chr>, SLEEPSCORE <chr>, …

Code

#source: Importing Data Once (Hei & McCarty, 2025): https://shanemccarty.github.io/FRIplaybook/import-once.html

2 Tidy

2.1 Convert Long to Wide for Day 1 & 7 Surveys

Code

library(tidyr)

## Convert to wide format
wide_onesevendata <- onesevendata %>%
  pivot_wider(
    id_cols = PASSWORD,
    names_from = SURVEYDAY, 
    values_from = c(`WELLBEING1`, `WELLBEING2`, `WELLBEING3`, `WELLBEING4`, `WELLBEING5`, `WELLBEING6`, `WELLBEING7`, `WELLBEING8`), 
    names_glue = "{.value}_T{SURVEYDAY}" 
  )

Warning: Values from `WELLBEING1`, `WELLBEING2`, `WELLBEING3`, `WELLBEING4`,
`WELLBEING5`, `WELLBEING6`, `WELLBEING7` and `WELLBEING8` are not uniquely
identified; output will contain list-cols.
• Use `values_fn = list` to suppress this warning.
• Use `values_fn = {summary_fun}` to summarise duplicates.
• Use the following dplyr code to identify duplicates.
  {data} |>
  dplyr::summarise(n = dplyr::n(), .by = c(PASSWORD, SURVEYDAY)) |>
  dplyr::filter(n > 1L)

Code

#source: https://dcl-prog.stanford.edu/list-columns.html
# i got help from danica on this part
#source: Tidying your Data (McCarty et. al., 2025): https://shanemccarty.github.io/FRIplaybook/tidyr.html

2.2 Convert Long to Wide for Daily Survey

Code

library(tidyr)

## Convert to wide format
wide_daily_survey <- dailydata %>%
  pivot_wider(
    id_cols = PASSWORD,
    names_from = DAY,
    values_from = c(SLEEPSCORE, STEPCOUNT), 
    names_glue = "{.value}_T{DAY}" 
  )

Warning: Values from `SLEEPSCORE` and `STEPCOUNT` are not uniquely identified; output
will contain list-cols.
• Use `values_fn = list` to suppress this warning.
• Use `values_fn = {summary_fun}` to summarise duplicates.
• Use the following dplyr code to identify duplicates.
  {data} |>
  dplyr::summarise(n = dplyr::n(), .by = c(PASSWORD, DAY)) |>
  dplyr::filter(n > 1L)

Code

#source: Tidying your Data (McCarty et. al., 2025): https://shanemccarty.github.io/FRIplaybook/tidyr.html

3 Import

3.1 Clean Daily Data

Code

library(readxl)
# Import Excel file
daily_survey_clean <- read_excel(
    "daily_survey_clean.xlsx",
    col_names = TRUE)

#source: Importing Data Once (Hei & McCarty, 2025): https://shanemccarty.github.io/FRIplaybook/import-once.html
#explanation: all -99 and -50 data will be treated as missing data

4 Transform

4.1 Recoding

Code

library(dplyr)
library(ggplot2)

# Fix list columns in wide_onesevendata
library(dplyr)

wide_onesevendata <- wide_onesevendata %>%
  mutate(across(starts_with("WELLBEING"),
                ~ as.numeric(as.character(sapply(., `[`, 1)))))

Warning: There were 24 warnings in `mutate()`.
The first warning was:
ℹ In argument: `across(...)`.
Caused by warning:
! NAs introduced by coercion
ℹ Run `dplyr::last_dplyr_warnings()` to see the 23 remaining warnings.

Code

# Select relevant WELLBEING columns + PASSWORD
masterdata <- wide_onesevendata %>%
  select(starts_with("WELLBEING"), PASSWORD)

# Convert all WELLBEING columns to numeric
masterdata <- masterdata %>%
  mutate(across(starts_with("WELLBEING"), as.numeric))


# Optional recoding (to labels if you want them as text — otherwise skip)
recode_labels <- function(x) {
  case_when(
    x == 1 ~ "Not at all",
    x == 2 ~ "A little bit",
    x == 3 ~ "Moderately",
    x == 4 ~ "Quite a bit",
    x == 5 ~ "Extremely",
    TRUE ~ NA_character_
  )
}

masterdata <- wide_onesevendata %>%
  select(starts_with("WELLBEING"), PASSWORD) %>%
  mutate(across(starts_with("WELLBEING"), ~ as.numeric(as.character(sapply(., `[`, 1)))))

# Create mean wellbeing score (numeric)
masterdata$WELLBEING <- rowMeans(masterdata %>% select(starts_with("WELLBEING")), na.rm = TRUE)

# Plot WELLBEING distribution
ggplot(masterdata, aes(x = WELLBEING)) +
  geom_histogram(binwidth = 0.5, color = "black")

Warning: Removed 1 row containing non-finite outside the scale range
(`stat_bin()`).

Code

wide_daily_survey <- wide_daily_survey %>%
  select(-contains("What day of data collection"))

# Convert STEPCOUNT columns to numeric before averaging
wide_daily_survey <- wide_daily_survey %>%
  mutate(across(starts_with("STEPCOUNT"),
                ~ as.numeric(as.character(sapply(., `[`, 1)))))

Warning: There were 8 warnings in `mutate()`.
The first warning was:
ℹ In argument: `across(...)`.
Caused by warning:
! NAs introduced by coercion
ℹ Run `dplyr::last_dplyr_warnings()` to see the 7 remaining warnings.

Code

# Compute average stepcount
wide_daily_survey$STEPCOUNT <- rowMeans(
  wide_daily_survey %>% select(starts_with("STEPCOUNT")),
  na.rm = TRUE
)

# Plot STEPCOUNT distribution
ggplot(wide_daily_survey, aes(x = STEPCOUNT)) +
  geom_histogram(binwidth = 500, color = "black")

Warning: Removed 5 rows containing non-finite outside the scale range
(`stat_bin()`).

Code

### Checking Normality of Sleep Score


for (col in grep("^SLEEPSCORE", names(wide_daily_survey), value = TRUE)) {
  wide_daily_survey[[col]] <- as.numeric(as.character(sapply(wide_daily_survey[[col]], `[`, 1)))
}

Warning: NAs introduced by coercion
Warning: NAs introduced by coercion
Warning: NAs introduced by coercion
Warning: NAs introduced by coercion
Warning: NAs introduced by coercion
Warning: NAs introduced by coercion
Warning: NAs introduced by coercion
Warning: NAs introduced by coercion

Code

wide_daily_survey$SLEEPSCORE <- rowMeans(wide_daily_survey[, c("SLEEPSCORE_T1", "SLEEPSCORE_T2", "SLEEPSCORE_T3", "SLEEPSCORE_T4","SLEEPSCORE_T5", "SLEEPSCORE_T6", "SLEEPSCORE_T7")], na.rm=TRUE)

library(ggplot2)
ggplot(wide_daily_survey, mapping=aes(x = SLEEPSCORE)) +
  geom_histogram(binwidth = .5, color = "black")

Warning: Removed 12 rows containing non-finite outside the scale range
(`stat_bin()`).

Code

#source: datacamp

Comparing Sleepscore and Wellbeing

Code

#combining datasets
library(dplyr)

masterdata2 <- inner_join(masterdata, wide_daily_survey, by = "PASSWORD")


ggplot(masterdata2, aes(SLEEPSCORE,WELLBEING)) +
  geom_smooth(method = "lm") +
  geom_point(position = "jitter")

`geom_smooth()` using formula = 'y ~ x'

Warning: Removed 6 rows containing non-finite outside the scale range
(`stat_smooth()`).

Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_point()`).

Code

summary(masterdata2$WELLBEING)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  1.812   3.000   3.500   3.488   3.938   5.000       1

Code

summary(masterdata2$SLEEPSCORE)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  39.50   65.52   71.67   70.03   76.58   83.33       6

Code

lm(formula = WELLBEING ~ SLEEPSCORE, data = masterdata2)


Call:
lm(formula = WELLBEING ~ SLEEPSCORE, data = masterdata2)

Coefficients:
(Intercept)   SLEEPSCORE  
  3.5172409   -0.0007238

Code

#source: datacamp

4.2 Comparing Stepcount and Wellbeing

Code

ggplot(masterdata2, aes(STEPCOUNT,WELLBEING)) +
  geom_smooth(method = "lm") +
  geom_point(position = "jitter")

`geom_smooth()` using formula = 'y ~ x'

Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_smooth()`).

Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_point()`).

Code

summary(masterdata2$WELLBEING)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  1.812   3.000   3.500   3.488   3.938   5.000       1

Code

summary(masterdata2$STEPCOUNT)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    418    6970    9605   10079   12377   21341       2

Code

lm(formula = STEPCOUNT ~ WELLBEING, data = masterdata2)


Call:
lm(formula = STEPCOUNT ~ WELLBEING, data = masterdata2)

Coefficients:
(Intercept)    WELLBEING  
       3064         2013

Code

#source: datacamp