Project 2. RMD

In this project, I’m exploring the relationship between social media usage specifically TikTok and academic performance . By analyzing this dataset, which contains information about Chinese teenagers’ TikTok usage, study habits, and academic performance. I have chosen this topic because as a first year college student I struggle with using social media instead of using time to study.

I examine variables such as tiktok_use_hours_mon, study_hours_mon and sleep_quality. I chose this topic because I am interested in how digital habits affect my own study efficiency. My analysis involves cleaning the data, and I have used dplyr to subset my observations to ensure a focused look at student habits.

To what extent do daily TikTok usage hours, sleep quality, and self-studytime affect the academic performance (GPA) in Chinese teenagers?

Loading the libraries and fixing my variables

I came across a couple issues with one of my variables which were (gender_1-4) their was too many so I had to narrow it down to the ideal ones which is just “Female and Male” so it wasn’t more then 4 showing up.

Another thing I did in this code is fix the hours,I noticed for the tiktok_usage it was showing up as (120 hours) for monday but thats ulitmately incorrect so I fixed it to be 24 hours in a day and minutes instead so It made more sense.

library(readr)

library(readr)

# 1. Read the file into the variable 'df'
df <- read_csv("/Users/sadiyasow/Downloads/TikTok_Cleaned.csv")

## Rows: 362 Columns: 72
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr   (6): UserLanguage, media_use_freq_6_TEXT, phone_4_TEXT, education_part...
## dbl  (63): Status, Progress, Duration__in_seconds_, Finished, Q_RecaptchaSco...
## dttm  (3): StartDate, EndDate, RecordedDate
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# 2. Use 'df' to inspect the data (ensure this matches what you used above)
head(df)

## # A tibble: 6 × 72
##   StartDate           EndDate             Status Progress Duration__in_seconds_
##   <dttm>              <dttm>               <dbl>    <dbl>                 <dbl>
## 1 2021-01-19 04:36:52 2021-01-19 04:37:16      0      100                    24
## 2 2021-01-19 04:36:26 2021-01-19 04:37:20      0      100                    54
## 3 2021-01-19 04:31:28 2021-01-19 04:39:45      0      100                   496
## 4 2021-01-19 04:31:11 2021-01-19 04:40:25      0      100                   553
## 5 2021-01-19 04:33:11 2021-01-19 04:44:57      0      100                   705
## 6 2021-01-19 04:38:35 2021-01-19 04:45:00      0      100                   385
## # ℹ 67 more variables: Finished <dbl>, RecordedDate <dttm>, UserLanguage <chr>,
## #   Q_RecaptchaScore <dbl>, consent_parents <dbl>, consent_children <dbl>,
## #   media_use_freq_1 <dbl>, media_use_freq_2 <dbl>, media_use_freq_3 <dbl>,
## #   media_use_freq_4 <dbl>, media_use_freq_5 <dbl>, media_use_freq_6 <dbl>,
## #   media_use_freq_6_TEXT <chr>, media_use_rank_1 <dbl>,
## #   media_use_rank_2 <dbl>, media_use_rank_3 <dbl>, media_use_rank_4 <dbl>,
## #   media_use_rank_5 <dbl>, media_use_rank_6 <dbl>, …

# Load the necessary libraries
library(readr)
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(highcharter)

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

# Set the working directory to where your Rmd file is
knitr::opts_knit$set(root.dir = getwd())

file_path <- "/Users/sadiyasow/Downloads/TikTok_Cleaned.csv"


if (file.exists(file_path)) {
 
  message("File loaded successfully!")
} else {
  stop(paste("ERROR: R cannot find the file at this location:", file_path, 
             "\nCheck if the filename in your Downloads folder matches exactly."))
}

## File loaded successfully!

# 3. Preview the data
head(df)

## # A tibble: 6 × 72
##   StartDate           EndDate             Status Progress Duration__in_seconds_
##   <dttm>              <dttm>               <dbl>    <dbl>                 <dbl>
## 1 2021-01-19 04:36:52 2021-01-19 04:37:16      0      100                    24
## 2 2021-01-19 04:36:26 2021-01-19 04:37:20      0      100                    54
## 3 2021-01-19 04:31:28 2021-01-19 04:39:45      0      100                   496
## 4 2021-01-19 04:31:11 2021-01-19 04:40:25      0      100                   553
## 5 2021-01-19 04:33:11 2021-01-19 04:44:57      0      100                   705
## 6 2021-01-19 04:38:35 2021-01-19 04:45:00      0      100                   385
## # ℹ 67 more variables: Finished <dbl>, RecordedDate <dttm>, UserLanguage <chr>,
## #   Q_RecaptchaScore <dbl>, consent_parents <dbl>, consent_children <dbl>,
## #   media_use_freq_1 <dbl>, media_use_freq_2 <dbl>, media_use_freq_3 <dbl>,
## #   media_use_freq_4 <dbl>, media_use_freq_5 <dbl>, media_use_freq_6 <dbl>,
## #   media_use_freq_6_TEXT <chr>, media_use_rank_1 <dbl>,
## #   media_use_rank_2 <dbl>, media_use_rank_3 <dbl>, media_use_rank_4 <dbl>,
## #   media_use_rank_5 <dbl>, media_use_rank_6 <dbl>, …

library(readr)
library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.2     ✔ tibble    3.3.1
## ✔ lubridate 1.9.5     ✔ tidyr     1.3.2
## ✔ purrr     1.2.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

head(df)

## # A tibble: 6 × 72
##   StartDate           EndDate             Status Progress Duration__in_seconds_
##   <dttm>              <dttm>               <dbl>    <dbl>                 <dbl>
## 1 2021-01-19 04:36:52 2021-01-19 04:37:16      0      100                    24
## 2 2021-01-19 04:36:26 2021-01-19 04:37:20      0      100                    54
## 3 2021-01-19 04:31:28 2021-01-19 04:39:45      0      100                   496
## 4 2021-01-19 04:31:11 2021-01-19 04:40:25      0      100                   553
## 5 2021-01-19 04:33:11 2021-01-19 04:44:57      0      100                   705
## 6 2021-01-19 04:38:35 2021-01-19 04:45:00      0      100                   385
## # ℹ 67 more variables: Finished <dbl>, RecordedDate <dttm>, UserLanguage <chr>,
## #   Q_RecaptchaScore <dbl>, consent_parents <dbl>, consent_children <dbl>,
## #   media_use_freq_1 <dbl>, media_use_freq_2 <dbl>, media_use_freq_3 <dbl>,
## #   media_use_freq_4 <dbl>, media_use_freq_5 <dbl>, media_use_freq_6 <dbl>,
## #   media_use_freq_6_TEXT <chr>, media_use_rank_1 <dbl>,
## #   media_use_rank_2 <dbl>, media_use_rank_3 <dbl>, media_use_rank_4 <dbl>,
## #   media_use_rank_5 <dbl>, media_use_rank_6 <dbl>, …

list.files()

##   [1] "_.jpeg"                                                                 
##   [2] "0309 (2)(1).mov"                                                        
##   [3] "0316.mov"                                                               
##   [4] "0317 (1).mov"                                                           
##   [5] "0317 (1)(1).mov"                                                        
##   [6] "0317 (2).mov"                                                           
##   [7] "0321.mov"                                                               
##   [8] "0406.mov"                                                               
##   [9] "20220831.xlsx"                                                          
##  [10] "6371b8a28dc9e.pdf"                                                      
##  [11] "81ec3feb-4283-4c98-bd3c-61a0b2287e57.png"                               
##  [12] "Airbnb_DC_25.csv"                                                       
##  [13] "Airbnb_DC_25.xlsx"                                                      
##  [14] "Alluvials_week5.qmd"                                                    
##  [15] "arrests-latest.feather"                                                 
##  [16] "Assignment 10 Sadiya Sow.pdf"                                           
##  [17] "Assignment 3.pdf"                                                       
##  [18] "Assignment 4.pdf"                                                       
##  [19] "Assignment 5 Sadiya Sow"                                                
##  [20] "Assignment 7.Rmd"                                                       
##  [21] "Assignment 8.qmd"                                                       
##  [22] "Assignment-7.html"                                                      
##  [23] "Assignment8.docx"                                                       
##  [24] "Assignment8.pdf"                                                        
##  [25] "bar_charts_with_diamonds (1).qmd"                                       
##  [26] "bar_charts_with_diamonds.qmd"                                           
##  [27] "bird with spiky hair on head - Google Search.png"                       
##  [28] "Black and Brown Elegant Potluck Sign-Up Sheet US Letter Document.pdf"   
##  [29] "cancer.csv"                                                             
##  [30] "CapCut_7586419543699013646_installer.dmg"                               
##  [31] "CapCut_7607909408877641741_installer.dmg"                               
##  [32] "CapCut_7610309049170771982_installer.dmg"                               
##  [33] "CapCut_7615277451215912973_installer.dmg"                               
##  [34] "certificate.pdf"                                                        
##  [35] "comparing-the-cost-of-project-apollo-to-today-s-industrial-policies.png"
##  [36] "convertcsv.csv"                                                         
##  [37] "Copy of Resume.docx"                                                    
##  [38] "copy_45DD6420-2141-4D0E-938C-18CCEBDF3D93.MOV"                          
##  [39] "copy_A3294811-5BDE-4E01-BFF3-980EB5836225 2.MOV"                        
##  [40] "copy_A3294811-5BDE-4E01-BFF3-980EB5836225.MOV"                          
##  [41] "Crime_Data_from_2020_to_2024 copy.csv"                                  
##  [42] "Crime_Data_from_2020_to_2024.csv"                                       
##  [43] "Data"                                                                   
##  [44] "Data 110 -correlation scatterplots and plotly.pdf"                      
##  [45] "Data 110 Project 2.pdf"                                                 
##  [46] "Data 110 unit 1 intro and Markdown and Rpubs (1).html"                  
##  [47] "Data 110 unit 4 ethics p-hacking reproducibility_week 4 - Tagged.pdf"   
##  [48] "Data 110 unit 4 ethics p-hacking reproducibility_week 4.docx"           
##  [49] "Data 110 unit 5 heatmaps treemaps streamgraphs alluvials.pdf"           
##  [50] "Data 110 unit 9 webscraping and ethics (1).docx"                        
##  [51] "Data 110 unit 9 webscraping and ethics.docx"                            
##  [52] "Data 2"                                                                 
##  [53] "Data 3"                                                                 
##  [54] "Data 4"                                                                 
##  [55] "Data 5"                                                                 
##  [56] "Data.zip"                                                               
##  [57] "detention-stints-latest.numbers"                                        
##  [58] "detention-stints-latest.xlsx"                                           
##  [59] "Downloads.Rproj"                                                        
##  [60] "edanmdm-npg_NPG.86.TC52.txt"                                            
##  [61] "ERO Admin Arrests_LESA-STU-FINAL Release_raw.xlsx"                      
##  [62] "food_stamps (1).csv"                                                    
##  [63] "food_stamps (1).numbers"                                                
##  [64] "food_stamps.csv"                                                        
##  [65] "food_stamps.numbers"                                                    
##  [66] "gemini-code-1776895629643.r"                                            
##  [67] "Grammarly.o1.cxjsoW8fnfnaa6hd8pku03g2.dmg"                              
##  [68] "Group Project Paper 108 New.pdf"                                        
##  [69] "household_debt (1).csv"                                                 
##  [70] "household_debt.csv"                                                     
##  [71] "How to Maintain Good Habits as a College Student.pdf"                   
##  [72] "ICE Detentions_LESA-STU_FINAL Release_raw (1).xlsx"                     
##  [73] "ICE Detentions_LESA-STU_FINAL Release_raw (2).xlsx"                     
##  [74] "ICE Detentions_LESA-STU_FINAL Release_raw.xlsx"                         
##  [75] "ice_release_1dec2025 (1)"                                               
##  [76] "ice_release_1dec2025 (1) 2"                                             
##  [77] "ice_release_1dec2025 (1).zip"                                           
##  [78] "ice_release_1dec2025.zip"                                               
##  [79] "images"                                                                 
##  [80] "IMG_0022.HEIC"                                                          
##  [81] "IMG_0148.JPG"                                                           
##  [82] "IMG_0208.HEIC"                                                          
##  [83] "IMG_0258-preview.HEIC"                                                  
##  [84] "IMG_0448.HEIC"                                                          
##  [85] "IMG_0457.HEIC"                                                          
##  [86] "IMG_0624.HEIC"                                                          
##  [87] "IMG_0631.heic"                                                          
##  [88] "IMG_0632.heic"                                                          
##  [89] "IMG_0647.MOV"                                                           
##  [90] "IMG_0653.MOV"                                                           
##  [91] "IMG_0654.MOV"                                                           
##  [92] "IMG_0655.MOV"                                                           
##  [93] "IMG_0679.HEIC"                                                          
##  [94] "IMG_0690.HEIC"                                                          
##  [95] "IMG_0724.MOV"                                                           
##  [96] "IMG_0732.MOV"                                                           
##  [97] "IMG_0733.MOV"                                                           
##  [98] "IMG_0749.heic"                                                          
##  [99] "IMG_0767.jpg"                                                           
## [100] "IMG_0786.HEIC"                                                          
## [101] "IMG_0787.HEIC"                                                          
## [102] "IMG_0793.HEIC"                                                          
## [103] "IMG_0816.HEIC"                                                          
## [104] "IMG_0846.HEIC"                                                          
## [105] "IMG_0907.jpg"                                                           
## [106] "IMG_1008.HEIC"                                                          
## [107] "IMG_1010.HEIC"                                                          
## [108] "IMG_1011.HEIC"                                                          
## [109] "IMG_1013.HEIC"                                                          
## [110] "IMG_1014.HEIC"                                                          
## [111] "IMG_1015.HEIC"                                                          
## [112] "IMG_6921.heic"                                                          
## [113] "IMG_8123.HEIC"                                                          
## [114] "IMG_8343.HEIC"                                                          
## [115] "IMG_8344.HEIC"                                                          
## [116] "IMG_8346.HEIC"                                                          
## [117] "IMG_8348.HEIC"                                                          
## [118] "IMG_8349.HEIC"                                                          
## [119] "IMG_8350.HEIC"                                                          
## [120] "IMG_8351.HEIC"                                                          
## [121] "IMG_8381.jpg"                                                           
## [122] "IMG_8510.HEIC"                                                          
## [123] "IMG_8511.HEIC"                                                          
## [124] "IMG_8581.HEIC"                                                          
## [125] "IMG_8582.HEIC"                                                          
## [126] "IMG_8583.HEIC"                                                          
## [127] "IMG_8603.HEIC"                                                          
## [128] "IMG_8604.HEIC"                                                          
## [129] "IMG_8701.HEIC"                                                          
## [130] "IMG_8702.HEIC"                                                          
## [131] "IMG_8997.HEIC"                                                          
## [132] "IMG_8998.HEIC"                                                          
## [133] "IMG_9075.HEIC"                                                          
## [134] "IMG_9206.JPG"                                                           
## [135] "IMG_9306 2.HEIC"                                                        
## [136] "IMG_9306.HEIC"                                                          
## [137] "IMG_9307 2.HEIC"                                                        
## [138] "IMG_9307.HEIC"                                                          
## [139] "Install Respondus LockDown Browser (x64c) 171415267.pkg"                
## [140] "KHSM Detentions (Book-ins) fy25m11.xlsx"                                
## [141] "Kimmel_Accounting_8e_ET_Ch01_Introduction-to-Financial-Statements.xlsx" 
## [142] "Lecture 1.pptx"                                                         
## [143] "Lecture 10 Aggregate demand and supply.pptx"                            
## [144] "Lecture 10b Monetary and fiscal Policy  - Tagged.pdf"                   
## [145] "Lecture 10b Monetary and fiscal Policy .pptx"                           
## [146] "Lecture 11 - Money, Bank and FED.pptx"                                  
## [147] "Lecture 2 PPF.pptx"                                                     
## [148] "Lecture 3 Demand (1).pptx"                                              
## [149] "Lecture 3 Demand.key"                                                   
## [150] "Lecture 3 Demand.pptx"                                                  
## [151] "Lecture 4 Supply and Market.key"                                        
## [152] "Lecture 4 Supply and Market.pptx"                                       
## [153] "Lecture 5.pdf"                                                          
## [154] "Lecture 9 Aggregate expenditure (s).pptx"                               
## [155] "Mac-GUI-1.82"                                                           
## [156] "Mac-GUI-1.82.tar.gz"                                                    
## [157] "Major change.pdf"                                                       
## [158] "Markets, Competition, and the Law of Demand - NotebookLM_files"         
## [159] "Markets, Competition, and the Law of Demand - NotebookLM.html"          
## [160] "meeting-94217853577.ics"                                                
## [161] "NASA-budget (1).png (1).webp"                                           
## [162] "Nations Analysis Sadiya Sow_files"                                      
## [163] "Nations Analysis Sadiya Sow.html"                                       
## [164] "Nations Analysis Sadiya Sow.qmd"                                        
## [165] "nations.csv"                                                            
## [166] "NPG-NPG_86_TC52NewEconomy-000001.txt"                                   
## [167] "oppurtunity cost in space "                                             
## [168] "Outline  Research Paper PHIL140 Sadiya Sow.pdf"                         
## [169] "Paper outline stucture.docx"                                            
## [170] "Peer Review Essay 2 Sadiya Sow"                                         
## [171] "project 1"                                                              
## [172] "Project 1 (1).Rmd"                                                      
## [173] "Project 1.Rmd"                                                          
## [174] "Project 2.RMD"                                                          
## [175] "Project-2.html"                                                         
## [176] "Project-2.RMD"                                                          
## [177] "Project.html"                                                           
## [178] "Project.Rmd"                                                            
## [179] "Project.Rmd 2.Rmd"                                                      
## [180] "Project1 (2).Rmd"                                                       
## [181] "Project1--2-.html"                                                      
## [182] "Project1.qmd"                                                           
## [183] "psych100.pdf"                                                           
## [184] "psychiatryint-06-00025 (1).pdf"                                         
## [185] "psychiatryint-06-00025.pdf"                                             
## [186] "R-4.5.2-arm64 (1).pkg"                                                  
## [187] "R-4.5.2-arm64.pkg"                                                      
## [188] "reading data in three ways and accessibility.qmd"                       
## [189] "reading-data-in-three-ways-and-accessibility_files"                     
## [190] "Roblox.dmg"                                                             
## [191] "Rough Draft Essay 1.pdf"                                                
## [192] "Rough+Draft+Essay+1.pdf"                                                
## [193] "Rplot.pdf"                                                              
## [194] "Rplot01.png"                                                            
## [195] "Rplot02.png"                                                            
## [196] "Rpubs.pdf"                                                              
## [197] "rsconnect"                                                              
## [198] "RStudio-2026.01.0-392 (1).dmg"                                          
## [199] "RStudio-2026.01.0-392.dmg"                                              
## [200] "RStudio-2026.01.1-403.dmg"                                              
## [201] "sadiyasow345@gmail.com.ical.zip"                                        
## [202] "Screen Recording 2026-03-31 at 11.56.01 PM.mp4"                         
## [203] "Screenshot 2026-04-06 at 2.11.29 PM.png"                                
## [204] "Sheet1.csv"                                                             
## [205] "Speech Template 2  (1).pptx"                                            
## [206] "Speech Template 2 .pptx"                                                
## [207] "spend 2 weeks in dubai w me facetime edition 2.mov"                     
## [208] "spend 2 weeks in dubai w me facetime edition.mov"                       
## [209] "spring_correlation scatterplots and regression.qmd"                     
## [210] "Surviving childhood in Africa - BBC News_files"                         
## [211] "Surviving childhood in Africa - BBC News.html"                          
## [212] "Table_1.docx"                                                           
## [213] "TableauPublic-2026-1-0-arm64.dmg"                                       
## [214] "Template Final Group Project  (1).pptx"                                 
## [215] "Template Final Group Project .pptx"                                     
## [216] "The_Influence_of_TikTok_Media_Exposure_on_Body_Ima.pdf"                 
## [217] "TikTok Fitspiration (1).sav"                                            
## [218] "TikTok Fitspiration.sav"                                                
## [219] "TikTok_Cleaned (1).csv"                                                 
## [220] "TikTok_Cleaned.csv"                                                     
## [221] "TikTok_SMSCF_Chinese_students_NL_RAW (1).sav"                           
## [222] "TikTok_SMSCF_Chinese_students_NL_RAW.sav"                               
## [223] "TikTok_SMSCF_Chinese_teenagers_RAW.sav"                                 
## [224] "TikTok_SMSCF_S1_Chinese_adolescents_clear_final_29.03.2022.sav"         
## [225] "TikTok_SMSCF_S1_Chinese_adolescents_clear_final_29.03.2022.sav copy"    
## [226] "TikTok_SMSCF_S2_Chinese_uni_students_clear_final_29.03.2022.sav"        
## [227] "Understanding Graphs.pptx"                                              
## [228] "Untitled document.docx"                                                 
## [229] "Untitled.qmd"                                                           
## [230] "Untitled.R"                                                             
## [231] "Untitled.Rmd"                                                           
## [232] "View recent photos 2.png"                                               
## [233] "View recent photos.png"                                                 
## [234] "Web Scraping_.pptx"                                                     
## [235] "Webscraping_IMDB19_tutorial_and_questions.qmd"                          
## [236] "Week 10_R Shiny App (1).pptx"                                           
## [237] "Week 10_R Shiny App.pptx"                                               
## [238] "Week 8.pptx"                                                            
## [239] "Week9_WebScrap (1).R"                                                   
## [240] "Week9_WebScrap.R"                                                       
## [241] "Zoom.pkg"

getwd()

## [1] "/Users/sadiyasow/Downloads"

Load file as Read.csv:

# Load the necessary libraries
library(readr)
library(tidyverse)




head(df)

## # A tibble: 6 × 72
##   StartDate           EndDate             Status Progress Duration__in_seconds_
##   <dttm>              <dttm>               <dbl>    <dbl>                 <dbl>
## 1 2021-01-19 04:36:52 2021-01-19 04:37:16      0      100                    24
## 2 2021-01-19 04:36:26 2021-01-19 04:37:20      0      100                    54
## 3 2021-01-19 04:31:28 2021-01-19 04:39:45      0      100                   496
## 4 2021-01-19 04:31:11 2021-01-19 04:40:25      0      100                   553
## 5 2021-01-19 04:33:11 2021-01-19 04:44:57      0      100                   705
## 6 2021-01-19 04:38:35 2021-01-19 04:45:00      0      100                   385
## # ℹ 67 more variables: Finished <dbl>, RecordedDate <dttm>, UserLanguage <chr>,
## #   Q_RecaptchaScore <dbl>, consent_parents <dbl>, consent_children <dbl>,
## #   media_use_freq_1 <dbl>, media_use_freq_2 <dbl>, media_use_freq_3 <dbl>,
## #   media_use_freq_4 <dbl>, media_use_freq_5 <dbl>, media_use_freq_6 <dbl>,
## #   media_use_freq_6_TEXT <chr>, media_use_rank_1 <dbl>,
## #   media_use_rank_2 <dbl>, media_use_rank_3 <dbl>, media_use_rank_4 <dbl>,
## #   media_use_rank_5 <dbl>, media_use_rank_6 <dbl>, …

Data Cleaning:

Data Cleaning (dplyr): I used the filter() and mutate() commands from the dplyr library. The filter() command was essential for removing missing values (NA) in key columns, ensuring that the visualization only reflects complete observations. The mutate() command was used to transform the gender variable into a factor, which allows the plotting engine to treat these values as distinct groups (colors) rather than a continuous numeric range.
I came across an issue that I had to fix which was my hours were not adding up with the tik tok usage. I showed up as “120 hours” so I went ahead and fixed it to filter out to minutes and do 24 hours.

df_summary <- df %>%

  filter(!is.na(sleep_quality), !is.na(tiktok_use_hours_mon)) %>%

  group_by(sleep_quality) %>%

  summarize(mean_usage = mean(tiktok_use_hours_mon, na.rm = TRUE)) %>%

  mutate(sleep_quality = factor(sleep_quality))





ggplot(df_summary, aes(x = sleep_quality, y = mean_usage, fill = sleep_quality)) +

  geom_col() +

  labs(title = "Average TikTok Usage by Sleep Quality",

       x = "Sleep Quality Rating (1=Poor, 4=Good)",

       y = "Mean TikTok Usage (Hours/Monday)") +

  theme_minimal() +
  theme(legend.position = "none")

I wanted to see if more TikTok time meant worse sleep, so I made this bar chart to look at the average usage for each sleep quality rating. suprisingly the data actually shows that students who reported the best sleep (level 4) also have the highest average TikTok usage Which I wasn’t expecting, a positive correlation is there, but it also suggests that for these students, TikTok isn’t necessarily ruining their sleep quality. So it could be another factor that plays into it maybe that they’re better using Tik tok differently or managing their time.

df_plot <- df %>%
  # Remove missing data
  filter(!is.na(tiktok_use_hours_mon), !is.na(scores), !is.na(gender)) %>%
  # Explicitly keep only gender 1 and 2
  filter(gender %in% c(1, 2)) %>%

  mutate(gender_label = factor(gender, levels = c(1, 2), labels = c("Male", "Female")))


hchart(df_plot, "scatter", hcaes(x = tiktok_use_hours_mon, 
                                 y = scores, 
                                 group = gender_label)) %>%
  # Blue for Male (#377EB8), Pink for Female (#F781BF)
  hc_colors(c("#377EB8", "#F781BF")) %>% 
  # This adds the score numbers on the dots
  hc_plotOptions(scatter = list(dataLabels = list(enabled = TRUE, format = "{point.y}"))) %>%
  hc_title(text = "TikTok Usage vs. Academic Score by Gender") %>%
  hc_xAxis(title = list(text = "TikTok Hours (Monday)")) %>%
  hc_yAxis(title = list(text = "Academic Score (GPA Proxy)")) %>%
  hc_add_theme(hc_theme_elementary()) %>%
  hc_caption(text = "Source: TikTok_Cleaned.csv")

This scatter plot shows how daily TikTok usage relates to academic performance (relatively their GPA), with the data broken down by gender to see if there are any differences.

To get the data ready, I had to do a bit of cleaning. I removed entries with missing values to keep the results accurate and filtered the dataset to focus on the two genders since their were more then two genders in the data, Males and Females. I also converted the time data into minutes and changed it at 24 hours (1,440 minutes). This was a necessary step because the raw data included some unrealistic hourslike 120-hour days which isn’t possible.

Looking at the plot, I was checking for a trend to see if higher TikTok usage correlates with lower academic scores. While it’s hard to find a ultimate idea , the scatter plot showed where the individual students fell. It shows that academic success isn’t always just one factor.

library(highcharter)
library(dplyr)

# 1. Clean the data: Filter for valid 0-24 hour range, then convert to minutes
df_plot <- df %>%
  filter(!is.na(tiktok_use_hours_mon), !is.na(study_hours_mon), !is.na(gender)) %>%
  # Filter to remove anything greater than 24 hours
  filter(tiktok_use_hours_mon <= 24 & study_hours_mon <= 24) %>%
  mutate(
    gender_factor = as.factor(gender),
    # Convert to minutes
    tiktok_use_min_mon = tiktok_use_hours_mon * 60,
    study_min_mon = study_hours_mon * 60
  )

hchart(df_plot, "scatter", hcaes(x = tiktok_use_min_mon, 
                                 y = study_min_mon, 
                                 size = sleep_quality, 
                                 group = gender_factor)) %>%
  hc_colors(c("#377EB8", "#F781BF", "#984EA3")) %>% 
  

  hc_plotOptions(scatter = list(dataLabels = list(enabled = TRUE, format = "{point.y}"))) %>%
  
  
  hc_xAxis(title = list(text = "TikTok Usage (Minutes/Monday)"),
           min = 0, max = 1440,
           labels = list(format = "{value}")) %>%
  hc_yAxis(title = list(text = "Study Time (Minutes)"),
           min = 0, max = 1440,
           labels = list(format = "{value}")) %>%
  
  hc_title(text = "Study Time vs. TikTok Usage (Minutes, Limited to 24hr)") %>%
  hc_add_theme(hc_theme_elementary()) %>%
  hc_caption(text = "Source: TikTok_Cleaned.csv")

For this final visualization, I wanted to see how study time and TikTok usage could be negative against each other, while also factoring in sleep quality. I used a bubble chart to map TikTok usage on the X-axis and study time on the Y-axis. The cool part is that the size of each bubble represents the student’s reported sleep quality, so i factored in a 3rd variable in this graph.

I had to do some more data cleaning which was again changing the hours to 24 hours to be minutes too and then I multiplied the hourly data by 60 so that it converted everything to minutes so it was easier to understand and read. Even thoough it’s still very clustered. I had to use AI to help convert the minutes as well. For the grouping I used the gender _factor so that the color were grouped and datalabels for specifc study time.

Model Linear regression:

model <- lm(study_hours_mon ~ tiktok_use_hours_mon + sleep_quality, data = df)

# View the results
summary(model)

## 
## Call:
## lm(formula = study_hours_mon ~ tiktok_use_hours_mon + sleep_quality, 
##     data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -108950  -39969  -39425  -39413 9896482 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)
## (Intercept)          167608.4   103745.1   1.616    0.107
## tiktok_use_hours_mon    271.7     4830.5   0.056    0.955
## sleep_quality        -64090.4    47651.4  -1.345    0.180
## 
## Residual standard error: 605400 on 270 degrees of freedom
##   (89 observations deleted due to missingness)
## Multiple R-squared:  0.006713,   Adjusted R-squared:  -0.000645 
## F-statistic: 0.9123 on 2 and 270 DF,  p-value: 0.4028

Equation:

lm(study_hours_mon ~ tiktok_use_hours_mon + sleep_quality, data = data)

(Intercept): The predicted study hours when TikTok usage and sleep quality are zero.
The expected change in study hours for every 1-hour increase in TikTok usage (holding sleep quality constant).
The expected change in study hours for every 1-unit increase in sleep quality (holding TikTok usage constant).
The expected change in study hours for every 1-hour increase in TikTok usage (holding sleep quality constant).
The expected change in study hours for every 1-unit increase in sleep quality (holding TikTok usage constant).
The error term (residual).

Conclusion

This project has provided a practical look at how student survey data can be transformed into actionable insights. By cleaning the “Tiktok_Cleaned.csv” dataset and applying visualization , I was able to observe how social media engagement correlates with academic and lifestyle variables.

While the data did not reveal a simple, one-size-fits-all rule, it goes over the diversity of student experiences. Some students maintain their study hours which are relatively high regardless of TikTok usage, while others show a more inverse relationship. This data reinforced how crucial it is to do data cleaning and how to remove the “NA” values and properly characterize variables, this can help prevent errors and improve insights. To conclude , this project aids as a step toward being more considerate our time management.

What I would do differently is maybe pick a better data set as this one had lots of cleaning to do like I said in the conclusion, it was a lot of processing and filtering out but over all I liked the project and It made me think about decisions I should be making in my study and sleep time.

Citation:

from (https://www.sciencedirect.com/science/article/pii/S0001691824004438)

TikTok_Cleaned.csv

AI WAS USED in changing my file sot CSV and trying to figure out why RMD file wasn’t rendering, and lastly to help me figure out how to convert my hours into minutes for my visualizations. :)

Project 2. RMD

Sadiya Sow

2026-04-23

Loading the libraries and fixing my variables

Conclusion