# if you haven't run this code before, you'll need to download the below packages first
# you should see a prompt near the top of the page (in a yellow bar within the RStudio window)
# you can also use the packages tab to the right
library(naniar) # for the gg_miss-upset() commandData Prep
Load Libraries
Import Data
# # for the HW, you'll import the CSV file of your chosen dataset
# df <- read.csv(file= eammi2_data_final.csv)Viewing Data
# # these are commands useful for viewing a dataframe
# # you can also click the object in the environment tab to view it in a new window
# names(df)
# head(df)
# str(df)Subsetting Data
# # use the codebook you created in the codebook activity to get the names of your variables (first column)
# # enter this list of names in the select=c() argument to subset those columns from the dataframe
# d <- subset(df, select=c(age, gender,stress,swb,moa_maturrity))Missing Data
# # use the gg_miss_upset() command for a visualization of your missing data
# gg_miss_upset(d, nsets = 6)
#
# # use the na.omit() command to create a new dataframe in which any participants with missing data are dropped from the dataframe
# d2 <- na.omit(d)
#
# # use a bit of math to see what percentage of participants had missing data
# # math will go hereExporting Data
# # last step is to export the data after you've dropped NAs
# write.csv(d2, file="../Data/labdata.csv", row.names = F)Write-Up
In this section, you should create a write-up of what you did, using language that would be suitable for a manuscript. Make sure you include: selecting six variables to focus on, dropping participants with missing data, your percentage of how many participants were dropped, and your final number of participants. I have given you a template you can follow below – you should delete the other text in this section and only include your write-up.
Remember – this should be writing appropriate for a manuscript! We use shortened abbreviations when referring to variables in R, but these labels don’t make sense to include in your manuscript. You should include something more descriptive for your writeup.
We selected six variables from the [placeholder] dataset to focus on in our analysis: [placeholder]. Participants with missing data ([placeholder]%) in these six variables were dropped from our analysis, leaving us a final sample of n = [placeholder].