Basic Statistics

Load Libraries

# if you haven't run this code before, you'll need to download the below packages first
# instructions on how to do this are included in the video
# but as a reminder, you use the packages tab to the right

library(psych) # for the describe() command
library(expss) # for the cross_cases() command
## Loading required package: maditr
## 
## Use magrittr pipe '%>%' to chain several operations:
##              mtcars %>%
##                  let(mpg_hp = mpg/hp) %>%
##                  take(mean(mpg_hp), by = am)
## 
## 
## Attaching package: 'maditr'
## The following object is masked from 'package:base':
## 
##     sort_by
## 
## Use 'expss_output_rnotebook()' to display tables inside R Notebooks.
##  To return to the console output, use 'expss_output_default()'.

Import Data

# import our data for the lab
# for the homework, you will import the mydata.csv that we created in the Data Prep Lab

d2 <- read.csv(file="Data/mydata.csv", header = T)

Univariate Plots: Histograms & Tables

table(d2$age) #the table command shows us what the levels of this variable are, and how many participants are in each level
## 
## 1 between 18 and 25 2 between 26 and 35 3 between 36 and 45           4 over 45 
##                1947                 115                  37                  16
table(d2$marriage5)
## 
##             are currently divorced from one another 
##                                                 509 
##                are currently married to one another 
##                                                1401 
##       never married each other and are not together 
##                                                 166 
## never married each other but are currently together 
##                                                  39
hist(d2$moa_safety)

#the hist command creates a histogram of the variable
hist(d2$swb)

hist(d2$support)

hist(d2$socmeduse)

Univariate Normality

We analyzed the skew and kurtosis of our continuous variables and all were within the accepted range (-2/+2).(true for the lab, may not be true for the homework)

We analyzed the skew and kurtosis of our variables and most were within the accepted range (-2/+2). However, some variables (age) were outside of the accepted range. For this analysis, we will use them anyway, but outside of this class this is bad practice.

describe(d2) # we use this to check unvariate normality.. skew and kurtosis, (-2/+2)
##            vars    n  mean   sd median trimmed  mad min max range  skew
## age*          1 2115  1.11 0.42   1.00    1.00 0.00   1   4     3  4.41
## marriage5*    2 2115  1.87 0.61   2.00    1.82 0.00   1   4     3  0.56
## moa_safety    3 2115  3.21 0.65   3.25    3.28 0.74   1   4     3 -0.71
## swb           4 2115  4.44 1.33   4.50    4.50 1.48   1   7     6 -0.36
## support       5 2115  5.54 1.13   5.75    5.66 0.99   0   7     7 -1.09
## socmeduse     6 2115 34.23 8.59  35.00   34.50 7.41  11  55    44 -0.31
##            kurtosis   se
## age*          21.16 0.01
## marriage5*     1.52 0.01
## moa_safety    -0.06 0.01
## swb           -0.48 0.03
## support        1.34 0.02
## socmeduse      0.18 0.19

Bivariate Plots

Crosstabs

cross_cases(d2, age, marriage5) #update variable2 and variable3 with categorical variable names
 marriage5 
 are currently divorced from one another   are currently married to one another   never married each other and are not together   never married each other but are currently together 
 age 
   1 between 18 and 25  453 1311 147 36
   2 between 26 and 35  38 62 13 2
   3 between 36 and 45  15 16 5 1
   4 over 45  3 12 1
   #Total cases  509 1401 166 39

Scatterplots

plot(d2$support, d2$socmeduse,
     main="Scatterplot of support and socmeduse",
     xlab = "support",
     ylab = "socmeduse")

plot(d2$swb, d2$moa_safety,
     main="Scatterplot of swb and moa_safety",
     xlab = "swb",
     ylab = "moa_safety")

Boxplots

 # boxplots use One categorical and one continuous variable
 # make sure that you enter them in the right order
 # categorical variables go before the tilde (~)
 # continuous variable goes after the tilde

boxplot(data=d2, socmeduse ~ age,
        main="Boxplot of Social Medua Use and Age",
        xlab = "Social Media Use",
        ylab = "Age")

boxplot(data=d2, support ~ age,
        main="Boxplot of Support and Parents Marrital Status",
        xlab = "Level of Support",
        ylab = "Parents Marital Status")