Basic Statistics Lab

Load Libraries

# remember, you might need to install packages

library(psych) # for the describe() command
library(expss) # for the cross_cases() command

Load Data

 d <- read.csv(file="Data/mydata.csv", header=T)
 names(d)
[1] "age"    "gender" "rse"    "pss"    "phq"    "edeq12"

Univariate Plots: Histograms & Tables

 table(d$age)

         1 under 18 2 between 18 and 25 3 between 26 and 35 4 between 36 and 45 
                619                  55                   6                  84 
          5 over 45 
                175 
table(d$gender)

            female I use another term               male  Prefer not to say 
               764                 21                143                 11 
 hist(d$rse)

 hist(d$pss)

 hist(d$phq)

 hist(d$edeq12)

Univariate Normality

Check skew and kurtosis.

psych::describe(d)
        vars   n mean   sd median trimmed  mad min max range  skew kurtosis
age*       1 939 2.09 1.64   1.00    1.86 0.00   1   5     4  1.00    -0.86
gender*    2 939 1.36 0.78   1.00    1.19 0.00   1   4     3  1.80     1.57
rse        3 939 2.65 0.71   2.70    2.67 0.74   1   4     3 -0.20    -0.75
pss        4 939 2.93 0.96   3.00    2.91 1.11   1   5     4  0.11    -0.80
phq        5 939 2.06 0.86   1.89    1.98 0.99   1   4     3  0.65    -0.65
edeq12     6 939 1.88 0.72   1.75    1.81 0.74   1   4     3  0.69    -0.50
          se
age*    0.05
gender* 0.03
rse     0.02
pss     0.03
phq     0.03
edeq12  0.02
## Bivariate Plots

### Crosstabs
expss::cro(d$age, d$gender)
 d$gender 
 female   I use another term   male   Prefer not to say 
 d$age 
   1 under 18  479 19 112 9
   2 between 18 and 25  50 1 4
   3 between 26 and 35  6
   4 between 36 and 45  76 1 7
   5 over 45  153 20 2
   #Total cases  764 21 143 11

Scatterplots

plot(d$rse, d$pss,
     main="Scatterplot of Self-Esteem and stress",
     xlab = "Self-esteem",
     ylab = "Stress")

plot(d$rse, d$phq,
     main="Scatterplot of Self-Esteem and Patient Health Questionnaire",
     xlab = "Self-esteem",
     ylab = "Patient Health Questionnaire")

plot(d$rse, d$edeq12,
     main="Scatterplot of Self-Esteem and Eating Disorder Examination Questionnair",
     xlab = "Self-esteem",
     ylab = "Eating Disorder Examination Questionnair")

plot(d$pss, d$phq,
     main="Scatterplot of Stress and Patient Health Questionnaire",
     xlab = "Stress",
     ylab = "Patient Health Questionnaire")

plot(d$pss, d$edeq12,
     main="Scatterplot of Stress and Eating Disorder Examination Questionnaire",
     xlab = "stress",
     ylab = "Eating Disorder Examination Questionnaire")

plot(d$edeq12, d$pss,
     main="Scatterplot of Eating Disorder Examination Questionnaire and stress",
     xlab = "Eating Disorder Examination Questionnaire",
     ylab = "Stress")

Boxplots

 boxplot(data=d, pss~age,
         main="Boxplot of Stress",
         xlab = "age",
         ylab = "stress")

boxplot(data=d, pss~gender,
         main="Boxplot of Stress",
         xlab = "gender",
         ylab = "stress")

Write-Up

Once again, you need to create a write-up reviewing the most important things you did here. Again, it should be suitable for inclusion in a manuscript. Make sure you include your review of skewness and kurtosis. I have given you two potential templates you can follow below, depending upon your needs – you should delete the other text in this section and only include your write-up.

If skew and kurtosis are good: We reviewed plots and descriptive statistics for our six chosen variables. All four of our continuous variables had skew and kurtosis within the accepted range (-2/+2).

If skew and kurtosis have issues: We reviewed plots and descriptive statistics for our six chosen variables. [Placeholder] variables had issues with skew and/or kurtosis: worry scores were negatively skewed (-3.15) and self-esteem scores were kurtotic (2.50). The other [placeholder] variables had skew and kurtosis within the accepted range (-2/+2).