Basic Statistics Lab

Load Libraries

# remember, you might need to install packages

library(psych) # for the describe() command
library(expss) # for the cross_cases() command

Load Data

d <- read.csv(file="Data/mydata.csv", header=T)
names(d)
[1] "gender"    "age"       "big5_open" "big5_ext"  "big5_agr"  "pswq"     

Univariate Plots: Histograms & Tables

table(d$gender)  # Update for hw!!

            female I use another term               male  Prefer not to say 
              1021                 28                198                 17 
table(d$age)

         1 under 18 2 between 18 and 25 3 between 26 and 35 4 between 36 and 45 
                832                  75                  12                 119 
          5 over 45 
                226 
hist(d$big5_open)

hist(d$big5_ext)

hist(d$big5_agr)

hist(d$pswq)

Univariate Normality

Check skew and kurtosis. Cutoffs are -2 and +2; if skew or kurtosis are higher or lower than these values, I need to mention it in my writeup!!

describe(d)
          vars    n  mean   sd median trimmed  mad   min  max range  skew
gender*      1 1264  1.38 0.79   1.00    1.20 0.00  1.00 4.00  3.00  1.76
age*         2 1264  2.08 1.63   1.00    1.85 0.00  1.00 5.00  4.00  1.00
big5_open    3 1264  5.21 1.13   5.33    5.29 0.99  1.00 7.00  6.00 -0.73
big5_ext     4 1264  4.37 1.45   4.33    4.41 1.48  1.00 7.00  6.00 -0.24
big5_agr     5 1264  4.99 1.13   5.00    5.04 0.99  1.00 7.00  6.00 -0.44
pswq         6 1264 -0.02 1.00   0.01   -0.02 1.18 -2.25 2.38  4.63 -0.08
          kurtosis   se
gender*       1.42 0.02
age*         -0.83 0.05
big5_open     0.48 0.03
big5_ext     -0.78 0.04
big5_agr      0.03 0.03
pswq         -0.92 0.03

Bivariate Plots

Crosstabs

cross_cases(d, gender, age)
 age 
 1 under 18   2 between 18 and 25   3 between 26 and 35   4 between 36 and 45   5 over 45 
 gender 
   I use another term  24 3 1
   Prefer not to say  14 1 2
   female  641 66 10 108 196
   male  153 6 2 9 28
   #Total cases  832 75 12 119 226

Scatterplots

plot(d$big5_open, d$big5_ext,
     main="Scatterplot of Openness and Extraverison",
     xlab = "Openness",
     ylab = "Extraverison")

plot(d$big5_open, d$big5_agr,
     main="Scatterplot of Openness and Agreeableness",
     xlab = "Openness",
     ylab = "Agreeableness ")

plot(d$big5_open, d$pswq,
     main="Scatterplot of Openness and Worry",
     xlab = "Openness",
     ylab = "Worry")

plot(d$big5_ext, d$big5_agr,
     main="Scatterplot of Extraverison and Agreeableness",
     xlab = "Extraverison",
     ylab = "Agreeableness")

plot(d$big5_ext, d$pswq,
     main="Scatterplot of Extraverison and Worry",
     xlab = "Extraverison",
     ylab = "Worry")

plot(d$big5_agr, d$pswq,
     main="Scatterplot of Agreeableness and Worry",
     xlab = "Agreeableness",
     ylab = "Worry")

Boxplots

boxplot(data=d, big5_open~gender,
        main="Boxplot of Openness and Gender Identification",
        xlab = "Gender Indetification",
        ylab = "Openness")

boxplot(data=d, big5_open~age,
        main="Boxplot of Openness and Age",
        xlab = "Age",
        ylab = "Openness")

Write-Up

Once again, you need to create a write-up reviewing the most important things you did here. Again, it should be suitable for inclusion in a manuscript. Make sure you include your review of skewness and kurtosis. I have given you two potential templates you can follow below, depending upon your needs – you should delete the other text in this section and only include your write-up.

If skew and kurtosis are good: We reviewed plots and descriptive statistics for our six chosen variables. All four of our continuous variables had skew and kurtosis within the accepted range (-2/+2).