Replace “Your Name” with your actual name.

Instructions

Please complete this exam on your own. Include your R code, interpretations, and answers within this document.

Part 1: Types of Data and Measurement Errors

Question 1: Data Types in Psychological Research

Read Chapter 2 (Types of Data Psychologists Collect) and answer the following:

  1. Describe the key differences between nominal, ordinal, interval, and ratio data. Provide one example of each from psychological research.

Write your answer(s) here nominal data are data categories that do not have any order or number value attached to them. Ordinal data are categories with a set order however the intervals between these categories are not equal. Interval data are represented with number values and the intervals between these values are equal, yet there is no point for true zero. Ratio data has a point for true zero and number values with equal intervals which creates ratios.

  1. For each of the following variables, identify the appropriate level of measurement (nominal, ordinal, interval, or ratio) and explain your reasoning:
    • Scores on a depression inventory (0-63)
    • Response time in milliseconds
    • Likert scale ratings of agreement (1-7)
    • Diagnostic categories (e.g., ADHD, anxiety disorder, no diagnosis)
    • Age in years

Write your answer(s) here - Scores on a depression inventory (0-63) = ordinal data - Response time in milliseconds = Ratio Data - Likert scale ratings of agreement (1-7) = ordinal data - Diagnostic categories (e.g., ADHD, anxiety disorder, no diagnosis) = nominal data - Age in years = ratio data

Question 2: Measurement Error

Referring to Chapter 3 (Measurement Errors in Psychological Research):

  1. Explain the difference between random and systematic error, providing an example of each in the context of a memory experiment.

Write your answer(s) here random error are entirely unpredictable and can damage the accuracy of the data although not significantly enough to make it invalid.For example, if a study is being done that involves memory, there can be unintended distractions that can effect one of the participants memory such as someone tapping their finger or a noise being heard outside. Systematic errors however, can affect the outcome of the data heavily and can damage the validity of a study.They also are predictable and avoidable errors in contrast to random errors. An example of this could be used in the same memory experiment. Perhaps there is a broken air conditioner in the room that the memory experiment takes place, causing a constant distracting buzzing noise being hear by all the participants. This could have been avoided with more caution and vigilance as opposed to a random noise outside.

  1. How might measurement error affect the validity of a study examining the relationship between stress and academic performance? What steps could researchers take to minimize these errors?

Write your answer(s) here An error may cause the data to be skewed and as a result, incorrect. There may be random errors affecting this specific study which are quite hard to avoid such as some individuals being more prone to stress affecting their performance while others are not. There are some individuals who may received heightened focus during times of stress and unfortunately there is not much to do about this. There are however ways to prevent systematic errors within this study such as screening the participants beforehand about their current stress levels to prevent bringing in someone with already high amounts of stress which would skew the data.


Part 2: Descriptive Statistics and Basic Probability

Question 3: Descriptive Analysis

The code below creates a simulated dataset for a psychological experiment. Run the below code chunk without making any changes:

# Create a simulated dataset
set.seed(123)  # For reproducibility

# Number of participants
n <- 50

# Create the data frame
data <- data.frame(
  participant_id = 1:n,
  reaction_time = rnorm(n, mean = 300, sd = 50),
  accuracy = rnorm(n, mean = 85, sd = 10),
  gender = sample(c("Male", "Female"), n, replace = TRUE),
  condition = sample(c("Control", "Experimental"), n, replace = TRUE),
  anxiety_pre = rnorm(n, mean = 25, sd = 8),
  anxiety_post = NA  # We'll fill this in based on condition
)

# Make the experimental condition reduce anxiety more than control
data$anxiety_post <- ifelse(
  data$condition == "Experimental",
  data$anxiety_pre - rnorm(n, mean = 8, sd = 3),  # Larger reduction
  data$anxiety_pre - rnorm(n, mean = 3, sd = 2)   # Smaller reduction
)

# Ensure anxiety doesn't go below 0
data$anxiety_post <- pmax(data$anxiety_post, 0)

# Add some missing values for realism
data$reaction_time[sample(1:n, 3)] <- NA
data$accuracy[sample(1:n, 2)] <- NA

# View the first few rows of the dataset
head(data)
##   participant_id reaction_time  accuracy gender    condition anxiety_pre
## 1              1      271.9762  87.53319 Female      Control    31.30191
## 2              2      288.4911  84.71453 Female Experimental    31.15234
## 3              3      377.9354  84.57130 Female Experimental    27.65762
## 4              4      303.5254  98.68602   Male      Control    16.93299
## 5              5      306.4644  82.74229 Female      Control    24.04438
## 6              6      385.7532 100.16471 Female      Control    22.75684
##   anxiety_post
## 1     29.05312
## 2     19.21510
## 3     20.45306
## 4     13.75199
## 5     17.84736
## 6     19.93397

Now, perform the following computations*:

  1. Calculate the mean, median, standard deviation, minimum, and maximum for reaction time and accuracy, grouped by condition (hint: use the psych package).
# Your code here
summary_stats <- data %>% 
  group_by(data$condition) %>% 
  summarize(
    mean_reaction_time = mean(data$reaction_time, na.rm = TRUE),
    median_reaction_time = median(data$reaction_time, na.rm = TRUE),
    sd_reaction_time = sd(data$reaction_time, na.rm = TRUE),
    min_reaction_time = min(data$reaction_time, na.rm = TRUE),
    max_reaction_time = max(data$reaction_time, na.rm = TRUE),
    
    mean_accuracy = mean(data$accuracy, na.rm = TRUE),
    median_accuracy = median(data$accuracy, na.rm = TRUE),
    sd_accuracy = sd(data$accuracy, na.rm = TRUE),
    min_accuracy = min(data$accuracy, na.rm = TRUE),
    max_accuracy = max(data$accuracy, na.rm = TRUE)
  )

print(summary_stats)
## # A tibble: 2 × 11
##   `data$condition` mean_reaction_time median_reaction_time sd_reaction_time
##   <chr>                         <dbl>                <dbl>            <dbl>
## 1 Control                        299.                 296.             44.8
## 2 Experimental                   299.                 296.             44.8
## # ℹ 7 more variables: min_reaction_time <dbl>, max_reaction_time <dbl>,
## #   mean_accuracy <dbl>, median_accuracy <dbl>, sd_accuracy <dbl>,
## #   min_accuracy <dbl>, max_accuracy <dbl>
describeBy(data$reaction_time, group = data$condition)
## 
##  Descriptive statistics by group 
## group: Control
##    vars  n  mean    sd median trimmed   mad    min    max  range skew kurtosis
## X1    1 30 301.4 48.54 299.68  300.42 55.38 201.67 408.45 206.78 0.14    -0.66
##      se
## X1 8.86
## ------------------------------------------------------------ 
## group: Experimental
##    vars  n   mean    sd median trimmed   mad    min    max  range skew kurtosis
## X1    1 17 295.75 38.37 288.49  295.61 43.74 215.67 377.94 162.27    0    -0.27
##      se
## X1 9.31
describeBy(data$accuracy, group = data$accuracy)
## 
##  Descriptive statistics by group 
## group: 61.9083112435919
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 61.91 NA  61.91   61.91   0 61.91 61.91     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 69.5124719576978
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 69.51 NA  69.51   69.51   0 69.51 69.51     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 72.7928228774546
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 72.79 NA  72.79   72.79   0 72.79 72.79     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 74.2820877352442
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 74.28 NA  74.28   74.28   0 74.28 74.28     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 74.7357909969322
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 74.74 NA  74.74   74.74   0 74.74 74.74     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 74.8142461689291
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 74.81 NA  74.81   74.81   0 74.81 74.81     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 77.9079923741761
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 77.91 NA  77.91   77.91   0 77.91 77.91     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 78.1199138353264
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 78.12 NA  78.12   78.12   0 78.12 78.12     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 78.7209392396063
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 78.72 NA  78.72   78.72   0 78.72 78.72     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 78.9974041285287
##    vars n mean sd median trimmed mad min max range skew kurtosis se
## X1    1 1   79 NA     79      79   0  79  79     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 79.976765468907
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 79.98 NA  79.98   79.98   0 79.98 79.98     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 80.0896883394346
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 80.09 NA  80.09   80.09   0 80.09 80.09     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 81.2933996820759
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 81.29 NA  81.29   81.29   0 81.29 81.29     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 81.6679261633058
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 81.67 NA  81.67   81.67   0 81.67 81.67     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 81.7406841446877
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 81.74 NA  81.74   81.74   0 81.74 81.74     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 82.1522699294899
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 82.15 NA  82.15   82.15   0 82.15 82.15     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 82.6429964089952
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 82.64 NA  82.64   82.64   0 82.64 82.64     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 82.7422901434073
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 82.74 NA  82.74   82.74   0 82.74 82.74     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 82.7951343818125
##    vars n mean sd median trimmed mad  min  max range skew kurtosis se
## X1    1 1 82.8 NA   82.8    82.8   0 82.8 82.8     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 84.5712954270868
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 84.57 NA  84.57   84.57   0 84.57 84.57     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 84.714532446513
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 84.71 NA  84.71   84.71   0 84.71 84.71     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 85.0576418589989
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 85.06 NA  85.06   85.06   0 85.06 85.06     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 85.530042267305
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 85.53 NA  85.53   85.53   0 85.53 85.53     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 86.2385424384461
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 86.24 NA  86.24   86.24   0 86.24 86.24     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 86.8130347974915
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 86.81 NA  86.81   86.81   0 86.81 86.81     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 87.1594156874397
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 87.16 NA  87.16   87.16   0 87.16 87.16     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 87.5331851399475
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 87.53 NA  87.53   87.53   0 87.53 87.53     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 88.0352864140426
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 88.04 NA  88.04   88.04   0 88.04 88.04     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 88.317819639157
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 88.32 NA  88.32   88.32   0 88.32 88.32     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 88.7963948275988
##    vars n mean sd median trimmed mad  min  max range skew kurtosis se
## X1    1 1 88.8 NA   88.8    88.8   0 88.8 88.8     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 88.8528040112633
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 88.85 NA  88.85   88.85   0 88.85 88.85     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 89.351814908338
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 89.35 NA  89.35   89.35   0 89.35 89.35     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 89.4820977862943
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 89.48 NA  89.48   89.48   0 89.48 89.48     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 90.4839695950807
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 90.48 NA  90.48   90.48   0 90.48 90.48     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 90.8461374963607
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 90.85 NA  90.85   90.85   0 90.85 90.85     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 91.4437654851883
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 91.44 NA  91.44   91.44   0 91.44 91.44     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 94.2226746787974
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 94.22 NA  94.22   94.22   0 94.22 94.22     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 94.9350385596212
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 94.94 NA  94.94   94.94   0 94.94 94.94     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 95.0573852446226
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 95.06 NA  95.06   95.06   0 95.06 95.06     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 95.255713696967
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 95.26 NA  95.26   95.26   0 95.26 95.26     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 95.9683901314935
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 95.97 NA  95.97   95.97   0 95.97 95.97     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 96.4880761845109
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 96.49 NA  96.49   96.49   0 96.49 96.49     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 98.6065244853001
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 98.61 NA  98.61   98.61   0 98.61 98.61     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 98.6860228401446
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 98.69 NA  98.69   98.69   0 98.69 98.69     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 100.164706044295
##    vars n   mean sd median trimmed mad    min    max range skew kurtosis se
## X1    1 1 100.16 NA 100.16  100.16   0 100.16 100.16     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 100.326106261852
##    vars n   mean sd median trimmed mad    min    max range skew kurtosis se
## X1    1 1 100.33 NA 100.33  100.33   0 100.33 100.33     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 105.500846856271
##    vars n  mean sd median trimmed mad   min   max range skew kurtosis se
## X1    1 1 105.5 NA  105.5   105.5   0 105.5 105.5     0   NA       NA NA
## ------------------------------------------------------------ 
## group: 106.873329930166
##    vars n   mean sd median trimmed mad    min    max range skew kurtosis se
## X1    1 1 106.87 NA 106.87  106.87   0 106.87 106.87     0   NA       NA NA
  1. Using dplyr and piping, create a new variable anxiety_change that represents the difference between pre and post anxiety scores (pre minus post). Then calculate the mean anxiety change for each condition.
# Your code here

mean_anxiety_change <- data %>%
  mutate(anxiety_change = data$anxiety_pre - data$anxiety_post) %>%  
  group_by(condition) %>%                                  
  summarise(mean_anxiety_change = mean(anxiety_change, na.rm = TRUE))  

print(mean_anxiety_change)
## # A tibble: 2 × 2
##   condition    mean_anxiety_change
##   <chr>                      <dbl>
## 1 Control                     3.79
## 2 Experimental                8.64

Write your answer(s) here mean anxiety change for control = 3.794972
mean anxiety change for experimental = 8.642833

Question 4: Probability Calculations

Using the concepts from Chapter 4 (Descriptive Statistics and Basic Probability in Psychological Research):

  1. If reaction times in a cognitive task are normally distributed with a mean of 350ms and a standard deviation of 75ms:
    1. What is the probability that a randomly selected participant will have a reaction time greater than 450ms?
    2. What is the probability that a participant will have a reaction time between 300ms and 400ms?
# Your code here
prob_greater_than_450 <- 1 - pnorm(450, mean = 350, sd = 75)
print(prob_greater_than_450)
## [1] 0.09121122
prob_between_300_and_400 <- pnorm(400, mean = 350, sd=75) - pnorm(300, mean =350, sd =75)
print(prob_between_300_and_400)
## [1] 0.4950149

Write your answer(s) here a = 0.09121122 probability b = 0.4950149 probability


Part 3: Data Cleaning and Manipulation

Question 5: Data Cleaning with dplyr

Using the dataset created in Part 2, perform the following data cleaning and manipulation tasks:

  1. Remove all rows with missing values and create a new dataset called clean_data.
clean_data <- na.omit(data)

print(clean_data)
##    participant_id reaction_time  accuracy gender    condition anxiety_pre
## 1               1      271.9762  87.53319 Female      Control    31.30191
## 2               2      288.4911  84.71453 Female Experimental    31.15234
## 3               3      377.9354  84.57130 Female Experimental    27.65762
## 4               4      303.5254  98.68602   Male      Control    16.93299
## 5               5      306.4644  82.74229 Female      Control    24.04438
## 6               6      385.7532 100.16471 Female      Control    22.75684
## 7               7      323.0458  69.51247 Female      Control    29.50392
## 8               8      236.7469  90.84614   Male      Control    22.02049
## 10             10      277.7169  87.15942 Female      Control    22.00335
## 12             12      317.9907  79.97677   Male Experimental    16.60658
## 13             13      320.0386  81.66793   Male Experimental    14.91876
## 14             14      305.5341  74.81425 Female      Control    50.92832
## 15             15      272.2079  74.28209 Female Experimental    21.66514
## 17             17      324.8925  89.48210 Female Experimental    30.09256
## 18             18      201.6691  85.53004   Male      Control    21.12975
## 19             19      335.0678  94.22267 Female      Control    29.13490
## 20             20      276.3604 105.50085   Male      Control    27.95172
## 21             21      246.6088  80.08969 Female      Control    23.27696
## 22             22      289.1013  61.90831   Male      Control    25.52234
## 23             23      248.6998  95.05739   Male      Control    24.72746
## 24             24      263.5554  77.90799   Male Experimental    42.02762
## 25             25      268.7480  78.11991 Female      Control    19.06931
## 26             26      215.6653  95.25571 Female Experimental    16.23203
## 27             27      341.8894  82.15227   Male      Control    25.30231
## 28             28      307.6687  72.79282   Male      Control    27.48385
## 29             29      243.0932  86.81303 Female      Control    28.49219
## 31             31      321.3232  85.05764   Male Experimental    16.49339
## 32             32      285.2464  88.85280 Female Experimental    35.10548
## 33             33      344.7563  81.29340 Female      Control    22.20280
## 34             34      343.9067  91.44377   Male      Control    18.07590
## 35             35      341.0791  82.79513 Female      Control    23.10976
## 36             36      334.4320  88.31782 Female Experimental    23.42259
## 37             37      327.6959  95.96839 Female Experimental    33.87936
## 38             38      296.9044  89.35181 Female Experimental    25.67790
## 39             39      284.7019  81.74068 Female      Control    31.03243
## 40             40      280.9764  96.48808   Male Experimental    21.00566
## 41             41      265.2647  94.93504   Male      Control    26.71556
## 42             42      289.6041  90.48397 Female      Control    22.40251
## 44             44      408.4478  78.72094 Female      Control    17.83709
## 45             45      360.3981  98.60652   Male      Control    14.51359
## 46             46      243.8446  78.99740   Male Experimental    40.97771
## 47             47      279.8558 106.87333   Male Experimental    29.80567
## 48             48      276.6672 100.32611 Female Experimental    14.98983
## 49             49      338.9983  82.64300 Female      Control    20.11067
## 50             50      295.8315  74.73579 Female      Control    15.51616
##    anxiety_post
## 1     29.053117
## 2     19.215099
## 3     20.453056
## 4     13.751994
## 5     17.847362
## 6     19.933968
## 7     24.342317
## 8     17.758982
## 10    22.069157
## 12     7.875522
## 13     3.221330
## 14    45.327922
## 15    16.642661
## 17    23.416047
## 18    21.642810
## 19    26.912456
## 20    24.773302
## 21    18.586930
## 22    20.597288
## 23    20.358843
## 24    31.904850
## 25    14.370025
## 26     8.052780
## 27    21.952702
## 28    24.334744
## 29    24.635854
## 31     2.627509
## 32    27.376440
## 33    18.430744
## 34    15.607200
## 35    19.873474
## 36    19.373641
## 37    26.428138
## 38    16.420951
## 39    28.470531
## 40    15.350273
## 41    21.378795
## 42    17.294151
## 44    15.992029
## 45     7.508622
## 46    27.270622
## 47    22.108595
## 48    11.069351
## 49    17.068705
## 50    10.016330
  1. Create a new variable performance_category that categorizes participants based on their accuracy:
    • “High” if accuracy is greater than or equal to 90
    • “Medium” if accuracy is between 70 and 90
    • “Low” if accuracy is less than 70
# Your code here

clean_data <- clean_data %>%
  mutate(performance_category = case_when(
    accuracy >= 90 ~ "High",
    accuracy >= 70 & accuracy < 90 ~ "Medium",
    accuracy < 70 ~ "Low"
  ))

print(clean_data)
##    participant_id reaction_time  accuracy gender    condition anxiety_pre
## 1               1      271.9762  87.53319 Female      Control    31.30191
## 2               2      288.4911  84.71453 Female Experimental    31.15234
## 3               3      377.9354  84.57130 Female Experimental    27.65762
## 4               4      303.5254  98.68602   Male      Control    16.93299
## 5               5      306.4644  82.74229 Female      Control    24.04438
## 6               6      385.7532 100.16471 Female      Control    22.75684
## 7               7      323.0458  69.51247 Female      Control    29.50392
## 8               8      236.7469  90.84614   Male      Control    22.02049
## 10             10      277.7169  87.15942 Female      Control    22.00335
## 12             12      317.9907  79.97677   Male Experimental    16.60658
## 13             13      320.0386  81.66793   Male Experimental    14.91876
## 14             14      305.5341  74.81425 Female      Control    50.92832
## 15             15      272.2079  74.28209 Female Experimental    21.66514
## 17             17      324.8925  89.48210 Female Experimental    30.09256
## 18             18      201.6691  85.53004   Male      Control    21.12975
## 19             19      335.0678  94.22267 Female      Control    29.13490
## 20             20      276.3604 105.50085   Male      Control    27.95172
## 21             21      246.6088  80.08969 Female      Control    23.27696
## 22             22      289.1013  61.90831   Male      Control    25.52234
## 23             23      248.6998  95.05739   Male      Control    24.72746
## 24             24      263.5554  77.90799   Male Experimental    42.02762
## 25             25      268.7480  78.11991 Female      Control    19.06931
## 26             26      215.6653  95.25571 Female Experimental    16.23203
## 27             27      341.8894  82.15227   Male      Control    25.30231
## 28             28      307.6687  72.79282   Male      Control    27.48385
## 29             29      243.0932  86.81303 Female      Control    28.49219
## 31             31      321.3232  85.05764   Male Experimental    16.49339
## 32             32      285.2464  88.85280 Female Experimental    35.10548
## 33             33      344.7563  81.29340 Female      Control    22.20280
## 34             34      343.9067  91.44377   Male      Control    18.07590
## 35             35      341.0791  82.79513 Female      Control    23.10976
## 36             36      334.4320  88.31782 Female Experimental    23.42259
## 37             37      327.6959  95.96839 Female Experimental    33.87936
## 38             38      296.9044  89.35181 Female Experimental    25.67790
## 39             39      284.7019  81.74068 Female      Control    31.03243
## 40             40      280.9764  96.48808   Male Experimental    21.00566
## 41             41      265.2647  94.93504   Male      Control    26.71556
## 42             42      289.6041  90.48397 Female      Control    22.40251
## 44             44      408.4478  78.72094 Female      Control    17.83709
## 45             45      360.3981  98.60652   Male      Control    14.51359
## 46             46      243.8446  78.99740   Male Experimental    40.97771
## 47             47      279.8558 106.87333   Male Experimental    29.80567
## 48             48      276.6672 100.32611 Female Experimental    14.98983
## 49             49      338.9983  82.64300 Female      Control    20.11067
## 50             50      295.8315  74.73579 Female      Control    15.51616
##    anxiety_post performance_category
## 1     29.053117               Medium
## 2     19.215099               Medium
## 3     20.453056               Medium
## 4     13.751994                 High
## 5     17.847362               Medium
## 6     19.933968                 High
## 7     24.342317                  Low
## 8     17.758982                 High
## 10    22.069157               Medium
## 12     7.875522               Medium
## 13     3.221330               Medium
## 14    45.327922               Medium
## 15    16.642661               Medium
## 17    23.416047               Medium
## 18    21.642810               Medium
## 19    26.912456                 High
## 20    24.773302                 High
## 21    18.586930               Medium
## 22    20.597288                  Low
## 23    20.358843                 High
## 24    31.904850               Medium
## 25    14.370025               Medium
## 26     8.052780                 High
## 27    21.952702               Medium
## 28    24.334744               Medium
## 29    24.635854               Medium
## 31     2.627509               Medium
## 32    27.376440               Medium
## 33    18.430744               Medium
## 34    15.607200                 High
## 35    19.873474               Medium
## 36    19.373641               Medium
## 37    26.428138                 High
## 38    16.420951               Medium
## 39    28.470531               Medium
## 40    15.350273                 High
## 41    21.378795                 High
## 42    17.294151                 High
## 44    15.992029               Medium
## 45     7.508622                 High
## 46    27.270622               Medium
## 47    22.108595                 High
## 48    11.069351                 High
## 49    17.068705               Medium
## 50    10.016330               Medium
  1. Filter the dataset to include only participants in the Experimental condition with reaction times faster than the overall mean reaction time.
# Your code here
overall_mean_reaction_time <- mean(clean_data$reaction_time, na.rm = TRUE)

filtered_data <- clean_data %>%
  filter(condition == "Experimental" & reaction_time < overall_mean_reaction_time)

print(filtered_data)
##    participant_id reaction_time  accuracy gender    condition anxiety_pre
## 1               2      288.4911  84.71453 Female Experimental    31.15234
## 2              15      272.2079  74.28209 Female Experimental    21.66514
## 3              24      263.5554  77.90799   Male Experimental    42.02762
## 4              26      215.6653  95.25571 Female Experimental    16.23203
## 5              32      285.2464  88.85280 Female Experimental    35.10548
## 6              38      296.9044  89.35181 Female Experimental    25.67790
## 7              40      280.9764  96.48808   Male Experimental    21.00566
## 8              46      243.8446  78.99740   Male Experimental    40.97771
## 9              47      279.8558 106.87333   Male Experimental    29.80567
## 10             48      276.6672 100.32611 Female Experimental    14.98983
##    anxiety_post performance_category
## 1      19.21510               Medium
## 2      16.64266               Medium
## 3      31.90485               Medium
## 4       8.05278                 High
## 5      27.37644               Medium
## 6      16.42095               Medium
## 7      15.35027                 High
## 8      27.27062               Medium
## 9      22.10860                 High
## 10     11.06935                 High

Write your answer(s) here describing your data cleaning process. first I knew that I could use na.omit function in order to clean the original data set. I then went to creating the performance category variable by adding conditionals using the function case_when by inputting the required values for medium, high and low. I then went and calculated the overall mean reaction time of the cleaned data. After that I created filtered data filtering by experimental with reaction time greater than the overall mean reaction time.I used the filter function for this bit.


Part 4: Visualization and Correlation Analysis

Question 6: Correlation Analysis with the psych Package

Using the psych package, create a correlation plot for the simulated dataset created in Part 2. Include the following steps:

  1. Select the numeric variables from the dataset (reaction_time, accuracy, anxiety_pre, anxiety_post, and anxiety_change if you created it).
  2. Use the psych package’s corPlot() function to create a correlation plot.
  3. Interpret the resulting plot by addressing:
    • Which variables appear to be strongly correlated?
    • Are there any surprising relationships?
    • How might these correlations inform further research in psychology?
# Your code here. Hint: first, with dplyr create a new dataset that selects only the numeric variable (reaction_time, accuracy, anxiety_pre, anxiety_post, and anxiety_change if you created it).

numeric_data <- data %>%
  select(reaction_time, accuracy, anxiety_pre, anxiety_post)

print(numeric_data)
##    reaction_time  accuracy anxiety_pre anxiety_post
## 1       271.9762  87.53319    31.30191    29.053117
## 2       288.4911  84.71453    31.15234    19.215099
## 3       377.9354  84.57130    27.65762    20.453056
## 4       303.5254  98.68602    16.93299    13.751994
## 5       306.4644  82.74229    24.04438    17.847362
## 6       385.7532 100.16471    22.75684    19.933968
## 7       323.0458  69.51247    29.50392    24.342317
## 8       236.7469  90.84614    22.02049    17.758982
## 9             NA  86.23854    32.81579    19.863065
## 10      277.7169  87.15942    22.00335    22.069157
## 11            NA  88.79639    33.42169    25.063956
## 12      317.9907  79.97677    16.60658     7.875522
## 13      320.0386  81.66793    14.91876     3.221330
## 14      305.5341  74.81425    50.92832    45.327922
## 15      272.2079  74.28209    21.66514    16.642661
## 16            NA  88.03529    27.38582    21.290659
## 17      324.8925  89.48210    30.09256    23.416047
## 18      201.6691  85.53004    21.12975    21.642810
## 19      335.0678  94.22267    29.13490    26.912456
## 20      276.3604 105.50085    27.95172    24.773302
## 21      246.6088  80.08969    23.27696    18.586930
## 22      289.1013  61.90831    25.52234    20.597288
## 23      248.6998  95.05739    24.72746    20.358843
## 24      263.5554  77.90799    42.02762    31.904850
## 25      268.7480  78.11991    19.06931    14.370025
## 26      215.6653  95.25571    16.23203     8.052780
## 27      341.8894  82.15227    25.30231    21.952702
## 28      307.6687  72.79282    27.48385    24.334744
## 29      243.0932  86.81303    28.49219    24.635854
## 30      362.6907        NA    21.33308    18.283727
## 31      321.3232  85.05764    16.49339     2.627509
## 32      285.2464  88.85280    35.10548    27.376440
## 33      344.7563  81.29340    22.20280    18.430744
## 34      343.9067  91.44377    18.07590    15.607200
## 35      341.0791  82.79513    23.10976    19.873474
## 36      334.4320  88.31782    23.42259    19.373641
## 37      327.6959  95.96839    33.87936    26.428138
## 38      296.9044  89.35181    25.67790    16.420951
## 39      284.7019  81.74068    31.03243    28.470531
## 40      280.9764  96.48808    21.00566    15.350273
## 41      265.2647  94.93504    26.71556    21.378795
## 42      289.6041  90.48397    22.40251    17.294151
## 43      236.7302        NA    25.75667    20.466142
## 44      408.4478  78.72094    17.83709    15.992029
## 45      360.3981  98.60652    14.51359     7.508622
## 46      243.8446  78.99740    40.97771    27.270622
## 47      279.8558 106.87333    29.80567    22.108595
## 48      276.6672 100.32611    14.98983    11.069351
## 49      338.9983  82.64300    20.11067    17.068705
## 50      295.8315  74.73579    15.51616    10.016330
corPlot(numeric_data)
## Error in plot.new(): figure margins too large

Write your answer(s) here anxiety pre and post are quite heavily correlated with each other.I was surprised to see that anxiety pre and reaction time were just as negatively correlated as anxiety pre and accuracy.Although none of the negative correlations were all too strong in the end. the relationship between pre and post could be looked into even further to see how pre anxiety affects our educational aptitude as well as our anxiety after the fact. —

Part 5: Reflection and Application

Question 7: Reflection

Reflect on how the statistical concepts and R techniques covered in this course apply to psychological research:

  1. Describe a specific research question in psychology that interests you. What type of data would you collect, what statistical analyses would be appropriate, and what potential measurement errors might you need to address?

  2. How has learning R for data analysis changed your understanding of psychological statistics? What do you see as the biggest advantages and challenges of using R compared to other statistical software?

Write your answer(s) here 1.I am interested in how lack of sleep would affect someones academic performance. I know that they are most likely related but I would be curious just to see how severe the effects of lack of sleep are in this regard. I would collect data about age, hours of sleep and study time as well as others pieces of information. I would need to ensure that I collect the amount of hours that the student studies as well as their grades in order to ensure that I am acknowledging that some students will stay up late studying which causes their lack of sleep, while others will stay up late doing other activities that will not benefit academic performance. I would like to get students who are around the same level in order to prevent outliers.

2.R has allowed me to see that the actual organization of the data is not the difficult aspect of conducting a study, it is ensuring that there will be the least number of errors as humanely possible. There are things that you simply have no control over when conducting a psychological study which is the hardest thing to get under control. I think that R is definately quite advanced in its assorting of data however it can be very intimidating and frustrating to use as someone who does not like coding.

Submission Instructions:

Ensure to knit your document to HTML format, checking that all content is correctly displayed before submission. Publish your assignment to RPubs and submit the URL to canvas.