Take-Home Midterm Exam: Introductory Psychological Statistics

Instructions

Please complete this exam on your own. Include your R code, interpretations, and answers within this document.

Part 1: Types of Data and Measurement Errors

Question 1: Data Types in Psychological Research

Read Chapter 2 (Types of Data Psychologists Collect) and answer the following:

Describe the key differences between nominal, ordinal, interval, and ratio data. Provide one example of each from psychological research.

Nominal data categorizes data into distinct groups without an inherent order, such as introverts, extroverts, or ambiverts, where none are “greater” than another. The categories are mutually exclusive and there is no meaningful numerical difference between them. Ordinal data categorizes in a meaningful order but with no consistent interval between values. For example, a scale might be defined by “low,” “moderate,” and “high,” and the difference between “low” and “moderate” might not be the same as between “moderate” and “high.” Interval data is numeric data with equal intervals between values but without a true zero point so the ratios cannot be meaningfully calculated. An example of this in psychological research could be an IQ test where the difference between 100 and 110 is the same as the difference between 110 and 120 but a score of 0 doesn’t mean “no intelligence.” Finally, ratio data is defined by ordered values with equal intervals and a true zero point. An example of this in psychological research could be reaction times, where zero milliseconds means no reaction time, and ratios are meaningful in that a reaction time of 400 milliseconds is twice as fast as a reaction time of 800 milliseconds.

For each of the following variables, identify the appropriate level of measurement (nominal, ordinal, interval, or ratio) and explain your reasoning:
- Scores on a depression inventory (0-63)
- Response time in milliseconds
- Likert scale ratings of agreement (1-7)
- Diagnostic categories (e.g., ADHD, anxiety disorder, no diagnosis)
- Age in years

Scores on a depression inventory (0-63) would be an example of interval data because there are equal intervals between the data points, but there is no true zero, as a score of zero does not indicate an absolute absence of depression, just the lowest measurable level. Response time in milliseconds is an example of ratio data, as mentioned above, because there are equal intervals with meaningful ratios and there is a true zero. A likert scale ratings of agreement is an example of ordinal data because they are ordered, but the intervals are not necessarily exactly equal because the difference between 1 and 2 may not be perceived the same as between 5 and 6. Diagnostic categories are a perfect example of nominal data because there is no inherent ranking between the disorders. Finally, age in years is another example of ratio data because there are equal intervals between the years and there is a true zero.

Question 2: Measurement Error

Referring to Chapter 3 (Measurement Errors in Psychological Research):

Explain the difference between random and systematic error, providing an example of each in the context of a memory experiment.

Random errors are unpredictable variations in measurements that occur because of chance factors like fluctuations in attention or environmental distractions. In a memory experiment, this might look like losing focus because there is a loud noise outside and the participant is therefore unable to recall something they might have otherwise remembered. Systematic errors, on the other hand, are consistent and predictable errors in measurement that are due to the study’s design or procedure, and often skew results. In a memory experiment, this could look like a picture being faded and less memorable, so that participants with worse eyesight are at a consistent disadvantage.

How might measurement error affect the validity of a study examining the relationship between stress and academic performance? What steps could researchers take to minimize these errors?

A measurement error could affect the validity of a study examining the relationship between stress and academic performance by threatening the reliability and validity of the experiment. For example, a systematic error like a poorly worded question on a questionnaire could lead to incorrect conclusions about the strength of the correlation. In order to minimize this, researchers should control for external factors, cross-validate data, reduce participant bias, and improve measurement instruments.

Part 2: Descriptive Statistics and Basic Probability

Question 3: Descriptive Analysis

The code below creates a simulated dataset for a psychological experiment. Run the below code chunk without making any changes:

# Create a simulated dataset
set.seed(123)  # For reproducibility

# Number of participants
n <- 50

# Create the data frame
data <- data.frame(
  participant_id = 1:n,
  reaction_time = rnorm(n, mean = 300, sd = 50),
  accuracy = rnorm(n, mean = 85, sd = 10),
  gender = sample(c("Male", "Female"), n, replace = TRUE),
  condition = sample(c("Control", "Experimental"), n, replace = TRUE),
  anxiety_pre = rnorm(n, mean = 25, sd = 8),
  anxiety_post = NA  # We'll fill this in based on condition
)

# Make the experimental condition reduce anxiety more than control
data$anxiety_post <- ifelse(
  data$condition == "Experimental",
  data$anxiety_pre - rnorm(n, mean = 8, sd = 3),  # Larger reduction
  data$anxiety_pre - rnorm(n, mean = 3, sd = 2)   # Smaller reduction
)

# Ensure anxiety doesn't go below 0
data$anxiety_post <- pmax(data$anxiety_post, 0)

# Add some missing values for realism
data$reaction_time[sample(1:n, 3)] <- NA
data$accuracy[sample(1:n, 2)] <- NA

# View the first few rows of the dataset
head(data)

##   participant_id reaction_time  accuracy gender    condition anxiety_pre
## 1              1      271.9762  87.53319 Female      Control    31.30191
## 2              2      288.4911  84.71453 Female Experimental    31.15234
## 3              3      377.9354  84.57130 Female Experimental    27.65762
## 4              4      303.5254  98.68602   Male      Control    16.93299
## 5              5      306.4644  82.74229 Female      Control    24.04438
## 6              6      385.7532 100.16471 Female      Control    22.75684
##   anxiety_post
## 1     29.05312
## 2     19.21510
## 3     20.45306
## 4     13.75199
## 5     17.84736
## 6     19.93397

Now, perform the following computations*:

Calculate the mean, median, standard deviation, minimum, and maximum for reaction time and accuracy, grouped by condition (hint: use the psych package).

# Your code here

library(psych)
describe(data$accuracy)

##    vars  n mean   sd median trimmed  mad   min    max range  skew kurtosis   se
## X1    1 48 86.5 9.23  86.53   86.52 8.65 61.91 106.87 44.97 -0.05    -0.06 1.33

describe(data$reaction_time)

##    vars  n   mean    sd median trimmed   mad    min    max  range skew kurtosis
## X1    1 47 299.36 44.78 295.83  298.57 43.09 201.67 408.45 206.78 0.15    -0.36
##      se
## X1 6.53

Using dplyr and piping, create a new variable anxiety_change that represents the difference between pre and post anxiety scores (pre minus post). Then calculate the mean anxiety change for each condition.

# Your code here
library(dplyr)   
head(data)

##   participant_id reaction_time  accuracy gender    condition anxiety_pre
## 1              1      271.9762  87.53319 Female      Control    31.30191
## 2              2      288.4911  84.71453 Female Experimental    31.15234
## 3              3      377.9354  84.57130 Female Experimental    27.65762
## 4              4      303.5254  98.68602   Male      Control    16.93299
## 5              5      306.4644  82.74229 Female      Control    24.04438
## 6              6      385.7532 100.16471 Female      Control    22.75684
##   anxiety_post
## 1     29.05312
## 2     19.21510
## 3     20.45306
## 4     13.75199
## 5     17.84736
## 6     19.93397

data %>% 
  mutate(anxiety_change = anxiety_pre - anxiety_post)

##    participant_id reaction_time  accuracy gender    condition anxiety_pre
## 1               1      271.9762  87.53319 Female      Control    31.30191
## 2               2      288.4911  84.71453 Female Experimental    31.15234
## 3               3      377.9354  84.57130 Female Experimental    27.65762
## 4               4      303.5254  98.68602   Male      Control    16.93299
## 5               5      306.4644  82.74229 Female      Control    24.04438
## 6               6      385.7532 100.16471 Female      Control    22.75684
## 7               7      323.0458  69.51247 Female      Control    29.50392
## 8               8      236.7469  90.84614   Male      Control    22.02049
## 9               9            NA  86.23854 Female Experimental    32.81579
## 10             10      277.7169  87.15942 Female      Control    22.00335
## 11             11            NA  88.79639 Female Experimental    33.42169
## 12             12      317.9907  79.97677   Male Experimental    16.60658
## 13             13      320.0386  81.66793   Male Experimental    14.91876
## 14             14      305.5341  74.81425 Female      Control    50.92832
## 15             15      272.2079  74.28209 Female Experimental    21.66514
## 16             16            NA  88.03529 Female      Control    27.38582
## 17             17      324.8925  89.48210 Female Experimental    30.09256
## 18             18      201.6691  85.53004   Male      Control    21.12975
## 19             19      335.0678  94.22267 Female      Control    29.13490
## 20             20      276.3604 105.50085   Male      Control    27.95172
## 21             21      246.6088  80.08969 Female      Control    23.27696
## 22             22      289.1013  61.90831   Male      Control    25.52234
## 23             23      248.6998  95.05739   Male      Control    24.72746
## 24             24      263.5554  77.90799   Male Experimental    42.02762
## 25             25      268.7480  78.11991 Female      Control    19.06931
## 26             26      215.6653  95.25571 Female Experimental    16.23203
## 27             27      341.8894  82.15227   Male      Control    25.30231
## 28             28      307.6687  72.79282   Male      Control    27.48385
## 29             29      243.0932  86.81303 Female      Control    28.49219
## 30             30      362.6907        NA   Male      Control    21.33308
## 31             31      321.3232  85.05764   Male Experimental    16.49339
## 32             32      285.2464  88.85280 Female Experimental    35.10548
## 33             33      344.7563  81.29340 Female      Control    22.20280
## 34             34      343.9067  91.44377   Male      Control    18.07590
## 35             35      341.0791  82.79513 Female      Control    23.10976
## 36             36      334.4320  88.31782 Female Experimental    23.42259
## 37             37      327.6959  95.96839 Female Experimental    33.87936
## 38             38      296.9044  89.35181 Female Experimental    25.67790
## 39             39      284.7019  81.74068 Female      Control    31.03243
## 40             40      280.9764  96.48808   Male Experimental    21.00566
## 41             41      265.2647  94.93504   Male      Control    26.71556
## 42             42      289.6041  90.48397 Female      Control    22.40251
## 43             43      236.7302        NA   Male      Control    25.75667
## 44             44      408.4478  78.72094 Female      Control    17.83709
## 45             45      360.3981  98.60652   Male      Control    14.51359
## 46             46      243.8446  78.99740   Male Experimental    40.97771
## 47             47      279.8558 106.87333   Male Experimental    29.80567
## 48             48      276.6672 100.32611 Female Experimental    14.98983
## 49             49      338.9983  82.64300 Female      Control    20.11067
## 50             50      295.8315  74.73579 Female      Control    15.51616
##    anxiety_post anxiety_change
## 1     29.053117     2.24879426
## 2     19.215099    11.93723893
## 3     20.453056     7.20456483
## 4     13.751994     3.18099329
## 5     17.847362     6.19701754
## 6     19.933968     2.82286978
## 7     24.342317     5.16159899
## 8     17.758982     4.26150823
## 9     19.863065    12.95272240
## 10    22.069157    -0.06580401
## 11    25.063956     8.35773571
## 12     7.875522     8.73106229
## 13     3.221330    11.69742764
## 14    45.327922     5.60039736
## 15    16.642661     5.02247855
## 16    21.290659     6.09516212
## 17    23.416047     6.67651035
## 18    21.642810    -0.51305479
## 19    26.912456     2.22244027
## 20    24.773302     3.17841445
## 21    18.586930     4.69002601
## 22    20.597288     4.92505594
## 23    20.358843     4.36861886
## 24    31.904850    10.12276506
## 25    14.370025     4.69928609
## 26     8.052780     8.17924981
## 27    21.952702     3.34960540
## 28    24.334744     3.14910235
## 29    24.635854     3.85633353
## 30    18.283727     3.04934997
## 31     2.627509    13.86588190
## 32    27.376440     7.72904122
## 33    18.430744     3.77205314
## 34    15.607200     2.46869675
## 35    19.873474     3.23628902
## 36    19.373641     4.04895160
## 37    26.428138     7.45122383
## 38    16.420951     9.25694721
## 39    28.470531     2.56189924
## 40    15.350273     5.65539054
## 41    21.378795     5.33676775
## 42    17.294151     5.10836205
## 43    20.466142     5.29052622
## 44    15.992029     1.84506400
## 45     7.508622     7.00496546
## 46    27.270622    13.70708547
## 47    22.108595     7.69707534
## 48    11.069351     3.92047789
## 49    17.068705     3.04196717
## 50    10.016330     5.49982914

data <- data %>% 
   mutate(anxiety_change = anxiety_pre - anxiety_post) 

describe(data$anxiety_change)

##    vars  n mean  sd median trimmed  mad   min   max range skew kurtosis   se
## X1    1 50 5.64 3.3   5.07     5.3 2.86 -0.51 13.87 14.38 0.79     0.19 0.47

It looks like the mean between pre anxiety scores and post anxiety scores is 5.64 when you subtract anxiety_pre and anxiety_post.

Question 4: Probability Calculations

Using the concepts from Chapter 4 (Descriptive Statistics and Basic Probability in Psychological Research):

If reaction times in a cognitive task are normally distributed with a mean of 350ms and a standard deviation of 75ms:
1. What is the probability that a randomly selected participant will have a reaction time greater than 450ms?
2. What is the probability that a participant will have a reaction time between 300ms and 400ms?

1 - pnorm(450, mean = 350, sd = 75)

## [1] 0.09121122

pnorm(400, mean = 350, sd = 75) - pnorm(300, mean = 350, sd = 75)

## [1] 0.4950149

The probability that a randomly selected participant will have a reaction time greater than 450ms is roughly 9.12% and the probability that a randomly selected participant will have a reaction time between 300ms and 400ms is roughly 49.50%.

Part 3: Data Cleaning and Manipulation

Question 5: Data Cleaning with dplyr

Using the data set created in Part 2, perform the following data cleaning and manipulation tasks:

Remove all rows with missing values and create a new data set called clean_data.

clean_data <- data %>% 
  na.omit() 
print(clean_data)

##    participant_id reaction_time  accuracy gender    condition anxiety_pre
## 1               1      271.9762  87.53319 Female      Control    31.30191
## 2               2      288.4911  84.71453 Female Experimental    31.15234
## 3               3      377.9354  84.57130 Female Experimental    27.65762
## 4               4      303.5254  98.68602   Male      Control    16.93299
## 5               5      306.4644  82.74229 Female      Control    24.04438
## 6               6      385.7532 100.16471 Female      Control    22.75684
## 7               7      323.0458  69.51247 Female      Control    29.50392
## 8               8      236.7469  90.84614   Male      Control    22.02049
## 10             10      277.7169  87.15942 Female      Control    22.00335
## 12             12      317.9907  79.97677   Male Experimental    16.60658
## 13             13      320.0386  81.66793   Male Experimental    14.91876
## 14             14      305.5341  74.81425 Female      Control    50.92832
## 15             15      272.2079  74.28209 Female Experimental    21.66514
## 17             17      324.8925  89.48210 Female Experimental    30.09256
## 18             18      201.6691  85.53004   Male      Control    21.12975
## 19             19      335.0678  94.22267 Female      Control    29.13490
## 20             20      276.3604 105.50085   Male      Control    27.95172
## 21             21      246.6088  80.08969 Female      Control    23.27696
## 22             22      289.1013  61.90831   Male      Control    25.52234
## 23             23      248.6998  95.05739   Male      Control    24.72746
## 24             24      263.5554  77.90799   Male Experimental    42.02762
## 25             25      268.7480  78.11991 Female      Control    19.06931
## 26             26      215.6653  95.25571 Female Experimental    16.23203
## 27             27      341.8894  82.15227   Male      Control    25.30231
## 28             28      307.6687  72.79282   Male      Control    27.48385
## 29             29      243.0932  86.81303 Female      Control    28.49219
## 31             31      321.3232  85.05764   Male Experimental    16.49339
## 32             32      285.2464  88.85280 Female Experimental    35.10548
## 33             33      344.7563  81.29340 Female      Control    22.20280
## 34             34      343.9067  91.44377   Male      Control    18.07590
## 35             35      341.0791  82.79513 Female      Control    23.10976
## 36             36      334.4320  88.31782 Female Experimental    23.42259
## 37             37      327.6959  95.96839 Female Experimental    33.87936
## 38             38      296.9044  89.35181 Female Experimental    25.67790
## 39             39      284.7019  81.74068 Female      Control    31.03243
## 40             40      280.9764  96.48808   Male Experimental    21.00566
## 41             41      265.2647  94.93504   Male      Control    26.71556
## 42             42      289.6041  90.48397 Female      Control    22.40251
## 44             44      408.4478  78.72094 Female      Control    17.83709
## 45             45      360.3981  98.60652   Male      Control    14.51359
## 46             46      243.8446  78.99740   Male Experimental    40.97771
## 47             47      279.8558 106.87333   Male Experimental    29.80567
## 48             48      276.6672 100.32611 Female Experimental    14.98983
## 49             49      338.9983  82.64300 Female      Control    20.11067
## 50             50      295.8315  74.73579 Female      Control    15.51616
##    anxiety_post anxiety_change
## 1     29.053117     2.24879426
## 2     19.215099    11.93723893
## 3     20.453056     7.20456483
## 4     13.751994     3.18099329
## 5     17.847362     6.19701754
## 6     19.933968     2.82286978
## 7     24.342317     5.16159899
## 8     17.758982     4.26150823
## 10    22.069157    -0.06580401
## 12     7.875522     8.73106229
## 13     3.221330    11.69742764
## 14    45.327922     5.60039736
## 15    16.642661     5.02247855
## 17    23.416047     6.67651035
## 18    21.642810    -0.51305479
## 19    26.912456     2.22244027
## 20    24.773302     3.17841445
## 21    18.586930     4.69002601
## 22    20.597288     4.92505594
## 23    20.358843     4.36861886
## 24    31.904850    10.12276506
## 25    14.370025     4.69928609
## 26     8.052780     8.17924981
## 27    21.952702     3.34960540
## 28    24.334744     3.14910235
## 29    24.635854     3.85633353
## 31     2.627509    13.86588190
## 32    27.376440     7.72904122
## 33    18.430744     3.77205314
## 34    15.607200     2.46869675
## 35    19.873474     3.23628902
## 36    19.373641     4.04895160
## 37    26.428138     7.45122383
## 38    16.420951     9.25694721
## 39    28.470531     2.56189924
## 40    15.350273     5.65539054
## 41    21.378795     5.33676775
## 42    17.294151     5.10836205
## 44    15.992029     1.84506400
## 45     7.508622     7.00496546
## 46    27.270622    13.70708547
## 47    22.108595     7.69707534
## 48    11.069351     3.92047789
## 49    17.068705     3.04196717
## 50    10.016330     5.49982914

Create a new variable performance_category that categorizes participants based on their accuracy:
- “High” if accuracy is greater than or equal to 90
- “Medium” if accuracy is between 70 and 90
- “Low” if accuracy is less than 70

# Your code here
clean_data <- data %>% 
  mutate(performance_category = case_when(accuracy >= 90 ~ "High", accuracy >= 70 & accuracy < 90 ~ "Medium", accuracy < 70 ~ "Low")) 

print(clean_data)

##    participant_id reaction_time  accuracy gender    condition anxiety_pre
## 1               1      271.9762  87.53319 Female      Control    31.30191
## 2               2      288.4911  84.71453 Female Experimental    31.15234
## 3               3      377.9354  84.57130 Female Experimental    27.65762
## 4               4      303.5254  98.68602   Male      Control    16.93299
## 5               5      306.4644  82.74229 Female      Control    24.04438
## 6               6      385.7532 100.16471 Female      Control    22.75684
## 7               7      323.0458  69.51247 Female      Control    29.50392
## 8               8      236.7469  90.84614   Male      Control    22.02049
## 9               9            NA  86.23854 Female Experimental    32.81579
## 10             10      277.7169  87.15942 Female      Control    22.00335
## 11             11            NA  88.79639 Female Experimental    33.42169
## 12             12      317.9907  79.97677   Male Experimental    16.60658
## 13             13      320.0386  81.66793   Male Experimental    14.91876
## 14             14      305.5341  74.81425 Female      Control    50.92832
## 15             15      272.2079  74.28209 Female Experimental    21.66514
## 16             16            NA  88.03529 Female      Control    27.38582
## 17             17      324.8925  89.48210 Female Experimental    30.09256
## 18             18      201.6691  85.53004   Male      Control    21.12975
## 19             19      335.0678  94.22267 Female      Control    29.13490
## 20             20      276.3604 105.50085   Male      Control    27.95172
## 21             21      246.6088  80.08969 Female      Control    23.27696
## 22             22      289.1013  61.90831   Male      Control    25.52234
## 23             23      248.6998  95.05739   Male      Control    24.72746
## 24             24      263.5554  77.90799   Male Experimental    42.02762
## 25             25      268.7480  78.11991 Female      Control    19.06931
## 26             26      215.6653  95.25571 Female Experimental    16.23203
## 27             27      341.8894  82.15227   Male      Control    25.30231
## 28             28      307.6687  72.79282   Male      Control    27.48385
## 29             29      243.0932  86.81303 Female      Control    28.49219
## 30             30      362.6907        NA   Male      Control    21.33308
## 31             31      321.3232  85.05764   Male Experimental    16.49339
## 32             32      285.2464  88.85280 Female Experimental    35.10548
## 33             33      344.7563  81.29340 Female      Control    22.20280
## 34             34      343.9067  91.44377   Male      Control    18.07590
## 35             35      341.0791  82.79513 Female      Control    23.10976
## 36             36      334.4320  88.31782 Female Experimental    23.42259
## 37             37      327.6959  95.96839 Female Experimental    33.87936
## 38             38      296.9044  89.35181 Female Experimental    25.67790
## 39             39      284.7019  81.74068 Female      Control    31.03243
## 40             40      280.9764  96.48808   Male Experimental    21.00566
## 41             41      265.2647  94.93504   Male      Control    26.71556
## 42             42      289.6041  90.48397 Female      Control    22.40251
## 43             43      236.7302        NA   Male      Control    25.75667
## 44             44      408.4478  78.72094 Female      Control    17.83709
## 45             45      360.3981  98.60652   Male      Control    14.51359
## 46             46      243.8446  78.99740   Male Experimental    40.97771
## 47             47      279.8558 106.87333   Male Experimental    29.80567
## 48             48      276.6672 100.32611 Female Experimental    14.98983
## 49             49      338.9983  82.64300 Female      Control    20.11067
## 50             50      295.8315  74.73579 Female      Control    15.51616
##    anxiety_post anxiety_change performance_category
## 1     29.053117     2.24879426               Medium
## 2     19.215099    11.93723893               Medium
## 3     20.453056     7.20456483               Medium
## 4     13.751994     3.18099329                 High
## 5     17.847362     6.19701754               Medium
## 6     19.933968     2.82286978                 High
## 7     24.342317     5.16159899                  Low
## 8     17.758982     4.26150823                 High
## 9     19.863065    12.95272240               Medium
## 10    22.069157    -0.06580401               Medium
## 11    25.063956     8.35773571               Medium
## 12     7.875522     8.73106229               Medium
## 13     3.221330    11.69742764               Medium
## 14    45.327922     5.60039736               Medium
## 15    16.642661     5.02247855               Medium
## 16    21.290659     6.09516212               Medium
## 17    23.416047     6.67651035               Medium
## 18    21.642810    -0.51305479               Medium
## 19    26.912456     2.22244027                 High
## 20    24.773302     3.17841445                 High
## 21    18.586930     4.69002601               Medium
## 22    20.597288     4.92505594                  Low
## 23    20.358843     4.36861886                 High
## 24    31.904850    10.12276506               Medium
## 25    14.370025     4.69928609               Medium
## 26     8.052780     8.17924981                 High
## 27    21.952702     3.34960540               Medium
## 28    24.334744     3.14910235               Medium
## 29    24.635854     3.85633353               Medium
## 30    18.283727     3.04934997                 <NA>
## 31     2.627509    13.86588190               Medium
## 32    27.376440     7.72904122               Medium
## 33    18.430744     3.77205314               Medium
## 34    15.607200     2.46869675                 High
## 35    19.873474     3.23628902               Medium
## 36    19.373641     4.04895160               Medium
## 37    26.428138     7.45122383                 High
## 38    16.420951     9.25694721               Medium
## 39    28.470531     2.56189924               Medium
## 40    15.350273     5.65539054                 High
## 41    21.378795     5.33676775                 High
## 42    17.294151     5.10836205                 High
## 43    20.466142     5.29052622                 <NA>
## 44    15.992029     1.84506400               Medium
## 45     7.508622     7.00496546                 High
## 46    27.270622    13.70708547               Medium
## 47    22.108595     7.69707534                 High
## 48    11.069351     3.92047789                 High
## 49    17.068705     3.04196717               Medium
## 50    10.016330     5.49982914               Medium

Filter the data set to include only participants in the Experimental condition with reaction times faster than the overall mean reaction time.

filtered_data <-data %>% 
  filter(reaction_time > 311.75 & condition == "Experimental")

print(filtered_data)

##   participant_id reaction_time accuracy gender    condition anxiety_pre
## 1              3      377.9354 84.57130 Female Experimental    27.65762
## 2             12      317.9907 79.97677   Male Experimental    16.60658
## 3             13      320.0386 81.66793   Male Experimental    14.91876
## 4             17      324.8925 89.48210 Female Experimental    30.09256
## 5             31      321.3232 85.05764   Male Experimental    16.49339
## 6             36      334.4320 88.31782 Female Experimental    23.42259
## 7             37      327.6959 95.96839 Female Experimental    33.87936
##   anxiety_post anxiety_change
## 1    20.453056       7.204565
## 2     7.875522       8.731062
## 3     3.221330      11.697428
## 4    23.416047       6.676510
## 5     2.627509      13.865882
## 6    19.373641       4.048952
## 7    26.428138       7.451224

I couldn’t figure out how to just print out that specific data, but I was able to pipe and filter to select the experimental data from within the condition column and then only print data that lined up also with reaction times that were greater than the mean of 311.75.

Part 4: Visualization and Correlation Analysis

Question 6: Correlation Analysis with the psych Package

Using the psych package, create a correlation plot for the simulated data set created in Part 2. Include the following steps:

Select the numeric variables from the data set (reaction_time, accuracy, anxiety_pre, anxiety_post, and anxiety_change if you created it).
Use the psych package’s corPlot() function to create a correlation plot.
Interpret the resulting plot by addressing:
- Which variables appear to be strongly correlated?
- Are there any surprising relationships?
- How might these correlations inform further research in psychology?

numeric_data <- data %>% 
  select(reaction_time, accuracy, anxiety_pre, anxiety_post, anxiety_change) %>% 
  corPlot(upper = FALSE)

## Error in plot.new(): figure margins too large

# Your code here. Hint: first, with dplyr create a new dataset that selects only the numeric variable (reaction_time, accuracy, anxiety_pre, anxiety_post, and anxiety_change if you created it).

The variables anxiety_pre and anxiety_post seem to be strongly positively correlated, with a correlation of .901. It seems like this is the only strong correlation, which was a bit surprising to me. It was surprising that pretty much everything else was a little bit negatively correlated. These correlations might inform further research in psychology by inspiring others to investigate certain interventions or, with correlations like the faster the reaction time, the worse the accuracy, some might want to look into speed and accuracy trade-off.

Part 5: Reflection and Application

Question 7: Reflection

Reflect on how the statistical concepts and R techniques covered in this course apply to psychological research:

Describe a specific research question in psychology that interests you. What type of data would you collect, what statistical analyses would be appropriate, and what potential measurement errors might you need to address?
How has learning R for data analysis changed your understanding of psychological statistics? What do you see as the biggest advantages and challenges of using R compared to other statistical software?

1. I’m interested in how your living environment in college could impact your satisfaction. For example, what is the correlation between your happiness and living at home versus in dorms? Or participating in coops or living in a single apartment. I could collect data on the independent and dependent variables (living environment and happiness/satisfaction Likert scales). Descriptive statistics and correlation analysis would be appropriate statistical analyses. Self-report bias and confounding variables could pose potential measurement errors, which I would need to address beforehand ideally. 2. R has changed my understanding of psychological statistics by deepening my understanding of data manipulation - like filtering, selecting or cleaning data. I didn’t realize that statistics required writing so much code, but I do enjoy it to a certain extent. R doesn’t seem that intuitive for beginners, so there is somewhat of a steep learning curve. However, it is very accessible and very good at visualization like creating histograms and plots.

Submission Instructions:

Ensure to knit your document to HTML format, checking that all content is correctly displayed before submission. Publish your assignment to RPubs and submit the URL to canvas.