Take-Home Midterm Exam: Introductory Psychological Statistics

Replace “Your Name” with your actual name.

Instructions

Please complete this exam on your own. Include your R code, interpretations, and answers within this document.

Part 1: Types of Data and Measurement Errors

Question 1: Data Types in Psychological Research

Read Chapter 2 (Types of Data Psychologists Collect) and answer the following:

Describe the key differences between nominal, ordinal, interval, and ratio data. Provide one example of each from psychological research.

Nominal data divides variables into labeled categories. These groupings do not necessarily have any order. For example, in a study examining different personality traits, for instance, introversion, extroversion, neuroticism, these are nominal data due to the fact they are distinct categories without any ranking, nominal categories cannot be ranked in a logical order. Ordinal data, on the other hand, classifies variables into categories that can be ranked. For example, a survey with responses on a scale from “strongly disagree” to “strongly agree”; These categories are ordered but the differences between ranks are probably not equal. Interval data is measured along a numerical scale, has consistent intervals between values but has no true zero point. For example, a research measuring intelligence using IQ scores. The differences between a score of 100 and 110 is the same between 110 and 120, but a score of 0 does not mean one has no intelligence.Ratio data, like interval, is measured along a numerical scale that has equal distances between values but, ratio data does have a true zero point, meaning it is not possible to have negative values in ratio data.

For each of the following variables, identify the appropriate level of measurement (nominal, ordinal, interval, or ratio) and explain your reasoning:
- Scores on a depression inventory (0-63)
- Response time in milliseconds
- Likert scale ratings of agreement (1-7)
- Diagnostic categories (e.g., ADHD, anxiety disorder, no diagnosis)
- Age in years

Scores on a depression inventory, the appropriate level of measurement is interval. Scores are numerical and have consistent intervals, but there is no true zero, which indicates the absence of depression. Response time in milliseconds, the appropriate level of measurement is ratio. Response times are measured on a scale with a true zero point, no response time. Likert scale ratings of agreement, the ratings have a natural order but the intervals between ratings may not be equal, the appropriate level of measurement is ordinal. Diagnostic categories, for example anxiety disorder, ADHD, these categories are labels without any order or quantitative value. Age in years, the appropriate level of measurement is on a scale with a true zero point.

Question 2: Measurement Error

Referring to Chapter 3 (Measurement Errors in Psychological Research):

Explain the difference between random and systematic error, providing an example of each in the context of a memory experiment.

Systematic errors are consistent biases, leading to inaccurate results. For example, if instructions are misunderstood or not clear, all participants might complete the task incorrectly, leading to systematically inaccurate results. Random errors are unpredictable variations, affecting the reliability of measurements. An example of this could be individual differences in memory capacity among participants. Some may have better memory retention due to age, health, etc.

How might measurement error affect the validity of a study examining the relationship between stress and academic performance? What steps could researchers take to minimize these errors?

Measurement error can affect the validity of a study by introducing inaccuracies that provide misleading the true relationship between variables. Researchers can potentially conduct multiple methods of data collecting such as self-report measures, performance indicators can further provide a better understanding of the assessments in regards to stress and academic performance.

Part 2: Descriptive Statistics and Basic Probability

Question 3: Descriptive Analysis

The code below creates a simulated dataset for a psychological experiment. Run the below code chunk without making any changes:

# Create a simulated dataset
set.seed(123)  # For reproducibility

# Number of participants
n <- 50

# Create the data frame
data <- data.frame(
  participant_id = 1:n,
  reaction_time = rnorm(n, mean = 300, sd = 50),
  accuracy = rnorm(n, mean = 85, sd = 10),
  gender = sample(c("Male", "Female"), n, replace = TRUE),
  condition = sample(c("Control", "Experimental"), n, replace = TRUE),
  anxiety_pre = rnorm(n, mean = 25, sd = 8),
  anxiety_post = NA  # We'll fill this in based on condition
)

# Make the experimental condition reduce anxiety more than control
data$anxiety_post <- ifelse(
  data$condition == "Experimental",
  data$anxiety_pre - rnorm(n, mean = 8, sd = 3),  # Larger reduction
  data$anxiety_pre - rnorm(n, mean = 3, sd = 2)   # Smaller reduction
)

# Ensure anxiety doesn't go below 0
data$anxiety_post <- pmax(data$anxiety_post, 0)

# Add some missing values for realism
data$reaction_time[sample(1:n, 3)] <- NA
data$accuracy[sample(1:n, 2)] <- NA

# View the first few rows of the dataset
head(data)

##   participant_id reaction_time  accuracy gender    condition anxiety_pre
## 1              1      271.9762  87.53319 Female      Control    31.30191
## 2              2      288.4911  84.71453 Female Experimental    31.15234
## 3              3      377.9354  84.57130 Female Experimental    27.65762
## 4              4      303.5254  98.68602   Male      Control    16.93299
## 5              5      306.4644  82.74229 Female      Control    24.04438
## 6              6      385.7532 100.16471 Female      Control    22.75684
##   anxiety_post
## 1     29.05312
## 2     19.21510
## 3     20.45306
## 4     13.75199
## 5     17.84736
## 6     19.93397

Now, perform the following computations*:

Calculate the mean, median, standard deviation, minimum, and maximum for reaction time and accuracy, grouped by condition (hint: use the psych package).

# Your code here
reaction_times <-c(271, 288, 377, 303, 306, 385)

# Calculate mean
mean(reaction_times)

## [1] 321.6667

# Calculate median
median(reaction_times)

## [1] 304.5

# Calculate variance
var(reaction_times)

## [1] 2273.467

# Calculate standard deviation
sd(reaction_times)

## [1] 47.68088

Using dplyr and piping, create a new variable anxiety_change that represents the difference between pre and post anxiety scores (pre minus post). Then calculate the mean anxiety change for each condition.

# Your code here
library(dplyr)
head(data)

##   participant_id reaction_time  accuracy gender    condition anxiety_pre
## 1              1      271.9762  87.53319 Female      Control    31.30191
## 2              2      288.4911  84.71453 Female Experimental    31.15234
## 3              3      377.9354  84.57130 Female Experimental    27.65762
## 4              4      303.5254  98.68602   Male      Control    16.93299
## 5              5      306.4644  82.74229 Female      Control    24.04438
## 6              6      385.7532 100.16471 Female      Control    22.75684
##   anxiety_post
## 1     29.05312
## 2     19.21510
## 3     20.45306
## 4     13.75199
## 5     17.84736
## 6     19.93397

data %>% 
  mutate(anxiety_change = anxiety_pre - anxiety_post)

##    participant_id reaction_time  accuracy gender    condition anxiety_pre
## 1               1      271.9762  87.53319 Female      Control    31.30191
## 2               2      288.4911  84.71453 Female Experimental    31.15234
## 3               3      377.9354  84.57130 Female Experimental    27.65762
## 4               4      303.5254  98.68602   Male      Control    16.93299
## 5               5      306.4644  82.74229 Female      Control    24.04438
## 6               6      385.7532 100.16471 Female      Control    22.75684
## 7               7      323.0458  69.51247 Female      Control    29.50392
## 8               8      236.7469  90.84614   Male      Control    22.02049
## 9               9            NA  86.23854 Female Experimental    32.81579
## 10             10      277.7169  87.15942 Female      Control    22.00335
## 11             11            NA  88.79639 Female Experimental    33.42169
## 12             12      317.9907  79.97677   Male Experimental    16.60658
## 13             13      320.0386  81.66793   Male Experimental    14.91876
## 14             14      305.5341  74.81425 Female      Control    50.92832
## 15             15      272.2079  74.28209 Female Experimental    21.66514
## 16             16            NA  88.03529 Female      Control    27.38582
## 17             17      324.8925  89.48210 Female Experimental    30.09256
## 18             18      201.6691  85.53004   Male      Control    21.12975
## 19             19      335.0678  94.22267 Female      Control    29.13490
## 20             20      276.3604 105.50085   Male      Control    27.95172
## 21             21      246.6088  80.08969 Female      Control    23.27696
## 22             22      289.1013  61.90831   Male      Control    25.52234
## 23             23      248.6998  95.05739   Male      Control    24.72746
## 24             24      263.5554  77.90799   Male Experimental    42.02762
## 25             25      268.7480  78.11991 Female      Control    19.06931
## 26             26      215.6653  95.25571 Female Experimental    16.23203
## 27             27      341.8894  82.15227   Male      Control    25.30231
## 28             28      307.6687  72.79282   Male      Control    27.48385
## 29             29      243.0932  86.81303 Female      Control    28.49219
## 30             30      362.6907        NA   Male      Control    21.33308
## 31             31      321.3232  85.05764   Male Experimental    16.49339
## 32             32      285.2464  88.85280 Female Experimental    35.10548
## 33             33      344.7563  81.29340 Female      Control    22.20280
## 34             34      343.9067  91.44377   Male      Control    18.07590
## 35             35      341.0791  82.79513 Female      Control    23.10976
## 36             36      334.4320  88.31782 Female Experimental    23.42259
## 37             37      327.6959  95.96839 Female Experimental    33.87936
## 38             38      296.9044  89.35181 Female Experimental    25.67790
## 39             39      284.7019  81.74068 Female      Control    31.03243
## 40             40      280.9764  96.48808   Male Experimental    21.00566
## 41             41      265.2647  94.93504   Male      Control    26.71556
## 42             42      289.6041  90.48397 Female      Control    22.40251
## 43             43      236.7302        NA   Male      Control    25.75667
## 44             44      408.4478  78.72094 Female      Control    17.83709
## 45             45      360.3981  98.60652   Male      Control    14.51359
## 46             46      243.8446  78.99740   Male Experimental    40.97771
## 47             47      279.8558 106.87333   Male Experimental    29.80567
## 48             48      276.6672 100.32611 Female Experimental    14.98983
## 49             49      338.9983  82.64300 Female      Control    20.11067
## 50             50      295.8315  74.73579 Female      Control    15.51616
##    anxiety_post anxiety_change
## 1     29.053117     2.24879426
## 2     19.215099    11.93723893
## 3     20.453056     7.20456483
## 4     13.751994     3.18099329
## 5     17.847362     6.19701754
## 6     19.933968     2.82286978
## 7     24.342317     5.16159899
## 8     17.758982     4.26150823
## 9     19.863065    12.95272240
## 10    22.069157    -0.06580401
## 11    25.063956     8.35773571
## 12     7.875522     8.73106229
## 13     3.221330    11.69742764
## 14    45.327922     5.60039736
## 15    16.642661     5.02247855
## 16    21.290659     6.09516212
## 17    23.416047     6.67651035
## 18    21.642810    -0.51305479
## 19    26.912456     2.22244027
## 20    24.773302     3.17841445
## 21    18.586930     4.69002601
## 22    20.597288     4.92505594
## 23    20.358843     4.36861886
## 24    31.904850    10.12276506
## 25    14.370025     4.69928609
## 26     8.052780     8.17924981
## 27    21.952702     3.34960540
## 28    24.334744     3.14910235
## 29    24.635854     3.85633353
## 30    18.283727     3.04934997
## 31     2.627509    13.86588190
## 32    27.376440     7.72904122
## 33    18.430744     3.77205314
## 34    15.607200     2.46869675
## 35    19.873474     3.23628902
## 36    19.373641     4.04895160
## 37    26.428138     7.45122383
## 38    16.420951     9.25694721
## 39    28.470531     2.56189924
## 40    15.350273     5.65539054
## 41    21.378795     5.33676775
## 42    17.294151     5.10836205
## 43    20.466142     5.29052622
## 44    15.992029     1.84506400
## 45     7.508622     7.00496546
## 46    27.270622    13.70708547
## 47    22.108595     7.69707534
## 48    11.069351     3.92047789
## 49    17.068705     3.04196717
## 50    10.016330     5.49982914

data <- data %>% 
  mutate(anxiety_change = anxiety_pre - anxiety_post)
describe(data$anxiety_change)

##    vars  n mean  sd median trimmed  mad   min   max range skew kurtosis   se
## X1    1 50 5.64 3.3   5.07     5.3 2.86 -0.51 13.87 14.38 0.79     0.19 0.47

Write your answer(s) here It looks like the mean between pre anxiety scores and post anxiety scores is 5.64 when you subtract anxiety_pre and anxiety_post.

Question 4: Probability Calculations

Using the concepts from Chapter 4 (Descriptive Statistics and Basic Probability in Psychological Research):

If reaction times in a cognitive task are normally distributed with a mean of 350ms and a standard deviation of 75ms:
1. What is the probability that a randomly selected participant will have a reaction time greater than 450ms?
2. What is the probability that a participant will have a reaction time between 300ms and 400ms?

# Your code here
pnorm(450, mean = 350, sd = 75)

## [1] 0.9087888

pnorm(450, mean = 350, sd = 75) - pnorm(300, mean = 350, sd = 75)

## [1] 0.6562962

Write your answer(s) here The probability that a randomly chosen participant will have a reaction time greater than 450ms is roughly 9.08% and the probability that a randomly chosen participant will have a reaction time between 300ms and 400ms is roughly 65.63%.

Part 3: Data Cleaning and Manipulation

Question 5: Data Cleaning with dplyr

Using the dataset created in Part 2, perform the following data cleaning and manipulation tasks:

Remove all rows with missing values and create a new dataset called clean_data.

# Your code here
clean_data <- data %>%
  na.omit()
print(clean_data)

##    participant_id reaction_time  accuracy gender    condition anxiety_pre
## 1               1      271.9762  87.53319 Female      Control    31.30191
## 2               2      288.4911  84.71453 Female Experimental    31.15234
## 3               3      377.9354  84.57130 Female Experimental    27.65762
## 4               4      303.5254  98.68602   Male      Control    16.93299
## 5               5      306.4644  82.74229 Female      Control    24.04438
## 6               6      385.7532 100.16471 Female      Control    22.75684
## 7               7      323.0458  69.51247 Female      Control    29.50392
## 8               8      236.7469  90.84614   Male      Control    22.02049
## 10             10      277.7169  87.15942 Female      Control    22.00335
## 12             12      317.9907  79.97677   Male Experimental    16.60658
## 13             13      320.0386  81.66793   Male Experimental    14.91876
## 14             14      305.5341  74.81425 Female      Control    50.92832
## 15             15      272.2079  74.28209 Female Experimental    21.66514
## 17             17      324.8925  89.48210 Female Experimental    30.09256
## 18             18      201.6691  85.53004   Male      Control    21.12975
## 19             19      335.0678  94.22267 Female      Control    29.13490
## 20             20      276.3604 105.50085   Male      Control    27.95172
## 21             21      246.6088  80.08969 Female      Control    23.27696
## 22             22      289.1013  61.90831   Male      Control    25.52234
## 23             23      248.6998  95.05739   Male      Control    24.72746
## 24             24      263.5554  77.90799   Male Experimental    42.02762
## 25             25      268.7480  78.11991 Female      Control    19.06931
## 26             26      215.6653  95.25571 Female Experimental    16.23203
## 27             27      341.8894  82.15227   Male      Control    25.30231
## 28             28      307.6687  72.79282   Male      Control    27.48385
## 29             29      243.0932  86.81303 Female      Control    28.49219
## 31             31      321.3232  85.05764   Male Experimental    16.49339
## 32             32      285.2464  88.85280 Female Experimental    35.10548
## 33             33      344.7563  81.29340 Female      Control    22.20280
## 34             34      343.9067  91.44377   Male      Control    18.07590
## 35             35      341.0791  82.79513 Female      Control    23.10976
## 36             36      334.4320  88.31782 Female Experimental    23.42259
## 37             37      327.6959  95.96839 Female Experimental    33.87936
## 38             38      296.9044  89.35181 Female Experimental    25.67790
## 39             39      284.7019  81.74068 Female      Control    31.03243
## 40             40      280.9764  96.48808   Male Experimental    21.00566
## 41             41      265.2647  94.93504   Male      Control    26.71556
## 42             42      289.6041  90.48397 Female      Control    22.40251
## 44             44      408.4478  78.72094 Female      Control    17.83709
## 45             45      360.3981  98.60652   Male      Control    14.51359
## 46             46      243.8446  78.99740   Male Experimental    40.97771
## 47             47      279.8558 106.87333   Male Experimental    29.80567
## 48             48      276.6672 100.32611 Female Experimental    14.98983
## 49             49      338.9983  82.64300 Female      Control    20.11067
## 50             50      295.8315  74.73579 Female      Control    15.51616
##    anxiety_post anxiety_change
## 1     29.053117     2.24879426
## 2     19.215099    11.93723893
## 3     20.453056     7.20456483
## 4     13.751994     3.18099329
## 5     17.847362     6.19701754
## 6     19.933968     2.82286978
## 7     24.342317     5.16159899
## 8     17.758982     4.26150823
## 10    22.069157    -0.06580401
## 12     7.875522     8.73106229
## 13     3.221330    11.69742764
## 14    45.327922     5.60039736
## 15    16.642661     5.02247855
## 17    23.416047     6.67651035
## 18    21.642810    -0.51305479
## 19    26.912456     2.22244027
## 20    24.773302     3.17841445
## 21    18.586930     4.69002601
## 22    20.597288     4.92505594
## 23    20.358843     4.36861886
## 24    31.904850    10.12276506
## 25    14.370025     4.69928609
## 26     8.052780     8.17924981
## 27    21.952702     3.34960540
## 28    24.334744     3.14910235
## 29    24.635854     3.85633353
## 31     2.627509    13.86588190
## 32    27.376440     7.72904122
## 33    18.430744     3.77205314
## 34    15.607200     2.46869675
## 35    19.873474     3.23628902
## 36    19.373641     4.04895160
## 37    26.428138     7.45122383
## 38    16.420951     9.25694721
## 39    28.470531     2.56189924
## 40    15.350273     5.65539054
## 41    21.378795     5.33676775
## 42    17.294151     5.10836205
## 44    15.992029     1.84506400
## 45     7.508622     7.00496546
## 46    27.270622    13.70708547
## 47    22.108595     7.69707534
## 48    11.069351     3.92047789
## 49    17.068705     3.04196717
## 50    10.016330     5.49982914

Create a new variable performance_category that categorizes participants based on their accuracy:
- “High” if accuracy is greater than or equal to 90
- “Medium” if accuracy is between 70 and 90
- “Low” if accuracy is less than 70

# Your code here
clean_data <- data %>%
  mutate(performance_category = case_when(accuracy >= 90 ~ "High", accuracy >= 70 & accuracy < 90 ~ "Medium", accuracy < 70 ~ "Low"))
print(clean_data)

##    participant_id reaction_time  accuracy gender    condition anxiety_pre
## 1               1      271.9762  87.53319 Female      Control    31.30191
## 2               2      288.4911  84.71453 Female Experimental    31.15234
## 3               3      377.9354  84.57130 Female Experimental    27.65762
## 4               4      303.5254  98.68602   Male      Control    16.93299
## 5               5      306.4644  82.74229 Female      Control    24.04438
## 6               6      385.7532 100.16471 Female      Control    22.75684
## 7               7      323.0458  69.51247 Female      Control    29.50392
## 8               8      236.7469  90.84614   Male      Control    22.02049
## 9               9            NA  86.23854 Female Experimental    32.81579
## 10             10      277.7169  87.15942 Female      Control    22.00335
## 11             11            NA  88.79639 Female Experimental    33.42169
## 12             12      317.9907  79.97677   Male Experimental    16.60658
## 13             13      320.0386  81.66793   Male Experimental    14.91876
## 14             14      305.5341  74.81425 Female      Control    50.92832
## 15             15      272.2079  74.28209 Female Experimental    21.66514
## 16             16            NA  88.03529 Female      Control    27.38582
## 17             17      324.8925  89.48210 Female Experimental    30.09256
## 18             18      201.6691  85.53004   Male      Control    21.12975
## 19             19      335.0678  94.22267 Female      Control    29.13490
## 20             20      276.3604 105.50085   Male      Control    27.95172
## 21             21      246.6088  80.08969 Female      Control    23.27696
## 22             22      289.1013  61.90831   Male      Control    25.52234
## 23             23      248.6998  95.05739   Male      Control    24.72746
## 24             24      263.5554  77.90799   Male Experimental    42.02762
## 25             25      268.7480  78.11991 Female      Control    19.06931
## 26             26      215.6653  95.25571 Female Experimental    16.23203
## 27             27      341.8894  82.15227   Male      Control    25.30231
## 28             28      307.6687  72.79282   Male      Control    27.48385
## 29             29      243.0932  86.81303 Female      Control    28.49219
## 30             30      362.6907        NA   Male      Control    21.33308
## 31             31      321.3232  85.05764   Male Experimental    16.49339
## 32             32      285.2464  88.85280 Female Experimental    35.10548
## 33             33      344.7563  81.29340 Female      Control    22.20280
## 34             34      343.9067  91.44377   Male      Control    18.07590
## 35             35      341.0791  82.79513 Female      Control    23.10976
## 36             36      334.4320  88.31782 Female Experimental    23.42259
## 37             37      327.6959  95.96839 Female Experimental    33.87936
## 38             38      296.9044  89.35181 Female Experimental    25.67790
## 39             39      284.7019  81.74068 Female      Control    31.03243
## 40             40      280.9764  96.48808   Male Experimental    21.00566
## 41             41      265.2647  94.93504   Male      Control    26.71556
## 42             42      289.6041  90.48397 Female      Control    22.40251
## 43             43      236.7302        NA   Male      Control    25.75667
## 44             44      408.4478  78.72094 Female      Control    17.83709
## 45             45      360.3981  98.60652   Male      Control    14.51359
## 46             46      243.8446  78.99740   Male Experimental    40.97771
## 47             47      279.8558 106.87333   Male Experimental    29.80567
## 48             48      276.6672 100.32611 Female Experimental    14.98983
## 49             49      338.9983  82.64300 Female      Control    20.11067
## 50             50      295.8315  74.73579 Female      Control    15.51616
##    anxiety_post anxiety_change performance_category
## 1     29.053117     2.24879426               Medium
## 2     19.215099    11.93723893               Medium
## 3     20.453056     7.20456483               Medium
## 4     13.751994     3.18099329                 High
## 5     17.847362     6.19701754               Medium
## 6     19.933968     2.82286978                 High
## 7     24.342317     5.16159899                  Low
## 8     17.758982     4.26150823                 High
## 9     19.863065    12.95272240               Medium
## 10    22.069157    -0.06580401               Medium
## 11    25.063956     8.35773571               Medium
## 12     7.875522     8.73106229               Medium
## 13     3.221330    11.69742764               Medium
## 14    45.327922     5.60039736               Medium
## 15    16.642661     5.02247855               Medium
## 16    21.290659     6.09516212               Medium
## 17    23.416047     6.67651035               Medium
## 18    21.642810    -0.51305479               Medium
## 19    26.912456     2.22244027                 High
## 20    24.773302     3.17841445                 High
## 21    18.586930     4.69002601               Medium
## 22    20.597288     4.92505594                  Low
## 23    20.358843     4.36861886                 High
## 24    31.904850    10.12276506               Medium
## 25    14.370025     4.69928609               Medium
## 26     8.052780     8.17924981                 High
## 27    21.952702     3.34960540               Medium
## 28    24.334744     3.14910235               Medium
## 29    24.635854     3.85633353               Medium
## 30    18.283727     3.04934997                 <NA>
## 31     2.627509    13.86588190               Medium
## 32    27.376440     7.72904122               Medium
## 33    18.430744     3.77205314               Medium
## 34    15.607200     2.46869675                 High
## 35    19.873474     3.23628902               Medium
## 36    19.373641     4.04895160               Medium
## 37    26.428138     7.45122383                 High
## 38    16.420951     9.25694721               Medium
## 39    28.470531     2.56189924               Medium
## 40    15.350273     5.65539054                 High
## 41    21.378795     5.33676775                 High
## 42    17.294151     5.10836205                 High
## 43    20.466142     5.29052622                 <NA>
## 44    15.992029     1.84506400               Medium
## 45     7.508622     7.00496546                 High
## 46    27.270622    13.70708547               Medium
## 47    22.108595     7.69707534                 High
## 48    11.069351     3.92047789                 High
## 49    17.068705     3.04196717               Medium
## 50    10.016330     5.49982914               Medium

Filter the dataset to include only participants in the Experimental condition with reaction times faster than the overall mean reaction time.

# Your code here
filtered_data <-data %>%
  filter(reaction_time > 311.75 & condition == "Experimental")
print(filtered_data)

##   participant_id reaction_time accuracy gender    condition anxiety_pre
## 1              3      377.9354 84.57130 Female Experimental    27.65762
## 2             12      317.9907 79.97677   Male Experimental    16.60658
## 3             13      320.0386 81.66793   Male Experimental    14.91876
## 4             17      324.8925 89.48210 Female Experimental    30.09256
## 5             31      321.3232 85.05764   Male Experimental    16.49339
## 6             36      334.4320 88.31782 Female Experimental    23.42259
## 7             37      327.6959 95.96839 Female Experimental    33.87936
##   anxiety_post anxiety_change
## 1    20.453056       7.204565
## 2     7.875522       8.731062
## 3     3.221330      11.697428
## 4    23.416047       6.676510
## 5     2.627509      13.865882
## 6    19.373641       4.048952
## 7    26.428138       7.451224

Write your answer(s) here describing your data cleaning process.

Part 4: Visualization and Correlation Analysis

Question 6: Correlation Analysis with the psych Package

Using the psych package, create a correlation plot for the simulated dataset created in Part 2. Include the following steps:

Select the numeric variables from the dataset (reaction_time, accuracy, anxiety_pre, anxiety_post, and anxiety_change if you created it).
Use the psych package’s corPlot() function to create a correlation plot.
Interpret the resulting plot by addressing:
- Which variables appear to be strongly correlated?
- Are there any surprising relationships?
- How might these correlations inform further research in psychology?

# Your code here. Hint: first, with dplyr create a new dataset that selects only the numeric variable (reaction_time, accuracy, anxiety_pre, anxiety_post, and anxiety_change if you created it).
numeric_data <- data %>%
  select(reaction_time, accuracy, anxiety_pre, anxiety_post, anxiety_change) %>% 
  corPlot(upper = FALSE)

Write your answer(s) here The variables anxiety_pre and anxiety_post seem to be correlated, with a correlation of .901. This might be the only strong correlation. These might benefit further research in psychology by prompting researchers to question why these variables might be related. Correlations can help identify potential risk factors as well.

Part 5: Reflection and Application

Question 7: Reflection

Reflect on how the statistical concepts and R techniques covered in this course apply to psychological research:

Describe a specific research question in psychology that interests you. What type of data would you collect, what statistical analyses would be appropriate, and what potential measurement errors might you need to address?
How has learning R for data analysis changed your understanding of psychological statistics? What do you see as the biggest advantages and challenges of using R compared to other statistical software?

Write your answer(s) here 1. Is there any significant difference in the accuracy of eyewitness testimony between children and adults when recalling details of an event? In order to further the research I would recruit a group of children ages 8-13 and adults ages 18-25. I would have the participants watch a short video or an interaction between two people, then after a delayed period (24 hrs, 2 days), I would then survey participants and ask them to recall events. Challenges one might face during this research is that children are more susceptible to suggestion than adults so it would be important to not suggest or lead questions when interviewing participants.

I am honestly still learning and continue to learn R, but I have gained an understanding of data manipulation and statistical testing. R offers a wide range of functions and packages and I think that is a big advantage because we can customize it to our specific research needs. This may also be seen as a disadvantage because R relies heavily on packages for specific analyses. Sometimes managing these installations, packages, can be complex.

Submission Instructions:

Ensure to knit your document to HTML format, checking that all content is correctly displayed before submission. Publish your assignment to RPubs and submit the URL to canvas.