Replace “Your Name” with your actual name.
Please complete this exam on your own. Include your R code, interpretations, and answers within this document.
Read Chapter 2 (Types of Data Psychologists Collect) and answer the following:
Write your answer(s) here Nominal data is categorical, it includes things like labels and names. An example of this would be a survey that asks participants what their preferred type of therapy is. The possible reponses would be “CBT” or “DBT”. Ordinal data has ordered categories. An example would be a questionaire that assesses the level of agreement with a statement. The possible answers would be a scale of strongly disagree, disagree, neutral, agree, and strongly agree. Interval data is numerical and continous. An example of this would be a measurment of reaction time. Reaction time is a continous variable so the measure will be all positive real numbers. Ratio data is numerical, but it has a true zero. An example of this would be the number of errors an individual makes on a standard times table test.
Write your answer(s) here Likert scale rating of agreement would be an example of ordinal measurment because it is a scale of ordered categories. Diagnostic categories would be an example of nominal measument because it is names of disorders. Response time in milliseconds would be an example of ratio measurement because there is a true zero. Scores on a depression inventory (0-63) is an example of interval measurment because zero does not mean zero depression. Age in years would be ratio measurment because zero means the newborn has not aged yet.
Referring to Chapter 3 (Measurement Errors in Psychological Research):
Write your answer(s) here Random errors are unpredictable and inconsistent. An example of a random error would be an individual told to recall words, but their memory is worse because there was a distraction in the trial room. A systematic error has predictable bias and is consistent. An example of this would be a memorization period that is too short, so participants continously are not able to recall words.
Write your answer(s) here Measurement error affects the validity of a study by adding inaccuracies. This can lead to an incorrect conclusions and reduced credibility. For example, if a study examining the relationship between stress and academic performance has consistently poor worded questions, then the data may show an inaccurate relationship between the two.
The code below creates a simulated dataset for a psychological experiment. Run the below code chunk without making any changes:
# Create a simulated dataset
set.seed(123) # For reproducibility
# Number of participants
n <- 50
# Create the data frame
data <- data.frame(
participant_id = 1:n,
reaction_time = rnorm(n, mean = 300, sd = 50),
accuracy = rnorm(n, mean = 85, sd = 10),
gender = sample(c("Male", "Female"), n, replace = TRUE),
condition = sample(c("Control", "Experimental"), n, replace = TRUE),
anxiety_pre = rnorm(n, mean = 25, sd = 8),
anxiety_post = NA # We'll fill this in based on condition
)
# Make the experimental condition reduce anxiety more than control
data$anxiety_post <- ifelse(
data$condition == "Experimental",
data$anxiety_pre - rnorm(n, mean = 8, sd = 3), # Larger reduction
data$anxiety_pre - rnorm(n, mean = 3, sd = 2) # Smaller reduction
)
# Ensure anxiety doesn't go below 0
data$anxiety_post <- pmax(data$anxiety_post, 0)
# Add some missing values for realism
data$reaction_time[sample(1:n, 3)] <- NA
data$accuracy[sample(1:n, 2)] <- NA
# View the first few rows of the dataset
head(data)
## participant_id reaction_time accuracy gender condition anxiety_pre
## 1 1 271.9762 87.53319 Female Control 31.30191
## 2 2 288.4911 84.71453 Female Experimental 31.15234
## 3 3 377.9354 84.57130 Female Experimental 27.65762
## 4 4 303.5254 98.68602 Male Control 16.93299
## 5 5 306.4644 82.74229 Female Control 24.04438
## 6 6 385.7532 100.16471 Female Control 22.75684
## anxiety_post
## 1 29.05312
## 2 19.21510
## 3 20.45306
## 4 13.75199
## 5 17.84736
## 6 19.93397
Now, perform the following computations*:
reaction_times <- c(271.9762, 288.4911, 377.9354, 303.5254, 306.4644, 385.7532)
mean(reaction_times)
## [1] 322.3576
## [1] 304.9949
## [1] 47.75011
## [1] 271.9762
## [1] 385.7532
## [1] 89.73534
## [1] 86.12386
## [1] 7.674822
## [1] 82.74229
## [1] 100.1647
anxiety_change
that represents the difference between pre
and post anxiety scores (pre minus post). Then calculate the mean
anxiety change for each condition.anxiety_pre <- c(31.30191, 31.15234, 27.65762, 16.93299, 24.04438, 22.75684)
anxiety_post <- c(29.05312, 19.21510, 20.45306, 13.75199, 17.84736, 19.93397)
anxiety_change <- (anxiety_pre - anxiety_post)
mean(anxiety_change)
## [1] 5.59858
The mean anxiety change is 5.59858
Using the concepts from Chapter 4 (Descriptive Statistics and Basic Probability in Psychological Research):
## [1] 0.9087888
## [1] 0.4950149
Write your answer(s) here The probability that a randomly selected participant will have a reaction time greater than 450ms is 0.9087888. The probability that a randomly selected participant will have a reaction time between 300 and 400ms is 0.4950149
Using the dataset created in Part 2, perform the following data cleaning and manipulation tasks:
Remove all rows with missing values and create a new dataset
called clean_data
.
Create a new variable performance_category
that
categorizes participants based on their accuracy:
Write your answer(s) here describing your data cleaning process.
Using the psych package, create a correlation plot for the simulated dataset created in Part 2. Include the following steps:
corPlot()
function to create a correlation plot.# Your code here. Hint: first, with dplyr create a new dataset that selects only the numeric variable (reaction_time, accuracy, anxiety_pre, anxiety_post, and anxiety_change if you created it).
Write your answer(s) here
Reflect on how the statistical concepts and R techniques covered in this course apply to psychological research:
Describe a specific research question in psychology that interests you. What type of data would you collect, what statistical analyses would be appropriate, and what potential measurement errors might you need to address?
How has learning R for data analysis changed your understanding of psychological statistics? What do you see as the biggest advantages and challenges of using R compared to other statistical software?
Write your answer(s) here
Ensure to knit your document to HTML format, checking that all content is correctly displayed before submission. Publish your assignment to RPubs and submit the URL to canvas.