Replace “Your Name” with your actual name.
Please complete this exam on your own. Include your R code, interpretations, and answers within this document.
Read Chapter 2 (Types of Data Psychologists Collect) and answer the following:
Nominal Data categorizes mutually exclusive categories. Example, Gender. Ordinal Data categorizes data into distinct categories with a meaningful order and ranking. Example, Educational level. Interval Data orders data with equal intervals between values. Example, Temperature in Celsius & Fahrenheit. Ratio data has equal intervals between values, just like interval data. Example, Height, weight.
Scores on a depression inventory is an Interval, a score of 0 which doesn’t represent depression which means there is no true 0 point. Response time in milliseconds is Ratio, a true 0 point and meaningful ratios. Likert scale ratings of agreement is Ordinal, the internal categorizes might not be equal for the same degree. Diagnostic categories is Nominal, No fundamental order. Age in years is Ratio, a true zero point and meaningful ratios and twice the age.
Referring to Chapter 3 (Measurement Errors in Psychological Research):
Random errors are not predictable and have variation in measurements that can occur in random. An example of random errors is misreading the weighing scale. Systematic errors are steady and quotable derivation the value. An example of systematic errors is thermometer that steadily reads temperature 1 or 2 degrees higher than the original temperature.
Validity refers to the degree to which a test measures what it claims to measure. The measurement error affects the validity of a study of the relationship between stress and academic performance because your results will be messed up, which leads it to being inaccurate. To minimize these errors we could use reliable tools or repeat experiments to have better results.
The code below creates a simulated dataset for a psychological experiment. Run the below code chunk without making any changes:
# Create a simulated dataset
set.seed(123) # For reproducibility
# Number of participants
n <- 50
# Create the data frame
data <- data.frame(
participant_id = 1:n,
reaction_time = rnorm(n, mean = 300, sd = 50),
accuracy = rnorm(n, mean = 85, sd = 10),
gender = sample(c("Male", "Female"), n, replace = TRUE),
condition = sample(c("Control", "Experimental"), n, replace = TRUE),
anxiety_pre = rnorm(n, mean = 25, sd = 8),
anxiety_post = NA # We'll fill this in based on condition
)
# Make the experimental condition reduce anxiety more than control
data$anxiety_post <- ifelse(
data$condition == "Experimental",
data$anxiety_pre - rnorm(n, mean = 8, sd = 3), # Larger reduction
data$anxiety_pre - rnorm(n, mean = 3, sd = 2) # Smaller reduction
)
# Ensure anxiety doesn't go below 0
data$anxiety_post <- pmax(data$anxiety_post, 0)
# Add some missing values for realism
data$reaction_time[sample(1:n, 3)] <- NA
data$accuracy[sample(1:n, 2)] <- NA
# View the first few rows of the dataset
head(data)## participant_id reaction_time accuracy gender condition anxiety_pre
## 1 1 271.9762 87.53319 Female Control 31.30191
## 2 2 288.4911 84.71453 Female Experimental 31.15234
## 3 3 377.9354 84.57130 Female Experimental 27.65762
## 4 4 303.5254 98.68602 Male Control 16.93299
## 5 5 306.4644 82.74229 Female Control 24.04438
## 6 6 385.7532 100.16471 Female Control 22.75684
## anxiety_post
## 1 29.05312
## 2 19.21510
## 3 20.45306
## 4 13.75199
## 5 17.84736
## 6 19.93397
Now, perform the following computations*:
## item group1 vars n mean sd median trimmed mad min max range
## X11 1 Control 1 29 85.49 9.86 85.53 85.68 8.77 61.91 105.50 43.59
## X12 2 Experimental 1 19 88.06 8.20 88.32 87.76 9.86 74.28 106.87 32.59
## skew kurtosis se
## X11 -0.15 -0.35 1.83
## X12 0.45 -0.45 1.88
anxiety_change that represents the difference between pre
and post anxiety scores (pre minus post). Then calculate the mean
anxiety change for each condition.data <- data %>%
mutate(anxiety_change = anxiety_pre - anxiety_post)
data %>%
group_by(condition) %>%
summarise(mean_anxiety_change = mean(anxiety_change, na.rm = TRUE))## # A tibble: 2 × 2
## condition mean_anxiety_change
## <chr> <dbl>
## 1 Control 3.79
## 2 Experimental 8.64
The mean anxiety for the control group is 3.79. The mean anxiety for the experimental group is 8.64
Using the concepts from Chapter 4 (Descriptive Statistics and Basic Probability in Psychological Research):
mean_rt <- 350
sd_rt <- 75
# (a) Probability of reaction time > 450ms
p_greater_450 <- 1 - pnorm(450, mean = mean_rt, sd = sd_rt)
# (b) Probability of reaction time between 300ms and 400ms
p_between_300_400 <- pnorm(400, mean = mean_rt, sd = sd_rt) - pnorm(300, mean = mean_rt, sd = sd_rt)
p_greater_450## [1] 0.09121122
The probability that a randomly selected student will have a reaction time greater than 450ms is 0.09. The probability that a participant will have a reaction time between 300ms and 400ms is 0.49.
Using the dataset created in Part 2, perform the following data cleaning and manipulation tasks:
clean_data.performance_category that
categorizes participants based on their accuracy:
mean_reaction_time <- mean(clean_data$reaction_time, na.rm = TRUE)
filtered_data <- clean_data %>%
filter(condition == "Experimental" & reaction_time < mean_reaction_time)mean_reaction_time <- mean(clean_data$reaction_time, na.rm = TRUE)
filtered_data <- clean_data %>%
filter(condition == "Experimental" & reaction_time < mean_reaction_time)I started with removed rows that had missing values, which created a new dataset(clean-data). Moving on created a new variable(performance_category) which classified participants based on their accuracy. 90 or above accuracy is high, between 70 and 90 was participants is medium accuracy and 70 was participants is low accuracy. Finally, I filtered the dataset to include only participants in the experimental who had faster reaction time than the mean.
Using the psych package, create a correlation plot for the simulated dataset created in Part 2. Include the following steps:
corPlot()
function to create a correlation plot.numeric_data <- clean_data %>%
select(reaction_time, accuracy, anxiety_pre, anxiety_post, anxiety_change)
corPlot(cor(numeric_data, use = "pairwise.complete.obs"),
numbers = TRUE, # Display correlation values
upper = FALSE, # Show only lower triangle
main = "Correlation Plot of Key Variables")## Error in plot.new(): figure margins too large
There is a strong correlation between anxiety_pre & anxiety_post. Another strong correlation is between anxiety_change and anxiety_pre. A unique relationship is between reaction time and accuracy. These correlations can advise further research in psychology. Studies can explore ways to reduce anxiety in high pressure situations.
Reflect on how the statistical concepts and R techniques covered in this course apply to psychological research:
Describe a specific research question in psychology that interests you. What type of data would you collect, what statistical analyses would be appropriate, and what potential measurement errors might you need to address?
How has learning R for data analysis changed your understanding of psychological statistics? What do you see as the biggest advantages and challenges of using R compared to other statistical software?
1. How does sleep schedule affect anxiety levels in teenagers? To research this, I would collect data on daily sleep time as well as self reports on anxiety levels using google forms. I would use correlation tests to see if less sleep hours is linked with higher anxiety. I could use reversion analysis. A latent measurement error that needs to be addressed could be on self reported data from the google forms. People may under or overestimate their anxiety levels or hours of sleep time. 2. Learning R for data analysis changed my viewpoint of psychological statistics as it is useful for this topic. The advantage of R is that is free and widely accessible, and it has tools for collected data. The only challenge would be to learn the code for the people who are new to the program.
Ensure to knit your document to HTML format, checking that all content is correctly displayed before submission. Publish your assignment to RPubs and submit the URL to canvas.