Replace “Your Name” with your actual name.
Please complete this exam on your own. Include your R code, interpretations, and answers within this document.
Read Chapter 2 (Types of Data Psychologists Collect) and answer the following:
Nominal Data is categories or labels without any order or ranking, Ordinal Data have a meaningful order or ranking, but the intervals between the values may not be equal. Interval can be categorized, ranked and the interval between the values are equal but there’s no zero. Ratio can be categorized,ranked and the interval between values are equal and there is a zero point, meaning the zero represents the absence of the measure quantity. Ordinal data example, a questionnaire on depression, Interval Data Example, test scores of a examination. Ratio data example is an self reported behavior report rating the frequency of certain behaviors and nominal data example can be gender female or male or different types of mental health disorders.
Nominal data without any inherent order or ranking labels or categories with no order because they are purely descriptive, they dont have any quantitative or numeric value, Ordinal is categorized with a specific order or ranking but the intervals are not equal because the distance between the categories are uneven or unknown. Interval data equals distance between values no true zero because zero is an arbitrary point and not a complete absence of the variable. Ratio Data has equal intervals and a true zero because the quantity is being measured.
Referring to Chapter 3 (Measurement Errors in Psychological Research):
Random sampling relies on chance, while systematic relies on a fixed rule. Random error mainly affects precision based on the same measurement under the equivalent circumstances, data affected by systematic error are biased, and this type of error can not be reduced or eliminated by taking repeated measures. Systematic example can be an thermometer resulting in results being to high that can be eliminated. Random example can be a coin toss because the results can be heads or tails, being unsure of what the results will be obtained.
The code below creates a simulated dataset for a psychological experiment. Run the below code chunk without making any changes:
# Create a simulated dataset
set.seed(123) # For reproducibility
# Number of participants
n <- 50
# Create the data frame
data <- data.frame(
participant_id = 1:n,
reaction_time = rnorm(n, mean = 300, sd = 50),
accuracy = rnorm(n, mean = 85, sd = 10),
gender = sample(c("Male", "Female"), n, replace = TRUE),
condition = sample(c("Control", "Experimental"), n, replace = TRUE),
anxiety_pre = rnorm(n, mean = 25, sd = 8),
anxiety_post = NA # We'll fill this in based on condition
)
# Make the experimental condition reduce anxiety more than control
data$anxiety_post <- ifelse(
data$condition == "Experimental",
data$anxiety_pre - rnorm(n, mean = 8, sd = 3), # Larger reduction
data$anxiety_pre - rnorm(n, mean = 3, sd = 2) # Smaller reduction
)
# Ensure anxiety doesn't go below 0
data$anxiety_post <- pmax(data$anxiety_post, 0)
# Add some missing values for realism
data$reaction_time[sample(1:n, 3)] <- NA
data$accuracy[sample(1:n, 2)] <- NA
# View the first few rows of the dataset
head(data)## participant_id reaction_time accuracy gender condition anxiety_pre
## 1 1 271.9762 87.53319 Female Control 31.30191
## 2 2 288.4911 84.71453 Female Experimental 31.15234
## 3 3 377.9354 84.57130 Female Experimental 27.65762
## 4 4 303.5254 98.68602 Male Control 16.93299
## 5 5 306.4644 82.74229 Female Control 24.04438
## 6 6 385.7532 100.16471 Female Control 22.75684
## anxiety_post
## 1 29.05312
## 2 19.21510
## 3 20.45306
## 4 13.75199
## 5 17.84736
## 6 19.93397
Now, perform the following computations*:
data.frames(1:6) reaction_times (271.9762,288.4911,377.9354,303.5254,385.7532)
mean(reaction_times)
median(reaction_times)
sd(reaction_times)## Error in parse(text = input): <text>:1:18: unexpected symbol
## 1: data.frames(1:6) reaction_times
## ^
anxiety_change that represents the difference between pre
and post anxiety scores (pre minus post). Then calculate the mean
anxiety change for each condition.311.75 and 302.5 indicates no outliner
Using the concepts from Chapter 4 (Descriptive Statistics and Basic Probability in Psychological Research):
pt(75,df)
pt(1,df)-pt(-1,df)
---
## Part 3: Data Cleaning and Manipulation
### Question 5: Data Cleaning with dplyr
Using the dataset created in Part 2, perform the following data cleaning and manipulation tasks:
1. Remove all rows with missing values and create a new dataset called `clean_data`.## Error in parse(text = input): <text>:12:7: unexpected symbol
## 11: ### Question 5: Data Cleaning with dplyr
## 12: Using the
## ^
performance_category that
categorizes participants based on their accuracy:
## Error in parse(text = input): <text>:1:11: unexpected ','
## 1: medium (y),
## ^
Write your answer(s) here describing your data cleaning process.
Using the psych package, create a correlation plot for the simulated dataset created in Part 2. Include the following steps:
corPlot()
function to create a correlation plot.# Your code here. Hint: first, with dplyr create a new dataset that selects only the numeric variable (reaction_time, accuracy, anxiety_pre, anxiety_post, and anxiety_change if you created it).Write your answer(s) here
Reflect on how the statistical concepts and R techniques covered in this course apply to psychological research:
Describe a specific research question in psychology that interests you. What type of data would you collect, what statistical analyses would be appropriate, and what potential measurement errors might you need to address?
How has learning R for data analysis changed your understanding of psychological statistics? What do you see as the biggest advantages and challenges of using R compared to other statistical software?
How does social media use affect adolescents mental health over a three month period? The data could lead to the different types of mental health diagnosis, gender and age which can allow researchers to analyze the correlations of variables, which can be systematic procedures used to observed, describe ans predict and explained ensuring the data collection is objective and reliable to understand. Using R has showed me how to organized code, data and input codes to identify data calculation. .
Ensure to knit your document to HTML format, checking that all content is correctly displayed before submission. Publish your assignment to RPubs and submit the URL to canvas.