Replace “Your Name” with your actual name.
In this lab, you will apply data transformation techniques, including mean-centering, calculating Z-scores, and performing non-linear transformations on various datasets. Please complete the exercises by filling in the code chunks and answering the interpretation questions. Once completed, knit this document to HTML and submit it as instructed.
study_hours <- c(15, 22, 18, 25, 20, 28, 24, 19, 23, 26)
mean_study <- mean(study_hours)
mean_centered <- study_hours - mean_study
# Plot original study hours
plot(study_hours, type = "b", col = "blue", main = "Original Study Hours", ylab = "Hours")
abline(h = mean_study, col = "red", lty = 2)# Plot mean-centered study hours
plot(mean_centered, type = "b", col = "green", main = "Mean-Centered Study Hours", ylab = "Centered Hours")
abline(h = 0, col = "red", lty = 2)Interpretation: Mean-centering shifts the data so that the average value becomes zero. It helps compare relative deviations from the mean.
reaction_times <- c(350, 420, 310, 390, 370, 450, 380, 340, 400, 360)
mean_reaction <- mean(reaction_times)
sd_reaction <- sd(reaction_times)
z_scores <- (reaction_times - mean_reaction) / sd_reaction
# Plot Z-scores
plot(z_scores, type = "b", col = "purple", main = "Z-Scores of Reaction Times", ylab = "Z-Score")
abline(h = 0, col = "red", lty = 2)Interpretation: Z-scores indicate how many standard deviations each reaction time is from the mean. A positive Z-score means a value above the mean, while a negative Z-score means a value below the mean.
sales <- c(200, 450, 700, 1200, 300, 800, 1100, 900, 400, 1500)
log_sales <- log(sales)
sqrt_sales <- sqrt(sales)
# Plot histograms
par(mfrow=c(1,3))
hist(sales, main = "Original Sales", col = "blue")
hist(log_sales, main = "Log-Transformed Sales", col = "green")
hist(sqrt_sales, main = "Square Root Transformed Sales", col = "purple")Interpretation: Log and square root transformations reduce skewness, making the data more normally distributed and easier to interpret.
step_counts <- c(8000, 10500, 9200, 11500, 10000, 12500, 11000, 9500, 10200, 12000)
mean_step <- mean(step_counts)
sd_step <- sd(step_counts)
mean_centered_steps <- step_counts - mean_step
z_scores_steps <- (step_counts - mean_step) / sd_step
# Plot original, mean-centered, and Z-scores
par(mfrow=c(1,3))
plot(step_counts, type = "b", col = "blue", main = "Original Step Counts", ylab = "Steps")
plot(mean_centered_steps, type = "b", col = "green", main = "Mean-Centered Steps", ylab = "Centered Steps")
plot(z_scores_steps, type = "b", col = "purple", main = "Z-Scores of Steps", ylab = "Z-Score")Interpretation: Combining transformations provides deeper insights into variability and relative standing within the dataset.
Submission Instructions:
Ensure to knit your document to HTML format, checking that all content is correctly displayed before submission. Submit the RPubs link to Canvas Assignments.