Lab #6: Exploring t-tests and Interaction Plots

library(readxl)
employeenumeric <- read_excel("C:/Users/rmase/OneDrive - University of North Carolina at Chapel Hill/Fall 2025/GEOG391/Lab 6/employeenumeric.xls")

After loading the data…

Question #1: The dimensions of the data set are 474 rows x 5 columns.

Question #2: The variables are gender, current salary, years of education, minority classification, and date of birth.

names(employeenumeric)
## [1] "Gender"                  "Current Salary"         
## [3] "Years of Education"      "Minority Classification"
## [5] "Date of Birth"

Question #3: The dimensions of the data set are 116 rows x 5 columns.

Question #4: The null hypothesis is that mean salary will not differ meaningfully by sex in the population.

Filtering Data for Education and first T-test

new_education_data <- subset(employeenumeric, `Years of Education` == 15)
attach(new_education_data)
t.test(`Current Salary` ~ Gender, data = new_education_data)
## 
##  Welch Two Sample t-test
## 
## data:  Current Salary by Gender
## t = -5.0443, df = 102.38, p-value = 1.977e-06
## alternative hypothesis: true difference in means between group f and group m is not equal to 0
## 95 percent confidence interval:
##  -9024.884 -3930.779
## sample estimates:
## mean in group f mean in group m 
##        27050.00        33527.83

Question #5: 15 years of education for most people will likely include K-12 education plus an additional 2 years associates degree or technical education program. We include only samples with 15 years of education as a fair basis for people likely to have reasonably equal opportunity in the job market.

Question #6: The t-statistic is -5.0443.

Question #7: The p-value is 1.977e-06

Question #8: The limits of the 95% confidence interval are -9024.884 and -3930.779.

Question #9: The 95% confidence interval does not contain the value 0. Question #10: The mean salary for women is 27050 and for men is 33527.83.

Question #11: Because this interval does not contain 0, there is a statistically significant different in current salary by gender. We can reject the null hypothesis and confirm that gender does play a role in salary.

Second T-test

t.test(`Current Salary` ~ `Minority Classification`, data = new_education_data)
## 
##  Welch Two Sample t-test
## 
## data:  Current Salary by Minority Classification
## t = 2.4432, df = 59.458, p-value = 0.01755
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##   664.4916 6673.2519
## sample estimates:
## mean in group 0 mean in group 1 
##        32507.33        28838.46

Question #12: The t-statistic is 2.4432.

Question #13: The p-value is 0.01755.

Question #14: The limits of the 95% confidence interval are 664.4916 and 6673.2519.

Question #15: This interval does not contain 0.

Question #16: The mean salaries for non-minorities and minorities are 32507.33 and 28838.46, respectively.

Question ##17: We can conclude that there is a statistically significant correlation between minority classification and salary.

##Separating the Men

men_data <- subset(new_education_data, Gender == "m")
t.test(`Current Salary` ~ `Minority Classification`, data = men_data)
## 
##  Welch Two Sample t-test
## 
## data:  Current Salary by Minority Classification
## t = 2.4005, df = 40.643, p-value = 0.02104
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##   702.7293 8164.9289
## sample estimates:
## mean in group 0 mean in group 1 
##        34489.38        30055.56

Question 18: There is a significant difference at 95% confidence.

##Separating the Women

women_data <- subset(new_education_data, Gender == "f")
t.test(`Current Salary` ~ `Minority Classification`, data = women_data)
## 
##  Welch Two Sample t-test
## 
## data:  Current Salary by Minority Classification
## t = 0.62398, df = 11.646, p-value = 0.5447
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -3139.546  5647.546
## sample estimates:
## mean in group 0 mean in group 1 
##           27354           26100

Question 19: There is no significant difference at 95% confidence.

##Table of Salaries

Mean Salaries Male Female
Non-Minority 344489.38 27354
Minority 30055.56 26100

##Interaction Plot

interaction.plot (Gender, `Minority Classification`, `Current Salary`)

Question 23: This graph tells us that salaries change in relation to both sex and minority status. Salaries are likely to be lower for women regardless of minority status, but even here, non-minority women will earn more than minority women. Salary will increase for men, but it increases more for non-minority men than for minority men.