library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.2 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readr)
library(readxl)
library(dplyr)
library(janitor)
##
## Attaching package: 'janitor'
##
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
employeenumeric <- read_excel("employeenumeric.xls")
1. What are the dimensions of this dataset?
There are five columns and 474 rows in this dataset.
2. What variables are included in the dataset?
Gender, Current Salary, Years of Education, Minority Classification and Date of Birth.
fifteenyears <- employeenumeric %>%
clean_names() %>%
filter(years_of_education==15)
3. What are the dimensions of the new dataframe you created?
There are five columns and 116 rows.
attach(fifteenyears)
fifteenttest <- t.test(current_salary[gender=="m"], current_salary[gender=="f"])
fifteenttest
##
## Welch Two Sample t-test
##
## data: current_salary[gender == "m"] and current_salary[gender == "f"]
## t = 5.0443, df = 102.38, p-value = 1.977e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 3930.779 9024.884
## sample estimates:
## mean of x mean of y
## 33527.83 27050.00
4. What is the null hypothesis?
The null hypothesis is that there is no significant difference between the salaries of men and women with 15 years of education. The alternative hypothesis is that the true difference in means is not equal to 0.
5. Why do you think I asked you to include only samples with 15 years of education?
15 years of education means that only a 3-year bachelor’s program was completed, or it was not completed. Typically master’s degrees can lead to higher-paying jobs depending on the field, so this may have been a way to make it more of an even field.
6. What is the t-statistic for the difference in salaries between men and women with 15 years of education?
5.0443
7. What is the p-value?
1.977e-06
8. What are the limits of the 95% CI?
3930.779 and 9024.884
9. Does the 95% CI contain the value 0?
No, because the the limit does not range from a negative number to a positive number.
10. What are the mean salaries for men and women with 15 years of education?
Men: $33,527.83
Women: $27,050.00
11. Referring back to your null and alternative hypotheses, what do you conclude from the results of this test?
I conclude that I should reject the null. There is a significant difference between the values of salaries of men and women, both with 15 years of education, and it is not random.
attach(fifteenyears)
## The following objects are masked from fifteenyears (pos = 3):
##
## current_salary, date_of_birth, gender, minority_classification,
## years_of_education
minorityttest <- t.test(current_salary[minority_classification=="0"], current_salary[minority_classification=="1"])
minorityttest
##
## Welch Two Sample t-test
##
## data: current_salary[minority_classification == "0"] and current_salary[minority_classification == "1"]
## t = 2.4432, df = 59.458, p-value = 0.01755
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 664.4916 6673.2519
## sample estimates:
## mean of x mean of y
## 32507.33 28838.46
12. What is the t-statistic for the difference in salaries between minority and non-minority respondents with 15 years of education?
The t-statistic of this test is 2.4432.
13. What is the p-value?
The p-value is 0.01755
14. What are the limits of the 95% CI?
664.4916 and 6673.2519
15. Does the 95% CI contain the value 0?
No, because the the limit does not range from a negative number to a positive number.
16. What are the mean salaries for minorities and non-minorities with 15 years of education?
Non-minority with 15 years of education = $32507.33
Minority with 15 years of education: $28,838.46
17. What do you conclude from the results of the t-test?
I conclude that I should reject the null hypothesis that there is no significant difference between the salaries of minorities and non-minorities with 15 years of education.
18. Compare, using a t-test for the difference in means, minority vs. non-minority men. Is there a significant difference at 95%?
men <- employeenumeric %>%
clean_names() %>%
filter(gender=="m")
attach(men)
## The following objects are masked from fifteenyears (pos = 3):
##
## current_salary, date_of_birth, gender, minority_classification,
## years_of_education
## The following objects are masked from fifteenyears (pos = 4):
##
## current_salary, date_of_birth, gender, minority_classification,
## years_of_education
menttest <- t.test(current_salary[minority_classification=="0"], current_salary[minority_classification=="1"])
menttest
##
## Welch Two Sample t-test
##
## data: current_salary[minority_classification == "0"] and current_salary[minority_classification == "1"]
## t = 5.5845, df = 168.79, p-value = 9.209e-08
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 7906.222 16552.415
## sample estimates:
## mean of x mean of y
## 44475.41 32246.09
Yes, there is a significant difference between minority and non-minority men at 95% CI. The p-value is 9.209e-08, which means that it is extremely unlikely that the difference in means occurred by chance.
19. Compare, using a t-test for the difference in means, minority vs. non-minority women. Is there a significant difference at 95%?
women <- employeenumeric %>%
clean_names() %>%
filter(gender=="f")
attach(women)
## The following objects are masked from men:
##
## current_salary, date_of_birth, gender, minority_classification,
## years_of_education
## The following objects are masked from fifteenyears (pos = 4):
##
## current_salary, date_of_birth, gender, minority_classification,
## years_of_education
## The following objects are masked from fifteenyears (pos = 5):
##
## current_salary, date_of_birth, gender, minority_classification,
## years_of_education
womenttest <- t.test(current_salary[minority_classification=="0"], current_salary[minority_classification=="1"])
womenttest
##
## Welch Two Sample t-test
##
## data: current_salary[minority_classification == "0"] and current_salary[minority_classification == "1"]
## t = 4.1825, df = 121.34, p-value = 5.478e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 1919.316 5369.264
## sample estimates:
## mean of x mean of y
## 26706.79 23062.50
Yes, there is a significant difference between minority and non-minority women at 95% CI. The p-value is 5.478e-05, which means that it is extremely unlikely that the difference in means occurred by chance.
20.
| Mean Salaries | Male | Female |
| Non-Minority | $44,475.41 | $26,706.79 |
| Minority | $32,246.09 | $23,062.50 |
attach(fifteenyears)
## The following objects are masked from women:
##
## current_salary, date_of_birth, gender, minority_classification,
## years_of_education
## The following objects are masked from men:
##
## current_salary, date_of_birth, gender, minority_classification,
## years_of_education
## The following objects are masked from fifteenyears (pos = 5):
##
## current_salary, date_of_birth, gender, minority_classification,
## years_of_education
## The following objects are masked from fifteenyears (pos = 6):
##
## current_salary, date_of_birth, gender, minority_classification,
## years_of_education
interaction.plot(gender, `minority_classification`, `current_salary`)
21. Include the plot and describe what it tells you about how differences in salary relate to the interaction between sex and minority status for people with 15 years of education.
The dotted line is consistently higher than the solid line, which means that non-minorities with 15 years of education earn more than minorities with the same amount of education. Both lines slope upwards which indicates that men with 15 years are earning more than women with fifteen years. Since there is no overlap, that means that minority and non-minority men earn more than minority and non-minority women.