library(ggplot2)
library(readxl)
employeenumeric <- read_excel("C:/Users/Lindsey Raye Hyde/Downloads/employeenumeric.xls")
View(employeenumeric)
The dimensions of the dataset are 5 columns by 474 rows.
Variables included in the dataset are gender, current salary, years of education, minority classification, and date of birth.
The dimensions of the new data frame are 5 columns by 116 rows.
employee15 <- employeenumeric %>%
filter(`Years of Education` == 15)
employeenumeric[employeenumeric[,3]==15,]
## # A tibble: 116 × 5
## Gender `Current Salary` `Years of Education` `Minority Classification`
## <chr> <dbl> <dbl> <dbl>
## 1 m 57000 15 0
## 2 m 45000 15 0
## 3 m 32100 15 0
## 4 m 36000 15 0
## 5 f 27900 15 0
## 6 m 27750 15 1
## 7 f 35100 15 1
## 8 m 46000 15 0
## 9 f 24000 15 1
## 10 f 21150 15 1
## # ℹ 106 more rows
## # ℹ 1 more variable: `Date of Birth` <dbl>
The null hypothesis is that the true mean difference between salary and sex in this sample is 0.
attach(employee15)
t.test(`Current Salary`[Gender == "m"], `Current Salary`[Gender == "f"])
##
## Welch Two Sample t-test
##
## data: `Current Salary`[Gender == "m"] and `Current Salary`[Gender == "f"]
## t = 5.0443, df = 102.38, p-value = 1.977e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 3930.779 9024.884
## sample estimates:
## mean of x mean of y
## 33527.83 27050.00
detach(employee15)
I think you asked us to include only samples with 15 years of education because there are 116 observations, which is a pretty good sample size, and by 15 years of education, salary level may be maxed out/more evened out between genders.
The t-statistic for the difference in salaries between men and women with 15 years of education is 5.0443.
The p-value is .000001977.
The limits of the 95% confidence interval are 3930.779 - 9024.884.
This 95% confidence interval does not include 0 within its bounds.
The mean salaries for men and women with 15 years of education are 33527.83 & 27050.00.
I would conclude that there is evidence that mean salary does differ by sex in this sample, and I would reject the null hypothesis.
attach(employee15)
t.test(`Current Salary`[`Minority Classification` == "0"], `Current Salary`[`Minority Classification` == "1"])
##
## Welch Two Sample t-test
##
## data: `Current Salary`[`Minority Classification` == "0"] and `Current Salary`[`Minority Classification` == "1"]
## t = 2.4432, df = 59.458, p-value = 0.01755
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 664.4916 6673.2519
## sample estimates:
## mean of x mean of y
## 32507.33 28838.46
detach(employee15)
The t-statistic for the difference in salaries between minority and non-minority respondents with 15 years of education is 2.4432.
The p-value is 0.01755.
The limits of the 95% confidence interval are 664.4916 - 6673.2519.
This 95% confidence interval does not contain the value of 0.
The mean salaries for minorities and non-minorities with 15 years of education are 32507.33 & 28838.46.
With a p-value of less than 0.05 and the 95 % CI not containing zero, I would conclude that there is a difference in salaries between minority and non-minority respondents with 15 years of education; I would reject the null hypothesis.
attach(employee15)
t.test(`Current Salary`[`Minority Classification` == "0" & Gender == "m"], `Current Salary`[`Minority Classification` == "1" & Gender == "m"])
##
## Welch Two Sample t-test
##
## data: `Current Salary`[`Minority Classification` == "0" & Gender == "m"] and `Current Salary`[`Minority Classification` == "1" & Gender == "m"]
## t = 2.4005, df = 40.643, p-value = 0.02104
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 702.7293 8164.9289
## sample estimates:
## mean of x mean of y
## 34489.38 30055.56
detach(employee15)
The mean salaries for minority and non-minority men with 15 years of education are 34489.38 and 30055.56. There is a difference of about $4000, which is significant, and the value of 0 is not within the bounds of the 95% confidence interval.
attach(employee15)
t.test(`Current Salary`[`Minority Classification` == "0" & Gender == "f"], `Current Salary`[`Minority Classification` == "1" & Gender == "f"])
##
## Welch Two Sample t-test
##
## data: `Current Salary`[`Minority Classification` == "0" & Gender == "f"] and `Current Salary`[`Minority Classification` == "1" & Gender == "f"]
## t = 0.62398, df = 11.646, p-value = 0.5447
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3139.546 5647.546
## sample estimates:
## mean of x mean of y
## 27354 26100
detach(employee15)
The mean salaries for minority and non-minority women with 15 years of education are 27354 and 26100. This is a difference of a little over $1000, which is not nearly as significant of some of our past findings. Also, the value of 0 is within the bounds of the 95% confidence interval.
MEAN SALARIES –> MALE & FEMALE NON-MINORITY = 34,489 & 27,354 MINORITY = 30,056 & 26,100
attach(employee15)
interaction.plot(Gender, `Minority Classification`, `Current Salary`)
detach(employee15)
Females within this population make less money overall, with minority females making the least of all 4 subcategories. Non-minority males make the highest salary across all 4 subcategories. Minorities and non-minorities plot lines do not overlap, and overall this trend tells us that regardless of the male and female identifiers minorities have a lower salary than non-minorities; while males have the highest salaries, the difference between minority and non-minority male salaries is much larger than between minority and non-minority female salaries.