7/29/2015

Introduction

Our Data

The original data set used for this project, CPS85, was taken in 1985 with the Current Population Survey (CPS), used to collect data in-between census years. The CPS85 data set contains information on wages (US dollars per hour) of both men and women and it includes variables such as years of education and work experience, race, sex, marriage status, age, sector of work, region of residence, and union membership. We used a subset of this data set, CPS2, that does not include an outlier from the original data set.

Research Questions

We have four questions:

  • Do women, on average, make less money than men because they choose jobs in lower-paying sectors?

  • Do women, on average, make less money than men because they have a lower level of education?

  • Do women, on average, make less money than men because they have less experience?

  • Do the women make less money, on average, than men because of their marriage status?

Methods

Confounding Factor One: Education

Variable Analysis

Sex: Factor Variable

Education: Numerical Variable

Wage: Numerical Variable

Methods for Education

Methods to correct for confounding factor:

  • Numerical:
    • xtabs()
    • predict()
    • chisqtestGC()
  • Graphical:
    • Histogram
    • Density plot

Results

xtabs()

We used xtabs to see if there was any correlation between gender and level of education.

xtabs(~sex+educ, data=CPS85)
##    educ
## sex   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18
##   F   0   0   0   0   2   1   9   5   5  11 110  16  23   6  34  14   9
##   M   1   1   1   1   1   4   6   7  12  16 109  21  33   7  37  10  22

We crossed "sex" and "level of education" and saw that more men are out in the workforce with between two and seven years of education.

predict()

Next, we utilized the predict model to compare mens' and womens' wages.

 predict(modEducMale, x = 14)
## Predict wage is about 10.67,
## give or take 4.954 or so for chance variation.
predict(modEducFemale, x = 14)
## Predict wage is about 8.543,
## give or take 3.585 or so for chance variation.

We used the "modEducMale" and "modEducFemale" to predict mens' and womens' wages, which yielded some interesting results. The function showed that the predicted wage for men is always at least $2 higher than the womens'. However, the chance variation for mens' wages was considerably higher than the womens'.

histogram()

Then we used two histograms, one to graph the relationship between gender and education…

histogram()

… and the other to graph wage vs. sex.

histogram(~wage | sex, data=CPS2)

histogram()

The second histogram shows that more women make a wage of $5-$6 per hour compared to men. Also, a considerable number of men make more than $15 per hour, which can't be said for the women.

densityplot()

Finally, we employed the use of a density plot to show the level of males and females, their level of education, and their wages.

densityplot(~wage | educ + sex, data=CPS2)

This density plot observes three different factors: sex, wage, and education. Our results showed us that more females have a higher level of education than males.

Methods

Confounding Factor Two: Experience

Variable Analysis

Sex: Factor Variable

Experience: Numerical Variable

Wage: Numerical Variable

Methods for Experience

Methods to correct for confounding factor:

  • Numerical:
    • favstats()
  • Graphical:
    • Scatterplot
    • Density plot

Results

xyplot()

The first R chunk I made was to see the relationship between wage and experience. To see this I made an xy plot graph. Looking at the graph it shows that the more experience a person has, in this sample group, does not mean they will receive the highest wage. So wage and experience are not related.

favstats()

Next I made R chunks to find the favstats relationship between sex and experience, then sex and wage. On average females had almost two years more of experience then males, specifically 1.94 years.

favstats(exper~sex,data=CPS2)
##   sex min Q1 median Q3 max     mean       sd   n missing
## 1   F   0  9     16 28  49 18.90574 12.58663 244       0
## 2   M   0  8     14 23  55 16.96540 12.13461 289       0

favstats()

In the sex and wage relationship I looked at the mean for males and females. This showed a 2.27 wage difference. Males have an overall higher wage then females. Even though females had, on average, more experience.

favstats(wage~sex,data=CPS2)
##   sex  min     Q1 median      Q3   max     mean       sd   n missing
## 1   F 1.75 4.7175  6.735  9.8125 24.98 7.728770 4.102386 244       0
## 2   M 1.00 6.0000  8.930 13.0000 26.29 9.994913 5.285854 289       0

densityplot()

This density plot shows that on average females have more work experience then males especially between the 20 to 60 year marks. In females it is almost double the amount of males, this is easiest to see in that 20 to 60 years of experience range.

xyplot()

This is an xy plot made to show the relationship between wage, experience and sex. In this graph it shows that males are making more money on average then females regardless of how much experience they have. There is a higher concentration of males making more money with less experience then females.

Methods

Confounding Factor Three: Marriage Status

Variable Analysis

Sex: Factor Variable

Marriage Status: Factor Variable

Wage: Numerical Variable

Methods for Marriage Status

Methods to correct for confounding factor:

  • Numerical
    • xtabs()
    • favstats()
    • tapply()
  • Graphical
    • Bar chart
    • Box and Whisker Plot

Results

Table for Marriage Status Men v. Women:

Married Single
F 162 82
M 188 101
Married Single Total
F 66.39 33.61 100
M 65.05 34.95 100

barchartGC()

Bar charts for Marriage Status of Men v. Women:

barchartGC()

Both the table and the bar chart show about an equal percentage of men and women in the study are married and single.

Data Summary for Wage & Marriage Status

favstats(wage~married,data=CPS2)
##   married  min    Q1 median Q3   max     mean       sd   n missing
## 1 Married 1.00 5.620  8.595 12 26.29 9.398486 4.925121 350       0
## 2  Single 2.01 4.585  6.500 10 25.00 8.114098 4.776275 183       0

Now including sex

favstats(wage~sex+married,data=CPS2)
##   sex.married  min     Q1 median     Q3   max      mean       sd   n
## 1   F.Married 1.75 4.8750  6.880 10.000 23.25  7.683765 3.725468 162
## 2   M.Married 1.00 6.8250  9.845 13.545 26.29 10.876064 5.345956 188
## 3    F.Single 3.35 4.5125  6.450  9.245 24.98  7.817683 4.784327  82
## 4    M.Single 2.01 5.0000  6.670 10.670 25.00  8.354752 4.779962 101
##   missing
## 1       0
## 2       0
## 3       0
## 4       0

While the first code chunk shows there is a difference in wages between married and single people, the second (which breaks it further down by sex) shows an even further difference in wages. According to the data, men make higher wages than women, regardless of their marriage status.

Mean Wages of Single v. Married Men and Women

with(CPS2,tapply(wage,INDEX=list(married,sex),FUN=mean))
##                F         M
## Married 7.683765 10.876064
## Single  7.817683  8.354752

Box & Whiskers Plot:

Box and Whiskers Plot

The box plot shows that when single, men and women had very similar median wages and different ranges (men having a higher IQR and range between the min and max values). However, when their status changed to married, men's median wage jumped significantly higher than female's and their range expanded significantly as well.

Methods

Confounding Factor Four: Job Sector

Variable Analysis

Sex: Factor Variable

Job Sector: Factor Variable

Wage: Numerical Variable

Methods for Job Sector

Methods to correct for confounding factor:

  • Numerical:
    • xtabs()
    • favstats()
    • lm()
  • Graphical:
    • Density plot

Results

Sex and Sector

The amount of males and females in each sector, and the percents of males and females in each sector

sexSector<-xtabs(~sex+sector,data=CPS2)
sexSector
##    sector
## sex clerical const manag manuf other prof sales service
##   F       76     0    20    24     6   52    17      49
##   M       21    20    34    44    62   53    21      34
rowPerc(sexSector)
##    sector
## sex clerical  const  manag  manuf  other   prof  sales service  Total
##   F    31.15   0.00   8.20   9.84   2.46  21.31   6.97   20.08 100.00
##   M     7.27   6.92  11.76  15.22  21.45  18.34   7.27   11.76 100.00

favstats()

You can use favstats to look at the mean wage for each sector

favstats(wage~sector,data=CPS2)
##     sector  min     Q1 median      Q3   max      mean       sd   n missing
## 1 clerical 3.00 5.2000  7.500  9.5000 15.03  7.422577 2.699018  97       0
## 2    const 3.75 7.2250  9.750 11.6275 15.00  9.502000 3.343877  20       0
## 3    manag 1.00 7.1250 10.620 15.8550 26.29 12.115185 6.244713  54       0
## 4    manuf 3.00 4.9250  6.750  9.8725 22.20  8.036029 4.117607  68       0
## 5    other 2.85 5.0000  6.940 10.8150 26.00  8.500588 4.601049  68       0
## 6     prof 4.35 7.5000 10.610 15.3800 24.98 11.947429 5.523833 105       0
## 7    sales 3.35 4.3125  5.725 10.8325 19.98  7.592632 4.232272  38       0
## 8  service 1.75 3.9650  5.500  8.0000 25.00  6.537470 3.673278  83       0

densityplot()

Here is a density plot showing wages for the different sector

Sex and Sector

From the row percents and the density plots, it appears women do often choose different sectors than men, and we can see that some sectors have higher wages than others. The null hypothesis states this is why women have lower wages. In order to disprove the null hypothesis, we will have to compare the wages of men and women in the same sector.

densityplot()

Density plot comparing the wages of each sex for the different sectors:

densityplot()

Looking at the density plot, you can clearly see that males make more than females is all sectors. The plots with females have most of the data clustered in one place. This means that most females make around the same wage. But for the males, the data is more spread out, and stretches farther- showing us that a lot of the males make a variety of wages, and usually more than the females.

Conclusion

Conclusion

In conclusion…

  • Men's wages > Women's wages
  • Women receive less for same sector of work
  • Women have more work experience, but receive less
  • Women receive lower wages for equal education
  • Once married, men begin to make more on average while women begin to make less

Conclusion

Possible reasons why?

  • Society
    • Social status and worth
  • Men = "Provider" for their family
  • Men's work is seen as more important than women's
    • Even if it's the same job in the same sector of work

It might be interesting to study the other variables from the CPS85 data set or to compare/contrast with a similar study taken from another country.