Bank Wages

I am using the dataset BankWages from AER package for this homework. The dataset reports the following:

A. Explore dataset

data("BankWages")
str(BankWages)
## 'data.frame':    474 obs. of  4 variables:
##  $ job      : Ord.factor w/ 3 levels "custodial"<"admin"<..: 3 2 2 2 2 2 2 2 2 2 ...
##  $ education: int  15 16 12 8 15 15 15 12 15 12 ...
##  $ gender   : Factor w/ 2 levels "male","female": 1 1 2 2 1 1 1 2 2 2 ...
##  $ minority : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 1 1 ...

The dataframe contains 474 observations and 4 variables. job is a factor variable indicating job category. education is a continuous variable representing education in years. Other variables are gender and minority.

B. How does education level affect job position? (Table)

xtabs(~ education + job, data = BankWages)
##          job
## education custodial admin manage
##        8         13    40      0
##        12        13   176      1
##        14         0     6      0
##        15         1   111      4
##        16         0    24     35
##        17         0     3      8
##        18         0     2      7
##        19         0     1     26
##        20         0     0      2
##        21         0     0      1
eduCat <- factor(BankWages$education)
levels(eduCat)[3:10] <- rep(c("14-15", "16-18", "19-21"), c(2, 3, 3)) #merge some education levels
tab <- xtabs(~ eduCat + job, data = BankWages)
#Table
prop.table(tab, 1)
##        job
## eduCat    custodial       admin      manage
##   8     0.245283019 0.754716981 0.000000000
##   12    0.068421053 0.926315789 0.005263158
##   14-15 0.008196721 0.959016393 0.032786885
##   16-18 0.000000000 0.367088608 0.632911392
##   19-21 0.000000000 0.033333333 0.966666667

The above table shows that bank employees with more years of education (16-21) are mostly in managerial positions.Similarly, most employees with 12-15 years of education are in administrative positions.

C. Relationship between job position and education level (Plot)

#Plot
plot(job ~ eduCat, data = BankWages, 
     main = "Job category by education",
     xlab = "Year of education",
     ylab = "Job category",
     cex.main = 2,
     cex.lab = 1.5,
     font = 2,
     col = c("#2CA25F", "#99D8C9" , "#E5F5F9")
     )