Determinants of Wages from 1985 Current Population Survey

The Current Population Survey (CPS) is used to supplement census information between census years. These data consist of a random sample of 29 persons from the CPS, with information on wages and other characteristics of the workers, including sex, number of years of education, years of work experience, occupational status, region of residence and union membership. We wish to determine (i) whether wages are related to these characteristics and (ii) whether there is a gender gap in wages.

Variable Names

EDUCATION: Number of years of education.
SOUTH: Indicator variable for Southern Region (1=Person lives in        South, 0=Person lives elsewhere).
SEX: Indicator variable for sex (1=Female, 0=Male).
EXPERIENCE: Number of years of work experience.
UNION: Indicator variable for union membership (1=Union member,         0=Not union member).
WAGE: Wage (dollars per hour).
AGE: Age (years).
RACE: Race (1=Other, 2=Hispanic, 3=White).
OCCUPATION: Occupational category (1=Management,        2=Sales, 3=Clerical, 4=Service, 5=Professional, 6=Other).
SECTOR: Sector (0=Other, 1=Manufacturing, 2=Construction).
MARR: Marital Status (0=Unmarried,  1=Married)
library("openxlsx")
data <- read.xlsx(file.choose())
data
##    Education South Sex Experience Union  Wage Age Sector Married
## 1          8     0   P         21     0   5.1  35      1       1
## 2          9     0   P         42     0  4.95  57      1       1
## 3         12     0   L          1     0  6.67  19      1       0
## 4         12     0   L          4     0     4  22      0       0
## 5         12     0   L         17     0   7.5  35      0       1
## 6         13     0   L          9     1 13.07  28      0       0
## 7         12     0   L          9     0 19.47  27      0       0
## 8         16     0   L         11     0 13.28  33      1       1
## 9         12     0   L          9     0  8.75  27      0       0
## 10        12     0   L         17     1 11.35  35      0       1
## 11        12     0   L         19     1  11.5  37      1       0
## 12        12     0   L         37     0   7.3  55      1       1
## 13        12     0   L         26     1  22.2  44      1       1
## 14        11     0   L         16     0  3.65  33      0       0
## 15        12     0   L         33     0 20.55  51      0       1
## 16        12     0   P         16     1  5.71  34      1       1
## 17         7     0   L         42     1     7  55      1       1
## 18        12     0   L          9     0  3.75  27      0       0
## 19        12     0   L         23     0  9.56  41      0       1
## 20        12     0   L          8     0  9.36  26      1       1
## 21        10     0   L         30     0   6.5  46      0       1
## 22        12     0   P          8     0  3.35  26      1       1
## 23        10     1   L         27     0  4.45  43      0       0
## 24         8     1   L         27     0   6.5  41      0       1
## 25         9     1   L         30     1  6.25  45      0       0
## 26         9     1   L         29     0 19.98  44      0       1
## 27         7     1   L         44     0     8  57      0       1
## 28        11     1   L         14     0   4.5  31      0       1
## 29         6     1   L         45     0  5.75  57      1       1
head(data)
##   Education South Sex Experience Union  Wage Age Sector Married
## 1         8     0   P         21     0   5.1  35      1       1
## 2         9     0   P         42     0  4.95  57      1       1
## 3        12     0   L          1     0  6.67  19      1       0
## 4        12     0   L          4     0     4  22      0       0
## 5        12     0   L         17     0   7.5  35      0       1
## 6        13     0   L          9     1 13.07  28      0       0
str(data)
## 'data.frame':    29 obs. of  9 variables:
##  $ Education : num  8 9 12 12 12 13 12 16 12 12 ...
##  $ South     : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Sex       : chr  "P" "P" "L" "L" ...
##  $ Experience: num  21 42 1 4 17 9 9 11 9 17 ...
##  $ Union     : num  0 0 0 0 0 1 0 0 0 1 ...
##  $ Wage      : chr  "5.1" "4.95" "6.67" "4" ...
##  $ Age       : num  35 57 19 22 35 28 27 33 27 35 ...
##  $ Sector    : num  1 1 1 0 0 0 0 1 0 0 ...
##  $ Married   : num  1 1 0 0 1 0 0 1 0 1 ...
data[, c(1, 4, 6)] <- lapply(data[, c(1, 4, 6)], as.numeric)
x1_p <- data[,1];x1_p
##  [1]  8  9 12 12 12 13 12 16 12 12 12 12 12 11 12 12  7 12 12 12 10 12 10  8  9
## [26]  9  7 11  6
x2_p <- data[,4];x2_p
##  [1] 21 42  1  4 17  9  9 11  9 17 19 37 26 16 33 16 42  9 23  8 30  8 27 27 30
## [26] 29 44 14 45
x3_p <- data[,6];x3_p
##  [1]  5.10  4.95  6.67  4.00  7.50 13.07 19.47 13.28  8.75 11.35 11.50  7.30
## [13] 22.20  3.65 20.55  5.71  7.00  3.75  9.56  9.36  6.50  3.35  4.45  6.50
## [25]  6.25 19.98  8.00  4.50  5.75
data$Sector<-as.integer(data$Sector)
data$Married<-as.integer(data$Married)
dataf<-data[1:29,c(1,4,6)];dataf
##    Education Experience  Wage
## 1          8         21  5.10
## 2          9         42  4.95
## 3         12          1  6.67
## 4         12          4  4.00
## 5         12         17  7.50
## 6         13          9 13.07
## 7         12          9 19.47
## 8         16         11 13.28
## 9         12          9  8.75
## 10        12         17 11.35
## 11        12         19 11.50
## 12        12         37  7.30
## 13        12         26 22.20
## 14        11         16  3.65
## 15        12         33 20.55
## 16        12         16  5.71
## 17         7         42  7.00
## 18        12          9  3.75
## 19        12         23  9.56
## 20        12          8  9.36
## 21        10         30  6.50
## 22        12          8  3.35
## 23        10         27  4.45
## 24         8         27  6.50
## 25         9         30  6.25
## 26         9         29 19.98
## 27         7         44  8.00
## 28        11         14  4.50
## 29         6         45  5.75
str(data)
## 'data.frame':    29 obs. of  9 variables:
##  $ Education : num  8 9 12 12 12 13 12 16 12 12 ...
##  $ South     : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Sex       : chr  "P" "P" "L" "L" ...
##  $ Experience: num  21 42 1 4 17 9 9 11 9 17 ...
##  $ Union     : num  0 0 0 0 0 1 0 0 0 1 ...
##  $ Wage      : num  5.1 4.95 6.67 4 7.5 ...
##  $ Age       : num  35 57 19 22 35 28 27 33 27 35 ...
##  $ Sector    : int  1 1 1 0 0 0 0 1 0 0 ...
##  $ Married   : int  1 1 0 0 1 0 0 1 0 1 ...

Uji Normalitas Multivariat

Uji Hipotesis

H0 : Data berdistribusi normal multivariat

H1 : Data tidak berdistribusi normal multivariat

Taraf Signifikansi

alpha = 5% = 0,05

library(MVN)
## Warning: package 'MVN' was built under R version 4.4.1
test = mvn(data[1:29,c(1,4,6)], mvnTest = "mardia", univariateTest = "SW", multivariatePlot = "qq")

test
## $multivariateNormality
##              Test          Statistic           p value Result
## 1 Mardia Skewness   13.2822084762488 0.208318454443323    YES
## 2 Mardia Kurtosis -0.214329504809508 0.830290111083336    YES
## 3             MVN               <NA>              <NA>    YES
## 
## $univariateNormality
##           Test   Variable Statistic   p value Normality
## 1 Shapiro-Wilk Education     0.8590    0.0012    NO    
## 2 Shapiro-Wilk Experience    0.9451    0.1365    YES   
## 3 Shapiro-Wilk    Wage       0.8277    0.0003    NO    
## 
## $Descriptives
##             n      Mean   Std.Dev Median  Min  Max 25th  75th       Skew
## Education  29 10.827586  2.172375     12 6.00 16.0  9.0 12.00 -0.3918675
## Experience 29 21.482759 12.740881     19 1.00 45.0  9.0 30.00  0.3390871
## Wage       29  8.965517  5.436784      7 3.35 22.2  5.1 11.35  1.1815078
##              Kurtosis
## Education  -0.0565370
## Experience -1.0901924
## Wage        0.2018184

P-Value Uji Normalitas Multivariat

Mardia Skewness : 0.20831845

Mardia Kurtosis : 0.830290111

Kriteria Uji

Pada Mardia’s Test Skewness, tolak H0 jika p-value < alpha

Pada Mardia’s Test Kurtosis, tolak H0 hika p-value < alpha

Kesimpulan

Berdasarkan hasil perhitungan diatas diketahui bahwa data berdistribusi normal multivariat dikarenakan p value dari kurtosis dan skewness > apha

Uji Homogenitas

Dikarenakan ada 2 grup yaitu sector (0=other,1=manufacturing) dan married(0=unmarried,1=married) maka dilakukan 2 uji homogenitas dengan ghipotesis,taraf signifikansi, dan uji yang sama yakni enggunakan Box’s M dengan taraf signifikansi 5%

Hipotesis

H0 = Matriks Kovarians Grup sama

H1 = Matriks Kovarians Grup tidak sama

Taraf Signifikansi

alpha = 5% = 0,05

# Sector
library(biotools)
## Warning: package 'biotools' was built under R version 4.4.1
## Loading required package: MASS
## ---
## biotools version 4.2
head(data[,8])
## [1] 1 1 1 0 0 0
boxM(data = dataf, grouping = data[,8])
## 
##  Box's M-test for Homogeneity of Covariance Matrices
## 
## data:  dataf
## Chi-Sq (approx.) = 6.7137, df = 6, p-value = 0.3481
# Married
head(data[,9])
## [1] 1 1 0 0 1 0
boxM(data = dataf, grouping = data[,9])
## 
##  Box's M-test for Homogeneity of Covariance Matrices
## 
## data:  dataf
## Chi-Sq (approx.) = 10.367, df = 6, p-value = 0.11

Kriteria Uji

Tolak H0 jika p-value < alpha, terima dalam hal lainnya

Keputusan

Kesimpulan

Two Way MANOVA

H0 : Faktor sektor tidak berpengaruh terhadap gaji seseorang

H1 : Setidaknya ada satu yang tidak berpengaruh terhadap gaji seseorang

H0 : Faktor married berpengaruh terhadap gaji seseorang

H1 : Setidaknya ada satu faktor yang tidak berpengaruh terhadap gaji

sector <- as.factor(data$Sector[1:29])
married <- as.factor(data$Married[1:29])
manova <- manova(cbind(x1_p, x2_p, x3_p) ~ sector * married, data = data)
# Menampilkan hasil
summary(manova)
##                Df   Pillai approx F num Df den Df  Pr(>F)  
## sector          1 0.044373  0.35599      3     23 0.78526  
## married         1 0.243258  2.46448      3     23 0.08792 .
## sector:married  1 0.026406  0.20794      3     23 0.88985  
## Residuals      25                                          
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Kesimpulan

  1. Pada faktor sector diketahui p-value (0,78526) > alpha (0,05) maka terima H0, sehingga faktor sektor tidak berpengaruh terhadap gaji seseorang
  2. Pada faktor married diketahui p-value (0,08792) > alpha (0,05) maka terima H0, sehingga faktor married tidak berpengaruh terhadap gaji seseorang