Cross-section data originating from the May 1985 Current Population Survey by the US Census Bureau
loading the dataset
wagedata <- read.csv("CPS1985.csv", header=T, sep = ",")
dim(wagedata)
## [1] 534 11
head(wagedata)
## wage education experience age ethnicity region gender occupation
## 1 5.10 8 21 35 hispanic other female worker
## 2 4.95 9 42 57 cauc other female worker
## 3 6.67 12 1 19 cauc other male worker
## 4 4.00 12 4 22 cauc other male worker
## 5 7.50 12 17 35 cauc other male worker
## 6 13.07 13 9 28 cauc other male worker
## sector union married
## 1 manufacturing no yes
## 2 manufacturing no yes
## 3 manufacturing no no
## 4 other no no
## 5 other no yes
## 6 other yes no
str(wagedata)
## 'data.frame': 534 obs. of 11 variables:
## $ wage : num 5.1 4.95 6.67 4 7.5 ...
## $ education : int 8 9 12 12 12 13 10 12 16 12 ...
## $ experience: int 21 42 1 4 17 9 27 9 11 9 ...
## $ age : int 35 57 19 22 35 28 43 27 33 27 ...
## $ ethnicity : chr "hispanic" "cauc" "cauc" "cauc" ...
## $ region : chr "other" "other" "other" "other" ...
## $ gender : chr "female" "female" "male" "male" ...
## $ occupation: chr "worker" "worker" "worker" "worker" ...
## $ sector : chr "manufacturing" "manufacturing" "manufacturing" "other" ...
## $ union : chr "no" "no" "no" "no" ...
## $ married : chr "yes" "yes" "no" "no" ...
The Dataset has 11 variables and 534 obervations. There are 4 numeric variables and 7 categorical variables.
stats <- function (columnN){
Mean <- mean(wagedata[, columnN], na.rm=TRUE)
Median <- median(wagedata[, columnN], na.rm=TRUE)
Min <- min(wagedata[, columnN], na.rm=TRUE)
Max <- max(wagedata[, columnN], na.rm=TRUE)
output <- data.frame(Mean, Median, Min, Max)
return(output)
}
wage <- stats(1)
Variable <- "Wage"
Var1 <- cbind(Variable, wage)
Educ <- stats(2)
Variable <- "Education"
Var2 <- cbind(Variable, Educ)
Exp <- stats(3)
Variable <- "Experience"
Var3 <- cbind(Variable, Exp)
Age <- stats(4)
Variable <- "Age"
Var4 <- cbind(Variable, Age)
rbind(Var1, Var2, Var3, Var4)
## Variable Mean Median Min Max
## 1 Wage 9.024064 7.78 1 44.5
## 2 Education 13.018727 12.00 2 18.0
## 3 Experience 17.822097 15.00 0 55.0
## 4 Age 36.833333 35.00 18 64.0
table <- ftable(ethnicity ~ sector, wagedata)
table
## ethnicity cauc hispanic other
## sector
## construction 21 0 3
## manufacturing 81 4 14
## other 338 23 50
From the above table we have two interesting observations:
plot(wagedata$experience, wagedata$wage, main="Relatiobship between wage rate and experience",
xlab="Experience", ylab="Wage rate ($) ", pch=19)
From the above scatter plot, we can conclude that experience of the wokers does not have any effect on their wage rate.