Robert W. Walker
September 7, 2016
The two discrimination cases.
In Excel, we deploy the Pivot Table. In R, we have varieties of summary:
Lab3.xls
The file was exported from R exactly like this.
library(WriteXLS)
WriteXLS(c("Discrimination","DiscriminationUCB","Bonds"), ExcelFileName = "Lab3.xls")
Load the data: three data sets
load("Lab3.RData")
ls() # List the contents
[1] "Bonds" "Discrimination" "DiscriminationUCB"
library(RcmdrMisc)
numSummary(Discrimination[,"Expenditures"], groups=Discrimination$Ethnicity,
statistics=c("mean", "sd", "IQR"))
mean sd IQR data:n
American Indian 36438.250 25693.912 34085.25 4
Asian 18392.372 19209.225 30892.00 129
Black 20884.593 20549.274 37987.00 59
Hispanic 11065.569 15629.847 7961.25 376
Multi Race 4456.731 7332.135 2059.75 26
Native Hawaiian 42782.333 6576.462 6331.00 3
Other 3316.500 1836.356 1298.50 2
White not Hispanic 24697.549 20604.376 39157.00 401
Let's look at the same Expenditures but, this time, by Age.Cohort.
We could also look at the scatterplot using the Age.
Drag Age.Cohort to Columns
with(Discrimination, tapply(Expenditures, list(Age.Cohort, Ethnicity), mean,
na.rm=TRUE))
American Indian Asian Black Hispanic Multi Race
0-5 NA 1502.250 1083.000 1393.205 1735.286
13-17 3726 3509.100 4521.750 3955.282 3972.286
18-21 NA 9598.171 8847.444 9959.846 9309.000
22-50 28205 39580.517 40573.059 40924.116 38619.000
6-12 NA 2164.556 2476.455 2312.187 2076.111
51+ 56911 54622.615 54010.143 55585.000 NA
Native Hawaiian Other White not Hispanic
0-5 NA NA 1366.900
13-17 NA 3316.5 3904.358
18-21 NA NA 10133.058
22-50 39103 NA 40187.624
6-12 NA NA 2052.261
51+ 50141 NA 52670.424
Once we look at the data accounting for age, things look a bit different.
local({
.Table <- xtabs(~M.F+Admit, data=DiscriminationUCB)
cat("\nFrequency table:\n")
print(.Table)
cat("\nRow percentages:\n")
print(rowPercents(.Table))
})
Frequency table:
Admit
M.F No Yes
Female 1278 557
Male 1493 1198
Row percentages:
Admit
M.F No Yes Total Count
Female 69.6 30.4 100 1835
Male 55.5 44.5 100 2691
Row percentages:
, , Dept = A
Admit
M.F No Yes Total Count
Female 17.6 82.4 100 108
Male 37.9 62.1 100 825
, , Dept = B
Admit
M.F No Yes Total Count
Female 32 68 100 25
Male 37 63 100 560
, , Dept = C
Admit
M.F No Yes Total Count
Female 65.9 34.1 100 593
Male 63.1 36.9 100 325
, , Dept = D
Admit
M.F No Yes Total Count
Female 65.1 34.9 100 375
Male 66.9 33.1 100 417
, , Dept = E
Admit
M.F No Yes Total Count
Female 76.1 23.9 100 393
Male 72.3 27.7 100 191
, , Dept = F
Admit
M.F No Yes Total Count
Female 93.0 7.0 100 341
Male 94.1 5.9 100 373
DiscriminationUCB$Admit.Num <- with(DiscriminationUCB, as.numeric(Admit)-1)
with(DiscriminationUCB, tapply(Admit.Num, list(Dept, M.F), mean,
na.rm=TRUE))
Female Male
A 0.82407407 0.62060606
B 0.68000000 0.63035714
C 0.34064081 0.36923077
D 0.34933333 0.33093525
E 0.23918575 0.27748691
F 0.07038123 0.05898123