Scraping the MA school expenditures data, and a bit of data cleaning:

library(XML, quietly=T)
library(ineq, quietly=T)
suppressPackageStartupMessages(library(reldist, quietly=T))
url <- "http://profiles.doe.mass.edu/state_report/ppx.aspx"
tables <- readHTMLTable(url)
schools<- tables[[2]]
schools <- schools[schools$V1!="MASSACHUSETTS TOTAL",]
schools$V7 <- gsub("[$]", "", as.character(schools$V7))
schools$V7 <- as.numeric(gsub("[,]", "", as.character(schools$V7)))
schools$V3 <- as.numeric(gsub("[,]", "", as.character(schools$V3)))
schools<-schools[complete.cases(schools),]

We can measure inequality in a number of ways. Graphically, we can look at the Lorenz curve:

Lc.educ <- Lc(schools$V7, schools$V3)
plot(Lc.educ, main="Lorenz Curve for Educational Expenditures, 2011-12")

plot of chunk unnamed-chunk-2

Or just look at the density or histogram:

hist(schools$V7)

plot of chunk unnamed-chunk-3

We can also use scalar inequality indices, like the Gini index:

gini(schools$V7, schools$V3)
## [1] 0.09265

For context, the gini measure of income inequality in 2011 for Massachusetts was about 0.48.

How has this changed over time? Let’s examine the first year of data (manually grabbed, since the Javascript doesn’t play well with the XML library).

schools2005 <- read.csv("~/Documents/ma_schools2005.csv", header=FALSE)
schools2005$V7 <- gsub("[$]", "", as.character(schools2005$V7))
schools2005$V7 <- as.numeric(gsub("[,]", "", as.character(schools2005$V7)))
schools2005$V3 <- as.numeric(gsub("[,]", "", as.character(schools2005$V3)))
schools2005<-schools2005[complete.cases(schools),]
schools2005<-schools2005[schools2005$V1 %in% schools$V1,]
gini(schools2005$V7, schools2005$V3)
## [1] 0.1068

What about Lorenz dominance?

schools<-schools[schools$V1 %in% schools2005$V1,]

Lc.educ <- Lc(schools$V7, schools$V3)
Lc2005.educ <- Lc(schools2005$V7, schools2005$V3)
plot(Lc.educ, main="Lorenz Curve for Educational Expenditures, 2011-12")
lines(Lc2005.educ, col="red")

plot of chunk unnamed-chunk-6

min(Lc.educ$L -Lc2005.educ$L)
## [1] -0.01161
plot(Lc.educ$L -Lc2005.educ$L, type="l", main="Difference in Lorenz Curve ordinates, 2011 vs. 2005")

plot of chunk unnamed-chunk-6

So we can’t say anything for sure about Lorenz dominance.