library(tidyverse)
library(readxl)
library(ggalt)
library(scales)
library(ineq)
The salary data comes from the Texas Tribune Government Salary Explorer website. I extracted police department specific data from the cities’ data set and read them into R, keeping all employees of these police departments, not just uniformed officers. These non-uniform or civilian employees include administration staff, human resources, accounting, etc. I feel these employees are a vital part of their respective police departments and should be represented in this analysis.
SA_police.data = read_excel("/Users/uax742/Documents/San Antonio Police.xlsx")
Houston_police.data = read_excel("/Users/uax742/Documents/Houston Police.xlsx")
Austin_police.data = read_excel("/Users/uax742/Documents/Austin Police.xlsx")
Dallas_police.data = read_excel("/Users/uax742/Documents/Dallas Police.xlsx")
FW_police.data = read_excel("/Users/uax742/Documents/Fort Worth Police.xlsx")
The data sets were then cut down to just salaries and job titles, and combined them into one data set, adding a third column to identify the city.
variables_sa = list(salary=sym("FY16 ANNUAL SALARY2"), jobtitle=sym("JOB TITLE"))
SA_police.data %>%
select(., !!!variables_sa) -> SA_salary.only
variables_houston = list(salary=sym("Annual Salary"),
jobtitle=sym("Title"))
Houston_police.data %>%
select(., !!!variables_houston) -> Houston_salary.only
variables_austin = list(salary=sym("Annual Salary"),
jobtitle=sym("Title"))
Austin_police.data %>%
select(., !!!variables_austin) -> Austin_salary.only
variables_dallas = list(salary=sym("Annual Salary"),
jobtitle=sym("Job Code Description"))
Dallas_police.data %>%
select(., !!!variables_dallas) -> Dallas_salary.only
variables_fw = list(salary=sym("Annual Rt"),
jobtitle=sym("Job"))
FW_police.data %>%
select(., !!!variables_fw) -> FW_salary.only
SA_salary.only <- mutate(SA_salary.only, city="San Antonio")
Houston_salary.only <- mutate(Houston_salary.only, city="Houston")
Austin_salary.only <- mutate(Austin_salary.only, city="Austin")
Dallas_salary.only <- mutate(Dallas_salary.only, city="Dallas")
FW_salary.only <- mutate(FW_salary.only, city="Fort Worth")
Cities_salary <- bind_rows(SA_salary.only, Houston_salary.only, Austin_salary.only, Dallas_salary.only, FW_salary.only)
To summarize the data, I looked at the average annual salary and the salary distribution of Texas police departments by city. The Gini coeffecient was calculated for each city’s police department, and visualized it using Lorenz curves. I am looking to compare the San Antonio police department to the four other largest departments in the state of Texas.
Cities_salary %>%
group_by(., city) %>%
summarise(., jobtitle = n(), MeanAnnual=mean(salary, na.rm=TRUE)) %>%
print.data.frame(., digits=3)
## city jobtitle MeanAnnual
## 1 Austin 2508 77588
## 2 Dallas 3778 62287
## 3 Fort Worth 2345 63364
## 4 Houston 7371 62966
## 5 San Antonio 3013 59681
The average annual salary for the San Antonio police department is inline with the other large Texas cities police departments outside of Austin, whose average is significantly higher than the rest. The two police departments with the least amount of employees, Austin and Fort Worth, have the two highest annual salaries. Although, Fort Worth is closer to the other four departments then it is to Austin’s.
Cities_salary %>%
ggplot(., aes(city, salary)) +
geom_boxplot() + scale_y_continuous(labels = comma) +
coord_flip() +
labs(title ="Texas Police Department Salary Distributions")
The San Antonio police department’s salary distribution is similar to that of the Houston and Dallas police departments. The Austin and Fort Worth police departments’ are more spread out, indicating a larger variablility in salaries. The largest outliers for the San Antonio and Fort Worth police deapartments are their respective police chiefs. The Houston police department has two police chiefs, most likely due to the fact that they have over 7300 employees, nearly double that of each of the other four police departments. The Austin and Dallas police departments data sets did not list a police chief. The highest paid employee of the Austin PD is a Lab director, and the highest paid employee of the Dallas PD is an assistant police chief.
City_gini = matrix(c(0.1507, 0.1486, 0.1509, 0.1779, 0.2310), ncol = 5, byrow = T)
colnames(City_gini) = c("San Antonio ", "Dallas ", "Houston ", "Austin ", "Fort Worth")
rownames(City_gini) = c("Gini")
City_gini = as.table(City_gini)
City_gini
## San Antonio Dallas Houston Austin Fort Worth
## Gini 0.1507 0.1486 0.1509 0.1779 0.2310
The Gini coefficient is a measure of statistical dispersion intended to represent income distribution, and is the most commonly used measurement of inequality. The San Antonio police department’s Gini coefficeint ranks second among the five Texas departments, and indicates a high level of income equality within the department. The Houston, Dallas, and Austin police departments are similar and also indicate equality. The Fort Worth police department is noticieably higher than the other four departments, but is still a relatively low in terms of Gini coefficients.
Cities_salary %>%
group_by(., city) %>%
summarise(., Gini=ineq(salary, type="Gini")) %>%
ggplot(., aes(reorder(city, Gini), Gini)) +
geom_lollipop(point.colour="blue", point.size=3) +
coord_flip() +
labs(title="Texas Police Department Salary Inequity")
A Lorenz curve is a graphical representation of the distribution of income. The diagonal line represents perfect equality, a Gini coeffecient of 1.
plot(Lc(SA_salary.only$salary), col = "blue", lwd = 2, sub = "San Antonio")
par(mfrow = c(2,2))
plot(Lc(Houston_salary.only$salary), col = "blue", lwd = 2, sub = "Houston")
plot(Lc(Austin_salary.only$salary), col = "blue", lwd = 2, sub = "Austin")
plot(Lc(Dallas_salary.only$salary), col = "blue", lwd = 2, sub = "Dallas")
plot(Lc(FW_salary.only$salary), col = "blue", lwd = 2, sub = "Fort Worth")
Overall, the San Antonio police department compares favorably with the other four large police departments in the state of Texas. The average annual salary for the San Antonio police department is the lowest of the five departments, but not by much. The Salary distribution for the San Antonio PD is similar to that of Houston and Dallas, which are the three largest departments in the state. The San Antonio police department’s Gini coefficient is the second lowest among the departments, indicating a high level of equality in terms of salary across the department.