2023-02-24

R Code Information

Here are the libraries and setup for the R script. This project was made entirely using R, Rmarkdown, as well as data visualization libraries, ggplot and plotly.

library(ggplot2)
library(plotly)
## "path" is the file path for the Maricopa Transaction Records.csv file
df <- read.csv(path, header=TRUE,
   sep=",")

Maricopa’s Public Finances

Maricopa County, the 4th largest county in the US by population, is the largest county in Arizona and the one of the largest growing counties in the nation. Maricopa shows their openness as a government body by allowing access to their financial records every year in a few different ways. One source of data selected for this research is a spreadsheet of all transaction records from the County of Maricopa per fiscal year (they can be seen here). For the year 2020, there are 913434 transactions and 40 variables for each transactions. In 2019, prior to the 2020 fiscal year, a document was released by the county of Maricopa called, “The 2020 Citizens’ Budget Report” which is meant to explain to citizens how Maricopa will handle the county finances for the 2020 fiscal year (that document can be viewed here).

COVID-19

The current year is 2023, but the whole world is still feeling the effects of the blind-siding, global pandemic that was COVID-19. The 2020 Citizens’ Budget Report states that the budget is for the 2020 fiscal year which begins in July of 2019 and goes to the end of June 2020 and was a plan for the upcoming year while the County Transaction Report was finalized after the conclusion of the 2020 fiscal year, meaning that the two documents were published before and after the initial rise of COVID-19, respectively. The quick and complete change in society that occurred during the year was bound to have effects economically, and therefore, requiring some on-the-spot budget changes and theoretically, inconsistencies between these two documents.

Claims from 2020 CBR


Claims from 2020 CBR (cont.)

An important thing to remember as we analyze the discrepancies between these two documents is that the nature of the CBR is to simplify and generalize a lot of data for the sake of public understanding and the data extracted from that document isn’t in the most usable form, but can still be evaluated. For example, a few of the lists shown in the previous slide seem to end in a comma, indicating that they are incomplete, or missing some values. Because these lists aren’t complete and I don’t feel qualified to assign departments to their correct groupings myself, all evaluations will be done using only the departments given in the document and the results will be interpreted accordingly. In addition to the evaluation of the groups, the CBR states the largest expenditure is Public Safety, and the largest sources of revenue are property taxes and state-shared sales taxes.

Data from Transaction Report

For every transaction recorded, these are 40 different metrics associated with that specific transaction. Most of them are redundant, null, or not important to this evaluation so we will condense the data as much as possible by eliminating those columns. One phenomena that occurred during parsing the raw data was that certain transactions that seem to be duplicates that were written in parenthesis were parsed as negative, so we will remove those as well.

## Isolate the three columns we need 
## in a new Data Frame called RE_Data
RE_Data = data.frame(df[8],df[18],df[38])

## Eliminating duplicate by only using positive data;
RE_Data = RE_Data[which(RE_Data$POSTING.AMOUNT>=0),]

Visualizing Raw Data

The new data has 816202 transactions and 3 metrics to evaluate.

We can create one comprehensive list for all posting amounts by making all “RV” transactions positive and all “EX” transactions negative and then we can plot a histogram to look at the distribution of data.

## we will create a vector PosRev which looks for every 
## revenue transaction and takes their posting amount as a positive
PosRev = abs(RE_Data[which(RE_Data[,2] == "RV"),3])

## Vector NegExp will look for all expense transactions 
## and takes their posting amount as a negative
NegExp = -1*abs(RE_Data[which(RE_Data[,2] == "EX"),3])

Signed Histogram

Signed Histogram (cont.)

With 816202 transactions, the scale of the previous histogram was a little off, but with a tighter x-axis, it can be seen that the data actually looks normally distributed.

Transformations

After looking at the raw data and the claims from the Budget Report, the data will need to cleaned and transformed further to be oriented around the County Departments. To do this, we can isolate the column, “DEPARTMENT.NAME” and remove any duplicates.

## DEPT will be a vector of department names
DEPT = RE_Data[!duplicated(RE_Data$DEPARTMENT.NAME),1]
head(DEPT, 9)
## [1] "SHERIFF"               "LEGAL DEFENDER"        "JUVENILE PROBATION"   
## [4] "SUPERIOR COURT"        "TRANSPORTATION"        "CORRECTIONAL HEALTH"  
## [7] "ADULT PROBATION"       "ENTERPRISE TECHNOLOGY" "NON DEPARTMENTAL"

The total number of departments in Maricopa County is 61 departments.

New Data

Now, a new Data Frame can be made by iterating through the list of departments, separating the revenue and expense transactions, summing their posting amounts, and computing the difference between those two. The first 10 entries in the new data frame are:

##                    Dept.     Total RV   Total EX        Net
## 1                SHERIFF 1.386720e+08  506306942 -367634959
## 2         LEGAL DEFENDER 4.117144e+04   15134822  -15093651
## 3     JUVENILE PROBATION 6.950355e+06   72692302  -65741947
## 4         SUPERIOR COURT 2.148457e+07  140445231 -118960657
## 5         TRANSPORTATION 2.534099e+08  288782180  -35372286
## 6    CORRECTIONAL HEALTH 1.616734e+07   92978902  -76811559
## 7        ADULT PROBATION 1.398710e+07  135381411 -121394308
## 8  ENTERPRISE TECHNOLOGY 3.393221e+07  113209949  -79277738
## 9       NON DEPARTMENTAL 2.702350e+09 1046832137 1655518034
## 10         PUBLIC HEALTH 6.283399e+07   85331699  -22497709

Plotting the Data

Each pair of red and blue bar represents one department’s expenses and revenue, respectively

Maricopa Total Revenue vs Expense

Maricopa counties total recorded income based on the transaction records is $4.115 billion and the total recorded spending by the county is $3.571 billion.

Total Revenue by Department

Net Income by Department

By calculating the difference between revenue and expense, we get the net income. Profitable departments are in blue and departments that lose money are in red.

Net Gains

Net Gains (cont.)

Maricopa says that the largest sources of revenue are state-shared sales taxes and property taxes and they can be assumed to be placed in the “NON DEPARTMENTAL”

The three departments listed with the highest revenue are true to the statements from the Citizens’ Budget report.

Net Losses

Net Losses (cont.)

There are many more departments operating at a loss than profiting. The range of values for each department is much more evenly spread across all departments.

Addressing the Claims

The claims on the 2020 Citizens’ Budget Report gave percents of budget breakdown for different groups of departments such as “Highways & Streets”, “Education, Culture, and Recreation”, “Elected Officials and Essential Support”, “Health, Welfare, and Sanitation”, and “Public Safety”. The stated percentages for each of these groups were 8.16%, 1.69%, 14.82%, 22.56%, and 52.77%, respectively. Each grouping also gives a few departments that are included in that group. Because only about half the departments were included in total, we will use the true departments explicitly stated in each group. An assumption to be made is that departments will use their entire allotted budget and any revenue is invested back into the department somehow. Therefore, we can use the net income for our calculations of budget.

Calculated Groups of Departments

Results

There is clearly some inconsistencies between the two documents, as seen by the % error between these groupings, but again, this is only the data from the departments that were explicitly shown in each group so about 30 other departments are not shown. The average percent error between these 5 groups is 49.046%.

Specific Departments

Specifically in Public Safety, the CBR gave number for specific departments and we can look at those as well.

For these Specific Departments, the percent errors are extremely small, in comparison to the less-trustworthy group data, with an average of 1.968%.

Analysis

It is clear with the data for the specific departments that the Citizens’ Budget Report was extremely accurate. The department grouping data did not have such conclusive evidence, but the fact that the groups of departments were not able to completed accurately means that those results can be taken with a grain of salt. This also shows that COVID-19 didn’t have an immediate financial impact on Maricopa County in 2020. As Maricopa really began feeling the effects of COVID-19 around March and the budget was intended til the end of June, it could be that the government waited until the new budget implementation to make any drastic changes.

Conclusion

In conclusion, the 2020 Citizens’ Budget Report seems to be very consistent with the transaction reports from 2020. The numbers aren’t exact, but the general trends of the highest spending and earning departments are extremely similar between the two sources, as seen by the 5 departments that were explicitly stated in the CBR. This data not only helps us understand more about handling a pandemic as a nation, but also is great insight into the inner-workings of a smaller form of American government.