" Montgomery is a city in Hamilton County, Ohio, United States, settled in 1796. The town was a coach stop on the Cincinnati-Zanesville Road, later known as the Montgomery Pike, with an inn, two taverns, a grist mill and a carding mill to process its agricultural products.
The colorful landscape, tree-lined streets and miles of brick paver sidewalks in Montgomery lend themselves to the charming “Americana” feel of this well-preserved, historic community. It is home to wonderfully quaint old buildings that are a reminder of a quieter, slower time in the community. The treasured Montgomery Heritage District, located in and around Montgomery Road, is a diversified and unique destination for shopping, dining, and entertainment with storefront shops, nationally-recognized and fine ethnic restaurants, galleries, services and performance venues. " - “http://www.montgomeryohio.org/
This project is to answer a few questions basis the spend data of the government. The dataset includes County spending data for Montgomery County government.
The idea is to analyse the data, see the trends in the spend over the years, the mode of payemnts being used the most, the departments which spend the most, ROC and CV of the transactions made by all departments and other interesting facts about the governemnet spend data.
After spending a lot of time understanding, validating and cleaning the data; data is analysed to answer different questions on the spending of all the departments of the government of Montgomery county.
The data used is the County spending data for Montgomery County government. It does not include agency spending. The data is cleaned to pick the columns that are required to make it meaningful and relevant for the analysis as all the data fields are not required for the answers that the analysis demand.
library(readr)
library(dplyr)
library(tibble)
library(ggplot2)
library(knitr)
library(DT)
library(scales)
readr - The goal of readr is to provide a fast and friendly way to read tabular data into R.
dplyr- It is used for data manipulation. It provides some great, easy-to-use functions that are very handy when performing exploratory data analysis and manipulation.
tibble - Tibbles are a modern take on data frames. They keep the features that have stood the test of time, and drop the features that used to be convenient but are now frustrating (i.e. converting character vectors to factors).
ggplot 2 - This packageoffers a powerful graphics language for creating elegant and complex plots. Grouping can be represented by color, symbol, size, and transparency. It is a wonderfulfor tool for visualizing data.
knitr - Used to format the table using kable function
DT - It creates an HTML widget to display R data objects with DataTables.
scales - Used to scale axis in the plots
url <- "http://data.montgomerycountymd.gov/api/views/vpf9-6irq/rows.csv?accessType=DOWNLOAD"
county_spending <- read.csv(url,stringsAsFactors = TRUE)
county_spending <- county_spending %>% as_tibble()
| Column Name | Description | Data Type |
|---|---|---|
| Fiscal Year Period | Our fiscal year runs July 1st through June 30th. A period is equal to one month (Example: period 1 is July, period 2 is August, period 3 is September, etc.) | Number |
| Fiscal Year | The County’s fiscal year (such as 2015) represents a 12 month period used for annual financial reporting beginning July 1st through June 30th | Number |
| Service | High Organization level grouping, such as Public Safety and Community Development | Text |
| Department | The name of the Department | Text |
| Program | Subdivision of Departments made up of a primary service, function, or set of activities which address a specific responsibility or goal within an agency’s mission. (Example: Operations, Budget, Aquatics) | Text |
| Fund | Accounting device that the County uses to keep track of specific sources of funding and spending for particular purposes (Example: General Fund, Grant Fund; CIP Fund) | Text |
| Category | Represents how data is presented on the County’s financial statements (expense, revenue, liability, asset, owner’s equity or fund balance) | Text |
| Expense Category | The highest level grouping of account codes that the County has spent funds on. For example, the “Travel” expense | Text |
| Account Code | A unique account identification number (related to Account Name field) | Text |
| Invoice Description | Lowest level of description regarding how the County expends monies (Example: local travel) | Text |
| Vendor | Supplier name | Text |
| Vendor Number | A unique supplier identification number | Text |
| Zip | The zip code for the supplier’s address to which the supplier’s payment is sent | Text |
| Contract Number | A unique number used to identify a County contract | Text |
| PO Number | A unique number used to identify a purchase order issued to a vendor | Text |
| PO Line | The line number for a specific item or service that is being purchased on the Purchase Order (PO). (For example, if five different items are purchased, then there will be five PO lines.) | Text |
| Invoice Number | A number assigned to an invoice, used to uniquely identify a supplier’s invoice | Text |
| Invoice Line | A line number from a supplier invoice, for the County’s purchase of a good or a service | Text |
| Invoice Distribution Line | The line number indicating which Department(s) or Program(s) the invoice is charged to. (For example, if five different departments are being charged for an item, then there will be five distribution lines) | Text |
| Amount | The dollar amount associated with an invoice distribution line | Number |
| Invoice Date | The Invoice date is the date listed in ERP as the invoice date and may contain a null value | Date & Time |
| Payment Method | Type of payment, such as check, EFT, credit card (MCG JPM SUA) or wire transactions | Text |
| Payment Date | The date the check was written or the payment was made | Date & Time |
| Payment Number | The identifying number on the paper check, electronic payment, or credit card transactions | Number |
| Payment Status | Status of a payment through the banking system, such as Negotiable, Void, or Reconciled | Text |
The data is last updated on October 25, 2016 and is updated on a quarterly basis. It has 25 rows and 837606 rows.
The data is owned by MCG ESB Service and updated by Department of Finance.
The data has 588654 blank values.
county_spending[county_spending == ''] <- NA
na_count <-sapply(county_spending, function(y) sum(length(which(is.na(y)))))
na_count <- data.frame(na_count)
kable(na_count)
| na_count | |
|---|---|
| Fiscal.Year.Period | 0 |
| Fiscal.Year | 0 |
| Service | 0 |
| Department | 0 |
| Program | 0 |
| Fund | 0 |
| Category | 0 |
| Expense.Category | 0 |
| Account.Code | 0 |
| Invoice.Description | 0 |
| Vendor | 0 |
| Vendor.Number | 0 |
| Zip | 82157 |
| Contract.Number | 708046 |
| PO.Number | 482044 |
| PO.Line | 588654 |
| Invoice.Number | 0 |
| Invoice.Line | 0 |
| Invoice.Distribution.Line | 0 |
| Amount | 0 |
| Invoice.Date | 1609 |
| Payment.Method | 0 |
| Payment.Date | 0 |
| Payment.Number | 0 |
| Payment.Status | 0 |
While importing the data the data was converted to factor values by keeping stringasfactor option TRUE so it needs to be checked if all the data fields are in the right format.
kable(str(county_spending))
## Classes 'tbl_df', 'tbl' and 'data.frame': 837606 obs. of 25 variables:
## $ Fiscal.Year.Period : int 6 11 8 12 6 6 10 7 6 5 ...
## $ Fiscal.Year : int 2014 2014 2015 2014 2016 2016 2015 2014 2014 2014 ...
## $ Service : Factor w/ 11 levels "CIP","Culture and Recreation",..: 6 6 6 6 10 10 6 2 6 6 ...
## $ Department : Factor w/ 47 levels "Board of Appeals",..: 23 23 28 23 18 18 28 45 39 28 ...
## $ Program : Factor w/ 314 levels "24-Hours Crisis Center",..: 126 126 300 126 115 268 300 71 227 300 ...
## $ Fund : Factor w/ 33 levels "Bethesda Parking",..: 9 9 15 9 11 11 15 3 12 15 ...
## $ Category : Factor w/ 2 levels "Not Defined",..: 2 2 1 2 2 2 1 2 2 1 ...
## $ Expense.Category : Factor w/ 58 levels "Advertising",..: 22 22 25 50 36 40 25 9 9 25 ...
## $ Account.Code : int 65624 65628 66010 64100 62016 63634 66010 60537 60076 66010 ...
## $ Invoice.Description : Factor w/ 681 levels "800 Line Charges",..: 4 461 56 325 130 416 56 144 515 56 ...
## $ Vendor : Factor w/ 16627 levels "1000822890","1001219301",..: 8529 15662 5128 4136 4889 8304 10298 7890 540 5128 ...
## $ Vendor.Number : num 10759 54288 5645 40341 34954 ...
## $ Zip : Factor w/ 3331 levels "","00000","01040",..: 1337 120 44 1191 1052 1337 1753 1242 25 44 ...
## $ Contract.Number : Factor w/ 3821 levels "","0363200064AA",..: NA NA NA NA NA NA NA NA 3368 NA ...
## $ PO.Number : Factor w/ 54690 levels "","1000015","1000052",..: 8759 11719 NA NA 13102 NA NA 9272 54525 NA ...
## $ PO.Line : num 1 3 NA NA 1 NA NA 1 NA NA ...
## $ Invoice.Number : Factor w/ 656765 levels "_0000025034_010114-123114",..: 81365 200522 518897 481074 653925 559184 560850 566535 389722 518030 ...
## $ Invoice.Line : num 1 3 1 2 1 1 3 1 1 1 ...
## $ Invoice.Distribution.Line: num 1 1 1 1 1 1 1 1 1 1 ...
## $ Amount : num 1576281 25616 27394 195 148 ...
## $ Invoice.Date : Factor w/ 1463 levels "","01/01/2013",..: 1403 552 158 659 1376 1402 348 51 1306 1268 ...
## $ Payment.Method : Factor w/ 4 levels "CHECK","EFT",..: 2 2 2 1 3 1 1 1 1 2 ...
## $ Payment.Date : Factor w/ 831 levels "01/01/2014","01/02/2014",..: 809 292 96 344 13 812 216 48 771 741 ...
## $ Payment.Number : num 368883 380202 403394 2053265 6009472 ...
## $ Payment.Status : Factor w/ 2 levels "NEGOTIABLE","RECONCILED": 2 2 2 2 1 2 2 2 2 2 ...
We notice that we need to change the format of date type of columns.
| Column Name | Description | Data Type |
|---|---|---|
| Fiscal Year Period | Our fiscal year runs July 1st through June 30th. A period is equal to one month (Example: period 1 is July, period 2 is August, period 3 is September, etc.) | Number |
| Fiscal Year | The County’s fiscal year (such as 2015) represents a 12 month period used for annual financial reporting beginning July 1st through June 30th | Number |
| Service | High Organization level grouping, such as Public Safety and Community Development | Text |
| Department | The name of the Department | Text |
| Program | Subdivision of Departments made up of a primary service, function, or set of activities which address a specific responsibility or goal within an agency’s mission. (Example: Operations, Budget, Aquatics) | Text |
| Fund | Accounting device that the County uses to keep track of specific sources of funding and spending for particular purposes (Example: General Fund, Grant Fund; CIP Fund) | Text |
| Expense Category | The highest level grouping of account codes that the County has spent funds on. For example, the “Travel” expense | Text |
| Amount | The dollar amount associated with an invoice distribution line | Number |
| Invoice Date | The Invoice date is the date listed in ERP as the invoice date and may contain a null value | Date & Time |
| Payment Method | Type of payment, such as check, EFT, credit card (MCG JPM SUA) or wire transactions | Text |
The maximum spend in the data $ 2.6110^{8} in a single transaction. The minimum spend in the data $ -6.235908110^{5} in a single transaction. The Average spend per transaction is 1.480987110^{4}.
In this section, the analysis on the CV(Charge Volume) is done to answer various questions and to provide insights through the observations.
Insight
Department of Countrywide Generic and Debt Services are > the ones which have the highest spend over all the years.
Now we look at the top 10 departments which have the hight spend in the latest Fiscal Year.
The data for the Fiscal Year 2017 is excluded as it is the rolling year and the data in this year is incomplete.
Insight
The spedning trends of all the Departments have been almost uniform and according to the budget allocated. There are no sudden drop or rise for almost all the top spending departments over the years we have tested the data for.
Insight
The major spend is made on the contract and services followed by Insurance , Debt service and Liquor Purchases. The trend is similar over the years and thus we can conclude that the spending of the Montgomary government is budgeted, well planned and fixed.
This section, aims for thehe analysis on the ROC(number of transactions) made by the departments and to see the if they follow the similar treand as the CV.
Insight
The top 10 departments making the maximum transactions differ from the ones with the maximum spend. The trend in the number of transactions over the years also almost constant for almost all departments except Housing and Community Affairs where the number of transactions are increasing over the years.
We now compare all the methods of payments used in this section.
Insight
The check is the most widely used mode of payment in the year 2014 and 2015 whereas 2016 and 2017 show a change of trend with Electronic Fund Transfer being used more than checks
The data very clearly shows the smooth working of Montgomery County Government from the year 2014 to till date. The CV and ROC trends are constant over the years and the same departments carry the highest rank in spending and transacting from the past 4 years which very clearly indicates the on budget and planned nature of the governmnet expenses.
The methodology was simply to compare the highest spending and transacting departments over the years and check the uniformity in the data. The trends of the top 10 departments ( ROC and CV) were constant and thus gave conformity on the smooth functioning of the government.