Taxes. Not a topic a lot of people are keen to talk about especially around these days in which taxes seem to further tension within the community. Considering how people often complain that large portions of their pay checks are siphoned by the state and the federal government. So much so people hold tight to their money, to save a little for their savings without providing any contribution to the communities around them. Leaving the economic flow throughout the state of Maryland stagnated, as without continuance spend or usage of money, there is no room for revitalization.
That is where the Community Investment Tax Credits Program (CITC) comes in, offering tax incentives (tax deductions) to stimulate charity among the community. Revitalizing the community to promote charitable actions among the populace to encourage their care in the community, while deducting their taxes to a certain percentage. While stimulating the overall economy of the state of Maryland to promote further community development in infrastructure and the openings of new programs to help other communities in need.
By using the Community Investment Tax Credits Dataset from the Maryland Department of Housing and Community Development I plan to answer are the following. What Projects (categorical) done in which counties (categorical) produces the highest tax credits (quantitative)? and What award amounts (quantitative) are given based on which county (categorical) based off of project type (categorical).
I chose to analyze this data set based off of this topic since my godfather despite being very wealthy is very keen on finding better ways to help boost the economy of the community he is in by researching which areas aid will be more effective in. So by investigating which areas there are more ongoing projects that are giving out higher Tax Incentives and award amounts for each activity gives a visualization the level of effectiveness in each area especially in Baltimore City where he currently lives and outside of that area. Inspiring me to conduct this research to help interested project contributors by pin pointed which areas are best for added aid to maximize effectiveness and profitability on both ends.
# import needed librarieslibrary(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.2.1 ✔ readr 2.2.0
✔ forcats 1.0.1 ✔ stringr 1.6.0
✔ ggplot2 4.0.3 ✔ tibble 3.3.1
✔ lubridate 1.9.5 ✔ tidyr 1.3.2
✔ purrr 1.2.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':
last_plot
The following object is masked from 'package:stats':
filter
The following object is masked from 'package:graphics':
layout
library(ggplot2)library(treemapify)
Warning: package 'treemapify' was built under R version 4.6.1
library(alluvial)library(ggalluvial)
#download data settaxdata <-read_csv("Community_Investment_Tax_Credits_20260627.csv")
Rows: 129 Columns: 18
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (11): ProjNum, Awardee, ProjName, County, Region, ApplicantType, ProjTyp...
dbl (4): ProjID, FY, Y_Coord, X_Coord
num (3): AwardAmount, TaxCredited, TaxCredBal
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#view datahead(taxdata)
# A tibble: 6 × 18
ProjID ProjNum FY Awardee ProjName County Region ApplicantType ProjType
<dbl> <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <chr>
1 85603 CITC-2025-… 2025 Anacos… Award Y… Princ… Washi… Arts, Cultur… Capital
2 93250 CITC-2026-… 2026 Arts E… Award Y… Balti… Balti… Education an… Operati…
3 85801 CITC-2025-… 2025 Baltim… Award Y… Balti… Balti… Arts, Cultur… Operati…
4 85914 CITC-2025-… 2025 Baltim… Award Y… Balti… Balti… Education an… Operati…
5 92865 CITC-2026-… 2026 Baltim… Award Y… Balti… Balti… Education an… Operati…
6 85579 CITC-2025-… 2025 BioTec… Award Y… Balti… Balti… Job and Self… Operati…
# ℹ 9 more variables: ProjectDesc <chr>, AwardAmount <dbl>, TaxCredited <dbl>,
# TaxCredBal <dbl>, DonationLink <chr>, WebSite <chr>, Y_Coord <dbl>,
# X_Coord <dbl>, Location <chr>
summary(taxdata)
ProjID ProjNum FY Awardee ProjName
Min. :85179 Length :129 Min. :2025 Length :129 Length :129
1st Qu.:85631 N.unique :129 1st Qu.:2025 N.unique :105 N.unique :129
Median :92275 N.blank : 0 Median :2026 N.blank : 0 N.blank : 0
Mean :89293 Min.nchar: 19 Mean :2026 Min.nchar: 10 Min.nchar: 30
3rd Qu.:92645 Max.nchar: 29 3rd Qu.:2026 Max.nchar: 85 Max.nchar:124
Max. :93284 Max. :2026
County Region ApplicantType ProjType
Length :129 Length :129 Length :129 Length :129
N.unique : 22 N.unique : 7 N.unique : 8 N.unique : 4
N.blank : 0 N.blank : 0 N.blank : 0 N.blank : 0
Min.nchar: 4 Min.nchar: 14 Min.nchar: 28 Min.nchar: 7
Max.nchar: 15 Max.nchar: 19 Max.nchar: 46 Max.nchar: 21
ProjectDesc AwardAmount TaxCredited TaxCredBal
Length :129 Min. : 5000 Min. : 0 Min. : 750
N.unique :128 1st Qu.:20000 1st Qu.: 0 1st Qu.: 8837
N.blank : 0 Median :25000 Median : 0 Median :12500
Min.nchar:101 Mean :26132 Mean : 6187 Mean :14728
Max.nchar:255 3rd Qu.:30000 3rd Qu.:10250 3rd Qu.:23750
Max. :60000 Max. :49250 Max. :49000
NAs :66
DonationLink WebSite Y_Coord X_Coord
Length :129 Length :129 Min. :38.21 Min. :-78.77
N.unique :117 N.unique :105 1st Qu.:38.98 1st Qu.:-76.91
N.blank : 0 N.blank : 0 Median :39.29 Median :-76.61
Min.nchar: 18 Min.nchar: 17 Mean :39.18 Mean :-75.48
Max.nchar:145 Max.nchar:145 3rd Qu.:39.32 3rd Qu.:-76.58
Max. :39.65 Max. : 76.64
Location
Length :129
N.unique :123
N.blank : 0
Min.nchar: 18
Max.nchar: 37
NAs : 1
#view data typesstr(taxdata)
spc_tbl_ [129 × 18] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ ProjID : num [1:129] 85603 93250 85801 85914 92865 ...
$ ProjNum : chr [1:129] "CITC-2025-ATHAI-00078" "CITC-2026-ArtEvryDay-00180" "CITC-2025-BORRM-00112" "CITC-2025-BmrSqsh-00122" ...
$ FY : num [1:129] 2025 2026 2025 2025 2026 ...
$ Awardee : chr [1:129] "Anacostia Trails Heritage Area, Inc." "Arts Every Day, Inc." "Baltimore and Ohio Railroad Museum, Inc." "Baltimore SquashWise" ...
$ ProjName : chr [1:129] "Award Years 2025 - 2026: Commemoration of America 250 in Prince George's County" "Award Years 2026 - 2027: Phase 2 Baltimore Arts Leadership Project" "Award Years 2025 - 2026: Saving the American Freedom Train" "Award Years 2025 - 2026: SquashWise School Squash Program" ...
$ County : chr [1:129] "Prince George's" "Baltimore City" "Baltimore City" "Baltimore City" ...
$ Region : chr [1:129] "Washington Suburban" "Baltimore City" "Baltimore City" "Baltimore City" ...
$ ApplicantType: chr [1:129] "Arts, Culture and Historic Preservation" "Education and Youth Services" "Arts, Culture and Historic Preservation" "Education and Youth Services" ...
$ ProjType : chr [1:129] "Capital" "Operating" "Operating" "Operating" ...
$ ProjectDesc : chr [1:129] "Operating support to provide county-wide coordination and inclusive programming on history, civics, and service"| __truncated__ "Operating support for Phase 2 Baltimore Arts Leadership Project providing arts-focused civic engagement opportu"| __truncated__ "Operating support to restore the 20th Century American Freedom Train Number 1 for the U.S. 250th anniversary of"| __truncated__ "Operating support to increase the number of PE squash clinics accessible through our youth development program,"| __truncated__ ...
$ AwardAmount : num [1:129] 12500 10000 20000 21000 15000 10000 50000 20000 30000 20000 ...
$ TaxCredited : num [1:129] 575 0 12250 13053 0 ...
$ TaxCredBal : num [1:129] 11925 NA 7750 7947 NA ...
$ DonationLink : chr [1:129] "https://www.pgc250.org/" "https://artseveryday.org/donate" "https://www.borail.org/" "https://baltimoresquashwise.org/donate/" ...
$ WebSite : chr [1:129] "https://www.anacostiatrails.org/" "http://www.artseveryday.org" "https://www.borail.org/" "https://baltimoresquashwise.org/" ...
$ Y_Coord : num [1:129] 39 39.3 39.3 39.3 39.3 ...
$ X_Coord : num [1:129] -76.9 -76.6 -76.6 -76.6 -76.6 ...
$ Location : chr [1:129] "(38.952711, -76.940924)" "(39.30580916, -76.62806312)" "(39.285686, -76.6327)" NA ...
- attr(*, "spec")=
.. cols(
.. ProjID = col_double(),
.. ProjNum = col_character(),
.. FY = col_double(),
.. Awardee = col_character(),
.. ProjName = col_character(),
.. County = col_character(),
.. Region = col_character(),
.. ApplicantType = col_character(),
.. ProjType = col_character(),
.. ProjectDesc = col_character(),
.. AwardAmount = col_number(),
.. TaxCredited = col_number(),
.. TaxCredBal = col_number(),
.. DonationLink = col_character(),
.. WebSite = col_character(),
.. Y_Coord = col_double(),
.. X_Coord = col_double(),
.. Location = col_character()
.. )
- attr(*, "problems")=<pointer: 0x00000283aecf9ce0>
2. Clean the Data
#remove na valuestaxdata %>%filter(if_all(everything(), ~!is.na(.)))
# A tibble: 62 × 18
ProjID ProjNum FY Awardee ProjName County Region ApplicantType ProjType
<dbl> <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <chr>
1 85603 CITC-2025… 2025 Anacos… Award Y… Princ… Washi… Arts, Cultur… Capital
2 85801 CITC-2025… 2025 Baltim… Award Y… Balti… Balti… Arts, Cultur… Operati…
3 85579 CITC-2025… 2025 BioTec… Award Y… Balti… Balti… Job and Self… Operati…
4 85716 CITC-2025… 2025 Bluebi… Award Y… Balti… Balti… Education an… Capital
5 85261 CITC-2025… 2025 Charti… Award Y… Anne … Centr… Services for… Operati…
6 86266 CITC-2025… 2025 Cherry… Award Y… Balti… Balti… Housing and … Operati…
7 85205 CITC-2025… 2025 City L… Award Y… Balti… Balti… Housing and … Operati…
8 85509 CITC-2025… 2025 Commun… Award Y… Howard Centr… Services for… Capital…
9 85938 CITC-2025… 2025 Commun… Award Y… Balti… Centr… Housing and … Operati…
10 85745 CITC-2025… 2025 Commun… Award Y… Princ… Washi… Housing and … Operati…
# ℹ 52 more rows
# ℹ 9 more variables: ProjectDesc <chr>, AwardAmount <dbl>, TaxCredited <dbl>,
# TaxCredBal <dbl>, DonationLink <chr>, WebSite <chr>, Y_Coord <dbl>,
# X_Coord <dbl>, Location <chr>
#first filter for all baltimore city valuestaxdata_baltimore_city <- taxdata |>filter(County =="Baltimore City") |>filter(FY ==2025) |>filter(ApplicantType =="Housing and Community Development") |>filter(ProjType =="Operating")head(taxdata_baltimore_city)
# A tibble: 6 × 18
ProjID ProjNum FY Awardee ProjName County Region ApplicantType ProjType
<dbl> <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <chr>
1 86266 CITC-2025-… 2025 Cherry… Award Y… Balti… Balti… Housing and … Operati…
2 85205 CITC-2025-… 2025 City L… Award Y… Balti… Balti… Housing and … Operati…
3 86113 CITC-2025-… 2025 Health… Award Y… Balti… Balti… Housing and … Operati…
4 86153 CITC-2025-… 2025 Marian… Award Y… Balti… Balti… Housing and … Operati…
5 85988 CITC-2025-… 2025 Rebuil… Award Y… Balti… Balti… Housing and … Operati…
6 86072 CITC-2025-… 2025 St. Fr… Award Y… Balti… Balti… Housing and … Operati…
# ℹ 9 more variables: ProjectDesc <chr>, AwardAmount <dbl>, TaxCredited <dbl>,
# TaxCredBal <dbl>, DonationLink <chr>, WebSite <chr>, Y_Coord <dbl>,
# X_Coord <dbl>, Location <chr>
#second filter data for all other values that are not in Baltimore Citytaxdata_not_baltimore_city <- taxdata |>filter(County !="Baltimore City") |>filter(FY ==2025) |>filter(ApplicantType !="Housing and Community Development") |>filter(ProjType =="Operating")head(taxdata_not_baltimore_city)
baltimore_tax_treemap <-ggplot( taxdata_baltimore_city,aes(area = TaxCredited,fill = Awardee,label = ProjName, )) +geom_treemap() +geom_treemap_text(colour ="white", place ="centre", reflow =TRUE) +labs(title ="Baltimore City Community Investment Tax Credit Recievals by Nonprofit Organizations",caption ="From the Maryland Department of Housing and Development & Community Investment Tax Credit Program (CITC)")baltimore_tax_treemap
This data aimed to visualize the areas in which Tax Investments have been flowing into majorly based off of the 2025 Awards for having the most charity based funding provided from Tax Incentive Investing, specifically coming from non-profit organizations. By using the treemap visualizations we can see visually which non-profit organization is being invested in more, by filtering the Awardees and ProjNames, with their size based on the (TaxCredited) tax they have been credited to invest in. Especially towards Housing and infrastructure development are most majorly prevalent in comparison to other organization like food drives or services to assist families. Giving to show how the people in the areas of Baltimore City seem to be majorly focusing on the overall development of their community and overall economy through the added assistance to the workforce development. Allowing me to determine that Baltimore City aims more to improve their overall infrastructure and quality of their job market alongside their economy, so tax investments are majorly going into those areas to be more utilized.
taxdata_not_baltimore_city_alluvial <-ggplot(taxdata_not_baltimore_city,aes(axis1 = Awardee,axis2 = ApplicantType,axis3 = AwardAmount)) +geom_alluvium(aes(fill = County), alpha =0.7) +geom_stratum() +geom_text(stat ="stratum", aes(label =after_stat(stratum)), size =2) +scale_x_discrete(limits =c("Awardee", "Applicant Type", "Award Amount Recieved")) +labs(title ="Maryland Community Investment Tax Credit Flow for the year 2025 by County",y ="Number of Projects Developed",x ="Project Attributers",caption ="From the Maryland Department of Housing and Development & Community Investment Tax Credit Program (CITC)" )ggplotly(taxdata_not_baltimore_city_alluvial)
This data set achieves to go more in depth to demonstrate the numerical effectiveness of the tax incentive investments, by using an alluvial chart and making it interactive allows the viewer to select a specific County in the state of Maryland. By view what organizations were in it, what they are applied as and what amount of Award Amount they received based on the Project development based on how the graph line increases or decreases throughout its trajectory. By utilizing the Awardee, ApplicantType, and AwardAmount variables alongside the filtered data set I made (taxdata_not_baltimore_city) it became clearer that in areas outside of Baltimore City. Other counties in Maryland tend to focus on tourist attractions and visual incentives to attract populace to their areas like museums or art galleries to increase foot traffic to increase economic flow. The alluvial interactive feature of the graph allows to focus on one specific county without having to mix up the others by allowing me to determine that it doesn’t matter how many projects a nonprofit creates. To get higher investment awards depends on the type a nonprofit may apply as the applicant type determines if investors from charitable donations go towards the project.
4. Linear Regression Modeling
grouped_taxdata <- taxdata_not_baltimore_city |>filter(County !="Baltimore City") |>filter(FY ==2025) |>filter(ApplicantType !="Housing and Community Development") |>filter(ProjType =="Operating") |>group_by(County, ApplicantType)grouped_taxdata
# A tibble: 22 × 18
# Groups: County, ApplicantType [15]
ProjID ProjNum FY Awardee ProjName County Region ApplicantType ProjType
<dbl> <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <chr>
1 85261 CITC-2025… 2025 Charti… Award Y… Anne … Centr… Services for… Operati…
2 85408 CITC-2025… 2025 Commun… Award Y… Montg… Washi… Services for… Operati…
3 85367 CITC-2025… 2025 Downto… Award Y… Frede… Washi… Economic Dev… Operati…
4 85469 CITC-2025… 2025 For Al… Award Y… Talbot Upper… Services for… Operati…
5 85495 CITC-2025… 2025 Friend… Award Y… Carol… Upper… Education an… Operati…
6 85196 CITC-2025… 2025 Harfor… Award Y… Harfo… Centr… Services for… Operati…
7 85514 CITC-2025… 2025 Inner … Award Y… Harfo… Centr… Services for… Operati…
8 85889 CITC-2025… 2025 Interf… Award Y… Montg… Washi… Services for… Operati…
9 85483 CITC-2025… 2025 Linkin… Award Y… Harfo… Centr… Job and Self… Operati…
10 86364 CITC-2025… 2025 Main S… Award Y… Kent Upper… Economic Dev… Operati…
# ℹ 12 more rows
# ℹ 9 more variables: ProjectDesc <chr>, AwardAmount <dbl>, TaxCredited <dbl>,
# TaxCredBal <dbl>, DonationLink <chr>, WebSite <chr>, Y_Coord <dbl>,
# X_Coord <dbl>, Location <chr>
model <-lm(TaxCredited ~ County + ApplicantType, data = grouped_taxdata)summary(model)
Call:
lm(formula = TaxCredited ~ County + ApplicantType, data = grouped_taxdata)
Residuals:
Min 1Q Median 3Q Max
-12500.0 0.0 0.0 200.5 12500.0
Coefficients: (2 not defined because of singularities)
Estimate Std. Error
(Intercept) 1.736e+03 1.523e+04
CountyAnne Arundel 7.607e+03 1.436e+04
CountyBaltimore 3.051e+04 2.213e+04
CountyCaroline -1.078e-11 1.436e+04
CountyCarroll 5.140e+02 1.830e+04
CountyFrederick 3.251e+04 1.830e+04
CountyHarford 1.076e+04 1.684e+04
CountyKent -1.736e+03 1.830e+04
CountyMontgomery 7.647e+03 1.632e+04
CountyPrince George's 9.727e+03 1.684e+04
CountyQueen Anne's -2.352e+04 1.436e+04
CountyTalbot 1.126e+04 1.830e+04
ApplicantTypeEconomic Development and Tourism Promotion NA NA
ApplicantTypeEducation and Youth Services 2.326e+04 1.135e+04
ApplicantTypeJob and Self-Sufficiency Training -2.250e+03 1.243e+04
ApplicantTypeServices for At-Risk Populations NA NA
ApplicantTypeTechnical Assistance and Capacity Building -9.438e+03 1.243e+04
t value Pr(>|t|)
(Intercept) 0.114 0.9124
CountyAnne Arundel 0.530 0.6126
CountyBaltimore 1.379 0.2103
CountyCaroline 0.000 1.0000
CountyCarroll 0.028 0.9784
CountyFrederick 1.777 0.1189
CountyHarford 0.639 0.5429
CountyKent -0.095 0.9271
CountyMontgomery 0.469 0.6536
CountyPrince George's 0.578 0.5815
CountyQueen Anne's -1.639 0.1453
CountyTalbot 0.615 0.5577
ApplicantTypeEconomic Development and Tourism Promotion NA NA
ApplicantTypeEducation and Youth Services 2.050 0.0796 .
ApplicantTypeJob and Self-Sufficiency Training -0.181 0.8615
ApplicantTypeServices for At-Risk Populations NA NA
ApplicantTypeTechnical Assistance and Capacity Building -0.759 0.4726
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 10150 on 7 degrees of freedom
Multiple R-squared: 0.7402, Adjusted R-squared: 0.2207
F-statistic: 1.425 on 14 and 7 DF, p-value: 0.329
#linear model linear_tax_plot <-ggplot(grouped_taxdata, aes(x = TaxCredited, y = County, color = County,group = County)) +geom_line(linewidth =0.9, alpha =0.7) +geom_point(size =1, alpha =0.7) +labs(title ="Community Investment Tax Credits Regression Model)",x ="Tax Credited",y ="Counties",color ="County",caption ="From the Maryland Department of Housing and Development & Community Investment Tax Credit Program (CITC)" ) +theme_minimal()ggplotly(linear_tax_plot)
#boxplot regression model based on generalized regions of Maryland on recieval of award amount given from tax incentive investments.taxdata_boxplot <-ggplot(grouped_taxdata,aes(x = AwardAmount, y = Region)) +geom_boxplot(fill ="lightgreen", alpha =0.7) +labs(title ="Maryland Region Tax Award Regression Boxplot",x ="Amount Rewarded",y ="Region in Maryland",caption ="From the Maryland Department of Housing and Development & Community Investment Tax Credit Program (CITC)" ) +theme_minimal()taxdata_boxplot