Abstract

For cities, land is a limited resource, and land use regulations can impact the economic productivity of available land, measured by the taxable value per acre. Therefore, it is necessary to conduct research into the relationship between land use and its associated economic productivity to inform policy decisions regarding land use regulation. Ultimately, this research answers: how does a parcel’s land use type impact the taxable value per acre within Johnson City? This question is answered using 2023 Washington County land use data from the Tennessee IMPACT CAMA system. The dataset includes property value assessments, land use classifications, and acreage for every parcel in the county. For this research, the data is filtered to only include parcels within a mile of the Johnson City boundary. The data is prepared and analyzed in R using ANOVA models and rank-sum tests to assess variations between land use categories. The results indicate that the differences in the taxable value per acre for each land type are statistically significant. Post hoc testing also confirms that multi-family residential and commercial uses produce the most taxable value per acre, while agriculture, low-density single family residential, and mobile homes produce the least per acre. This information could inform land use regulation changes that use land more efficiently while increasing overall property tax revenue.

Introduction

Most cities get the majority of their tax revenue via property taxes, which utilize property value assessments to determine how much should be collected. Property assessments identify the land and building value — sometimes called the improved value — for every parcel within the jurisdiction. Aside from tax collection purposes, property assessment data is also frequently studied by economic development and urban planning authorities to identify ways to improve a jurisdiction’s tax base, estimate a new project’s return on investment, conduct cost-of-service analysis, et cetera. Often, planners use the total assessed value for these purposes. However, geo-analytics firm Urban3 has proposed and demonstrated that normalizing the assessed value by its acreage provides greater insight into the productive value of a jurisdiction’s land. Because land is a finite resource and increasing urban land area increases infrastructure costs, the taxable value per acre approach allows for planning and policy decisions that result in more economically efficient land use (Nielsen, 2018).

Furthermore, as calls for zoning and land use regulation reforms increase, it is important to understand the relationship between different types of land use and the taxable value per acre. Extensive research has studied the effects of varying land uses and regulations on home prices using spatial modeling. For example, Song and Knaap (2004) and Kim and Jin (2019) studied the effects of mixed land use on housing prices, and Ihlanfeldt (2005) researched the effects of regulation restrictiveness, including zoning, on housing costs. However, because the focus was on housing costs, the existing research does not cover other land use types, such as commercial and industrial uses. Furthermore, it does not consider an efficiency measurement, i.e., taxable value per acre.

However, Wang et al. (2013) took land value research a step further by assessing the land value per square foot in Travis County, Texas, as it related to covariates such as the parcel’s slope and the Euclidean distances to the nearest central business district, freeway, major arterial roadway, and minor arterial roadway. Their research was unique because it integrated Eigenvector Spatial Filtering into its approach to control for spatial autocorrelation in the model residuals (Wang et al., 2013). However, their research did not directly address the impact of the land use category on the land value per square foot. In contrast, O’Halloran (2019) briefly examined taxable value per acre broken down by land use categories for Jacksonville, Florida, from an urban planning perspective. O’Halloran found that medium-density land uses generally produced more value per acre than low-density land uses, though this varied in certain parts of the city. However, because his research was exploratory, he did not implement statistical tests or control for other spatial factors to confirm the significance of his observations.

This paper will expand the methodologies of O’Halloran (2019) with the insight gained from Wang et al. (2013) to analyze the 2023 assessed land use and taxable value per acre in Johnson City, Tennessee. Analysis of Variance (ANOVA) modeling and rank-sum tests with spatial components will be implemented to provide a statistical measure of land use category variations while controlling for spatial relationships. Ultimately, this research will answer: how does a parcel’s land use type relate to the taxable value per acre within Johnson City?

Data

The data that was used for this research comes from the Tennessee IMPACT computer automated mass appraisal (CAMA) system managed by the state’s Comptroller of the Treasury. The IMPACT system — short for the Integrated Multi-Processing of Administrative and CAMA Technology system — integrates information from several state agencies into a relational database that provides substantial information on properties, such as value and land use, for tax and planning purposes (Tennessee Comptroller of the Treasury, n.d.).

A shapefile containing comprehensive 2023 parcel data with land use information for Washington County, Johnson City’s main county, was downloaded from the Comptroller’s website. The dataset contains over a dozen fields, but for this research, the most important fields were the city, the calculated acreage, the improved value, the land value, the appraisal, and the land use classification (Tennessee Comptroller, 2023a).

The dataset’s land use classifications were represented by 50 different number codes. For the sake of this research, many of the classifications were consolidated. For example, the multiple agriculture-related classifications were consolidated into one agriculture category. To accomplish this, a look-up table was created in Excel using PDFs of the file documentation (Tennessee Comptroller, 2018) and the land use map (Tennessee Comptroller, 2023b). The completed look-up table included the original land use category numbers, the class description from the documentation, a consolidated class based on the map legend, and the corresponding color hex code. The table excluding hex codes can be found in Appendix A. Ultimately, the 50 classifications were consolidated into 20 categories.

Furthermore, the dataset also contained over 59 thousand features, meaning that computer processing could be unwieldy and time-consuming. To reduce the number of features to only include the area immediately around Johnson City, a Tennessee City Boundary dataset was downloaded from the Tennessee STS-GIS data portal (Tennessee STS-GIS, 2023). The Johnson City boundaries were selected from the data. Because the boundaries are complex and fragmented, a mile-wide buffer was added to smooth the boundary and prevent isolated parcels before filtering out parcels outside the buffered area.

Methodology

A flow chart visualizing the methodological process for this research can be found in Figure 1. The method consists of six main stages described as follows:

Data Preparation and Filtering:
- First, the parcel, the city boundaries shapefile, and the classification look-up table were imported into R Studio where the analysis took place.
- After selecting Johnson City from the city boundaries shapefile, a mile-wide buffer was created using the sf package. The buffer smoothed Johnson City’s complex boundaries and eliminated isolated areas, which can be problematic for spatial autocorrelation analysis. Then all parcels whose centroids fell within the buffered area were selected.
- Additionally, fields unnecessary for the analysis were removed, leaving only the acres, improved value, land value, appraisal, and classification fields in the dataset.
- A new field for the Taxable Value Per Acre was calculated by dividing the appraisal by the acreage.
- Next, five concentric rings that are each 2.7 miles wide were created around the city’s townhall and given ID numbers. Then using a point identified on the surface of each parcel, the ring IDs were then joined to the parcels, based on the ring in which the point fell.
- Then, the look-up table was joined to the dataset, so that the consolidated classifications could be used in later steps.
- Also, data was filtered according to the following criteria:
  - If the appraised value did not equal the sum of the improved value and land value, the parcel was excluded. These parcels reflect either system errors or special tax exemptions like Greenbelt properties.
  - Parcels that had an improved value of zero were excluded because the absence of improvements is reflected in the appraisal. Therefore, including both parcels with improvements and parcels without improvements could skew the land use value comparison, especially if a particular land use has more unimproved lots than others.
  - Parcels with public, semi-public, vacant, water, transportation, utility, or uncategorized land use types were excluded because of the special nature of each of these categories.
Exploratory analysis was conducted to understand how the data is distributed. The analysis performed in this stage can be generally informative. However, it is also important because it can help identify problems with the data and inform the best methodologies. For example, some kinds of tests are parametric, meaning they have to meet certain assumptions to be valid, while other tests are non-parametric and do not need to meet the same assumptions. In this stage, the following steps were carried out:
- Summary statistics were calculated, and groups with very few observations were removed.
- Histograms and box plots were created to identify the data’s distribution because normality is an assumption of most parametric tests.
- Parametric models also assume that observations are independent of each other, which means that spatial autocorrelation would violate this assumption. To assess the amount of spatial autocorrelation, Moran’s Test and Local Indicators of Spatial Association (LISA) clusters were created using the sfdep and spdep packages.
ANOVA tests were performed. One-way ANOVA is a parametric test that is used on a continuous dependent variable with a categorical explanatory variable to determine if the differences in the categories are statistically meaningful or simply due to chance (Barabás, n.d.). This makes it ideal for analyzing the differences in the taxable value per acre for each land use type. Additionally, two-way ANOVA allows a second categorical explanatory variable to be used and also measures the interaction of the two explanatory variables. For this analysis, because spatial autocorrelation was present, a second category, the rings that parcels are within, was implemented.
The ANOVA assumptions were tested. Because ANOVA is a parametric test, the residuals — the difference between fitted values and actual values — need to be normally-distributed, independent, and homoscedastic, which means the variance is relatively equal. These assumptions can be tested using diagnostic plots.
Because ANOVA assumptions were not met, non-parametric tests that do not need to meet the assumptions were conducted. The Kruskal-Wallis Test is a rank-sum test equivalent to the one-way ANOVA test, while the Scheirer-Ray-Hare test (found in the rcompanion package) is equivalent to the two-way ANOVA test (Barabás, n.d.).
Then a post hoc test were performed on the Kruskal-Wallis test results to identify the specific categories that are different from each other. For ANOVA, Tukey’s honest significance test is used. However, the Dunn test (from the rstatix package) is more appropriate for the Kruskal-Wallis test, but no post hoc test is well suited for the Sheirer-Ray-Hare test (Barabás, n.d.).

Figure 1: This analysis takes place in six main steps. Those steps are represented in as blue squares. Following step 4, if the models are valid, the analysis will then proceed to step 6, post hoc testing. Otherwise, step 5, rank-sum testing, will take place.

library(tidyverse)
library(sf)
library(here)
library(RColorBrewer)
library(scales)
library(rgeoda)
library(viridis)
library(moments)
library(gt)
library(flextable)
library(sfdep)
library(spdep)
library(rcompanion)
library(FSA)

options(scipen = 0)

# Read in landuse file
landuse_filepath <- here("Data", "2023_LU_WASHINGTON", "Landuse.shp")
landuse <- st_read(landuse_filepath)

# Read in city boundaries
tncities_filepath <- here("Data", "TN_City_Boundaries", "TN_City_Boundaries.shp")
tncities <- st_read(tncities_filepath)

# Select just Johnson City from the cities data
JCboundary <- tncities %>% 
    filter(NAME == "Johnson City")

# Transform it to the same CRS as the landuse data
JCboundary <- st_transform(JCboundary, st_crs(landuse))

# Make all the polygons valid and sf objects
JCboundary <- st_make_valid(JCboundary)
JCboundary <- st_as_sf(JCboundary)

landuse <- st_make_valid(landuse)
landuse <- st_as_sf(landuse)

# Remove non-polygon or multipolygon tracts
landuse <- landuse %>% 
    filter(st_geometry_type(.) == "POLYGON" | st_geometry_type(.) == "MULTIPOLYGON")

# Buffer the boundary
JCboundary_buffer <- st_buffer(JCboundary, 1609.344)

# Assign TRUE to all the parcels that are within the Johnson City boundary
landuse$withinJC <- st_within(st_centroid(landuse, of_largest_polygon = TRUE), JCboundary_buffer, sparse = FALSE)[,1]

# Import Look-Up table
lookup_path <- here("Data", "LU_LOOKUP.csv")
lookuptable <- read_csv(lookup_path)

# Ensure the classifications are treated as characters for the join
lookuptable$LU_CLASSIF <- as.character(lookuptable$LU_CLASSIF)

# Filter out areas that are not within the Johnson City borders, select fields of interest, and calculate the taxable value per acre field (TVPA).
landuseJC <- landuse %>%
    filter(withinJC == TRUE) %>% 
    select(LU_ACRES, IMPVALUE, LANDVALUE, APPRAISAL, LU_CLASSIF) %>% 
    mutate(TVPA = APPRAISAL/LU_ACRES, PARCEL_ID = row_number())

# Join the land use data with the look-up table
landuseJC <- left_join(landuseJC, lookuptable, by = "LU_CLASSIF")

# Define a point for city hall
townhall <- st_as_sf(data.frame(Lat=-82.35175663032484,Lon=36.313899995936886),coords=c("Lat","Lon"), remove=FALSE)

# Set the coordinate system to WGS84 in degrees
st_crs(townhall) <- 4326

# Create list of buffer distances in miles with the largest ring on the outside
buffer_dist <- c(13.5, 10.8, 8.1, 5.4, 2.7)

# Create buffers from each distance
buffers <- lapply(buffer_dist, function(radius) st_buffer(townhall, dist = radius*1609.344))

# Create an empty list of the rings
rings <- c()

# Add all the rings to the list once the difference is taken
for (i in 1:length(buffers)) {
   
    if(i == 5) { # index 5 is the inner most ring
        rings[[i]] <- st_difference(buffers[[i]])$geometry
    } else {
        rings[[i]] <- st_difference(buffers[[i]], buffers[[i+1]])$geometry
    }
}

# Bind them into one list and convert them to an sf object
all_rings <- do.call(rbind, rings)
all_rings <- st_as_sf(st_as_sfc(all_rings))
st_crs(all_rings) <- 4326

# Rename the shape file and add an ID field
all_rings <- all_rings %>% 
    rename(SHAPE = 1) %>% 
    arrange(desc(row_number())) %>% 
    mutate(RING_ID = as.factor(row_number())) %>% 
    select(RING_ID, SHAPE) %>% 
    st_transform(crs = st_crs(landuseJC))

# Do an intersection based on the centroids to pass the ring id to the parcels
parcelRings <- st_join(st_point_on_surface(landuseJC), all_rings, join = st_within) %>% 
    select(RING_ID, PARCEL_ID) # Only keep the ring and parcel IDs

# Join the ID back to the parcels
landuse_rings <- st_join(landuseJC, parcelRings, by = "unique_id", left = TRUE, largest = TRUE)

Results

Data Preparation and Filtering

Figure 2 shows the all land uses for parcels within the mile-buffer of the Johnson City boundary. Figure 3 shows the taxable value per acre for every parcel using a quantile scale. However, the upper limit of the quantile with the greatest values is not shown on the legend bar because the maximum taxable value per acre of approximately $13.9 million skews the scale substantially. In total, there are 36,317 parcels represented by these maps.

# Cast to multipolygon for use with ggplotly
landuseJC <- st_cast(landuseJC, "MULTIPOLYGON")

# Map Land Use Categories
landuse_plot <- landuseJC %>% 
    ggplot(aes(fill = as.factor(CLASS_CONSOL))) +
    geom_sf(color = NA) +
    scale_fill_manual(name = "Land Use",
                      values = lookuptable$COLOR,
                      breaks = lookuptable$CLASS_CONSOL,
                      na.value = "grey") +
    labs(title = "Land Use",
         subtitle = "Johnson City, TN plus a mile-wide buffer") +
    guides(fill = guide_legend(ncol = 3)) +
    theme_dark() +
    theme(legend.position = "bottom",
          legend.title.position = "top",
          legend.title = element_text(face = "bold"),
          axis.text.x = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks = element_blank())

landuse_plot

Figure 2: Johnson City consists of a variety of land uses, but single family residential and agricultural uses dominate much of the area. Public and General Commercial land uses also make up a sizeable area.

# Create quantiles legend
quantiles <- quantile(landuseJC$TVPA, probs = seq(0, 1, by = .2))
labels <- as.character(dollar(quantiles))

# Map Taxable Value per Acre
TVPA_plot <- landuseJC %>% 
    ggplot(aes(fill = TVPA)) +
    geom_sf(color = NA) +
    scale_fill_viridis_c(name = "Taxable Value Per Acre",
                         option = "magma",
                         direction = -1,
                         breaks = quantiles,
                         labels = labels,
                         limits = c(0, 600000),
                         oob = squish, # ensures values outside the limit are maintained
                         na.value = "lightgrey") +
    labs(title = "Taxable Value\nPer Acre",
         subtitle = "Johnson City, TN plus a mile-wide buffer") +
    theme_dark() +
    theme(legend.position = "inside",
          legend.position.inside = c(0.175, 0.175),
          legend.background = element_rect(fill = "#7F7F7F"),
          legend.text = element_text(color = "white", face = "bold"),
          legend.title = element_text(color = "white", face = "bold"),
          axis.text.x = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks = element_blank())

TVPA_plot

Figure 3: The taxable value per acre of parcels varies throughout the city, though the value tends to be much higher in dense areas. The high end of the scale bar on this map is truncated to better show the majority of the data. The highest value per acre is $13,862,436.

For the sake of the analysis, some parcels were excluded according to the criteria discussed in step 1 of the methodology section. Figure 4 and Figure 5 show land use and taxable value per acre maps for the 30,090 remaining parcels once these criteria were implemented.

# Filter out undesired land uses and over and under appraised parcels 
landuseJC_filtered <- landuse_rings %>% 
    filter(APPRAISAL == LANDVALUE + IMPVALUE,
           IMPVALUE > 0,
           !CLASS_CONSOL %in% c("Public", "Semi-Public", "Vacant", "Water", "Transportation", "Utilities", NA))

# Plot the filtered landuse
landusefiltered_plot <- landuseJC_filtered %>% 
    ggplot(aes(fill = as.factor(CLASS_CONSOL))) +
    geom_sf(color = NA) +
    scale_fill_manual(name = "Land Use",
                      values = lookuptable$COLOR,
                      breaks = lookuptable$CLASS_CONSOL,
                      na.value = "grey") +
    labs(title = "Land Use",
         subtitle = "Filtering Criteria Applied") +
    guides(fill = guide_legend(ncol = 3)) +
    theme_dark() +
    theme(legend.position = "bottom",
          legend.title.position = "top",
          legend.title = element_text(face = "bold"),
          axis.text.x = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks = element_blank())

landusefiltered_plot

Figure 4: After applying the filtering criteria, much of the agricultural land was removed. However, single family residential and commercial uses still cover much of the area.

# Create new quantiles for the filtered data
quantiles2 <- quantile(landuseJC_filtered$TVPA, probs = seq(0, 1, by = .2))
labels2 <- as.character(dollar(quantiles2))

# Map Taxable Value per Acre filtered
TVPAfiltered_plot <- landuseJC_filtered %>% 
    ggplot(aes(fill = TVPA)) +
    geom_sf(color = NA) +
    scale_fill_viridis_c(name = "Taxable Value Per Acre",
                         option = "magma",
                         direction = -1,
                         breaks = quantiles2,
                         labels = labels2,
                         limits = c(0, 700000),
                         oob = squish,
                         na.value = "lightgrey") +
    labs(title = "Taxable Value Per Acre",
         subtitle = "Filtering Criteria Applied") +
    theme_dark() +
    theme(legend.position = "inside",
          legend.position.inside = c(0.175, 0.175),
          legend.background = element_rect(fill = "#7F7F7F"),
          legend.text = element_text(color = "white", face = "bold"),
          legend.title = element_text(color = "white", face = "bold"),
          axis.text.x = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks = element_blank())

TVPAfiltered_plot

Figure 5: Filtering removed some of the areas with low taxable values per acre. The maximum value is still $13,862,436.

Exploratory Analysis

Then summary statistics were generated for the entire dataset and for each land use type. Results can be found in Table 1. Because the heavy industrial category has an insubstantial sample size with only four parcels, it was filtered out before further analysis. Also, based on Table 1, the histogram of all parcels in Figure 6, and the boxplots for each category in Figure 7, it is evident that the data does not follow a normal distribution.

# Summarize entire data set
All_summary <- landuseJC_filtered %>%
    st_drop_geometry() %>% 
    summarise(type = "All Categories",
              count = n(),
              mean = mean(TVPA),
              median = median(TVPA),
              max = max(TVPA),
              min = min(TVPA),
              IQR = IQR(TVPA),
              skew = skewness(TVPA),
              kurtosis = kurtosis(TVPA))

# Summarize each group
type_summary <- landuseJC_filtered %>%
    st_drop_geometry() %>%
    group_by(CLASS_CONSOL) %>%
    summarise(count = n(),
            mean = mean(TVPA),
            median = median(TVPA),
            max = max(TVPA),
            min = min(TVPA),
            IQR = IQR(TVPA),
            skew = skewness(TVPA),
            kurtosis = kurtosis(TVPA)) %>% 
    rename(type = CLASS_CONSOL)

# Join together
summary <- rbind(All_summary, type_summary)

# Show the summary table
summary %>% 
    gt(summary, rowname_col = "type") %>% 
    tab_header(title = md("**Table 1**: Taxable Value Per Acre Summary Statistics"),
               subtitle = "Grouped by Land Use Category") %>%
    fmt_number(decimals = 2, use_seps = TRUE, drop_trailing_zeros = TRUE) %>% 
    opt_align_table_header(align = "left")

	count	mean	median	max	min	IQR	skew	kurtosis
Table 1: Taxable Value Per Acre Summary Statistics
Grouped by Land Use Category
All Categories	30,090	542,874.46	274,669.54	13,862,435.86	132.16	281,293.17	3.99	20.93
Agriculture	58	11,426.73	6,683.14	153,540.3	1,339.28	7,048.22	6.08	42.91
Duplex	392	340,636.13	300,639.13	1,047,238.7	12,832.3	237,251.93	0.93	4.05
General Commercial	1,216	446,797.28	344,300.41	7,501,784.07	690.57	402,016.65	6.25	61.37
Heavy Industrial	4	98,247.37	93,880.28	162,022.71	43,206.22	39,009.1	0.3	1.98
Light Industrial and Warehousing	229	250,090.63	178,034.33	3,158,140.13	6,616.07	218,276.64	5.79	57.66
Misc Commercial	89	818,539.7	387,084.19	5,885,410.69	8,995.57	912,641.67	2.4	9.72
Mobile Home	306	46,444.12	36,707.77	344,996.03	4,438.26	36,755.59	3.3	18.96
Mobile Home Park	90	80,299.47	48,735.4	1,336,327.99	1,340.15	43,029.64	6.91	56.23
Multi-Family	2,409	2,887,566.86	2,954,525.9	11,620,210.47	9,614.7	3,472,827.24	0.18	2.38
Office	377	827,676.11	511,011.68	13,862,435.86	1,659.54	482,192.42	5.6	47.22
Single Family Residential < 5 acres	23,834	343,645.13	267,141.56	7,023,656.26	2,692.09	221,734.95	6.48	68.02
Single Family Residential >= 5 acres	1,086	43,298.07	29,671.45	3,040,312.84	132.16	29,427.61	23.62	672.81

landuseJC_filtered <- filter(landuseJC_filtered, CLASS_CONSOL != "Heavy Industrial")

landuseJC_filtered %>% 
    ggplot(aes(x = TVPA)) +
    geom_histogram(binwidth = 100000, color = "black", fill = "lightblue") +
    scale_x_continuous(name = "Taxable Value Per Acre",
                       limits = c(0, 6000000),
                       labels = label_currency()) +
    theme_minimal() +
    labs(title = "Histogram of Taxable Value Per Acre for All Parcels")

Figure 6: The distribution of the taxable value per acre for all parcels is heavily right skewed.The right tail extends well past $6 million, even though most of the data is under $0.5 million.

landuseJC_filtered %>%
    ggplot(aes(x = reorder(CLASS_CONSOL, TVPA), y = TVPA, fill = CLASS_CONSOL)) +
    geom_boxplot(color = "black", outliers = FALSE) +
    scale_x_discrete(name = "Land Use Category") +
    scale_y_continuous(name = "Taxable Value Per Acre",
                       labels = label_currency()) +
    scale_fill_manual(name = "Land Use",
                      values = lookuptable$COLOR,
                      breaks = lookuptable$CLASS_CONSOL,
                      na.value = "grey") +
    theme_minimal() +
    coord_flip() +
    labs(title = "Box Plots",
         subtitle = "Taxable Value Per Acre by Land Use Category") +
    guides(fill = "none")

Figure 7: The box plots above provide a visualization of data distribution of each group. Multi-family residential land uses have a much higher average than other land uses, but the interquartile range (IQR) is also much wider. Also, notably the average value per acre for Misc. Commercial land is lower than offices, but the IQR extends farther.

Then, a test of spatial autocorrelation was performed on the taxable value per acre. The weighting matrix used to calculate it was a row-standardized k-nearest neighbors matrix. The k-nearest neighbors classification was used to ensure every observation had an equal number of neighbors despite there being spatial gaps from the filtering procedures. To determine spatial autocorrelation, the 15 nearest neighbors were selected and equal weight was given to each neighbor regardless of distance. The number 15 was chosen so that each neighborhood remains relatively small while including more than just the most immediate neighbors. Ultimately, the Moran test yielded a remarkably high and statistically significant I-score (I = .818, p < 2.26e-16), meaning the data is highly spatially autocorrelated. Figure 8 shows Moran Local Indicators of Spatial Association (LISA) clusters at the .1 significance level.

# Create neighbors and weights as a listw object
knn15 <- st_knn(st_centroid(landuseJC_filtered, of_largest_polygon = TRUE), 15)
knn15weights <- st_weights(knn15, style = "W", allow_zero = TRUE)
listw <- recreate_listw(knn15, knn15weights)

# Perform the Moran test
moran.test(landuseJC_filtered$TVPA, listw)


    Moran I test under randomisation

data:  landuseJC_filtered$TVPA  
weights: listw    

Moran I statistic standard deviate = 407.09, p-value <
0.00000000000000022
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance 
   0.817677852927   -0.000033239156    0.000004034824

## Calculate local Moran's I 
lisa <- localmoran_perm(landuseJC_filtered$TVPA, listw, nsim = 999)
    
# access saved quadrants from lisa
quadrants <- attr(lisa, which = "quadr")
    
# add a level for anything below 0.05 significance
levels(quadrants$mean) <- c(levels(quadrants$mean), "Not Significant")
quadrants$mean[lisa[,5]> 0.1] <- "Not Significant"
    
lisa_data <- cbind(landuseJC_filtered, quadrants)
    

# Create color palette
clustercolor <- c(brewer.pal(n = 4, name = "RdBu"), "#FFFFFF")
clusterlabel <- c("Low-Low", "High-Low", "Low-High", "High-High", "Not Significant")

clustercolormap <- cbind(clusterlabel, clustercolor)
clustercolormap <- as.data.frame(clustercolormap)
    
## plot LISA Clusters
lisa_map <- lisa_data %>%
    ggplot(aes(fill = mean)) +
    geom_sf(color = NA) +
    scale_fill_manual(name = "Cluster",
                      values = clustercolormap$clustercolor,
                      breaks = clustercolormap$clusterlabel) +
    theme_grey() +
    labs(title = "Taxable Value Per Acre",
         subtitle = "LISA Cluster Map") +
    theme(panel.background = element_rect(fill = "#D6D6D6"),
          legend.position = "inside",
          legend.position.inside = c(0.175, 0.175),
          legend.background = element_rect(fill = "#D6D6D6"),
          legend.title = element_text(face = "bold"),
          axis.text.x = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks = element_blank())

lisa_map

Figure 8: The LISA cluster map shows several small clusters of high values within the interior of the city and clustering of low values on the fringes of the city at the .1 significance level. Low-High and High-Low clusters are areas that are different than their surroundings.

The presence of spatial autocorrelation indicates that the observations are not independent of each other, and the spatial dependence will need be accounted for in the analysis. While the use of a specialized continuous variable, such as Eigenvector Spatial Filtering or a spatially autoregressive term, is better suited for this purpose, the categorical rings shown in Figure 9 will be used instead to keep things simple. By using a categorical spatial variable, rather than a continuous spatial variable, using the parametric Two-Way ANOVA Test or the non-parametric Scheirer-Ray-Hare Test is possible. Summary statistics by ring for each land use type can be found in Appendix B.

ring_map <- landuseJC_filtered %>%
    ggplot(aes(fill = as.factor(RING_ID))) +
    geom_sf(color = NA) +
    scale_fill_discrete(name = "Ring ID") +
    theme_dark() +
    labs(title = "Parcels Organized in Concentric Rings") +
    theme(legend.position = "inside",
          legend.position.inside = c(0.175, 0.175),
          legend.background = element_rect(fill = "#7F7F7F"),
          legend.title = element_text(face = "bold"),
          axis.text.x = element_blank(),
          axis.text.y = element_blank(),
          axis.ticks = element_blank())

ring_map

Figure 9: This map consists of five concentric rings centered on the city hall. Each ring is 2.7 miles wide.

Testing

Once exploratory analysis was complete, a one-way ANOVA test was implemented to explore the relationship between land use (CLASS_CONSOL) and the taxable value per acre (TVPA). The F-value (F= 3363, p = 2.26e-16) was statistically significant, meaning the differences for each land use type do not occur by chance. However, the Moran test on the residuals indicated significant autocorrelation (I = .568, p < 2.26e-16). The diagnostic plots found in Figure 10 also indicate that the residuals are not normally distributed or homoscedastic. The points in the plot on the left, fitted vs. residual values, represent variance for each group and should be dispersed equally above and below zero on the y-axis. However, for many of the groups on the left side of the plot, the majority of points are above zero and vary immensely in magnitude, indicating that the residuals are not homoscedastic. Additionally, points in the Q-Q plot on the right side should follow the dashed line closely. However, that is not the case on the ends, indicating that the residuals are not normally distributed. Because the residuals are not independent, homoscedastic, or normally distributed, the one-way ANOVA test is not the best indicator of the data in its current state.

# AOV test
aov.model <- aov(TVPA ~ CLASS_CONSOL, landuseJC_filtered)

summary(aov.model)

                Df    Sum Sq   Mean Sq F value Pr(>F)    
CLASS_CONSOL    10 1.466e+16 1.466e+15    3364 <2e-16 ***
Residuals    30075 1.310e+16 4.357e+11                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Moran Test Residuals
moran.test(aov.model$residuals, listw = listw)


    Moran I test under randomisation

data:  aov.model$residuals  
weights: listw    

Moran I statistic standard deviate = 282.58, p-value < 2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance 
     5.675254e-01     -3.323916e-05      4.034089e-06

# Create diagnostic charts
par(mfrow = c(1, 2))
plot(aov.model, which = 1, pch = 1, col = "steelblue")
plot(aov.model, which = 2, pch = 1, col = "steelblue")

Figure 10: These diagnostic plots for the one-way ANOVA test indicate that the residuals are not homoscedastic or normally distributed. The Residuals vs. Fitted values plot (left) shows an unbalanced number of points above and below zero on the y-axis, meaning the variance is not equal. The Q-Q plot (right) shows that the residuals do not conform to the dotted line as normal plots should.

To improve upon the one-way ANOVA test, a two-way ANOVA test was implemented with the ring-based location (RING_ID) as the second explanatory variable. Once again, the F-value for land use was significant (F= 3387.85, p < 2.26e-16), and the F-value for the rings was also significant (F= 54.19, p= 2.26e-16). However, once again, the Moran test showed high spatial autocorrelation (I = .564, p= 2.26e-16), and the diagnostic plots in Figure 11 exhibited a similar pattern to the diagnostics for the one-way ANOVA, meaning the residuals were neither normally distributed or homoscedastic. With that in mind, non-parametric tests will likely provide a better assessment.

landuseJC_filtered$RING_ID <- as.factor(landuseJC_filtered$RING_ID)
aov.model2 <- aov(TVPA ~ CLASS_CONSOL + RING_ID, landuseJC_filtered)

summary(aov.model2)

                Df    Sum Sq   Mean Sq F value Pr(>F)    
CLASS_CONSOL    10 1.466e+16 1.466e+15 3387.85 <2e-16 ***
RING_ID          4 9.378e+13 2.344e+13   54.19 <2e-16 ***
Residuals    30071 1.301e+16 4.326e+11                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

moran.test(aov.model2$residuals, listw = listw)


    Moran I test under randomisation

data:  aov.model2$residuals  
weights: listw    

Moran I statistic standard deviate = 280.73, p-value < 2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance 
     5.638040e-01     -3.323916e-05      4.034073e-06

par(mfrow = c(1, 2))
plot(aov.model2, which = 1, pch = 1, col = "steelblue")
plot(aov.model2, which = 2, pch = 1, col = "steelblue")

Figure 11: The two-way ANOVA test residuals are also not homoscedastic and normally distributed, as shown by the Residuals vs. Fitted values plot (left) and the Q-Q plot (right).

First, the Kruskal-Wallis test was performed with the taxable value per acre and land use categories as variables. The results were statistically significant (H= 8122.9, p < 2.26e-16), meaning the differences in the median taxable value per acre of each category are likely not random. Then, a Scheirer-Ray-Hare test was performed with the addition of the ring variable. The land use variable was significant (H= 7842.7, p < 2.26e-16), the ring variable was significant (H= 389.2, p < 2.26e-16), and the interaction effect was significant (H= 228.5, p < 2.26e-16). This means that the differences in the taxable value per acre for each group is meaningful when the variables are treated separately, and when the variables are considered together, the group differences are also meaningful.

kw <- kruskal.test(formula = TVPA ~ CLASS_CONSOL, data = landuseJC_filtered)

# Kruskal-Wallis Test
kw


    Kruskal-Wallis rank sum test

data:  TVPA by CLASS_CONSOL
Kruskal-Wallis chi-squared = 8122.9, df = 10, p-value < 2.2e-16

SRH <- scheirerRayHare(formula = TVPA ~ CLASS_CONSOL + RING_ID, data = landuseJC_filtered)


DV:  TVPA 
Observations:  30086 
D:  1 
MS total:  75433124

# Scheirer-Ray-Hare Test
SRH

                        Df     Sum Sq      H p.value
CLASS_CONSOL            10 5.9160e+11 7842.7       0
RING_ID                  4 2.9360e+10  389.2       0
CLASS_CONSOL:RING_ID    37 1.7238e+10  228.5       0
Residuals            30034 1.6101e+12

Figure 12 shows the results of the post hoc Dunn test used to create pairwise comparisons of the land use types. A more complete table containing test statistics can be found in Appendix C. For each pair, significant values indicate that the difference in the taxable value per acre for the two land use types being compared is unlikely to have occurred by chance. Only 10 of the 55 pairs are not statistically significant. Most of the non-significant differences were in regard to mobile homes, mobile home parks, agriculture, and single family residencies greater than five acres.

dunn <- dunnTest(TVPA ~ CLASS_CONSOL, data = landuseJC_filtered)

# Add significance stars
dunn_neat <- dunn$res %>% 
    mutate(signif = case_when(
    P.adj < 0.001 ~ "***",
    P.adj < 0.01 ~ "**",
    P.adj < 0.05 ~ "*",
    TRUE ~ ""
    )) %>% separate_wider_delim(Comparison, names = c("Comparison1", "Comparison2"), delim = " - ") %>%
    arrange(Comparison1, Comparison2)

dunn_swapcol <- dunn_neat %>%
    select(Comparison2, Comparison1, everything()) %>%
    rename(Comparison1 = 1, Comparison2 = 2)

dunn_tiles <- rbind(dunn_neat, dunn_swapcol) %>%
    complete(Comparison1, Comparison2)
    
dunn_tiles %>% 
    ggplot(aes(x = Comparison1, y = Comparison2, fill = signif)) +
    geom_tile(stat = "identity", color = "darkgrey") +
    scale_fill_manual(values = c("white", "#92C4E8", "#3182BD"),
                      labels = c("Not Significant", "p < .01", "p < .001"),
                      na.value = "black",
                      name = "Significance") +
    labs(title = "Dunn Test Results",
         subtitle = "Pairwise Land Use Comparisons of the Taxable Value Per Acre") +
    theme_void() +
    theme(axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1),
          axis.text.y = element_text(angle = 0, hjust = 1, vjust = 1),
          plot.title = element_text(face = "bold"))

Figure 12: This chart shows the significance levels for each pair of land use types following the Dunn test. The pairings are mirrored on each side of the black square diagonal. Most pairings are highly significant.

Reflection

A bar chart showing the median taxable value per acre for each land use type in Johnson City can be found in Figure 13. Thanks to the Kruskal-Wallis and Dunn tests, it can be reasonably concluded that the differences in the median values are not due to chance for the data overall and for most pairings. Of course, there are exceptions, such as the differences between mobile home and agricultural uses. Additionally, the Scheirer-Ray-Hare test provided evidence that both location by itself and the interaction between land use and location have an impact on the taxable value per acre. Also, while they were not statistically valid, the results from the ANOVA tests were consistent with both rank-sum tests, providing some additional reassurance about the tests.

type_summary %>% 
    dplyr::filter(type != "Heavy Industrial") %>%
    ggplot(aes(x = reorder(type, median),
               y = median,
               fill = type)) +
    geom_col(color = "gray") +
    coord_flip() +
    theme_bw() +
    scale_y_continuous(labels = label_currency()) +
    scale_fill_manual(values = lookuptable$COLOR,
                      breaks = lookuptable$CLASS_CONSOL) +
    labs(title = "Median Taxable Value Per Acre by Land Use",
       x = "Land Use Type",
       y = "Median Taxable Value Per Acre") +
    guides(fill = "none")

Figure 13: The median taxable value per acre of multi-family housing far exceeds other land use types. Various commercial uses are the next highest. Agricultural parcels produce the least taxable value per acre.

Future research could improve on this work in several ways. First, rather than applying non-parametric tests, data transformations could be implemented to try to meet the parametric model assumptions. While non-parametric tests worked for this analysis, being able to use parametric models would allow for greater flexibility and depth in the analysis. For example, a continuous covariate could be implemented with a categorical covariate via an ANCOVA model. Second, if parametric models can be used, more powerful methods of controlling for spatial autocorrelation should be implemented, such as Eigenvector Spatial Filtering or spatially autoregressive model terms. This research was unable to accomplish that because the ANOVA model failed to meet the normality and homoscedasticity assumptions. Third, the analysis could be expanded by analyzing additional cities to determine if the trends in taxable value per acre by land use type can be generalized beyond Johnson City. At the moment, the evidence is only applicable to Johnson City, but a generalizable study could have wide-scale policy implications on land use regulation.

Regardless, this research can be used to inform better land use policy within Johnson City. Allowing more commercial property and duplexes could marginally increase property tax revenue, but the greatest benefits come from multi-family property. Multi-family parcels have a median taxable value per acre of $2,954,525.90, while the next most valuable land use, offices, has a median value per acre of $511,011.68. Increasing the amount of multi-family housing within the city would increase the housing supply — potentially lowering the costs for renters and home buyers — while also increasing the city’s tax base. Ultimately, the increased revenue could go towards improving valuable services provided to the community.

References

Barabás, G. (n.d.). Part II: Statistical Inference. In Introduction to Data Analysis and Visualization in R. essay. Retrieved April 29, 2024, from https://dysordys.github.io/data-with-R/Intro_statistics.html

Chen, M., Chen, Y., Wang, X., Tan, H., & Luo, F. (2019). Spatial difference of transit-based accessibility to hospitals by regions using spatially adjusted ANOVA. International Journal of Environmental Research and Public Health, 16(11), 1923. https://doi.org/10.3390/ijerph16111923

Ihlanfeldt, K. R. (2007). The effect of land use regulation on housing and land prices. Journal of Urban Economics, 61(3), 420–435. https://doi.org/10.1016/j.jue.2006.09.003

Kim, D., & Jin, J. (2019). The effect of land use on housing price and rent: Empirical evidence of job accessibility and mixed land use. Sustainability, 11(3), 938. https://doi.org/10.3390/su11030938

Nielsen, C. (2018, June 22). We measure car value based on miles per gallon, not miles per tank. why don’t we do the same for our cities’ developments? Strong Towns. https://www.strongtowns.org/journal/2018/6/22/miles-per-gallon-tank-value-per-acre

O’Halloran, K. (2019). The City as a Generator of Public Value: An Analysis of Taxable Value per Acre and Public Infrastructure Investment in Jacksonville, Florida. Georgetown Library Repository. https://repository.library.georgetown.edu/bitstream/handle/10822/1062768/O%27Halloran%20Paper.pdf?sequence=1

Song, Y., & Knaap, G.J. (2004). Measuring the effects of mixed land uses on housing values. Regional Science and Urban Economics, 34(6), 663–680. https://doi.org/10.1016/j.regsciurbeco.2004.02.003

Tennessee Comptroller of the Treasury. (2018, July 10). Land Use Classifications. Comptroller of the Treasury. https://comptroller.tn.gov/content/dam/cot/pa/documents/landuse/COTLandUseCodes.pdf

Tennessee Comptroller of the Treasury. (2023a). 2023_LU_WASHINGTON shapefile [Data Set]. https://comptroller.tn.gov/content/dam/cot/pa/documents/landuse/maps/landuse/county/2023_LU_WASHINGTON.zip

Tennessee Comptroller of the Treasury. (2023b). Existing Land Use: Washington County, Tennessee. Comptroller of the Treasury. https://comptroller.tn.gov/content/dam/cot/pa/documents/landuse/maps/landuse/city/2023_LU_WASHINGTON.pdf

Tennessee Comptroller of the Treasury. (n.d.). What is IMPACT. Comptroller of the Treasury. https://comptroller.tn.gov/office-functions/pa/impact/what-is-impact.html

Tennessee STS-GIS. (2023). TN City boundaries. State of Tennessee Downloadable GIS Data. https://tn-tnmap.opendata.arcgis.com/datasets/tnmap::tn-city-boundaries/about

Wang, Y., Kockelman, K. M., & Wang, X. (Cara). (2013). Understanding spatial filtering for analysis of land use-transport data. Journal of Transport Geography, 31, 122–131. https://doi.org/10.1016/j.jtrangeo.2013.06.001

Appendices

Appendix A

The table below shows the land use codes, their description, and how they were consolidated for this analysis.

lookuptable %>% 
    select(!COLOR) %>% #Remove the color row
    gt() %>% 
    tab_header(title = md("Land Use Codes and Consolidation Look Up Table")) %>%
    cols_align(columns = "LU_CLASSIF", align = "center") %>% 
    opt_align_table_header(align = "left")

LU_CLASSIF	CLASS_DESC	CLASS_CONSOL
Land Use Codes and Consolidation Look Up Table
0	Residential	Residential
1	Single Family Residential < 5 acres	Single Family Residential < 5 acres
2	Single Family Residential >= 5 acres	Single Family Residential >= 5 acres
3	Duplex	Duplex
4	Multi-Family	Multi-Family
5	Mobile Home	Mobile Home
6	Mobile Home Park	Mobile Home Park
7	Resort Residential	Resort Residential
10	Commercial	Misc Commercial
11	General Commercial	General Commercial
12	Office	Office
13	Motel and Hotel	Misc Commercial
14	General Commercial-Residential Split	Misc Commercial
15	Golf Course	Misc Commercial
16	Nursing Home	Misc Commercial
20	Industrial	Industrial
21	Light Industrial and Warehousing	Light Industrial and Warehousing
22	Heavy Industrial	Heavy Industrial
30	Public and Semi-Public	Public and Semi-Public
31	Public	Public
32	Semi-Public	Semi-Public
40	Utilities	Utilities
41	Utilities	Utilities
50	Vacant	Vacant
51	Vacant Lot < 5 acres	Vacant
52	Vacant Tract >= 5 acres	Vacant
53	Vacant Resort Lot	Vacant
60	Agricultural	Agriculture
61	Agricultural Tract Unimproved	Agriculture
62	Agricultural Tract with SFR	Agriculture
63	Agricultural Tract with Mobile Home	Agriculture
64	Agricultural Tract with SFR and Mobile Home	Agriculture
65	Agricultural Tract with Multi-Family Residential	Agriculture
70	Timber and Forest	Timber
71	Timber Tract Unimproved	Timber
72	Timber Tract with SFR	Timber
73	Timber Tract with Mobile Home	Timber
74	Timber Tract with SFR and Mobile Home	Timber
75	Timber Tract with Multi-Family Residential	Timber
80	Water	Water
81	River	Water
82	Lake or Pond	Water
90	Transportation	Transportation
91	Highway or Road Right-of-Way	Transportation
92	Railroad Right-of-Way	Transportation
93	Airport	Transportation
96	Unclassified Structure less than 30k	NA
97	Unclassified Structure equal to or greater than 30k	NA
98	CAAS Data unavailable	NA
99	Uncoded by Landuse model	NA

Appendix B

The table below shows summary statistics for each ring and land use type.

twogroup_summary <- landuseJC_filtered %>%
    st_drop_geometry() %>%
    group_by(CLASS_CONSOL, RING_ID) %>%
    summarise(count = n(),
            mean = mean(TVPA),
            median = median(TVPA),
            max = max(TVPA),
            min = min(TVPA),
            skew = skewness(TVPA),
            kurtosis = kurtosis(TVPA)) %>% 
    rename(LANDUSE = CLASS_CONSOL)

gt(twogroup_summary, rowname_col = "RING_ID") %>% 
    tab_header(title = md("Taxable Value Per Acre Summary Statistics"),
               subtitle = "Grouped by Land Use Category and Ring") %>%
    fmt_number(decimals = 2, use_seps = TRUE, drop_trailing_zeros = TRUE) %>% 
    opt_align_table_header(align = "left")

	count	mean	median	max	min	skew	kurtosis
Taxable Value Per Acre Summary Statistics
Grouped by Land Use Category and Ring
Agriculture
1	2	8,755.9	8,755.9	16,172.52	1,339.28	0	1
2	34	13,029.26	5,774.73	153,540.3	1,369.98	4.96	27.46
3	12	8,580.23	3,861.47	27,986.39	1,793.28	1.25	3.12
4	5	8,462.45	8,496.26	11,056.6	6,437.67	0.31	1.83
5	5	11,393.73	10,163.19	20,860.46	3,003.69	0.15	1.31
Duplex
1	301	373,135.04	345,975.91	1,047,238.7	74,139.56	0.92	3.99
2	47	234,693.81	203,603.49	559,106.72	12,832.3	0.7	2.91
3	22	222,767.02	263,289.25	401,949.97	21,411.84	−0.4	2.4
4	21	250,214.21	217,199.04	695,486.96	38,302.69	1.68	5.28
5	1	29,730.89	29,730.89	29,730.89	29,730.89	NaN	NaN
General Commercial
1	615	560,610.65	403,904.89	7,501,784.07	4,023.42	5.2	39.66
2	385	365,356.66	309,072.84	2,250,817.31	3,981.75	1.88	10.64
3	92	300,784.6	271,888.95	945,557.35	5,923.3	0.69	2.77
4	122	246,168.28	194,929.56	797,341.2	690.57	0.92	3.43
5	2	81,455.4	81,455.4	102,392.07	60,518.72	0	1
Light Industrial and Warehousing
1	123	275,670.23	186,339.12	3,158,140.13	19,630.32	5.49	45.23
2	75	211,754.47	159,039.66	666,071	6,616.07	1.01	3.31
3	14	268,157.53	236,487.06	574,770.56	99,236.64	1.01	3.12
4	14	255,167.59	167,791.58	851,741.58	26,565.76	1.52	3.99
5	3	51,726.51	38,754.08	83,494.5	32,930.94	0.67	1.5
Misc Commercial
1	34	765,241.16	459,309.7	2,990,211.29	17,623.51	1.27	3.87
2	33	939,179.72	314,814.13	5,885,410.69	16,535.11	2.26	7.12
3	8	571,094.51	326,211.93	1,974,481.57	9,027.36	1.18	3.3
4	14	805,010.52	133,842.42	2,387,122.34	8,995.57	0.63	1.46
Mobile Home
1	48	54,175.36	47,642.34	127,770.02	12,837.73	0.81	3.13
2	139	35,113.88	28,567.55	198,249.64	4,438.26	2.14	11.82
3	62	56,664.64	39,086.94	344,996.03	6,954.76	2.87	12.47
4	43	64,951.61	52,463.29	305,672.43	5,932.12	2.41	9.52
5	14	30,323.39	31,160.26	47,823.73	14,830.97	0.15	1.86
Mobile Home Park
1	36	117,121.91	65,250.96	1,336,327.99	1,340.15	4.76	25.72
2	27	46,179.61	38,382.33	186,311.03	3,139.25	2.38	10.2
3	4	39,017.97	43,080.06	62,528.91	7,382.86	−0.55	1.99
4	19	68,708.71	48,260.71	352,036.55	13,868.44	2.93	11.54
5	4	75,544.28	43,331.11	185,875.57	29,639.34	1.11	2.29
Multi-Family
1	955	2,589,924.26	1,989,414.9	11,620,210.47	24,145.31	0.43	2.22
2	558	2,867,554.76	3,042,167.56	7,662,640.18	9,614.7	−0.04	1.93
3	356	3,397,568.5	3,851,662.19	6,893,143.39	72,001.86	−0.54	1.99
4	540	3,098,409.2	3,091,696.34	8,146,735.28	18,008.75	0.89	5.19
Office
1	179	846,495.81	620,346.04	5,959,520.94	17,229.23	3.15	15.8
2	137	987,506.55	498,555.47	13,862,435.86	23,888.82	4.6	28.94
3	30	410,909.18	458,906.45	903,764.16	130,527.41	0.38	2.74
4	31	415,982.91	385,854.55	907,115.18	1,659.54	0.19	2.42
Single Family Residential < 5 acres
1	9,418	309,238.09	273,001.5	3,824,076.75	8,112.84	5.76	61.8
2	7,429	354,623.97	263,085.7	6,529,189.84	7,532.39	6.6	70.04
3	3,175	446,093.7	311,961.28	7,023,656.26	2,692.09	5.23	42.81
4	3,436	340,242.26	247,956.66	5,411,590.31	6,514.24	4.73	29.73
5	376	154,554.17	148,437.04	512,953.49	7,610.05	0.84	4.24
Single Family Residential >= 5 acres
1	92	48,123.75	32,859.1	462,524.77	132.16	4.17	23.07
2	473	45,563.39	25,792.61	3,040,312.84	1,208.22	17.7	349.88
3	280	44,029.94	34,181.65	343,185.02	312.86	3.13	18.45
4	194	38,137.62	30,967.21	166,456.03	5,087.91	1.77	7.07
5	47	27,994.74	23,372.95	70,680.22	6,081.45	0.81	2.69

Appendix C

The table below shows the pairwise test statistics from the Dunn Test.

# Put in a neat table
gt(dunn_neat) %>% 
    tab_header(title = md("Dunn Test Results"),
               subtitle = "Pairwise Comparisons of Land Use Types") %>% 
    fmt_scientific(n_sigfig = 3, drop_trailing_zeros = TRUE, drop_trailing_dec_mark = TRUE) %>% 
    tab_footnote("Signif. codes: 0 <= '***' < 0.001 < '**' < 0.01 < '*' < 0.05", placement = "right") %>% 
    opt_align_table_header(align = "left")

Comparison1	Comparison2	Z	P.unadj	P.adj	signif
Dunn Test Results
Pairwise Comparisons of Land Use Types
Agriculture	Duplex	−1.31 × 10¹	2.84 × 10⁻³⁹	8.51 × 10⁻³⁸	***
Agriculture	General Commercial	−1.39 × 10¹	8.16 × 10⁻⁴⁴	2.53 × 10⁻⁴²	***
Agriculture	Light Industrial and Warehousing	−8.40	4.47 × 10⁻¹⁷	8.05 × 10⁻¹⁶	***
Agriculture	Misc Commercial	−1.12 × 10¹	5.19 × 10⁻²⁹	1.30 × 10⁻²⁷	***
Agriculture	Mobile Home	−1.23	2.18 × 10⁻¹	1.00
Agriculture	Mobile Home Park	−1.87	6.19 × 10⁻²	5.57 × 10⁻¹
Agriculture	Multi-Family	−2.26 × 10¹	9.95 × 10⁻¹¹³	4.18 × 10⁻¹¹¹	***
Agriculture	Office	−1.75 × 10¹	2.18 × 10⁻⁶⁸	8.27 × 10⁻⁶⁷	***
Agriculture	Single Family Residential < 5 acres	−1.25 × 10¹	6.06 × 10⁻³⁶	1.64 × 10⁻³⁴	***
Agriculture	Single Family Residential >= 5 acres	−1.07	2.83 × 10⁻¹	1.00
Duplex	General Commercial	−3.64 × 10⁻¹	7.16 × 10⁻¹	1.00
Duplex	Light Industrial and Warehousing	7.33	2.27 × 10⁻¹³	3.40 × 10⁻¹²	***
Duplex	Misc Commercial	−3.56 × 10⁻¹	7.22 × 10⁻¹	1.00
Duplex	Mobile Home	2.19 × 10¹	5.16 × 10⁻¹⁰⁶	2.12 × 10⁻¹⁰⁴	***
Duplex	Mobile Home Park	1.31 × 10¹	3.67 × 10⁻³⁹	1.06 × 10⁻³⁷	***
Duplex	Multi-Family	−2.12 × 10¹	1.42 × 10⁻⁹⁹	5.69 × 10⁻⁹⁸	***
Duplex	Office	−8.60	7.98 × 10⁻¹⁸	1.52 × 10⁻¹⁶	***
Duplex	Single Family Residential < 5 acres	3.91	9.26 × 10⁻⁵	1.02 × 10⁻³	**
Duplex	Single Family Residential >= 5 acres	2.89 × 10¹	5.00 × 10⁻¹⁸³	2.35 × 10⁻¹⁸¹	***
General Commercial	Light Industrial and Warehousing	8.76	1.96 × 10⁻¹⁸	3.92 × 10⁻¹⁷	***
General Commercial	Misc Commercial	−1.88 × 10⁻¹	8.51 × 10⁻¹	8.51 × 10⁻¹
General Commercial	Mobile Home	2.64 × 10¹	9.33 × 10⁻¹⁵⁴	4.29 × 10⁻¹⁵²	***
General Commercial	Mobile Home Park	1.42 × 10¹	8.95 × 10⁻⁴⁶	2.95 × 10⁻⁴⁴	***
General Commercial	Multi-Family	−3.22 × 10¹	2.36 × 10⁻²²⁷	1.15 × 10⁻²²⁵	***
General Commercial	Office	−1.02 × 10¹	2.85 × 10⁻²⁴	6.27 × 10⁻²³	***
General Commercial	Single Family Residential < 5 acres	7.49	6.83 × 10⁻¹⁴	1.16 × 10⁻¹²	***
General Commercial	Single Family Residential >= 5 acres	4.12 × 10¹	0.00	0.00	***
Light Industrial and Warehousing	Misc Commercial	−5.22	1.82 × 10⁻⁷	2.36 × 10⁻⁶	***
Light Industrial and Warehousing	Mobile Home	1.21 × 10¹	9.07 × 10⁻³⁴	2.36 × 10⁻³²	***
Light Industrial and Warehousing	Mobile Home Park	7.40	1.38 × 10⁻¹³	2.21 × 10⁻¹²	***
Light Industrial and Warehousing	Multi-Family	−2.55 × 10¹	1.93 × 10⁻¹⁴³	8.51 × 10⁻¹⁴²	***
Light Industrial and Warehousing	Office	−1.47 × 10¹	8.24 × 10⁻⁴⁹	2.80 × 10⁻⁴⁷	***
Light Industrial and Warehousing	Single Family Residential < 5 acres	−6.19	6.15 × 10⁻¹⁰	8.62 × 10⁻⁹	***
Light Industrial and Warehousing	Single Family Residential >= 5 acres	1.50 × 10¹	8.39 × 10⁻⁵¹	2.94 × 10⁻⁴⁹	***
Misc Commercial	Mobile Home	1.42 × 10¹	9.29 × 10⁻⁴⁶	2.97 × 10⁻⁴⁴	***
Misc Commercial	Mobile Home Park	1.05 × 10¹	7.28 × 10⁻²⁶	1.75 × 10⁻²⁴	***
Misc Commercial	Multi-Family	−1.03 × 10¹	7.07 × 10⁻²⁵	1.63 × 10⁻²³	***
Misc Commercial	Office	−4.91	9.15 × 10⁻⁷	1.10 × 10⁻⁵	***
Misc Commercial	Single Family Residential < 5 acres	2.27	2.33 × 10⁻²	2.33 × 10⁻¹
Misc Commercial	Single Family Residential >= 5 acres	1.58 × 10¹	3.25 × 10⁻⁵⁶	1.20 × 10⁻⁵⁴	***
Mobile Home	Mobile Home Park	−1.15	2.50 × 10⁻¹	1.00
Mobile Home	Multi-Family	−4.65 × 10¹	0.00	0.00	***
Mobile Home	Office	−2.97 × 10¹	2.15 × 10⁻¹⁹⁴	1.03 × 10⁻¹⁹²	***
Mobile Home	Single Family Residential < 5 acres	−2.55 × 10¹	7.89 × 10⁻¹⁴⁴	3.55 × 10⁻¹⁴²	***
Mobile Home	Single Family Residential >= 5 acres	4.90 × 10⁻¹	6.24 × 10⁻¹	1.00
Mobile Home Park	Multi-Family	−2.50 × 10¹	6.41 × 10⁻¹³⁸	2.76 × 10⁻¹³⁶	***
Mobile Home Park	Office	−1.83 × 10¹	4.67 × 10⁻⁷⁵	1.82 × 10⁻⁷³	***
Mobile Home Park	Single Family Residential < 5 acres	−1.26 × 10¹	1.99 × 10⁻³⁶	5.57 × 10⁻³⁵	***
Mobile Home Park	Single Family Residential >= 5 acres	1.55	1.22 × 10⁻¹	9.75 × 10⁻¹
Multi-Family	Office	9.63	6.13 × 10⁻²²	1.29 × 10⁻²⁰	***
Multi-Family	Single Family Residential < 5 acres	6.33 × 10¹	0.00	0.00	***
Multi-Family	Single Family Residential >= 5 acres	7.81 × 10¹	0.00	0.00	***
Office	Single Family Residential < 5 acres	1.58 × 10¹	3.88 × 10⁻⁵⁶	1.40 × 10⁻⁵⁴	***
Office	Single Family Residential >= 5 acres	3.88 × 10¹	0.00	0.00	***
Single Family Residential < 5 acres	Single Family Residential >= 5 acres	4.84 × 10¹	0.00	0.00	***
Signif. codes: 0 <= '*' < 0.001 < '' < 0.01 < '*' < 0.05

Exploring Taxable Value per Acre by Land Use Type in Johnson City, Tennessee 2023

GEOG588 Term Project

Benjamin Browning

2024-05-01