Notebook Summary

This notebook looks at population density metrics for a set of selected countries. For some further detail on data sources and definitions, see the end of this notebook.

Countries currently in the data:
  • Australia
  • United States of America

1. Initialise and Read in the Data

Data is sourced from the relevant national statistical agencies and gathered using scripts in the data-raw directory with results stored as csv files in the inst/extdata directory and as rda files in the data directory.

library(geo.popn.density)
InitCSS()

all_data <- GetAllData()



2. Data Review

2.1 Data Coverage

Total Population, Area and Avg. Density by Country
ShowTable(TableCountryTotals(all_data), params$table_css)


Population Summary Statistics by Statistical Area
ShowTable(TableSummStats(all_data, stat_col = params$ppn_est), params$table_css)


And by Metropolitan Area
all_data_exc_rural <- ExcRural(all_data, msa_name = params$msa_name, msa_code = params$msa_code)
ShowTable(TableSummStats(all_data_exc_rural, group_by = params$msa_name, stat_col = params$ppn_est), params$table_css)

Note, includes Metropolitan and Micropolitan areas in the US data.

Area Summary Statistics by Statistical Area
ShowTable(TableSummStats(all_data, stat_col = params$area, dflt_dp = 2), params$table_css)


And by Metropolitan Area
ShowTable(TableSummStats(all_data_exc_rural, group_by = params$msa_name, stat_col = params$area, dflt_dp = 2), params$table_css)

Note, includes Metropolitan and Micropolitan areas in the US data.

Discussion 2022-06-21

The “Statistical Area Level 2” is the lowest level statistical area data generally available for Australian population and is driving a much lower fidelity than that available for the US. This leads to inevitable issues of just how comparable these statistic actually across these data sets.
Further, the metropolitan statistical area definition looks to typically generate a larger geographical area in the US than Australia; again leading to issues of comparability.


3 Density Comparisons

3.1 Density by Country

densities <- CalcDensities(all_data, ccode = params$ccode, ppn_est = params$ppn_est,
                           area = params$area, density = params$density, group_by = params$ccode)
PlotDensities(densities, density = params$density, ccode = params$ccode)



Weighted Arithmetic: \(\displaystyle \frac{\sum_{i=1}^{n}(density_i * population_i)}{\sum_{i=1}^{n}{population_i}}\)

Weighted Geometric: \(\displaystyle {\exp(\frac{\sum_{i=1}^{n}(\log(density_i) * population_i)}{\sum_{i=1}^{n}{population_i}})}\)

Weighted Median: \(\displaystyle {median(density_i)}\)

Simple: \(\displaystyle \frac{\sum_{i=1}^{n}{population_i}}{\sum_{i=1}^{n}{area_i}}\)

Where: \(\displaystyle density_i = \frac{population_i}{area_i}\)

Discussion 2022-06-21

So how dense is Australia vs. USA? Well depends on how you calculate it. The arithmetic method is weighted by areas in the US with very high density e.g. in New York and Chicago. The median result is interesting as it show the 50th percentile or median person in Australia experiences a higher density than the US.

3.2 Statistical Areas by Density

all_by_density <- CalcCumPercent(all_data, cum_pct = params$cum_pct, metric = params$ppn_est, sort_by = params$density, group_by = params$ccode)
mdn_sa <- FindMedian(all_by_density, cum_pct = params$cum_pct)
PlotAllDataByDensity(all_by_density, density = params$density, ccode = params$ccode, cum_pct = params$cum_pct, mdn_sa = mdn_sa)


The Median Statistical Area by Density
# Sorting the data by descending data and calculating the cumulative percentage population to find the median
show_cols <- c(params$ccode, params$sabbr, params$msa_name, params$sa_code, params$ppn_est, params$area, params$density, params$cum_pct)
ShowTable(TableMedian(mdn_sa, show_cols), params$table_css)


3.3 Metropolitan Areas by Population

# Group the data into its metropolitan statistical areas
msas <- GroupData(all_data, ppn_est = params$ppn_est, area = params$area, ppn_pct = params$ppn_pct, density = params$density, group_by = c(params$ccode, "MSA_Code", params$msa_name, "Metro_Micro"))

# Sort by population descending
msas_by_popn <- SortMSAsbyPopn(msas, cum_pct = params$cum_pct, ppn_est = params$ppn_est, ppn_pct = params$ppn_pct, msa_name = params$msa_name, group_by = params$ccode)
mdn_msa_popn <- FindMedian(msas_by_popn, cum_pct = params$cum_pct)
show_cols <- c(params$ccode, params$msa_name, params$ppn_est, params$area, params$density, params$cum_pct)

PlotMSAsByPopn(msas_by_popn, ppn_est = params$ppn_est, ccode = params$ccode, cum_pct = params$cum_pct)



The Median Metroplitan Areas by Population

ShowTable(TableMedian(mdn_msa_popn, show_cols), params$table_css)


3.4 Metropolitan Areas by Simple Density

msas_by_density <- SortMSAsbyDensity(msas, cum_pct = params$cum_pct, density = params$density, ppn_pct = params$ppn_pct, group_by = params$ccode)
PlotMSAsByDensity(msas_by_density, density = params$density, ccode = params$ccode, cum_pct = params$cum_pct)


The Median Metroplitan Areas by Simple Density

mdn_msa_density <- FindMedian(msas_by_density, cum_pct = params$cum_pct)
show_cols <- c(params$ccode, params$msa_name, params$ppn_est, params$area, params$density, params$cum_pct)
ShowTable(TableMedian(mdn_msa_density, show_cols), params$table_css)




Source Data

Australia

Data sourced from the Australian Bureau of Statistics. Population data sourced at the Statistical Area 2 level. See references below for further detail on Australian statistical geographic standards. Density is given by the ABS but is recalculated here as the population-area-density numbers do not line up as given. Data is output to au.csv in the data directory.

United States

Data sourced from the US census bureau via the tidycensus package. Population data sourced at the census tract level. Data is output to us.csv in the data directory.

Source Code

The repo for this code is here on GitHub