This notebook looks at population density metrics for a set of
selected countries. For some further detail on data sources and
definitions, see the end of this notebook.
Data is sourced from the relevant national statistical agencies and
gathered using scripts in the data-raw directory with
results stored as csv files in the
inst/extdata directory and as rda files in the
data directory.
library(geo.popn.density)
InitCSS()
all_data <- GetAllData()
ShowTable(TableCountryTotals(all_data), params$table_css)
ShowTable(TableSummStats(all_data, stat_col = params$ppn_est), params$table_css)
all_data_exc_rural <- ExcRural(all_data, msa_name = params$msa_name, msa_code = params$msa_code)
ShowTable(TableSummStats(all_data_exc_rural, group_by = params$msa_name, stat_col = params$ppn_est), params$table_css)
Note, includes Metropolitan and Micropolitan areas
in the US data.
ShowTable(TableSummStats(all_data, stat_col = params$area, dflt_dp = 2), params$table_css)
ShowTable(TableSummStats(all_data_exc_rural, group_by = params$msa_name, stat_col = params$area, dflt_dp = 2), params$table_css)
Note, includes Metropolitan and Micropolitan areas
in the US data.
The “Statistical Area Level 2” is the lowest level
statistical area data generally available for Australian population and
is driving a much lower fidelity than that available for the US. This
leads to inevitable issues of just how comparable these statistic
actually across these data sets.
Further, the metropolitan statistical area definition looks to typically
generate a larger geographical area in the US than Australia; again
leading to issues of comparability.
densities <- CalcDensities(all_data, ccode = params$ccode, ppn_est = params$ppn_est,
area = params$area, density = params$density, group_by = params$ccode)
PlotDensities(densities, density = params$density, ccode = params$ccode)
Weighted Arithmetic: \(\displaystyle
\frac{\sum_{i=1}^{n}(density_i *
population_i)}{\sum_{i=1}^{n}{population_i}}\)
Weighted Geometric: \(\displaystyle
{\exp(\frac{\sum_{i=1}^{n}(\log(density_i) *
population_i)}{\sum_{i=1}^{n}{population_i}})}\)
Weighted Median: \(\displaystyle
{median(density_i)}\)
Simple: \(\displaystyle
\frac{\sum_{i=1}^{n}{population_i}}{\sum_{i=1}^{n}{area_i}}\)
Where: \(\displaystyle density_i =
\frac{population_i}{area_i}\)
So how dense is Australia vs. USA? Well depends on how you calculate
it. The arithmetic method is weighted by areas in the US with very high
density e.g. in New York and Chicago. The median result is interesting
as it show the 50th percentile or median person in Australia experiences
a higher density than the US.
all_by_density <- CalcCumPercent(all_data, cum_pct = params$cum_pct, metric = params$ppn_est, sort_by = params$density, group_by = params$ccode)
mdn_sa <- FindMedian(all_by_density, cum_pct = params$cum_pct)
PlotAllDataByDensity(all_by_density, density = params$density, ccode = params$ccode, cum_pct = params$cum_pct, mdn_sa = mdn_sa)
# Sorting the data by descending data and calculating the cumulative percentage population to find the median
show_cols <- c(params$ccode, params$sabbr, params$msa_name, params$sa_code, params$ppn_est, params$area, params$density, params$cum_pct)
ShowTable(TableMedian(mdn_sa, show_cols), params$table_css)
# Group the data into its metropolitan statistical areas
msas <- GroupData(all_data, ppn_est = params$ppn_est, area = params$area, ppn_pct = params$ppn_pct, density = params$density, group_by = c(params$ccode, "MSA_Code", params$msa_name, "Metro_Micro"))
# Sort by population descending
msas_by_popn <- SortMSAsbyPopn(msas, cum_pct = params$cum_pct, ppn_est = params$ppn_est, ppn_pct = params$ppn_pct, msa_name = params$msa_name, group_by = params$ccode)
mdn_msa_popn <- FindMedian(msas_by_popn, cum_pct = params$cum_pct)
show_cols <- c(params$ccode, params$msa_name, params$ppn_est, params$area, params$density, params$cum_pct)
PlotMSAsByPopn(msas_by_popn, ppn_est = params$ppn_est, ccode = params$ccode, cum_pct = params$cum_pct)
ShowTable(TableMedian(mdn_msa_popn, show_cols), params$table_css)
msas_by_density <- SortMSAsbyDensity(msas, cum_pct = params$cum_pct, density = params$density, ppn_pct = params$ppn_pct, group_by = params$ccode)
PlotMSAsByDensity(msas_by_density, density = params$density, ccode = params$ccode, cum_pct = params$cum_pct)
mdn_msa_density <- FindMedian(msas_by_density, cum_pct = params$cum_pct)
show_cols <- c(params$ccode, params$msa_name, params$ppn_est, params$area, params$density, params$cum_pct)
ShowTable(TableMedian(mdn_msa_density, show_cols), params$table_css)
Data sourced from the Australian Bureau of Statistics. Population
data sourced at the Statistical Area 2 level. See references below for
further detail on Australian statistical geographic standards. Density
is given by the ABS but is recalculated here as the
population-area-density numbers do not line up as given. Data is output
to au.csv in the data directory.
Data sourced from the US census bureau via the
tidycensus package. Population data sourced at the census
tract level. Data is output to us.csv in the
data directory.
The repo for this code is here on GitHub