Using open source demographic data downloaded from the Province of New Brunswick (PNB) website https://www2.gnb.ca/content/dam/gnb/Departments/fin/pdf/esi/demographic-demographique/ComponentsofGrowth-ComposantesDeLaCroissance.xlsx, the goal of this exercise is to determine which has a bigger factor on population growth in New Brunswick - Natural Balance (births-deaths), Interprovincial Migration, or Immigration.
Data cleanup on the 48-year source dataset in Microsoft Excel, before saving as a CSV file (for data analysis in R), included editing to force all multi-row column headings to be only one row high, in English language text only, shortening some of the column headings, and then deleting the last row of the dataset which had only the starting population for the period 2019-20. There was no data key for the data variables with the document, or found anywhere on the PNB website.
Data Frame in R: NB_pop
names(NB_pop)
## [1] "Period" "Population_begin_period"
## [3] "Births" "Deaths"
## [5] "Interprov_migration_In" "Interprov_migration_Out"
## [7] "Interprov_migration_Net" "Immigrants"
## [9] "Emigrants" "Net_NPR"
## [11] "Residual_deviation" "Total_growth"
The column headings/variables in the dataset are mostly self-explanatory: Period (July 1 to June 30) , Population_begin_period, Births, Deaths, Interprov_migration_In, Interprov_migration_Out, Interprov_migration_Net, Immigrants, Emigrants.
However a few variable names needed further online research on population change measures: Net_NPR is the Net population of Non-Permanent Residents (Permanent Residents are included in the population); Residual_deviation is “obtained by distributing the error of closure linearly throughout the intercensal period. The error of closure is defined as the difference between the postcensal population estimates on Census Day and the population enumerated in that census adjusted for census net undercoverage and incompletely enumerated indian reserves” (from Statistics Canada https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=1710000801 footnote 15).
Total Growth (per period) is the sum of: Births minus Deaths, Interprov_migration_Net, Immigrants minus Emigrants, Net_NPR and Residual_deviation.
Variables that will need to be added to the dataset are Natural_Balance (births minus deaths) and International_Migration (immigration minus emigration). This idea is adapted from a Brandon University document (https://www.brandonu.ca/rdi/files/2014/09/Components-of-Population-Change1.pdf)
We can see in the above plot that the population of New Brunswick has decreased several times on an annual basis since 1997.
In the dataset, four variables are considered independent variables – Net Interprovincial Migration, Natural Balance, Net Immigration and Net Non-Permanent Residents. The dependent variable is the Total Annual Population Change. One variable, Residual Deviation, is not being included in the analysis as it is more or less a “fudge factor” used to make the population accounting “balance”.
We note in the above plot that in most years more New Brunswickers move out of the province than other Canadians moving to New Brunswick.
We note in the above plot that immigration to New Brunswick jumped dramatically starting in 2006-07.
We note above that the Natural Balance in New Brunswick went into the negative during the period 2014-15. In other words, since then more people are dying each year in New Brunswick than being born.
The number of Non-Permanent Residents took a huge jump in the last year of the dataset, which might be attributed to the arrival of Syrian refugees in New Brunswick.
summary(NB_pop)
## Period Population_begin_period Births Deaths
## Length:48 Min. :642471 Min. : 6550 Min. :5000
## Class :character 1st Qu.:712996 1st Qu.: 7121 1st Qu.:5286
## Mode :character Median :747467 Median : 8534 Median :5886
## Mean :731113 Mean : 8797 Mean :5907
## 3rd Qu.:750687 3rd Qu.:10406 3rd Qu.:6316
## Max. :770921 Max. :12047 Max. :7822
## Interprov_migration_In Interprov_migration_Out Interprov_migration_Net
## Min. : 8517 Min. : 9702 Min. :-4989.0
## 1st Qu.:10704 1st Qu.:11662 1st Qu.:-1907.5
## Median :11674 Median :12632 Median : -875.5
## Mean :12844 Mean :13508 Mean : -663.9
## 3rd Qu.:13856 3rd Qu.:15079 3rd Qu.: 222.8
## Max. :24072 Max. :19806 Max. : 6037.0
## Immigrants Emigrants Net_NPR Residual_deviation
## Min. : 558.0 Min. :183.0 Min. :-249.0 Min. :-1222.0
## 1st Qu.: 686.2 1st Qu.:313.2 1st Qu.: -6.5 1st Qu.: 0.0
## Median : 878.5 Median :478.0 Median : 100.0 Median : 891.5
## Mean :1429.6 Mean :473.2 Mean : 209.4 Mean : 593.4
## 3rd Qu.:1942.5 3rd Qu.:588.5 3rd Qu.: 362.0 3rd Qu.: 1145.0
## Max. :5076.0 Max. :830.0 Max. :1752.0 Max. : 1821.0
## Total_growth
## Min. :-2436.0
## 1st Qu.: 205.8
## Median : 2625.5
## Mean : 2799.1
## 3rd Qu.: 4576.0
## Max. :12486.0
Additional Variables:
Net_Immigration = Immigrants - Emigrants
Natural_Balance = Births - Deaths
summary(lm.pop)
##
## Call:
## lm(formula = Total_growth ~ Natural_Balance + Net_Immigration +
## Interprov_migration_Net + Net_NPR, data = NB_pop)
##
## Residuals:
## Min 1Q Median 3Q Max
## -705.65 -413.61 -159.15 83.75 1573.02
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -374.93624 259.42935 -1.445 0.1556
## Natural_Balance 0.83366 0.05239 15.912 < 2e-16 ***
## Net_Immigration 1.27104 0.11136 11.414 1.34e-14 ***
## Interprov_migration_Net 0.87846 0.04948 17.752 < 2e-16 ***
## Net_NPR 0.62960 0.35040 1.797 0.0794 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 620.1 on 43 degrees of freedom
## Multiple R-squared: 0.9674, Adjusted R-squared: 0.9644
## F-statistic: 319.1 on 4 and 43 DF, p-value: < 2.2e-16
summary(lm.pop1)
##
## Call:
## lm(formula = Total_growth ~ Natural_Balance, data = NB_pop)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4556 -1733 -1065 1540 6818
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 577.9260 611.8060 0.945 0.35
## Natural_Balance 0.7684 0.1622 4.737 2.11e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2723 on 46 degrees of freedom
## Multiple R-squared: 0.3279, Adjusted R-squared: 0.3132
## F-statistic: 22.44 on 1 and 46 DF, p-value: 2.114e-05
summary(lm.pop2)
##
## Call:
## lm(formula = Total_growth ~ Net_Immigration, data = NB_pop)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4941.2 -2221.7 -754.4 1408.5 9195.1
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2097.0088 603.3565 3.476 0.00112 **
## Net_Immigration 0.7341 0.4045 1.815 0.07605 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3208 on 46 degrees of freedom
## Multiple R-squared: 0.06683, Adjusted R-squared: 0.04654
## F-statistic: 3.294 on 1 and 46 DF, p-value: 0.07605
summary(lm.pop3)
##
## Call:
## lm(formula = Total_growth ~ Interprov_migration_Net, data = NB_pop)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2694.1 -1422.7 -138.5 1408.7 3133.2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3676.0565 253.3528 14.51 < 2e-16 ***
## Interprov_migration_Net 1.3209 0.1138 11.61 2.89e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1675 on 46 degrees of freedom
## Multiple R-squared: 0.7455, Adjusted R-squared: 0.7399
## F-statistic: 134.7 on 1 and 46 DF, p-value: 2.889e-15
summary(lm.pop4)
##
## Call:
## lm(formula = Total_growth ~ Net_NPR, data = NB_pop)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5232.0 -2591.5 -176.8 1757.1 9694.5
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.792e+03 5.558e+02 5.023 8.13e-06 ***
## Net_NPR 3.353e-02 1.344e+00 0.025 0.98
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3321 on 46 degrees of freedom
## Multiple R-squared: 1.353e-05, Adjusted R-squared: -0.02173
## F-statistic: 0.0006225 on 1 and 46 DF, p-value: 0.9802
NB_pop10 <- NB_pop[39:48,]
Additional Variables:
Net_Immigration = Immigrants - Emigrants
Natural_Balance = Births - Deaths
summary(lm.pop10)
##
## Call:
## lm(formula = Total_growth ~ Natural_Balance + Net_Immigration +
## Interprov_migration_Net + Net_NPR, data = NB_pop10)
##
## Residuals:
## 1 2 3 4 5 6 7 8 9 10
## -172.04 -21.39 402.62 -114.56 -175.00 70.53 -20.51 103.85 -24.02 -49.47
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.110e+03 5.208e+02 -2.130 0.086376 .
## Natural_Balance 1.326e+00 1.914e-01 6.929 0.000961 ***
## Net_Immigration 1.533e+00 2.326e-01 6.589 0.001209 **
## Interprov_migration_Net 5.892e-01 5.466e-02 10.778 0.000119 ***
## Net_NPR 5.762e-01 3.283e-01 1.755 0.139581
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 225.9 on 5 degrees of freedom
## Multiple R-squared: 0.9929, Adjusted R-squared: 0.9872
## F-statistic: 174.8 on 4 and 5 DF, p-value: 1.48e-05
summary(lm.pop10_1)
##
## Call:
## lm(formula = Total_growth ~ Natural_Balance, data = NB_pop10)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3405.7 -1329.1 597.2 1365.7 2044.3
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2627.8105 609.6689 4.310 0.00258 **
## Natural_Balance -0.9816 0.7489 -1.311 0.22631
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1923 on 8 degrees of freedom
## Multiple R-squared: 0.1768, Adjusted R-squared: 0.0739
## F-statistic: 1.718 on 1 and 8 DF, p-value: 0.2263
summary(lm.pop10_2)
##
## Call:
## lm(formula = Total_growth ~ Net_Immigration, data = NB_pop10)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2231.29 -913.78 78.73 869.10 1854.99
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1048.9848 1156.5059 -0.907 0.39089
## Net_Immigration 1.3603 0.3919 3.471 0.00843 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1339 on 8 degrees of freedom
## Multiple R-squared: 0.601, Adjusted R-squared: 0.5511
## F-statistic: 12.05 on 1 and 8 DF, p-value: 0.008426
summary(lm.pop10_3)
##
## Call:
## lm(formula = Total_growth ~ Interprov_migration_Net, data = NB_pop10)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1229.4 -874.7 -180.0 595.5 1875.4
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3740.1800 439.9795 8.501 2.81e-05 ***
## Interprov_migration_Net 0.9952 0.2299 4.328 0.00252 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1159 on 8 degrees of freedom
## Multiple R-squared: 0.7007, Adjusted R-squared: 0.6633
## F-statistic: 18.73 on 1 and 8 DF, p-value: 0.002518
summary(lm.pop10_4)
##
## Call:
## lm(formula = Total_growth ~ Net_NPR, data = NB_pop10)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1695.5 -615.8 241.6 423.3 1806.4
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1140.3653 465.4316 2.450 0.03993 *
## Net_NPR 3.0750 0.6357 4.838 0.00129 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1070 on 8 degrees of freedom
## Multiple R-squared: 0.7452, Adjusted R-squared: 0.7134
## F-statistic: 23.4 on 1 and 8 DF, p-value: 0.001292
In the period 1971-2019 Interprovincial Migration would appear to be the most influential factor in the yearly change in population in New Brunswick. Of the three variables investigated, Interprov_migration_Net has the largest intercept value, 3676.0565; lowest standard error, 253.3528; largest F-statistic, 134.7; largest t-value, 11.61; and smallest p-value, 2.89e-15.
If we limit the period examined to the most recent ten (10) years of data, 2009-10 to 2018-19, Interprovincial Migration would still appear to be the most influential factor in the yearly change in population in New Brunswick. Interprov_migration_Net has the largest intercept value, 3740.1800; lowest standard error, 439.9795; second largest F-statistic, 18.73; largest t-value, 11.61; and smallest p-value, 0.002518.