Original document on RPubs

Migratory bird populations (Part 1)

Question 1a. Complete the summary table

Regression tables for all three models

Simple Multiple AIC-based
(Intercept) -0.0359 2.61     3.03     
(-0.368 - 0.296) (-2.72 - 7.93)    (-1.47 - 7.53)    
mig_date -1.41 * -0.801    -0.889    
(-2.67 - -0.149) (-2.09 - 0.491)   (-2.02 - 0.24)    
log_bodsize       0.84 **  0.741 ***
      (0.279 - 1.4)     (0.419 - 1.06)    
farmland       -0.909 *  -0.897 *  
      (-1.75 - -0.0642)  (-1.67 - -0.128)   
bred_lat       -0.0547   -0.0612   
      (-0.121 - 0.0116)  (-0.126 - 0.0032)  
log_popsize       0.219            
      (-0.231 - 0.669)           
broods       -0.142            
      (-0.533 - 0.249)           
log_migdist       -0.172            
      (-0.806 - 0.462)           
dichrom       -0.153            
      (-0.698 - 0.392)           
R squared 0.0499 0.3      0.286    
Adj. R squared 0.0398 0.236    0.254    
F statistic 4.93   4.67     9.09     
P value 0.0288 8.97e-05 3.18e-06 
*** p < 0.001; ** p < 0.01; * p < 0.05.

Question 1b. Which regression model performed best? Why?

The AIC-based stepwise multiple regression model performed best, with an adjusted \(R^2\) of \(0.254\).

In this case, the multiple regression models perform better than the line regresson model with a single predictor (pop70_90 ~ mig_date), because predictor variables other than mig_date improve the ability of the model to predict the dependent variable pop70_90.

Between the two multiple regression models examined, the AIC-based model (pop70_90 ~ log_bodsize + farmland + bred_lat + mig_date) successfully found predictor variables which, in combination together, provide good prediction of the dependent variable without paying a penalty (in adjusted \(R^2\) terms) for having increased the number of predictor variables.

Question 1d. Which are the significant predictor variables for pop70_90, and are they different from pop90_00?

The significant predictor variables for pop70_90, at the \(p<0.05\) significance level, are log_bodsize and farmland. mig_date is also a predictor variable (\(p=0.122\)) in the AIC-based model for pop70_90.

This is different to the significant predictor variables for pop90_00, at the \(p<0.05\) significance level, which are just mig_date. log_bodsize is also a predictor variable (\(p=0.064\)) in the AIC-based model for pop90_00.

Question 1e. (i) Is there much of a signal of climate change?

Knowing that global temperatures rose in the 1990-2000 period compared to the 1970-1990 period, and that climate change resulting from increasing global temperatures could potentially affect breeding success of birds in a way dependent on time of migration, then the increased importance of mig_date in predicting pop during the 1990-2000 period (in both the increased magnitude of the co-efficient, and range of the 95% confidence interval) compared to the 1970-1990 period is a potential signal of climate change affecting migratory bird populations.

Question 1e. (ii) What is a different predictor variable during this earlier period, and what do you infer about one cause of population increase or decline?

farmland, an indicator of species which prefer to breed in farmland habitats, was significant (at the \(p<0.05\) level, and part of the AIC-based model) in the 1970-1990 period, but not the 1990-2000 period.

It appears that the major factors affecting migratory bird populations in 1990-2000 were not associated with changes in at least localized human-influenced factors such as farming practices, or changes which just affected human-associated crops.

Question 1e. (iii) Does this strengthen the argument that climate change is having effects on animal populations?

This strengthens the argument that climate change, which is not localized geographically to human farming practices but could also affect ‘wilderness’ and other non-farming areas, is having an effect on animal populations.

Murder Rates by State (Part 2)

Question 2a. Complete the summary table

Regression table for all three models.

All regressors Intermediate AIC AIC-based
(Intercept) 122 ***           122 ***           120 ***          
(86.1 - 158)  (86.3 - 158)  (85.5 - 155) 
Population 0.000188 ** 0.000182 ** 0.000178 **
(5.74e-05 - 0.000319)   (6.03e-05 - 0.000304)   (5.85e-05 - 0.000297)  
Income -0.000159                         
(-0.00131 - 0.000996)                        
Illiteracy 1.37        1.4         1.17       
(-0.306 - 3.05)       (-0.252 - 3.05)       (-0.198 - 2.54)      
Life.Exp -1.65 ***    -1.66 ***    -1.61 ***   
(-2.17 - -1.14)       (-2.17 - -1.15)       (-2.08 - -1.14)      
HS.Grad 0.0323      0.0269                
(-0.0832 - 0.148)      (-0.0805 - 0.134)                
Frost -0.0129      -0.0129      -0.0137     
(-0.0278 - 0.00203)    (-0.0277 - 0.0018)     (-0.028 - 0.000538)  
Area 5.97e-06    5.71e-06    6.8e-06 *  
(-1.7e-06 - 1.36e-05)   (-1.65e-06 - 1.31e-05)   (9.22e-07 - 1.27e-05)  
R squared 0.808       0.808       0.807      
Adj. R squared 0.776       0.781       0.785      
F statistic 25.3         30.1         36.7        
P value 3.87e-13    6.94e-14    1.22e-14   
*** p < 0.001; ** p < 0.01; * p < 0.05.

Question 2b. (i) Consider the variables in the original model. Which ones weren’t included in the final model recommendation?

The predictor variables Income and HS.Grad were included in the original model (which included all possible predictor variables/regressors), but not in the final AIC-based model.

Question 2b. (ii) Why?

The predictor variables Income and HS.Grad were determined by the AIC step-wise algorithm to not be good predictors of the Murder dependent variable.

There might be no correlation between Income or HS.Grad and Murder, or the other variables in the prediction model, which might be correlated with Income or HS.Grad, are better predictors of the value of Murder so that Income/HS.Grad do not provide much extra useful information for prediction.

Question 2c. (i) Considering the significant predictors in the final model, which two might be addressed by public policy?

(1)
(Intercept) 120 ***          
(85.5 - 155) 
Population 0.000178 **
(5.85e-05 - 0.000297)  
Illiteracy 1.17       
(-0.198 - 2.54)      
Life.Exp -1.61 ***   
(-2.08 - -1.14)      
Frost -0.0137     
(-0.028 - 0.000538)  
Area 6.8e-06 *  
(9.22e-07 - 1.27e-05)  
R squared 0.807      
Adj. R squared 0.785      
F statistic 36.7        
P value 1.22e-14   
*** p < 0.001; ** p < 0.01; * p < 0.05.


Two predictors of Murder which might be addressed by public policy are Life.exp (negative correlation with Murder at the 5% significance level, \(p<0.001\)) and Illiteracy (positive correlation with Murder at the 10% significance level, \(p = 0.092\)).

On the other hand, high Murder rates may be a causative factor behind low Life.Exp.

Question 2c. (ii) Briefly, and just using the regression results, what should the objectives of public policy be to reduce murder rates?

Life.Exp is perhaps associated with healthcare and other social determinants of health. Improving healthcare, or other social determinants of health, could be an objective of public policy to reduce murder rates.

Lowering Illiteracy through improved education could be an objective of public policy to reduce murder rates.

Question 2d. Frost is included in the final model. Do you really think the # of days below freezing in the state’s capital impacts murder rate? If not, why is this variable included?

No, the number of frost days (as represented by Frost) is unlikely to impact murder rate.

However, the AIC-based method says the predictor variable Frost improves the Murder prediction model. It is quite possible that a combination of ‘true’ causative factors, which might not be in our list of predictors, are correlated with the Frost variable.

These ‘true’ causative factors might be related to the weather, such as time spent indoors, or be relatively unrelated to weather, such as historical differences in social welfare programs among different states which just happens to be correlated with the Frost variable.

Appendix - R code and detailed results

Libraries required for later processing.

knitr::opts_chunk$set(echo = TRUE)

library(dplyr)     # manipulation of data tables
library(magrittr)  # piping
library(MASS)      # provides Akaike Information Criterion (AIC) to provide step-wise selection of predictors for multiple regression
library(DT)        # pretty tables
library(huxtable)  # pretty regression tables

Question 1 - Bird migration

Many species respond to climate change by changing some aspects of their biology: their ranges expand northwards or contract to higher-elevation areas, or their timing of reproduction changes. Such changes have been especially well studied in avian species, and in the following exercise, we will examine data used in this paper. The data set was used to examine factors affecting changes in the mean migration date and population size of different bird European bird species throughout the period 1960-2006. The authors have included data on facets of the birds’ biology as diverse as changes in the area of farmland (the breeding habitat of most species) to their sexual coloration and brain mass, in an attempt to explain changes in these two important population variables.

But which predictor variables are significant in predicting these response variables?

Bird migration data

Trend.in.mean.migration.date=c(-0.032, -0.157, -0.06, -0.116, -0.291, -0.099, -0.461, 0.136, -0.346, -0.312, -0.13, -0.049, -0.05, -0.056, -0.154, 0.034, 0.116, -0.368, -0.286, 0.238, 0.006, -0.193, -0.44, -0.954, -0.36, -0.215, -0.816, 0.266, -0.474, -0.325, 0.026, -0.205, 0.063, -0.249, -0.302, -0.118, -0.14, -0.156, 0.009, 0.251, -0.033, -0.033, -0.174, 0.141, -0.01, -0.502, -0.333, -0.492, 0.129, -0.018, -0.084, -0.045, 0.028, -0.532, -0.13, -0.208, -0.075, -0.014, -0.056, 0.021, -0.069, -0.252, -0.313, 0.019, -0.377, -0.068, -0.18, 0.272, -0.048, -0.722, -0.098, -0.06, -0.153, -0.303, -0.156, 0.015, -0.03, 0.622, 0.085, -0.167, 0.016, -0.102, -0.119, -0.118, 0, 0.134, -0.194, -0.024, -0.228, -0.208, -0.314, -0.192, -0.05, -0.061, -0.415, -0.061)
Population.trend.1970.1990=c(3, 0, 0, 0, -3, -3, 0, 3, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, -2, 0, 0, 3, 0, 3, 0, 0, 3, 2, 0, 0, -3, 0, 0, 0, 0, 0, 0, -3, 3, 0, -2, -2, -2, 3, 3, 0, 3, 0, 0, 0, 0, 0, 2, 0, 0, 0, -2, 0, 0, 0, 3, 0, 0, -3, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 2, -2, 0, 0, -2, 0, 0, 0, 0, 0, 0, 0, 0)
Population.trend.1990.2000=c(1, 0, 0, 0, -1, -2, -1, 3, -1, -1, -1, -1, 0, -2, 1, 0, -2, 1, 0, 0, 0, -1, 0, 2, 1, 0, 0, -1, 3, 3, -2, -1, -1, -1, 1, -1, 0, 0, -2, -3, -2, -1, -1, -2, -1, 2, 3, 3, -2, 0, 0, 0, -2, 0, -1, -1, 0, -1, -1, -2, 0, 0, 3, -2, 1, 0, 0, -2, -1, 0, -2, -1, 0, 0, 0, -1, 1, 0, -2, 1, 0, 1, 0, 0, 0, -2, 0, -2, 1, 0, 1, 0, 0, 0, 0, -3)
No..of.broods=c(1, 1, 2, 1, 4, 1, 1, 1, 3, 2, 3, 1, 1, 1, 1, 1, 3, 3, 2, 1, 1, 3, 1, 4, 3, 1, 1, 1, 1, 1, 2, 3, 1, 2, 3, 1, 2, 2, 1, 1, 1, 1, 3, 3, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 3, 2, 2, 2, 2, 2, 1, 1, 3, 2, 2, 2, 2, 2, 2, 2, 3, 3, 2, 2, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 2, 2, 4, 4, 2, 2, 3, 1)
Natal.dispersal=c(15.6, NA, 40.4, 47, 5.5, NA, 19.9, NA, 0.9, NA, NA, 36.8, NA, NA, NA, NA, 4.4, 11.1, 4.2, NA, NA, NA, NA, 10.4, 10.7, 9.9, 8.6, NA, NA, 34.3, 10.4, 8.4, NA, 5.4, 6, 20.6, 3.6, NA, NA, NA, NA, NA, 14.1, NA, NA, NA, 28.2, NA, 47, NA, NA, NA, NA, NA, NA, NA, 16.1, 12.5, 12.8, 18.9, 5.3, 5.3, NA, NA, NA, 12.2, NA, 20, 20.8, NA, NA, NA, 2.1, 4.6, NA, 0.5, NA, NA, 9.5, 41.2, NA, 14.4, 32.3, NA, NA, NA, NA, NA, 8.9, NA, 3.3, 7, NA, NA, 8.3, NA)
Farmland.breeding.habitat=c(0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1)
Habitat.specialization=c(NA, NA, NA, NA, 1.32, NA, NA, NA, 0.69, NA, 0.46, NA, NA, NA, NA, NA, 0.63, 0.56, 0.52, NA, NA, NA, NA, 0.68, 0.19, 0.32, 0.55, 0.27, NA, NA, 0.89, 0.54, NA, NA, 0.25, NA, 0.51, NA, NA, NA, NA, NA, 0.88, NA, 0.67, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0.47, NA, NA, NA, 0.44, 0.36, NA, NA, 0.89, 0.78, 0.41, 2.09, 0.65, NA, NA, NA, 0.38, 0.71, 1.09, 0.79, NA, NA, 0.4, 0.32, 0.34, 0.54, 0.51, NA, NA, NA, NA, NA, 0.41, NA, 0.21, 0.4, NA, NA, 0.44, NA)
Thermal.maximum=c(NA, NA, NA, NA, 20.08, NA, NA, NA, 16.97, NA, 19.55, NA, NA, NA, NA, NA, 20.54, 20.59, 20.51, NA, NA, NA, NA, 19.24, 20.26, 20.46, 20.4, 20.46, NA, NA, 20.55, 19.35, NA, NA, 20.42, NA, 20.57, NA, NA, NA, NA, NA, 20.52, NA, 20.2, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 20.24, NA, NA, NA, 20.57, 20.47, NA, NA, 19.93, 19.7, 19.87, 18.92, 17.7, NA, NA, NA, 19.05, 18.87, 18.86, 19.44, NA, NA, 19.96, 20.49, 19.39, 20.35, 19.71, NA, NA, NA, NA, NA, 20.4, NA, 20.52, 19.69, NA, NA, 20.09, NA)
Migration.distance=c(12.79, 66.84, 62.1, 44.6, 13.01, 26.72, 8.12, 12.28, 15.64, 21.27, 47.07, 59.39, 63.88, 17.08, 9.66, 43.12, 4.11, 1.16, 1.34, 24.22, 0.7, 2.56, 15.78, 3.45, 2.03, 5.71, 0.29, 49.38, 9.29, 1.08, 44.25, 4.72, 36.48, 10.52, 5, 43, 5.54, 13.75, 7.05, 10.57, 21.39, 71.34, 42.34, 35.2, 64.72, 22.14, 34.34, 7.28, 23.5, 59.92, 63.1, 25.5, 2.55, 16.1, 4.89, 11.05, 18.12, 40.98, 64.4, 38.17, 0, 0, 2.63, 52.2, 15.83, 33.93, 22.55, 52.74, 68.09, 20.49, 4.4, 7.87, 9.23, 0, 0, 34.84, 3.34, 52.75, 2.63, 19.63, 63.25, 53.05, 27.79, 9.35, 62.89, 44.39, 49.16, 35.28, 1.34, 10.77, 3.98, 14.65, 10.77, 14.13, 4.36, 12.08)
Wintering.in.Africa=c(0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0)
Winter.habitat=c(1, 3, 3, 3, 2, 4, 3, 3, 2, 3, 1, 5, 4, 3, 3, 3, 2, 2, 5, 1, 4, 4, 4, 1, 1, 2, 2, 1, 3, 3, 5, 2, 2, 3, 5, 1, 2, 1, 3, 4, 4, 2, 5, 5, 2, 4, 4, 4, 4, 4, 1, 3, 4, 4, 3, 4, 2, 2, 1, 2, 1, 1, 3, 3, 5, 1, 1, 1, 1, 2, 4, 4, 2, 1, 1, 2, 4, 4, 2, 1, 1, 2, 2, 4, 3, 3, 3, 3, 1, 1, 5, 1, 2, 2, 2, 2)
Northernmost.breeding.latitude=c(70, 63.93, 70.63, 64.64, 71.16, 70.63, 71.09, 71.11, 71.17, 75, 70.5, 70, 83.33, 70.16, 70.31, 71.25, 66, 63.57, 70.31, 67.67, 81.75, 83.33, 82.22, 65, 67.33, 71.17, 66.36, 70.47, 70, 62.41, 70.33, 70.44, 67.5, 71.18, 70, 70.67, 71.25, 71.25, 71.25, 73.28, 71.25, 70, 70.38, 69.5, 66.33, 71.25, 71.25, 79.31, 68, 70.31, 65, 71.17, 70.63, 74.53, 71.25, 73.2, 71.17, 70.67, 70.5, 71.17, 67.33, 70.67, 71.27, 71.25, 60.36, 70.5, 70.29, 68.33, 71.18, 83, 66, 67.67, 70.67, 70.31, 70.29, 70, 80.83, 71.25, 71.25, 70.1, 70.31, 69.33, 69.67, 66.17, 70.78, 71.25, 70.31, 71.09, 69.33, 71.17, 71.17, 70.33, 71.17, 71.17, 69, 70.16)
Sexual.dichromatism=c(1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1)
Brain.mass..g.=c(2.92, 0.52, NA, 0.58, 0.97, NA, 5.82, 11.91, 0.36, 0.58, 0.68, 0.69, NA, 4.36, NA, NA, 0.67, 0.59, 0.89, NA, NA, NA, NA, 2.27, 2.38, 8.14, 4.69, 2.24, NA, NA, 0.59, 0.82, NA, 0.68, 0.66, 0.45, 0.77, 0.78, 1.35, NA, NA, 0.54, 0.58, NA, 1, 6.43, NA, NA, 2.88, NA, NA, NA, NA, NA, NA, NA, 0.58, 0.47, 0.53, 0.72, 0.65, 0.85, NA, NA, 0.61, 0.54, 0.38, NA, 0.31, NA, NA, NA, 0.71, 0.89, 0.38, 0.67, 7.92, NA, 1.7, 0.67, 0.62, 0.56, 0.53, 4.74, NA, NA, NA, 1.42, 0.5, 1.45, 1.92, 1.59, 1.85, 1.87, 2.21, 2.16)
Body.mass..g.=c(204, 12, 11.9, 11.8, 36.4, 917, 1119, 3464.5, 19.2, 21.5, 23.4, 39.6, 107.5, 656.5, 840.3, 26.1, 18.9, 15.6, 27.6, 23.4, 394.5, 63.2, 722.5, 314.5, 494.5, 544.5, 249, 120.5, 11375, 10750, 19.5, 26.8, 20.6, 18.8, 16.4, 14.3, 24.2, 22.6, 106.5, 2804.5, 531, 13.3, 19.1, 37.3, 30.7, 895, 817.5, 1599.5, 280.5, 301, 25, 18.2, 1587.5, 1306.5, 1641.5, 1090.5, 20.8, 17.4, 15.5, 24, 11.8, 18.5, 2254, 140.5, 16, 15.9, 7.7, 9.1, 9.3, 37.4, 875, 829.8, 19, 31, 5.8, 16.6, 2066.5, 125, 80.5, 18.9, 19, 14.5, 12.4, 1152, 67.5, 47.8, 173.5, 112, 8.9, 62.8, 95.8, 70.5, 92.1, 117, 117.8, 218.5)
European.breeding.population..pairs.x.10.5.=c(3.95, 50, 59, 38.5, 600, 3.4, 42, 1.55, 115, 15.2, 345, 119.5, 0.58, 8.05, 5.4, 2.53, 190, 205, 230, 45.5, 2.15, 1.7, 7.2, 6.25, 130, 120, 101, 64, 0.19, 1.03, 169.5, 245, 106, 68, 630, 160, 1850, 175, 14.15, 0.72, 3.75, 53, 260, 9.4, 96.5, 15.25, 3.25, 1.45, 18.5, 0.04, 53, 61.5, 0.93, 1.15, 0.61, 0.97, 195, 109.5, 180, 88, 320, 685, 3.4, 3.55, 64, 114, 454.45, 180, 780, 11.9, 3.75, 0.44, 190, 106.5, 270, 77, 10.2, 4.2, 395, 370, 240, 195, 63, 0.54, 7.75, 11.6, 1.18, 4.45, 315, 185, 610, 280, 190, 4.9, 52, 22.5)
birds=cbind(Trend.in.mean.migration.date, Population.trend.1970.1990, Population.trend.1990.2000, No..of.broods, Natal.dispersal, Farmland.breeding.habitat, Habitat.specialization, Thermal.maximum, Migration.distance, Wintering.in.Africa, Winter.habitat, Northernmost.breeding.latitude, Sexual.dichromatism, Brain.mass..g., Body.mass..g., European.breeding.population..pairs.x.10.5.)
birds=as.data.frame(birds)

Rename and transform some columns

birds <- birds %>% mutate(log_migdist = log10(Migration.distance+1),
                          log_popsize = log10(European.breeding.population..pairs.x.10.5.),
                          log_bodsize = log10(Body.mass..g.),
                          farmland = Farmland.breeding.habitat,
                          dichrom = Sexual.dichromatism,
                          broods = No..of.broods,
                          mig_date = Trend.in.mean.migration.date,
                          pop70_90 = Population.trend.1970.1990,
                          pop90_00 = Population.trend.1990.2000,
                          habitat = Habitat.specialization,
                          winter = Winter.habitat,
                          bred_lat = Northernmost.breeding.latitude)

Simple model

model1a <- lm(pop70_90 ~ mig_date, data = birds)
summary(model1a)
## 
## Call:
## lm(formula = pop70_90 ~ mig_date, data = birds)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.3736 -0.4039 -0.1175  0.0774  3.2273 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  -0.0359     0.1673  -0.215   0.8305  
## mig_date     -1.4072     0.6336  -2.221   0.0288 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.403 on 94 degrees of freedom
## Multiple R-squared:  0.04986,    Adjusted R-squared:  0.03975 
## F-statistic: 4.933 on 1 and 94 DF,  p-value: 0.02875
# the regression formula
model1a_formula <- formula(model1a)
model1a_formula_string <- paste(model1a_formula[2], model1a_formula[1], model1a_formula[3])

model1a_coeff <- model1a$coefficients
model1a_coeff_sign <- sign(model1a_coeff)
model1a_coeff_prefix <- case_when(model1a_coeff_sign == -1 ~ " - ",
                                  model1a_coeff_sign == 1 ~ " + ",
                                  model1a_coeff_sign == 0 ~ " + ")
# create equation string, including co-efficients
model1a_eqn <- paste("pop70_90 =", paste(if_else(model1a_coeff[1]<0, "- ", ""),
                                         abs(round(model1a_coeff[1],3)),
                                         paste(model1a_coeff_prefix[-1],
                                               abs(round(model1a_coeff[-1],3)),
                                               " * ",
                                               names(model1a_coeff[-1]),
                                               sep = "",
                                               collapse = ""),
                                         sep=""))
# based on keith p jolley's code at
# https://stats.stackexchange.com/questions/63600/how-to-translate-the-results-from-lm-to-an-equation/

birdmigration_summary <- data.frame(Model = model1a_formula_string,
                                    Equation = model1a_eqn,
                                    AdjustedR2 = round(summary(model1a)$adj.r.squared, 3),
                                    p.value = signif(pf(summary(model1a)$fstatistic[1],
                                                        summary(model1a)$fstatistic[2],
                                                        summary(model1a)$fstatistic[3],
                                                        lower.tail = F), 3))

Multiple regression model

model1b <- lm(pop70_90 ~ log_bodsize + farmland + bred_lat + mig_date +
                log_popsize + broods + log_migdist + dichrom, data = birds)
summary(model1b)
## 
## Call:
## lm(formula = pop70_90 ~ log_bodsize + farmland + bred_lat + mig_date + 
##     log_popsize + broods + log_migdist + dichrom, data = birds)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.9308 -0.4317  0.1438  0.4943  2.6227 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)  2.60526    2.67757   0.973  0.33325   
## log_bodsize  0.83954    0.28217   2.975  0.00379 **
## farmland    -0.90948    0.42528  -2.139  0.03527 * 
## bred_lat    -0.05475    0.03336  -1.641  0.10440   
## mig_date    -0.80107    0.65010  -1.232  0.22119   
## log_popsize  0.21900    0.22663   0.966  0.33656   
## broods      -0.14191    0.19671  -0.721  0.47259   
## log_migdist -0.17191    0.31879  -0.539  0.59109   
## dichrom     -0.15298    0.27439  -0.558  0.57858   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.251 on 87 degrees of freedom
## Multiple R-squared:  0.3003, Adjusted R-squared:  0.2359 
## F-statistic: 4.666 on 8 and 87 DF,  p-value: 8.966e-05
model1b_formula <- formula(model1b)
model1b_formula_string <- paste(model1b_formula[2], model1b_formula[1], model1b_formula[3])

model1b_coeff <- model1b$coefficients
model1b_coeff_sign <- sign(model1b_coeff)
model1b_coeff_prefix <- case_when(model1b_coeff_sign == -1 ~ " - ",
                                  model1b_coeff_sign == 1 ~ " + ",
                                  model1b_coeff_sign == 0 ~ " + ")
model1b_eqn <- paste("pop70_90 =", paste(if_else(model1b_coeff[1]<0, "- ", ""),
                                         abs(round(model1b_coeff[1],3)),
                                         paste(model1b_coeff_prefix[-1],
                                               abs(round(model1b_coeff[-1],3)),
                                               " * ",
                                               names(model1b_coeff[-1]),
                                               sep = "",
                                               collapse = ""),
                                         sep=""))

birdmigration_summary <- birdmigration_summary %>%
  add_row(Model = model1b_formula_string,
          Equation = model1b_eqn,
          AdjustedR2 = round(summary(model1b)$adj.r.squared, 3),
          p.value = signif(pf(summary(model1b)$fstatistic[1],
                              summary(model1b)$fstatistic[2],
                              summary(model1b)$fstatistic[3],
                              lower.tail = F), 3))

AIC-based model

step1c <-stepAIC(model1b) 
## Start:  AIC=51.58
## pop70_90 ~ log_bodsize + farmland + bred_lat + mig_date + log_popsize + 
##     broods + log_migdist + dichrom
## 
##               Df Sum of Sq    RSS    AIC
## - log_migdist  1    0.4553 136.66 49.905
## - dichrom      1    0.4867 136.69 49.927
## - broods       1    0.8148 137.02 50.157
## - log_popsize  1    1.4620 137.67 50.610
## - mig_date     1    2.3772 138.59 51.246
## <none>                     136.21 51.585
## - bred_lat     1    4.2162 140.43 52.511
## - farmland     1    7.1602 143.37 54.503
## - log_bodsize  1   13.8596 150.07 58.887
## 
## Step:  AIC=49.91
## pop70_90 ~ log_bodsize + farmland + bred_lat + mig_date + log_popsize + 
##     broods + dichrom
## 
##               Df Sum of Sq    RSS    AIC
## - dichrom      1    0.2870 136.95 48.106
## - broods       1    0.5497 137.21 48.290
## - log_popsize  1    2.1066 138.77 49.374
## <none>                     136.66 49.905
## - mig_date     1    2.9152 139.58 49.931
## - bred_lat     1    4.2283 140.89 50.830
## - farmland     1    8.1749 144.84 53.482
## - log_bodsize  1   20.0954 156.76 61.075
## 
## Step:  AIC=48.11
## pop70_90 ~ log_bodsize + farmland + bred_lat + mig_date + log_popsize + 
##     broods
## 
##               Df Sum of Sq    RSS    AIC
## - broods       1    0.4807 137.43 46.443
## - log_popsize  1    2.0059 138.96 47.502
## - mig_date     1    2.7998 139.75 48.049
## <none>                     136.95 48.106
## - bred_lat     1    4.3739 141.32 49.125
## - farmland     1    9.1922 146.14 52.343
## - log_bodsize  1   20.0839 157.03 59.244
## 
## Step:  AIC=46.44
## pop70_90 ~ log_bodsize + farmland + bred_lat + mig_date + log_popsize
## 
##               Df Sum of Sq    RSS    AIC
## - log_popsize  1    1.6490 139.08 45.588
## - mig_date     1    2.3330 139.76 46.059
## <none>                     137.43 46.443
## - bred_lat     1    4.3159 141.75 47.411
## - farmland     1    9.6467 147.08 50.955
## - log_bodsize  1   22.0094 159.44 58.703
## 
## Step:  AIC=45.59
## pop70_90 ~ log_bodsize + farmland + bred_lat + mig_date
## 
##               Df Sum of Sq    RSS    AIC
## <none>                     139.08 45.588
## - mig_date     1     3.740 142.82 46.135
## - bred_lat     1     5.445 144.53 47.274
## - farmland     1     8.207 147.29 49.092
## - log_bodsize  1    31.972 171.05 63.452
model1c <- lm(pop70_90 ~ log_bodsize + farmland + bred_lat + mig_date, data = birds)
summary(model1c)
## 
## Call:
## lm(formula = pop70_90 ~ log_bodsize + farmland + bred_lat + mig_date, 
##     data = birds)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.9935 -0.5237  0.1698  0.5605  2.5108 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.03228    2.26677   1.338   0.1843    
## log_bodsize  0.74052    0.16191   4.574 1.51e-05 ***
## farmland    -0.89663    0.38694  -2.317   0.0227 *  
## bred_lat    -0.06117    0.03241  -1.887   0.0623 .  
## mig_date    -0.88863    0.56805  -1.564   0.1212    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.236 on 91 degrees of freedom
## Multiple R-squared:  0.2855, Adjusted R-squared:  0.2541 
## F-statistic: 9.091 on 4 and 91 DF,  p-value: 3.183e-06
model1c_formula <- formula(model1c)
model1c_formula_string <- paste(model1c_formula[2], model1c_formula[1], model1c_formula[3])

model1c_coeff <- model1c$coefficients
model1c_coeff_sign <- sign(model1c_coeff)
model1c_coeff_prefix <- case_when(model1c_coeff_sign == -1 ~ " - ",
                                  model1c_coeff_sign == 1 ~ " + ",
                                  model1c_coeff_sign == 0 ~ " + ")
model1c_eqn <- paste("pop70_90 =", paste(if_else(model1c_coeff[1]<0, "- ", ""),
                                         abs(round(model1c_coeff[1],3)),
                                         paste(model1c_coeff_prefix[-1],
                                               abs(round(model1c_coeff[-1],3)),
                                               " * ",
                                               names(model1c_coeff[-1]),
                                               sep = "",
                                               collapse = ""),
                                         sep=""))

birdmigration_summary <- birdmigration_summary %>%
  add_row(Model = model1c_formula_string,
          Equation = model1c_eqn,
          AdjustedR2 = round(summary(model1c)$adj.r.squared, 3),
          p.value = signif(pf(summary(model1c)$fstatistic[1],
                              summary(model1c)$fstatistic[2],
                              summary(model1c)$fstatistic[3],
                              lower.tail = F), 3))

Summary table

datatable(birdmigration_summary,
          colnames = c("Adjusted R2" = "AdjustedR2",
                       "p-value" = "p.value"),
          options = list(info = FALSE, dom = "", buttons = c("")))

Regression table for all three models

huxreg("Simple" = model1a, "Multiple" = model1b, "AIC-based" = model1c,
       ci_level = .95, error_format = '({conf.low} - {conf.high})',
       number_format = '%.3g',
       statistics = c('R squared' = 'r.squared',
                      'Adj. R squared' = 'adj.r.squared',
                      'F statistic' = 'statistic',
                      'P value' = 'p.value'))
Simple Multiple AIC-based
(Intercept) -0.0359 2.61     3.03     
(-0.368 - 0.296) (-2.72 - 7.93)    (-1.47 - 7.53)    
mig_date -1.41 * -0.801    -0.889    
(-2.67 - -0.149) (-2.09 - 0.491)   (-2.02 - 0.24)    
log_bodsize       0.84 **  0.741 ***
      (0.279 - 1.4)     (0.419 - 1.06)    
farmland       -0.909 *  -0.897 *  
      (-1.75 - -0.0642)  (-1.67 - -0.128)   
bred_lat       -0.0547   -0.0612   
      (-0.121 - 0.0116)  (-0.126 - 0.0032)  
log_popsize       0.219            
      (-0.231 - 0.669)           
broods       -0.142            
      (-0.533 - 0.249)           
log_migdist       -0.172            
      (-0.806 - 0.462)           
dichrom       -0.153            
      (-0.698 - 0.392)           
R squared 0.0499 0.3      0.286    
Adj. R squared 0.0398 0.236    0.254    
F statistic 4.93   4.67     9.09     
P value 0.0288 8.97e-05 3.18e-06 
*** p < 0.001; ** p < 0.01; * p < 0.05.

AIC-based model for 1990-2000 data

model2b <- lm(pop90_00 ~ log_bodsize + farmland + bred_lat + mig_date +
                log_popsize + broods + log_migdist + dichrom, data = birds)
step2c <-stepAIC(model2b)
## Start:  AIC=41.74
## pop90_00 ~ log_bodsize + farmland + bred_lat + mig_date + log_popsize + 
##     broods + log_migdist + dichrom
## 
##               Df Sum of Sq    RSS    AIC
## - log_popsize  1    0.1698 123.10 39.870
## - log_migdist  1    0.1973 123.13 39.891
## - broods       1    0.2375 123.17 39.923
## - dichrom      1    0.7163 123.65 40.295
## - bred_lat     1    0.8672 123.80 40.412
## - farmland     1    1.2717 124.20 40.726
## - log_bodsize  1    1.6735 124.60 41.036
## <none>                     122.93 41.737
## - mig_date     1   24.1471 147.08 56.954
## 
## Step:  AIC=39.87
## pop90_00 ~ log_bodsize + farmland + bred_lat + mig_date + broods + 
##     log_migdist + dichrom
## 
##               Df Sum of Sq    RSS    AIC
## - broods       1    0.1835 123.28 38.013
## - log_migdist  1    0.3367 123.44 38.132
## - dichrom      1    0.7213 123.82 38.431
## - bred_lat     1    1.0175 124.12 38.660
## - farmland     1    1.1170 124.22 38.737
## - log_bodsize  1    1.9737 125.07 39.397
## <none>                     123.10 39.870
## - mig_date     1   24.6615 147.76 55.400
## 
## Step:  AIC=38.01
## pop90_00 ~ log_bodsize + farmland + bred_lat + mig_date + log_migdist + 
##     dichrom
## 
##               Df Sum of Sq    RSS    AIC
## - log_migdist  1    0.2124 123.50 36.178
## - dichrom      1    0.6217 123.91 36.496
## - bred_lat     1    0.9859 124.27 36.778
## - farmland     1    1.3674 124.65 37.072
## <none>                     123.28 38.013
## - log_bodsize  1    3.9037 127.19 39.006
## - mig_date     1   25.6377 148.92 54.150
## 
## Step:  AIC=36.18
## pop90_00 ~ log_bodsize + farmland + bred_lat + mig_date + dichrom
## 
##               Df Sum of Sq    RSS    AIC
## - dichrom      1     0.489 123.98 34.557
## - bred_lat     1     1.054 124.55 34.994
## - farmland     1     1.444 124.94 35.294
## <none>                     123.50 36.178
## - log_bodsize  1     4.531 128.03 37.637
## - mig_date     1    31.671 155.17 56.094
## 
## Step:  AIC=34.56
## pop90_00 ~ log_bodsize + farmland + bred_lat + mig_date
## 
##               Df Sum of Sq    RSS    AIC
## - bred_lat     1    1.1338 125.12 33.431
## - farmland     1    1.9188 125.90 34.032
## <none>                     123.98 34.557
## - log_bodsize  1    4.6115 128.59 36.063
## - mig_date     1   31.3309 155.31 54.186
## 
## Step:  AIC=33.43
## pop90_00 ~ log_bodsize + farmland + mig_date
## 
##               Df Sum of Sq    RSS    AIC
## - farmland     1    1.7238 126.84 32.745
## <none>                     125.12 33.431
## - log_bodsize  1    4.0028 129.12 34.454
## - mig_date     1   31.2300 156.35 52.823
## 
## Step:  AIC=32.74
## pop90_00 ~ log_bodsize + mig_date
## 
##               Df Sum of Sq    RSS    AIC
## <none>                     126.84 32.745
## - log_bodsize  1     4.811 131.65 34.318
## - mig_date     1    32.060 158.90 52.378
model2c <- lm(pop90_00 ~ log_bodsize + mig_date, data = birds)
summary(model2c)
## 
## Call:
## lm(formula = pop90_00 ~ log_bodsize + mig_date, data = birds)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.6350 -0.9066  0.0234  0.6752  3.5397 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -1.1807     0.3102  -3.807 0.000252 ***
## log_bodsize   0.2809     0.1496   1.878 0.063503 .  
## mig_date     -2.5983     0.5359  -4.848 4.97e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.168 on 93 degrees of freedom
## Multiple R-squared:  0.2513, Adjusted R-squared:  0.2352 
## F-statistic:  15.6 on 2 and 93 DF,  p-value: 1.434e-06
model2c_coeff <- model2c$coefficients
model2c_coeff_sign <- sign(model2c_coeff)
model2c_coeff_prefix <- case_when(model2c_coeff_sign == -1 ~ " - ",
                                  model2c_coeff_sign == 1 ~ " + ",
                                  model2c_coeff_sign == 0 ~ " + ")
model2c_eqn <- paste("pop90_00 =", paste(if_else(model2c_coeff[1]<0, "- ", ""),
                                         abs(round(model2c_coeff[1],3)),
                                         paste(model2c_coeff_prefix[-1],
                                               abs(round(model2c_coeff[-1],3)),
                                               " * ",
                                               names(model2c_coeff[-1]),
                                               sep = "",
                                               collapse = ""),
                                         sep=""))

Question 2 - Murder data state-by-state, United States

What factors might predict differences in murder rates experienced by different states in the US? Perhaps determining these can advise public policy.

Load state.x77 dataset

library(car)
state.x77=as.data.frame(state.x77)
names(state.x77)=c("Population","Income","Illiteracy","Life.Exp","Murder","HS.Grad","Frost","Area")

Description of state.x77 (from ‘?state.x77’)

The ‘state’ dataset provides data sets related to the 50 states of the United States of America.

‘state.x77’ is a matrix with 50 rows and 8 columns giving the following statistics in the respective columns.

  • Population population estimate as of July 1, 1975
  • Income per capita income (1974)
  • Illiteracy illiteracy (1970, percent of population)
  • Life Exp life expectancy in years (1969–71)
  • Murder murder and non-negligent manslaughter rate per 100,000 population (1976)
  • HS Grad percent high-school graduates (1970)
  • Frost mean number of days with minimum temperature below freezing (1931–1960) in capital or large city
  • Area land area in square miles

Source

U.S. Department of Commerce, Bureau of the Census (1977) Statistical Abstract of the United States.

Multiple regression model, all possible regressors

model3a <- lm(Murder ~ .,
              data = state.x77)
model3c <- stepAIC(model3a)
## Start:  AIC=63.01
## Murder ~ Population + Income + Illiteracy + Life.Exp + HS.Grad + 
##     Frost + Area
## 
##              Df Sum of Sq    RSS    AIC
## - Income      1     0.236 128.27 61.105
## - HS.Grad     1     0.973 129.01 61.392
## <none>                    128.03 63.013
## - Area        1     7.514 135.55 63.865
## - Illiteracy  1     8.299 136.33 64.154
## - Frost       1     9.260 137.29 64.505
## - Population  1    25.719 153.75 70.166
## - Life.Exp    1   127.175 255.21 95.503
## 
## Step:  AIC=61.11
## Murder ~ Population + Illiteracy + Life.Exp + HS.Grad + Frost + 
##     Area
## 
##              Df Sum of Sq    RSS    AIC
## - HS.Grad     1     0.763 129.03 59.402
## <none>                    128.27 61.105
## - Area        1     7.310 135.58 61.877
## - Illiteracy  1     8.715 136.98 62.392
## - Frost       1     9.345 137.61 62.621
## - Population  1    27.142 155.41 68.702
## - Life.Exp    1   127.500 255.77 93.613
## 
## Step:  AIC=59.4
## Murder ~ Population + Illiteracy + Life.Exp + Frost + Area
## 
##              Df Sum of Sq    RSS    AIC
## <none>                    129.03 59.402
## - Illiteracy  1     8.723 137.75 60.672
## - Frost       1    11.030 140.06 61.503
## - Area        1    15.937 144.97 63.225
## - Population  1    26.415 155.45 66.714
## - Life.Exp    1   140.391 269.42 94.213
model3a_coeff <- model3a$coefficients
model3a_coeff_sign <- sign(model3a_coeff)
model3a_coeff_prefix <- case_when(model3a_coeff_sign == -1 ~ " - ",
                                  model3a_coeff_sign == 1 ~ " + ",
                                  model3a_coeff_sign == 0 ~ " + ")
model3a_eqn <- paste("Murder =", paste(if_else(model3a_coeff[1]<0, "- ", ""),
                                       abs(signif(model3a_coeff[1],3)),
                                       paste(model3a_coeff_prefix[-1],
                                             abs(signif(model3a_coeff[-1],3)),
                                             " * ",
                                             names(model3a_coeff[-1]),
                                             sep = "",
                                             collapse = ""),
                                       sep=""))

model3a_formula <- formula(model3a)
model3a_formula_string <- paste(model3a_formula[2], model3a_formula[1], model3a_formula[3])

murder_summary <- data.frame(Model = model3a_formula_string,
                             Equation = model3a_eqn,
                             AdjustedR2 = round(summary(model3a)$adj.r.squared, 3),
                             p.value = signif(pf(summary(model3a)$fstatistic[1],
                                                 summary(model3a)$fstatistic[2],
                                                 summary(model3a)$fstatistic[3],
                                                 lower.tail = F), 3))

Intermediate AIC-based model

model3b <- lm(Murder ~ Population + Illiteracy + Life.Exp + HS.Grad + Frost + Area,
              data = state.x77)

model3b_coeff <- model3b$coefficients
model3b_coeff_sign <- sign(model3b_coeff)
model3b_coeff_prefix <- case_when(model3b_coeff_sign == -1 ~ " - ",
                                  model3b_coeff_sign == 1 ~ " + ",
                                  model3b_coeff_sign == 0 ~ " + ")
model3b_eqn <- paste("Murder =", paste(if_else(model3b_coeff[1]<0, "- ", ""),
                                       abs(signif(model3b_coeff[1],3)),
                                       paste(model3b_coeff_prefix[-1],
                                             abs(signif(model3b_coeff[-1],3)),
                                             " * ",
                                             names(model3b_coeff[-1]),
                                             sep = "",
                                             collapse = ""),
                                       sep=""))

model3b_formula <- formula(model3b)
model3b_formula_string <- paste(model3b_formula[2], model3b_formula[1], model3b_formula[3])

murder_summary <- murder_summary %>%
  add_row(Model = model3b_formula_string,
          Equation = model3b_eqn,
          AdjustedR2 = round(summary(model3b)$adj.r.squared, 3),
          p.value = signif(pf(summary(model3b)$fstatistic[1],
                              summary(model3b)$fstatistic[2],
                              summary(model3b)$fstatistic[3],
                              lower.tail = F), 3))

Final AIC-based model

model3c_coeff <- model3c$coefficients
model3c_coeff_sign <- sign(model3c_coeff)
model3c_coeff_prefix <- case_when(model3c_coeff_sign == -1 ~ " - ",
                                  model3c_coeff_sign == 1 ~ " + ",
                                  model3c_coeff_sign == 0 ~ " + ")
model3c_eqn <- paste("Murder =", paste(if_else(model3c_coeff[1]<0, "- ", ""),
                                       abs(signif(model3c_coeff[1],3)),
                                       paste(model3c_coeff_prefix[-1],
                                             abs(signif(model3c_coeff[-1],3)),
                                             " * ",
                                             names(model3c_coeff[-1]),
                                             sep = "",
                                             collapse = ""),
                                       sep=""))

model3c_formula <- formula(model3c)
model3c_formula_string <- paste(model3c_formula[2], model3c_formula[1], model3c_formula[3])

murder_summary <- murder_summary %>%
  add_row(Model = model3c_formula_string,
          Equation = model3c_eqn,
          AdjustedR2 = round(summary(model3c)$adj.r.squared, 3),
          p.value = signif(pf(summary(model3c)$fstatistic[1],
                              summary(model3c)$fstatistic[2],
                              summary(model3c)$fstatistic[3],
                              lower.tail = F), 3))

Summary table

datatable(murder_summary,
          colnames = c("Adjusted R2" = "AdjustedR2",
                       "p-value" = "p.value"),
          options = list(info = FALSE, dom = "", buttons = c("")))

Regression table

huxreg("All regressors" = model3a, "Intermediate AIC" = model3b, "AIC-based" = model3c,
       ci_level = .95, error_format = '({conf.low} - {conf.high})',
       number_format = '%.3g',
       statistics = c('R squared' = 'r.squared',
                      'Adj. R squared' = 'adj.r.squared',
                      'F statistic' = 'statistic',
                      'P value' = 'p.value'))
All regressors Intermediate AIC AIC-based
(Intercept) 122 ***           122 ***           120 ***          
(86.1 - 158)  (86.3 - 158)  (85.5 - 155) 
Population 0.000188 ** 0.000182 ** 0.000178 **
(5.74e-05 - 0.000319)   (6.03e-05 - 0.000304)   (5.85e-05 - 0.000297)  
Income -0.000159                         
(-0.00131 - 0.000996)                        
Illiteracy 1.37        1.4         1.17       
(-0.306 - 3.05)       (-0.252 - 3.05)       (-0.198 - 2.54)      
Life.Exp -1.65 ***    -1.66 ***    -1.61 ***   
(-2.17 - -1.14)       (-2.17 - -1.15)       (-2.08 - -1.14)      
HS.Grad 0.0323      0.0269                
(-0.0832 - 0.148)      (-0.0805 - 0.134)                
Frost -0.0129      -0.0129      -0.0137     
(-0.0278 - 0.00203)    (-0.0277 - 0.0018)     (-0.028 - 0.000538)  
Area 5.97e-06    5.71e-06    6.8e-06 *  
(-1.7e-06 - 1.36e-05)   (-1.65e-06 - 1.31e-05)   (9.22e-07 - 1.27e-05)  
R squared 0.808       0.808       0.807      
Adj. R squared 0.776       0.781       0.785      
F statistic 25.3         30.1         36.7        
P value 3.87e-13    6.94e-14    1.22e-14   
*** p < 0.001; ** p < 0.01; * p < 0.05.

Final AIC-based model (stargazer output)

huxreg(model3c,
       ci_level = .95, error_format = '({conf.low} - {conf.high})',
       number_format = '%.3g',
       statistics = c('R squared' = 'r.squared',
                      'Adj. R squared' = 'adj.r.squared',
                      'F statistic' = 'statistic',
                      'P value' = 'p.value'))
(1)
(Intercept) 120 ***          
(85.5 - 155) 
Population 0.000178 **
(5.85e-05 - 0.000297)  
Illiteracy 1.17       
(-0.198 - 2.54)      
Life.Exp -1.61 ***   
(-2.08 - -1.14)      
Frost -0.0137     
(-0.028 - 0.000538)  
Area 6.8e-06 *  
(9.22e-07 - 1.27e-05)  
R squared 0.807      
Adj. R squared 0.785      
F statistic 36.7        
P value 1.22e-14   
*** p < 0.001; ** p < 0.01; * p < 0.05.