Suppose that your first job out of college is with the Lakes Region Planning Commission in New Hampshire. The director asks you to analyze recent demographic and economic trends in the region.

The data is posted in Moodle. In your course homepage, scroll all the way down to find “termProject: census.csv”. Click the link to open the data set. The description of the data is posted in Announcement at the top of the course homepage.

Now that you have the data, answer the following questions:

Question 1

How many towns are in the NH Lakes Region Planning Commission region?

Provide your answer here

Question 2

What time period does the data represent?

Provide your answer here

Question 3

What are recent economic and demographic trends? There isn’t right or wrong answer for this question. Explain what approach you would take to identify demographic and economic trends in a sentence or two. In addition, list at least two challenges you might face in your approach.

Provide your answer here

My comments

See below for three simple functions (head(), str(), summary()) to examine some basic feasures of the data you are about to analyze.

From head() you can learn:

From str() you can learn:

From summary() you can learn:

In addition, I provided a chart at the bottom just to demonstrate how we can create a beautiful chart to get your points across to your audience. A picture is worth a thousand words!!! Note that you can easily replace medianAge by another variable in the data to perform the same analysis.

Import data

# Import data
data <- read.csv("/resources/rstudio/Bus Statistics/data/census2.csv")

# first six rows of data
# Note that the row represents towns and the column represents characteristics of towns
head(data)
##   X       Town     County popTotal medianAge popNative_bornUSA popNative
## 1 1 Alexandria    Grafton     1836      39.2              1783      1790
## 2 2      Alton    Belknap     5214      44.6              5023      5040
## 3 3    Andover  Merrimack     2422      47.3              2339      2357
## 4 4    Ashland    Grafton     1507      39.6              1445      1445
## 5 5  Barnstead    Belknap     4564      38.4              4472      4492
## 6 6    Belmont    Belknap     7350      41.1              7141      7170
##   popNaitve_bornNH popMoved_otherState popMoved_abroad popCommute_car
## 1             1047                  27               0            749
## 2             2160                  26              34           2312
## 3             1359                  63              39           1146
## 4              782                  20              29            611
## 5             2247                  42               0           2304
## 6             4334                  26               0           3359
##   popCommute_publicT popCommute_bicycle popCommute_foot popCommute_other
## 1                  0                  0              10               10
## 2                  0                  0              80                0
## 3                  0                  0              86               95
## 4                  0                  0              22                0
## 5                  0                  0              24               25
## 6                  0                  0              88               24
##   popCommute_home popBA popPov medianIncome incomeLabor
## 1              31   217    147        56667    30747600
## 2             352  1119    288        60045   126791100
## 3              79   473    180        67900    57250300
## 4               6   218    187        38821    23228600
## 5              89   890    134        65221   100084000
## 6             123   994    502        58561   155553900
##   incomeLabor_WageSalary incomeLabor_SelfEmpl incomeInvest incomeTotal
## 1               27640100              3107600      1733400    40645300
## 2              112401800             14389400      5898900   155566200
## 3               50351700              6898600      1834700    71473100
## 4               19907000              3321700       611700    32032600
## 5               90297300              9786700      3353800   119881800
## 6              141222300             14331700      4644100   187212600
##     LF LF_Civilian LF_Civilian_Unemployed LF_Not housingTotal
## 1  919         919                     85    464          945
## 2 3007        3007                    163   1237         4219
## 3 1502        1494                     49    531         1124
## 4  691         691                     52    544         1261
## 5 2692        2692                    108    931         2344
## 6 3782        3782                    165   2027         3640
##   housingVacant_rent housingVacant_seasonal medianHomeValue
## 1                  0                    240          206500
## 2                 31                   2016          263000
## 3                 18                    122          228500
## 4                 65                    395          167900
## 5                 31                    557          205500
## 6                 30                    317          184900
##   medianGrossRent unemplRate LFparticipationRate
## 1             918   9.249184            66.44975
## 2             822   5.420685            70.85297
## 3             937   3.279786            73.88096
## 4             552   7.525326            55.95142
## 5            1133   4.011887            74.30306
## 6             922   4.362771            65.10587
##   housingVacant_seasonal_percent housingVacant_rent_percent incomeNonLabor
## 1                      25.396825                  0.0000000        9897700
## 2                      47.783835                  0.7347713       28775100
## 3                      10.854093                  1.6014235       14222800
## 4                      31.324346                  5.1546392        8804000
## 5                      23.762799                  1.3225256       19797800
## 6                       8.708791                  0.8241758       31658700
##   incomeTransferPayment incomeNonLabor_percent incomeInvest_percent
## 1               8164300               24.35140             4.264700
## 2              22876200               18.49701             3.791891
## 3              12388100               19.89951             2.566980
## 4               8192300               27.48450             1.909617
## 5              16444000               16.51443             2.797589
## 6              27014600               16.91056             2.480656
##   incomeTransferPayment_percent incomeLabor_percent popPov_percent
## 1                      20.08670            75.64860       8.006536
## 2                      14.70512            81.50299       5.523590
## 3                      17.33253            80.10049       7.431874
## 4                      25.57488            72.51550      12.408759
## 5                      13.71684            83.48557       2.936021
## 6                      14.42990            83.08944       6.829932
##   popBA_percent popMoved_otherState_percent popMoved_abroad_percent
## 1      11.81917                   1.4705882               0.0000000
## 2      21.46145                   0.4986575               0.6520905
## 3      19.52931                   2.6011561               1.6102395
## 4      14.46583                   1.3271400               1.9243530
## 5      19.50044                   0.9202454               0.0000000
## 6      13.52381                   0.3537415               0.0000000
##   popMoved_percent popCommute_car_percent popCommute_publicT_percent
## 1        1.4705882               40.79521                          0
## 2        1.1507480               44.34216                          0
## 3        4.2113955               47.31627                          0
## 4        3.2514930               40.54413                          0
## 5        0.9202454               50.48203                          0
## 6        0.3537415               45.70068                          0
##   popCommute_bicycle_percent popCommute_foot_percent
## 1                          0               0.5446623
## 2                          0               1.5343306
## 3                          0               3.5507845
## 4                          0               1.4598540
## 5                          0               0.5258545
## 6                          0               1.1972789
##   popCommute_other_percent popCommute_home_percent Year benchM
## 1                0.5446623                1.688453 2011     NA
## 2                0.0000000                6.751055 2011     NA
## 3                3.9223782                3.261767 2011     NA
## 4                0.0000000                0.398142 2011     NA
## 5                0.5477651                1.950044 2011     NA
## 6                0.3265306                1.673469 2011     NA

# structure of data
# Note that there are 64 observations or rows. This is because there are 30 towns in the region plus New Hampshire and the U.S. as benchmarks. That makes 32 not 64? Well, there are two years: 2011 and 2016. So each place show up twice in the data.
str(data)
## 'data.frame':    64 obs. of  56 variables:
##  $ X                             : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Town                          : Factor w/ 32 levels "Alexandria","Alton",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ County                        : Factor w/ 6 levels " Belknap"," Carroll",..: 3 1 4 3 1 1 3 3 1 4 ...
##  $ popTotal                      : int  1836 5214 2422 1507 4564 7350 1179 3063 960 1120 ...
##  $ medianAge                     : num  39.2 44.6 47.3 39.6 38.4 41.1 47.7 47.1 49.2 40.5 ...
##  $ popNative_bornUSA             : int  1783 5023 2339 1445 4472 7141 1143 2961 920 1106 ...
##  $ popNative                     : int  1790 5040 2357 1445 4492 7170 1153 2992 924 1120 ...
##  $ popNaitve_bornNH              : int  1047 2160 1359 782 2247 4334 545 1538 453 548 ...
##  $ popMoved_otherState           : int  27 26 63 20 42 26 44 45 5 16 ...
##  $ popMoved_abroad               : int  0 34 39 29 0 0 0 8 0 0 ...
##  $ popCommute_car                : int  749 2312 1146 611 2304 3359 575 1337 453 556 ...
##  $ popCommute_publicT            : int  0 0 0 0 0 0 5 0 0 0 ...
##  $ popCommute_bicycle            : int  0 0 0 0 0 0 0 10 0 0 ...
##  $ popCommute_foot               : int  10 80 86 22 24 88 10 58 29 7 ...
##  $ popCommute_other              : int  10 0 95 0 25 24 21 0 0 5 ...
##  $ popCommute_home               : int  31 352 79 6 89 123 52 47 6 22 ...
##  $ popBA                         : int  217 1119 473 218 890 994 272 518 319 123 ...
##  $ popPov                        : int  147 288 180 187 134 502 79 437 95 208 ...
##  $ medianIncome                  : int  56667 60045 67900 38821 65221 58561 55208 43242 58571 46845 ...
##  $ incomeLabor                   : num  3.07e+07 1.27e+08 5.73e+07 2.32e+07 1.00e+08 ...
##  $ incomeLabor_WageSalary        : num  2.76e+07 1.12e+08 5.04e+07 1.99e+07 9.03e+07 ...
##  $ incomeLabor_SelfEmpl          : num  3107600 14389400 6898600 3321700 9786700 ...
##  $ incomeInvest                  : num  1733400 5898900 1834700 611700 3353800 ...
##  $ incomeTotal                   : num  4.06e+07 1.56e+08 7.15e+07 3.20e+07 1.20e+08 ...
##  $ LF                            : int  919 3007 1502 691 2692 3782 723 1512 501 663 ...
##  $ LF_Civilian                   : int  919 3007 1494 691 2692 3782 723 1512 501 663 ...
##  $ LF_Civilian_Unemployed        : int  85 163 49 52 108 165 36 36 13 40 ...
##  $ LF_Not                        : int  464 1237 531 544 931 2027 324 997 318 210 ...
##  $ housingTotal                  : int  945 4219 1124 1261 2344 3640 968 2481 730 658 ...
##  $ housingVacant_rent            : int  0 31 18 65 31 30 12 87 0 0 ...
##  $ housingVacant_seasonal        : int  240 2016 122 395 557 317 427 1039 306 150 ...
##  $ medianHomeValue               : int  206500 263000 228500 167900 205500 184900 257400 186700 329200 203500 ...
##  $ medianGrossRent               : int  918 822 937 552 1133 922 760 704 786 945 ...
##  $ unemplRate                    : num  9.25 5.42 3.28 7.53 4.01 ...
##  $ LFparticipationRate           : num  66.4 70.9 73.9 56 74.3 ...
##  $ housingVacant_seasonal_percent: num  25.4 47.8 10.9 31.3 23.8 ...
##  $ housingVacant_rent_percent    : num  0 0.735 1.601 5.155 1.323 ...
##  $ incomeNonLabor                : num  9897700 28775100 14222800 8804000 19797800 ...
##  $ incomeTransferPayment         : num  8164300 22876200 12388100 8192300 16444000 ...
##  $ incomeNonLabor_percent        : num  24.4 18.5 19.9 27.5 16.5 ...
##  $ incomeInvest_percent          : num  4.26 3.79 2.57 1.91 2.8 ...
##  $ incomeTransferPayment_percent : num  20.1 14.7 17.3 25.6 13.7 ...
##  $ incomeLabor_percent           : num  75.6 81.5 80.1 72.5 83.5 ...
##  $ popPov_percent                : num  8.01 5.52 7.43 12.41 2.94 ...
##  $ popBA_percent                 : num  11.8 21.5 19.5 14.5 19.5 ...
##  $ popMoved_otherState_percent   : num  1.471 0.499 2.601 1.327 0.92 ...
##  $ popMoved_abroad_percent       : num  0 0.652 1.61 1.924 0 ...
##  $ popMoved_percent              : num  1.47 1.15 4.21 3.25 0.92 ...
##  $ popCommute_car_percent        : num  40.8 44.3 47.3 40.5 50.5 ...
##  $ popCommute_publicT_percent    : num  0 0 0 0 0 ...
##  $ popCommute_bicycle_percent    : num  0 0 0 0 0 ...
##  $ popCommute_foot_percent       : num  0.545 1.534 3.551 1.46 0.526 ...
##  $ popCommute_other_percent      : num  0.545 0 3.922 0 0.548 ...
##  $ popCommute_home_percent       : num  1.688 6.751 3.262 0.398 1.95 ...
##  $ Year                          : int  2011 2011 2011 2011 2011 2011 2011 2011 2011 2011 ...
##  $ benchM                        : int  NA NA NA NA NA NA NA NA NA NA ...


# summary of data
# You can learn a lot about data from the summary. For example, you can see that each town shows up twice in the data; that Belknap County covers the largest area in the Lakes Region representing 22 towns in the region; and that the smallest town has the population of 541 people while the largest town the population of more than 300 million people. What? Of coruse, you are right. That is the United States.
summary(data)
##        X                 Town              County      popTotal        
##  Min.   : 1.00   Alexandria: 2    Belknap     :22   Min.   :      541  
##  1st Qu.:16.75   Alton     : 2    Carroll     :16   1st Qu.:     1496  
##  Median :32.50   Andover   : 2    Grafton     :12   Median :     3013  
##  Mean   :32.50   Ashland   : 2    Merrimack   :10   Mean   :  9812983  
##  3rd Qu.:48.25   Barnstead : 2   New Hampshire: 2   3rd Qu.:     5516  
##  Max.   :64.00   Belmont   : 2   United States: 2   Max.   :318558162  
##                  (Other)   :52                                         
##    medianAge     popNative_bornUSA     popNative        
##  Min.   :37.00   Min.   :      481   Min.   :      487  
##  1st Qu.:42.27   1st Qu.:     1433   1st Qu.:     1433  
##  Median :46.30   Median :     2942   Median :     2966  
##  Mean   :46.82   Mean   :  8398579   Mean   :  8537742  
##  3rd Qu.:50.25   3rd Qu.:     5246   3rd Qu.:     5272  
##  Max.   :59.50   Max.   :271639606   Max.   :276363808  
##                                                         
##  popNaitve_bornNH    popMoved_otherState popMoved_abroad    
##  Min.   :      145   Min.   :      5     Min.   :      0.0  
##  1st Qu.:      664   1st Qu.:     27     1st Qu.:      0.0  
##  Median :     1388   Median :     47     Median :      0.0  
##  Mean   :  5746994   Mean   : 188730     Mean   :  50896.0  
##  3rd Qu.:     2608   3rd Qu.:    103     3rd Qu.:     13.8  
##  Max.   :186708691   Max.   :6111964     Max.   :1695894.0  
##                                                             
##  popCommute_car      popCommute_publicT popCommute_bicycle
##  Min.   :      173   Min.   :      0    Min.   :     0.0  
##  1st Qu.:      675   1st Qu.:      0    1st Qu.:     0.0  
##  Median :     1418   Median :      0    Median :     0.0  
##  Mean   :  3854261   Mean   : 225044    Mean   : 25412.4  
##  3rd Qu.:     2323   3rd Qu.:      4    3rd Qu.:     0.2  
##  Max.   :125037241   Max.   :7476312    Max.   :877995.0  
##                                                           
##  popCommute_foot   popCommute_other    popCommute_home  
##  Min.   :      0   Min.   :      0.0   Min.   :      6  
##  1st Qu.:     10   1st Qu.:      4.8   1st Qu.:     39  
##  Median :     24   Median :     13.5   Median :     70  
##  Mean   : 125354   Mean   :  54174.8   Mean   : 197434  
##  3rd Qu.:     74   3rd Qu.:     49.0   3rd Qu.:    145  
##  Max.   :4030730   Max.   :1777051.0   Max.   :6661892  
##                                                         
##      popBA              popPov          medianIncome    incomeLabor       
##  Min.   :     123   Min.   :      30   Min.   :38821   Min.   :9.157e+06  
##  1st Qu.:     360   1st Qu.:     141   1st Qu.:51477   1st Qu.:3.024e+07  
##  Median :     628   Median :     242   Median :58464   Median :7.037e+07  
##  Mean   : 1912784   Mean   : 1404789   Mean   :57864   Mean   :2.198e+11  
##  3rd Qu.:    1025   3rd Qu.:     551   3rd Qu.:63981   3rd Qu.:1.311e+08  
##  Max.   :64767787   Max.   :46932225   Max.   :76676   Max.   :7.290e+12  
##                                                                           
##  incomeLabor_WageSalary incomeLabor_SelfEmpl  incomeInvest      
##  Min.   :6.118e+06      Min.   :1.200e+06    Min.   :2.723e+05  
##  1st Qu.:2.618e+07      1st Qu.:4.139e+06    1st Qu.:2.141e+06  
##  Median :6.255e+07      Median :6.469e+06    Median :3.740e+06  
##  Mean   :2.056e+11      Mean   :1.412e+10    Mean   :1.401e+10  
##  3rd Qu.:1.138e+08      3rd Qu.:1.071e+07    3rd Qu.:1.041e+07  
##  Max.   :6.845e+12      Max.   :4.534e+11    Max.   :4.644e+11  
##                                                                 
##   incomeTotal              LF             LF_Civilian       
##  Min.   :1.554e+07   Min.   :      288   Min.   :      288  
##  1st Qu.:4.338e+07   1st Qu.:      778   1st Qu.:      778  
##  Median :8.854e+07   Median :     1677   Median :     1677  
##  Mean   :2.837e+11   Mean   :  4982562   Mean   :  4948950  
##  3rd Qu.:1.850e+08   3rd Qu.:     2977   3rd Qu.:     2977  
##  Max.   :9.502e+12   Max.   :160818740   Max.   :159807099  
##                                                             
##  LF_Civilian_Unemployed     LF_Not          housingTotal      
##  Min.   :      13       Min.   :     187   Min.   :      498  
##  1st Qu.:      39       1st Qu.:     506   1st Qu.:     1097  
##  Median :      70       Median :     866   Median :     1948  
##  Mean   :  396637       Mean   : 2782613   Mean   :  4163607  
##  3rd Qu.:     153       3rd Qu.:    1987   3rd Qu.:     4143  
##  Max.   :13488016       Max.   :92504969   Max.   :134054899  
##                                                               
##  housingVacant_rent housingVacant_seasonal medianHomeValue 
##  Min.   :      0    Min.   :     27        Min.   :155600  
##  1st Qu.:      0    1st Qu.:    300        1st Qu.:184900  
##  Median :      6    Median :    543        Median :218250  
##  Mean   :  96786    Mean   : 162998        Mean   :231338  
##  3rd Qu.:     38    3rd Qu.:   1244        3rd Qu.:268075  
##  Max.   :3321254    Max.   :5368085        Max.   :360800  
##                                                            
##  medianGrossRent    unemplRate     LFparticipationRate
##  Min.   : 552.0   Min.   : 1.310   Min.   :45.93      
##  1st Qu.: 823.5   1st Qu.: 3.895   1st Qu.:59.26      
##  Median : 922.0   Median : 4.963   Median :64.54      
##  Mean   : 926.8   Mean   : 5.360   Mean   :63.78      
##  3rd Qu.:1000.5   3rd Qu.: 6.797   3rd Qu.:67.65      
##  Max.   :1315.0   Max.   :10.935   Max.   :75.95      
##  NA's   :1                                            
##  housingVacant_seasonal_percent housingVacant_rent_percent
##  Min.   : 1.477                 Min.   :0.0000            
##  1st Qu.:16.565                 1st Qu.:0.0000            
##  Median :27.925                 Median :0.1363            
##  Mean   :29.913                 Mean   :0.8662            
##  3rd Qu.:41.980                 3rd Qu.:1.2604            
##  Max.   :68.020                 Max.   :5.1546            
##                                                           
##  incomeNonLabor      incomeTransferPayment incomeNonLabor_percent
##  Min.   :4.101e+06   Min.   :3.828e+06     Min.   :13.60         
##  1st Qu.:1.312e+07   1st Qu.:1.012e+07     1st Qu.:19.80         
##  Median :2.047e+07   Median :1.686e+07     Median :25.93         
##  Mean   :6.394e+10   Mean   :4.993e+10     Mean   :27.09         
##  3rd Qu.:4.423e+07   3rd Qu.:3.312e+07     3rd Qu.:31.85         
##  Max.   :2.212e+12   Max.   :1.748e+12     Max.   :53.67         
##                                                                  
##  incomeInvest_percent incomeTransferPayment_percent incomeLabor_percent
##  Min.   : 1.031       Min.   :10.58                 Min.   :46.33      
##  1st Qu.: 2.542       1st Qu.:15.94                 1st Qu.:68.15      
##  Median : 4.928       Median :18.44                 Median :74.07      
##  Mean   : 6.657       Mean   :20.43                 Mean   :72.91      
##  3rd Qu.: 8.735       3rd Qu.:24.75                 3rd Qu.:80.20      
##  Max.   :36.817       Max.   :36.41                 Max.   :86.40      
##                                                                        
##  popPov_percent   popBA_percent   popMoved_otherState_percent
##  Min.   : 1.157   Min.   :10.58   Min.   :0.3537             
##  1st Qu.: 6.630   1st Qu.:15.79   1st Qu.:1.1786             
##  Median : 8.495   Median :21.53   Median :1.9088             
##  Mean   : 9.681   Mean   :22.60   Mean   :2.1631             
##  3rd Qu.:12.705   3rd Qu.:27.98   3rd Qu.:2.7998             
##  Max.   :21.167   Max.   :43.13   Max.   :7.7385             
##                                                              
##  popMoved_abroad_percent popMoved_percent  popCommute_car_percent
##  Min.   :0.0000          Min.   : 0.3537   Min.   :31.98         
##  1st Qu.:0.0000          1st Qu.: 1.2728   1st Qu.:40.47         
##  Median :0.0000          Median : 2.0825   Median :44.18         
##  Mean   :0.3554          Mean   : 2.5185   Mean   :43.82         
##  3rd Qu.:0.3727          3rd Qu.: 3.2062   3rd Qu.:47.22         
##  Max.   :7.7634          Max.   :11.0906   Max.   :54.78         
##                                                                  
##  popCommute_publicT_percent popCommute_bicycle_percent
##  Min.   :0.0000             Min.   :0.00000           
##  1st Qu.:0.0000             1st Qu.:0.00000           
##  Median :0.0000             Median :0.00000           
##  Mean   :0.1775             Mean   :0.08443           
##  3rd Qu.:0.1488             3rd Qu.:0.00840           
##  Max.   :2.3469             Max.   :1.08887           
##                                                       
##  popCommute_foot_percent popCommute_other_percent popCommute_home_percent
##  Min.   :0.0000          Min.   :0.0000           Min.   : 0.3981        
##  1st Qu.:0.4395          1st Qu.:0.2175           1st Qu.: 1.6705        
##  Median :0.8309          Median :0.5455           Median : 2.3043        
##  Mean   :1.2211          Mean   :0.7564           Mean   : 2.9100        
##  3rd Qu.:1.5008          3rd Qu.:0.8935           3rd Qu.: 3.6064        
##  Max.   :6.6234          Max.   :3.9224           Max.   :15.8965        
##                                                                          
##       Year          benchM   
##  Min.   :2011   Min.   :1.0  
##  1st Qu.:2011   1st Qu.:1.0  
##  Median :2014   Median :1.5  
##  Mean   :2014   Mean   :1.5  
##  3rd Qu.:2016   3rd Qu.:2.0  
##  Max.   :2016   Max.   :2.0  
##                 NA's   :60

Simple demonstration of what you can do with R

Note that you can easily replace medianAge by another variable in the data to perform the same analysis

# Load 
library(ggplot2)
library(dplyr)

data %>%
  filter(Year == 2016) %>%
  ggplot(aes(reorder(x = Town, medianAge), y = medianAge, fill = benchM)) +
  geom_col(show.legend = FALSE) +
  coord_flip() +
  labs(title = "Lakes Region Towns by Median Age",
       x = NULL,
       y = NULL,
       caption = "Data source: American Community Survey, 5-year estimate, 2012-2016")