Overview

For this code-through we will be working through creating maps with that visually describe the populations represented in those areas, using Seattle Washington for this tutorial. We will also be generating diagrams that display the characteristics of the 6 groups autogenerated by this program.

For this code-through we will be using Census data from the 2012 American Communities Survey made available through the Diversity and Disparities Project.

Project Set Up

Download Data

We will be using Census data made available through the Diversity and Disparities Project for this Code-Through. The Diversity and Disparities Project has created a dataset with 170 variables that have been harmonized from 1970 onward for analysis of changes in tracts over time, allowing for easy analyses.

DATA DICTIONARY

LABEL VARIABLE
tractid GEOID
pnhwht12 Percent white, non-Hispanic
pnhblk12 Percent black, non-Hispanic
phisp12 Percent Hispanic
pntv12 Percent Native American race
pfb12 Percent foreign born
polang12 Percent speaking other language at home, age 5 plus
phs12 Percent with high school degree or less
pcol12 Percent with 4-year college degree or more
punemp12 Percent unemployed
pflabf12 Percent female labor force participation
pprof12 Percent professional employees
pmanuf12 Percent manufacturing employees
pvet12 Percent veteran
psemp12 Percent self-employed
hinc12 Median HH income, total
incpc12 Per capita income
ppov12 Percent in poverty, total
pown12 Percent owner-occupied units
pvac12 Percent vacant units
pmulti12 Percent multi-family units
mrent12 Median rent
mhmval12 Median home value
p30old12 Percent structures more than 30 years old
p10yrs12 Percent HH in neighborhood 10 years or less
p18und12 Percent 17 and under, total
p60up12 Percent 60 and older, total
p75up12 Percent 75 and older, total
pmar12 Percent currently married, not separated
pwds12 Percent widowed, divorced and separated
pfhh12 Percent female-headed families with children

Part 1: Building a Neighbohood Dataset

Step 1: Select Your MSA

Once you have decided which area you would like to do an analyis of, you can search for a city, using ‘grep( “CITYNAME”, crosswalk$msaname, value=TRUE )’

## [1] "SEATTLE-BELLEVUE-EVERETT, WA" "SEATTLE-BELLEVUE-EVERETT, WA"
## [3] "SEATTLE-BELLEVUE-EVERETT, WA"

Step 2: Download a Shapefile with Population Data

having determined the information for your area, you can download the shape files for that area.

Simply copy/paste the MSA name that you found through your search above, into the first line below, to pull the information for those areas.

Using ‘census_api_key(“your key here”)’, and your personal key, you can access and pull your data. If you do not yet have a Cencus AIP Key, you can request on at aip.census.gov.

After connecting, you can use the msaname and ‘get_acs()’ command to pull shapefiles and information that is relevant to the creation of your maps and analysis of your popuations:

## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |=====                                                                 |   6%
  |                                                                            
  |=====                                                                 |   8%
  |                                                                            
  |======                                                                |   8%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |===========                                                           |  15%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |=============                                                         |  18%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  28%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |======================                                                |  32%
  |                                                                            
  |=========================                                             |  35%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |=============================                                         |  42%
  |                                                                            
  |==============================                                        |  42%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |=================================                                     |  48%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |=====================================                                 |  52%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  58%
  |                                                                            
  |=========================================                             |  58%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |===========================================                           |  61%
  |                                                                            
  |===========================================                           |  62%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |=============================================                         |  65%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |================================================                      |  68%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |==================================================                    |  72%
  |                                                                            
  |===================================================                   |  72%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |======================================================                |  78%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |==========================================================            |  82%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |===========================================================           |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=============================================================         |  88%
  |                                                                            
  |==============================================================        |  88%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |=================================================================     |  94%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |===================================================================   |  96%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |===================================================================== |  98%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================| 100%

Step 3: Add Census Data

After we have pulled our geographic data, we can then retrieve demographic data from the a cencus file, and merge the files using common information.

  • Since we will be surveying an area with a costal area, which is seen as empty by the program, we will have to also include the code ‘area[ ! st_is_empty( area ) ,’ to tell the program to ignore those areas.

Step 4: Transform the Shapefile into A Dorling Cartogram

Once we have defined the areas which we are interested in mappingm can then transform our shapefile into a cartogram and plot them. This will provide us with a map if the tracts we are examining.

## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"

We can them convert the map to display Cencus information such as population, and remove the empty tracts:

We can then color code the cencus data to display gradiants in the trends we see within the populations in the area, such as Median Household Income, also defining the size of our ’bubbles, the data displayed, and the color pallette used to display that data:

If we play around more with the maps, we will find that we are also able to customize them more with items such as titles and content location.

## $tm_layout
## $tm_layout$legend.show
## [1] TRUE
## 
## $tm_layout$style
## [1] NA
## 
## 
## attr(,"class")
## [1] "tm"

We can also switch out the basemap to display topography, and focus on our area of interest, upon loading.

## $tm_layout
## $tm_layout$legend.show
## [1] TRUE
## 
## $tm_layout$style
## [1] NA
## 
## 
## attr(,"class")
## [1] "tm"

Or use other themes such as Stamen Watercolor, to display reference points like bodies of water, roads, or parks.

## $tm_layout
## $tm_layout$legend.show
## [1] TRUE
## 
## $tm_layout$style
## [1] NA
## 
## 
## attr(,"class")
## [1] "tm"

Part 2: Creating representative groups of the populations represented in the maps

After determining the variables that are of interest to you in regards to your popilations, you can auto generate distinct groups based on those characteristics. The data is transformed into z-scores so that they are all on a similar scale.

Prepare Data for Clustering

Using the ‘keep.these’ command, you can determine which variables to keep. You can the examine the data embedded in the map, focusing on your variables of interest. Below you will see an example of the first 7 variables we are examining.

pnhwht12 pnhblk12 phisp12 pntv12 pfb12 polang12 phs12
1.24 -0.7428 -0.8731 -0.4588 -1.181 -1.323 1.056
0.2575 0.6813 -0.3043 0.7652 -1.199 -1.241 -0.1616
1.171 -0.5183 -1.048 -0.01369 -1.355 -1.193 0.111
0.1927 -0.3768 -0.4018 -0.1002 -0.4999 -0.2616 1.027
0.7317 -0.6259 -0.441 -0.08169 -0.636 -0.7571 -0.4013
-0.2968 -0.1246 0.5503 -0.3351 0.0542 -0.5033 0.3122

Perform Cluster Analysis

We will then use mclust to create significant groupings of these populations, and examine the differences between groups.

Using this Information to Serve Our Community

Once you have produced your charts, you can use them the create labels for your groups based on the the information found in the clusters determined above. You can also determine trends in various areas, based on that information (i.e. demographic make up if each area).

I found this process incredibly useful in helping me to better understand the areas we examined, by being able to better understand the people in those areas. This process could be used to determine processes for local government or non-profit organizatoions, such as where to open a new workforce center, or what areas to target in promoting a new GED preparation course. It can help us to not only better understand our communities of interest, but also to better serve them.



Code-Trough Created By: Ricky Duran For: CPP 529 - Data Analysis Practicum Created On: “November 27, 2019”