IST 718 - Advanced Information Analytics (Professor G. Krydus)

Lab 4 (optional)

# Numerous packages were installed 
library(maptools)
library(ggplot2)
library(rgeos)
library(rgdal)
library(reshape)
library(gpclib)
library(dplyr)
library(data.table)
library(splitstackshape)
library(USAboundaries)
library(raster)
library(RCurl)    
library(tidyr)      
library(stringr)

Initial assignment

# . Create an animated choropleth map that renders NY State county populations by year. 
# o Animation should cycle on year.
# o Include appropriate text as you cycle through years
# . Review the data - clean as appropriate
# . Consider the base data:
# o county
# o population
# o multiple years (2010, 2011, 2012, 2013)
# o source format/structure
# o source data set (nys pop multiple years) is in xlsx & csv format located in Blackboard Library area 
# . Build a data frame for your analysis
# o You'll need to select the correct choroplethr dataset for this case study
# . Answer the following questions in your report:
# o What is your initial assessment of the source data
# o What data is missing
# o Describe what steps are needed to get the source data in a format for choroplethr consumption
# o Provide some descriptive statistics of your "choroplethr ready" dataset

Well, I failed to visualized a given dataset nys pop multiple years.xlsx (.csv), because data was missing. For example, in 2012 several counties were absent. Data is not good for such assignment.

# Set directory
setwd("C:/DC/Advanced Information Analytics/Labs")

# Attempt to map counties with data from the source:
# http://gis.ny.gov/gisdata/inventories/details.cfm?DSID=927
# Read shapefile
nys.counties <- readOGR("Counties.shp")
str(nys.counties)
summary(nys.counties)
names(nys.counties)

# Create a map
p <- ggplot(nys.counties@data, aes(POP2010, NAME))
p + geom_point(aes(colour=POP2010, size=POP2010)) + geom_text(size=2, aes(label=NAME)) 

nys.counties_geom <- fortify(nys.counties, region="NAME") 
# This function turns a map into a data frame than can more easily be plotted with ggplot2.
nys.counties_geom <- merge(nys.counties_geom, nys.counties@data, by.x="id", by.y="NAME")

# Produce map
Map <- ggplot(nys.counties_geom, aes(long, lat, group = group, fill = POP2010)) + 
  geom_polygon() + coord_equal() +
  labs(title = "NY State population by county in 2010 (the legend was removed)",
       subtitle = "Source: www.gis.ny.gov") + theme(legend.position="none") + 
  scale_x_discrete(breaks = NULL) + scale_y_discrete(breaks = NULL)
Map

# Additional beneficial sources:

# https://www2.census.gov/geo/docs/reference/codes/files/national_county.txt # Add this
# http://api.rpubs.com/jbrnbrg/project2_607
# http://eriqande.github.io/rep-res-web/lectures/making-maps-with-R.html#maps-package-and-ggplot
# https://rpbs.com/jfbratt/basic-mapping
# https://rpubs.com/alyssafahringer/165330