The following report analyzes tax parcel data from Syracuse, New York (USA).
View the “Data Dictionary” here: Syracuse City Tax Parcel Data
The following code imports the Syracuse, NY tax parcel data using a URL.
url <- paste0("https://raw.githubusercontent.com/DS4PS/Data",
"-Science-Class/master/DATA/syr_parcels.csv")
dat <- read.csv(url,
strings = FALSE)
There are several exploratory functions to better understand our new dataset.
We can inspect the first 5 rows of these data using function
head().
Functions names() or colnames() will print
all variable names in a dataset.
## [1] "tax_id" "neighborhood" "stnum" "stname" "zip"
## [6] "owner" "frontfeet" "depth" "sqft" "acres"
## [11] "yearbuilt" "age" "age_range" "land_use" "units"
## [16] "residential" "rental" "vacantbuil" "assessedla" "assessedva"
## [21] "tax.exempt" "countytxbl" "schooltxbl" "citytaxabl" "star"
## [26] "amtdelinqu" "taxyrsdeli" "totint" "overduewater"
We can also inspect the values of a variable by extracting it with
$.
The extracted variable is called a “vector”.
## [1] "CLARMIN BUILDERS ONON COR" "JOHNSTON LEE R"
## [3] "CHRISTO CRAIG S" "HAWKINS FARMS INC"
## [5] "PETERS LYNNETTE" "MITCHELL LOTAN G"
## [7] "WHALEN GIOVANNA A" "BERGH GARY D"
## [9] "CITY OF SYRACUSE TD" "DOUGHERTY ROBERT K JR"
Function unique() helps us determine what values exist
in a variable.
## [1] "Vacant Land" "Single Family" "Commercial"
## [4] "Parking" "Two Family" "Three Family"
## [7] "Apartment" "Schools" "Parks"
## [10] "Multiple Residence" "Cemetery" "Religious"
## [13] "Recreation" "Community Services" "Utilities"
## [16] "Industrial"
Function str() provides an overview of total rows and
columns (dimensions), variable classes, and a preview of values.
## 'data.frame': 41502 obs. of 29 variables:
## $ tax_id : int 1393130501 1393130500 1437100600 1425100900 1425101000 ...
## $ neighborhood: chr "South Valley" "South Valley" ...
## $ stnum : chr "2655" "2635" ...
## $ stname : chr "VALLEY DR" "VALLEY DR" ...
## $ zip : chr "13215" "13120" ...
## $ owner : chr "CLARMIN BUILDERS ONON COR" "JOHNSTON LEE R" ...
## $ frontfeet : num 67.2 104.8 ...
## $ depth : num 50 46.5 ...
## $ sqft : num 2149 6370 ...
## $ acres : num 0.0493 0.1462 ...
## $ yearbuilt : int NA 1925 1957 1958 1965 ...
## $ age : int NA 90 58 57 50 ...
## $ age_range : chr NA "81-90" ...
## $ land_use : chr "Vacant Land" "Single Family" ...
## $ units : int 0 0 0 0 0 ...
## $ residential : logi FALSE TRUE TRUE ...
## $ rental : logi FALSE FALSE FALSE ...
## $ vacantbuil : logi FALSE FALSE FALSE ...
## $ assessedla : int 475 10800 20200 18000 18000 ...
## $ assessedva : int 500 69300 88300 70500 74000 ...
## $ tax.exempt : logi TRUE FALSE FALSE ...
## $ countytxbl : int 500 69300 88300 70500 74000 ...
## $ schooltxbl : int 500 69300 88300 70500 74000 ...
## $ citytaxabl : int 500 69300 88300 70500 74000 ...
## $ star : logi NA TRUE TRUE ...
## $ amtdelinqu : num 0 0 0 0 0 ...
## $ taxyrsdeli : int 0 0 0 0 0 ...
## $ totint : num 0 0 0 0 0 ...
## $ overduewater: num 0 178 ...
Instructions: Provide the code for each solution in the following “chunks”.
Remember to modify the text to show your answer in human-readable terms.
Question: How many tax parcels are in Syracuse, NY?
Answer: There are [41502] tax parcels in Syracuse, NY.
parcel <- nrow(dat), parcel,
Question: How many acres of land are in Syracuse, NY?
Answer: There are 12510.49 acres of land in Syracuse, NY.
acres <- sum(dat$acres, na.rm = TRUE),
acres,
Question: How many vacant buildings are there in Syracuse, NY?
Answer: There are [1888] vacant buildings in Syracuse, NY.
vacant <- sum(dat$vacantbuil, na.rm = TRUE), vacant,
Question: What proportion of parcels are tax-exempt?
Answer: [10.7]% of parcels are tax-exempt.
exempt <- mean(dat$tax.exempt, na.rm = TRUE), percentage <- exempt * 100,
percentage,
Question: Which neighborhood contains the most tax parcels?
Answer: [Eastwood] contains the most tax parcels.
max.parcels <- table(dat$neighborhood), max.neighborhood <- names(max.parcels)[which.max(max.parcels)],
max.neighborhood,
# Pass the appropriate variable to function 'table()'
# Optional: Use additional functions to narrow your resultsQuestion: Which neighborhood contains the most vacant lots?
Answer: [Near Westside] contains the most vacant lots.
vacantlot <- table(dat\(neighborhood,dat\)land_use),
vacantlot,
# Pass two variables to function 'table()', separated by a comma
# (Optional) use additional functions to narrow your results