For a data set of your choosing, make a faceted plot using the trelliscopejs package. You may make any type of plot; scatter plot, histogram, etc. but, as mentioned in the discussion below, you must explain why you chose this plot and what you are investigating about the variable you are graphing.
The trelliscope plot must include one cognostic measure of your own. Include a description of what it is and what information this measure gives.
# Read in data
alleg <- read.csv("allegheny_county_master_file (2).csv")
# Check structure and the unique values of MUNIDESC
#unique(alleg$MUNIDESC)
str(alleg$LOTAREA)
## int [1:584107] 8320 282835 106321 9000 1800 54582 54581 5768 9568 21388 ...
str(alleg$MUNIDESC)
## chr [1:584107] "1st Ward - PITTSBURGH" "1st Ward - PITTSBURGH" ...
# Try to make scatterplot of SALEPRICE vs LOTAREA faceted by municipality
# Change MUNIDESC to factor
alleg$MUNIDESC <- as.factor(alleg$MUNIDESC)
str(alleg$MUNIDESC)
## Factor w/ 175 levels "10th Ward - McKEESPORT",..: 17 17 17 17 17 17 17 17 17 17 ...
# Check values of usedesc
#unique(alleg$USEDESC)
# Filter to only single family homes
library(tidyr)
alleg_new <- subset(alleg, USEDESC == "SINGLE FAMILY")
# String replace all the spaces in the MUNIDESC variable
library(stringr)
alleg_new$MUNIDESC <- str_replace(alleg_new$MUNIDESC, " ", "")
# Make one scatterplot to make sure it works
# Test on Shaler
shaler_test <- subset(alleg_new, MUNIDESC == "Shaler")
# Filter where SALEPRICE < 750000 and LOTAREA < 500000
library(ggplot2)
shaler_filter <- subset(shaler_test, SALEPRICE < 75000 & LOTAREA < 500000)
# Test scatterplot for Shaler
ggplot(shaler_test, aes(x = LOTAREA, y = SALEPRICE)) +
geom_point()
## Warning: Removed 17 rows containing missing values or values outside the scale range
## (`geom_point()`).
# New scatterplot on the filter one
ggplot(shaler_filter, aes(x = LOTAREA, y = SALEPRICE)) +
geom_point()
# Now filter every house by those conditions
alleg_filter <- subset(alleg_new, SALEPRICE < 75000 & LOTAREA < 500000)
alleg_filter$LOTAREA <- as.numeric(alleg_filter$LOTAREA)
# Add LOTAREA SD as new label and cognostic
library(trelliscopejs)
## Warning: package 'trelliscopejs' was built under R version 4.4.3
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.4.3
## Warning: package 'readr' was built under R version 4.4.3
## Warning: package 'forcats' was built under R version 4.4.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ purrr 1.0.2
## ✔ forcats 1.0.0 ✔ readr 2.1.5
## ✔ lubridate 1.9.4 ✔ tibble 3.2.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidyr)
alleg_filter_cog <- alleg_filter %>%
group_by(MUNIDESC) %>%
mutate(LOTAREA_SD = sd(LOTAREA))
# Description to cog
alleg_filter_cog$LOTAREA_SD <- cog(alleg_filter_cog$LOTAREA_SD,
desc = "Standard Deviation of LOTAREA", default_label = TRUE)
# Create trelliscope graph
# Scatterplot of SALEPRICE vs LOTAREA for single family homes faceted by municipality
library(ggplot2)
library(tidyverse)
alleg_filter_cog %>%
ggplot( aes(x = LOTAREA, y = SALEPRICE)) +
geom_point() +
facet_trelliscope(~ MUNIDESC,
name = "Single Family Houses",
desc = "Scatterplot for Single Family Houses\nIn Allegheny County by Municipality",
nrow = 2,
ncol = 3,
scales = c("free", "free"),
path = ".",
self_contained = TRUE)
## using data from the first layer
Description 2-3 paragraphs.
Describe the data set. Explain the variable you are graphing in your plots and the reason you are investigating with it. Discuss the reason/motivation you chose the variable to facet on, and what insight or trend you are attempting to investigate. Discuss any challenges you had in making the graphs and how you dealt with these challenges. Name at least one cognostic measure (this can include the cognostic you created or be different) the reader could investigate, and explain any insight they might gain from it.