# Load packages
library(ggplot2)
library(dplyr)
# Import data
Census <- read.csv("/resources/rstudio/BusStatistics/Data/Census.csv")
str(Census)
summary(Census)
Suppose that you are writing a report on the economic importance of snowbirds in the Lakes Region Planning Commission’s Region. The director ask you what the share of seasonal homes in total (housing_percentOfseasonal) for a typical town in the region is.
Mean ## Q2 Explain your answer. I would choose mean, because the median is only used when there is extreme values. Although the count shows a difference of about 200 between mean and median, mean would still be a better choice, because the charts indicate that there are no extreme outliers ## Q3 What is the highest percentage of the seasonal homes by any town in the region? The Highest percentage of seasonal homes would be 68%.
# Create faceted histogram
ggplot(Census, aes(x = housing_percentOfseasonal)) +
geom_histogram()+
facet_wrap(~housing_percentOfseasonal)
# Create box plots of city mpg by UR_aboveAve
ggplot(Census, aes(x = 1, y = housing_percentOfseasonal)) +
geom_boxplot()
# Create overlaid density plots for same data
ggplot(Census, aes(x = housing_percentOfseasonal)) +
geom_density(alpha = .3)
# If data has extreme values
Census %>%
summarize(median = median(housing_percentOfseasonal, na.rm = TRUE),
IQR = IQR(housing_percentOfseasonal, na.rm = TRUE))
# If data doesn't have extreme values
Census %>%
summarize(mean = mean(housing_percentOfseasonal, na.rm = TRUE),
sd = sd(housing_percentOfseasonal, na.rm = TRUE))
Suppose that director suspect that the share of seaonsal homes (popBA_percent) may be associated with the educational level of residents. Divide the towns into two groups: 1) educated towns (the share of population with Bachelor’s degree or higher than the average) and 2) other towns (the share of population with Bachelor’s degree or lower than the average). ## Q4 What is the share of seasonal homes in total in a typical educated town? The share of seasonal homes would total 37.3% in above average or educated towns ## Q5 What is the share of seasonal homes in total in a typical less educated educated town? The share of seasonal homes would total 23.7% in towns that are below average or less educated towns. ## Q6 What possible explanation you may have for the significant difference, if any? The 15% difference is present, becasue towns that have more educated people will be able afford more seasonal homes.
# Create a new variable, UR > or < average
UR_ave <- mean(Census$popBA_percent)
Census$UR_aboveAve <- ifelse(Census$popBA_percent >= UR_ave, "equal or above ave", "below ave")
# Create box plots of total population by UR_aboveAve
ggplot(Census, aes(x = UR_aboveAve, y = popTotal)) +
geom_boxplot()
# If data has extreme values
Census %>%
group_by(UR_aboveAve) %>%
summarize(median = median(housing_percentOfseasonal, na.rm = TRUE),
IQR = IQR(housing_percentOfseasonal, na.rm = TRUE))
# If data doesn't have extreme values
Census %>%
group_by(UR_aboveAve) %>%
summarize(mean = mean(housing_percentOfseasonal, na.rm = TRUE),
sd = sd(housing_percentOfseasonal, na.rm = TRUE))