library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.3 v purrr 0.3.4
## v tibble 3.1.2 v dplyr 1.0.6
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 1.4.0 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
setwd("C:/Documents - Copy/PERSONAL/Data 110_MC_Class")
project0 <- read_csv("2019_Rental_Facility_Occupancy_Survey_Results.csv")
##
## -- Column specification --------------------------------------------------------
## cols(
## `Community Name` = col_character(),
## `Community Address` = col_character(),
## `Bedroom Types` = col_character(),
## `Average Rent 2015` = col_double(),
## `Average Rent 2016` = col_double(),
## `Average Rent 2017` = col_double(),
## `Average Rent 2018` = col_double(),
## `Average Rent 2019` = col_double(),
## `Percent Change From Previous Year 2018-2019` = col_double()
## )
names(project0) <- tolower(names(project0))
names(project0) <- gsub(" ", "_", names(project0))
str(project0)
## spec_tbl_df [1,100 x 9] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ community_name : chr [1:1100] "Chateau Apartments" "The Henri" "Corrigan Square" "Dorset Apartments" ...
## $ community_address : chr [1:1100] "9727 MT PISGAH RD SILVER SPRING MD 20903" "11870 GRAND PARK AVE ROCKVILLE MD 20852" "8511 SNOUFFER SCHOOL RD GAITHERSBURG MD 20879" "4757 CHEVY CHASE DR CHEVY CHASE MD 20815" ...
## $ bedroom_types : chr [1:1100] "efficiency" "efficiency" "1 bedroom" "2 bedroom" ...
## $ average_rent_2015 : num [1:1100] 1249 NA 1032 1508 997 ...
## $ average_rent_2016 : num [1:1100] NA NA 1224 1523 1026 ...
## $ average_rent_2017 : num [1:1100] 1299 1606 1073 1508 782 ...
## $ average_rent_2018 : num [1:1100] 1323 1567 1093 1520 716 ...
## $ average_rent_2019 : num [1:1100] 1594 1845 1276 1772 834 ...
## $ percent_change_from_previous_year_2018-2019: num [1:1100] 0.205 0.177 0.167 0.166 0.165 0.163 0.158 0.146 0.14 0.139 ...
## - attr(*, "spec")=
## .. cols(
## .. `Community Name` = col_character(),
## .. `Community Address` = col_character(),
## .. `Bedroom Types` = col_character(),
## .. `Average Rent 2015` = col_double(),
## .. `Average Rent 2016` = col_double(),
## .. `Average Rent 2017` = col_double(),
## .. `Average Rent 2018` = col_double(),
## .. `Average Rent 2019` = col_double(),
## .. `Percent Change From Previous Year 2018-2019` = col_double()
## .. )
view(project0)
onebed <- project0 %>%
filter(bedroom_types == "1 bedroom")
view(onebed)
There are 404 entries for 1 bedroom.
twobed <- project0 %>%
filter(bedroom_types == "2 bedroom")
view(twobed)
There are 413 entries for 2 bedroom.
threebed <- project0 %>%
filter(bedroom_types == "3 bedroom")
view(threebed)
There are 167 entries of 3 bedroom.
fourbed <- project0 %>%
filter(bedroom_types == "4 bedroom")
view(fourbed)
There are 19 entries of 4 bedroom.
efficiency <- project0 %>%
filter(bedroom_types == "efficiency")
view(efficiency)
There are 97 entried of “efficiency” bedroom type.
ggplot(data = project0) +
geom_bar(mapping = aes(x = bedroom_types, fill=bedroom_types))+
ggtitle("Distribution of Bedroom Types")
The bar chart shows that one and two bedrooms are the most frequent bedroom types followed by efficiency; and the least are the 4-bedroom residences.
How does the average rent vary by bedroom type?
boxpl <- project0 %>%
ggplot() +
geom_boxplot(aes(y=average_rent_2018, group=bedroom_types,fill=bedroom_types))+
ggtitle("Average Rent 2018 by Bedroom Types")
boxpl
boxpl <- project0 %>%
ggplot() +
geom_boxplot(aes(y=average_rent_2019, group=bedroom_types,fill=bedroom_types))+
ggtitle("Average Rent 2019 by Bedroom Types")
boxpl
The rent pattern for 2019 is similar to that of 2018.
This data is from the Montgomery County datasets at this link: 2019 Rental Facility Occupancy Survey Results | Open Data Portal (montgomerycountymd.gov). It is data from a survey conducted in 2019 about residential rent in Montgomery County in Maryland in the period 2015 to 2019. It has the following categorical variables: community, address, bedroom types (1-bedroom, 2-bedroom, 3-bedroom, 4-bedroom and efficiency). And numeric variables: average rent in 2015, 2016, 2017, 2018 and 2019; and the percent difference in average rent between 2018 and 2019. The data has 1,100 entries. The cleaning of the data included converting variable names to lower case, as well as replacing word string variable names with single hyphenated words.
The 1,100 residences in the survey included 1-bedroom (n=404), 2-bedroom (n=413), 3-bedroom (n=167), 4-bedroom (n=19) and efficiency bedroom type (n=97). My main analysis question was, “how does the average rent vary by bedroom type?” I used the most recent data, 2018 and 2019, to answer this question. I plotted a boxplot for each year showing the range of average rent for each year by bedroom type. The box plots showed the same pattern in 2018 and 2019. The data showed that rent increased steadily from 1-bedroom, to 2-bedroom, to 3-bedroom, to 4-bedroom. The range for the rent for “efficiency” is almost similar to 1-bedroom. However, the rent for 4-bedroom has a wide range with the lower end overlaping with all the other bedroom types. This overlap came as a surprise to me. I would not have expected that in the same county one could get a 1-bedroom for the same rent as a 4-bedroom.
It would be interesting to explore possible reasons for this rent overlap through further analysis of this data. One reason for the overlap could be that rents in some area codes of the county are so high that some high-end 1-bedroom could have a rent similar to the lower-end 4-bedroom residences in some area codes with lower rents. This question could be answered by plotting the data by area code. Another reason for the overlap of rent between bedroom types could be the variation by community. It is possible that some residential communities are expensive while others are moderate, thus a 1-bedroom in one community could cost the same as a 4-bedroom in another community. Further exploration of the data could include examination of time trend. Indeed, the dataset has a variable, “Percent Change From Previous Year 2018-2019”. This analysis could reveal the pattern of rent increases between the years 2015 to 2019 thus suggesting a pattern that could be used to predict rents for subsequent years.