Presentation Instructions
Make a five-minute presentation on any chosen topic, preferably any topic from the current week’s chapter reading of Data Science for Business, or another topic that would be of interest to most. Do not just summarize the topic, go a little further, such as:
- Discussing one or more business use cases, or
- Showing a short example with R code (and perhaps relevant R package(s)), or
- Providing a curated “learning path” of on-line resources to build further expertise in that topic.
You may also record your presentation instead of presenting in our meetup. Making a presentation in front of a group is strongly encouraged but not required. Our primary focus is on writing R code related to getting and shaping data in preparation for downstream modeling and presentations.
library("DT")
library("tidyr")
library("dplyr")
library("stringr")
library("ggplot2")
NYC Mitchell-Lama Housing
NYC Housing
In many ways, the New York City real estate market is unlike any other in the United States. One of the biggest differences between NYC and Any-Other-Town, USA, is that apartments for sale in NYC are either condos or cooperatives. In most places, condos are the rule but not in the Big Apple. Although cooperatives outnumber condos in NYC - most estimates say about 75% are cooperatives - more condos are on the active market at any given moment. i
Housing Cooperative
Roughly 75 percent of the Manhattan housing inventory is comprised of cooperatives. Unlike a condo, cooperatives are owned by a corporation. This means, when you buy an apartment that is in a cooperative building, you are not actually buying real property (like you would in a condo). You are in fact, buying shares of the corporation. These shares entitle you to a proprietary lease, which relates your relationship to the building close to that of an investor, rather than a condo building, where you are the outright owner of your specific unit. ii
Mitchell-Lama Housing Program
In 1955, New York’s Governor signed into law a bill sponsored by State Senator MacNeill Mitchell and Assemblyman Alfred Lama to encourage and facilitate the construction and continued operation of affordable rental and cooperative housing in the State of New York. The law as we know it is the Mitchell-Lama Program. iii There are both New York City supervised Mitchell-Lama developments and New York State supervised Mitchell-Lama developments. iv
Mitchell-Lama Admission Preference for Veterans
Section 31 of the Private Housing Finance Law was amended effective September 12, 2010 to give Mitchell-Lama admission preference to all veterans, or their surviving spouses, who served on active duty in time of war, as defined in Section 85 of the Civil Service Law, and reside in New York State. v
Rent and Carrying Charge Increases
The housing company prepares the requisite petition, application, or motion for an increase in the maximum rental or carrying charges per room and submit same to HPD for approval as to form and authorization for hearing procedures. vi Proposed rent/carrying charge increases must be sufficient so that total income equals or exceeds total expenses.
Analysis
Income versus Expenses
inc_exp <- macro_data %>%
mutate(Net = Income - Expenses) %>%
mutate(Ratio = round(Income / Expenses, 2)) %>%
select(1,10,11,12,13)
display(inc_exp)
ggplot(filter(micro_data, Category == "Income"),
aes(Development, Amount, fill=Subcategory)) +
geom_bar(stat = "identity") +
labs(y = "Annual Income", title = "Income per Development") +
theme(plot.title = element_text(hjust = 0.5)) +
theme(axis.text.y = element_blank())

ggplot(filter(micro_data, Category == "Expenses"),
aes(Development, Amount, fill=Subcategory)) +
geom_bar(stat = "identity") +
labs(y = "Annual Expenses", title = "Expenses per Development") +
theme(plot.title = element_text(hjust = 0.5)) +
theme(axis.text.y = element_blank())

Annual Income Comparison
Standardized Income
ggplot(income_coop %>%
mutate(Mean = apply(.[, c(2:10)], 1, mean)) %>%
mutate(SD = apply(.[, c(2:10)], 1, sd)) %>%
mutate_each(funs((.- Mean) / SD), -c(1,11,12)) %>%
select(-c(11,12)) %>%
gather(Development, Amount, 2:10),
aes(x = Development, y = Description)) +
geom_tile(aes(fill = Amount)) +
scale_fill_gradient(low = 'black', high = 'green', name = "Standing") +
theme(axis.title.y = element_blank(),
legend.text = element_text(colour = "white"),
plot.title = element_text(hjust = 0.5)) +
labs(title = "Standardized Income Comparison")

Annual Expense Comparison
Standardized Expenses
ggplot(expenses_coop %>%
mutate(Mean = apply(.[, c(2:10)], 1, mean)) %>%
mutate(SD = apply(.[, c(2:10)], 1, sd)) %>%
mutate_each(funs((.- Mean) / SD), -c(1,11,12)) %>%
select(-c(11,12)) %>%
gather(Development, Amount, 2:10),
aes(x = Development, y = Description)) +
geom_tile(aes(fill = Amount)) +
scale_fill_gradient(low = 'black', high = 'red', name = "Standing") +
theme(axis.title.y = element_blank(),
legend.text = element_text(colour = "white"),
plot.title = element_text(hjust = 0.5)) +
labs(title = "Standardized Expense Comparison")

Monthly Income per Unit
income_apt <- micro_data %>% filter(Category == "Income") %>%
mutate(Amount = round(Amount / Apts / 12, 2)) %>% select(1,3,4,5)
display(income_apt %>% spread(Development, Amount))
ggplot(income_apt, aes(Development, Amount, fill=Subcategory)) +
geom_bar(stat = "identity") +
labs(y = "Monthly Income", title = "Income per Apartment Unit") +
theme(plot.title = element_text(hjust = 0.5))

cor((micro_data %>% filter(Category == "Income") %>%
mutate(Amount = round(Amount / Apts / 12, 2)) %>%
select(1,5,12) %>% group_by(Development) %>%
mutate(Amount = sum(Amount)) %>% unique())[, c(2,3)])[[2]]
## [1] 0.06978065
No correlation between size of coop (in units) and income (rent charged).
Monthly Expenses Per Unit
expenses_apt <- micro_data %>% filter(Category == "Expenses") %>%
mutate(Amount = round(Amount / Apts / 12, 2)) %>% select(1,3,4,5)
display(expenses_apt %>% spread(Development, Amount))
ggplot(expenses_apt, aes(Development, Amount, fill=Subcategory)) +
geom_bar(stat = "identity") +
labs(y = "Monthly Expenses", title = "Expenses per Apartment Unit") +
theme(plot.title = element_text(hjust = 0.5))

cor((micro_data %>% filter(Category == "Expenses") %>%
mutate(Amount = round(Amount / Apts / 12, 2)) %>%
select(1,5,12) %>% group_by(Development) %>%
mutate(Amount = sum(Amount)) %>% unique())[, c(2,3)])[[2]]
## [1] -0.01303967
No correlation between size of coop (in units) and expenses (maintenance costs).
Maintenance Comparison
maintenance <- data.frame(micro_data %>%
group_by(Category) %>%
mutate(Median_Maint = round(median(Apt_Maint), 2)) %>%
ungroup() %>% select(1,13,15) %>% unique() %>%
mutate(Dollar_Diff = round(Apt_Maint - Median_Maint, 2)) %>%
mutate(Percent_Diff = round((Apt_Maint - Median_Maint) / Median_Maint, 2)))
ggplot(maintenance, aes(x = Development, y = Apt_Maint, fill = Development)) +
geom_bar(stat="identity")+ coord_flip() +
labs(title = "Average Monthly Maintenance") +
theme(plot.title = element_text(hjust = 0.5)) +
theme(axis.title.x = element_blank()) +
theme(axis.title.y = element_blank()) +
theme(axis.text.y = element_blank())

display(maintenance %>%
mutate(Percent_Diff = paste0(round(Percent_Diff * 100, 2), "%")))
Increased Maintenance Comparison
increases <- data.frame(micro_data %>%
group_by(Category) %>%
mutate(Median_Maint = round(median(Inc_Maint), 2)) %>%
ungroup() %>% select(1,14,15) %>% unique() %>%
mutate(Dollar_Diff = round(Inc_Maint - Median_Maint, 2)) %>%
mutate(Percent_Diff = round((Inc_Maint - Median_Maint) / Median_Maint, 2)))
ggplot(increases, aes(x = Development, y = Inc_Maint, fill = Development)) +
geom_bar(stat="identity")+ coord_flip() +
labs(title = "Average Monthly Increased Maintenance") +
theme(plot.title = element_text(hjust = 0.5)) +
theme(axis.title.x = element_blank()) +
theme(axis.title.y = element_blank()) +
theme(axis.text.y = element_blank())

display(increases %>%
mutate(Percent_Diff = paste0(round(Percent_Diff * 100, 2), "%")))