Brian Grob and Ryan Stong

10/09/17

Applying Appropriate Packages

library(ggplot2)
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.4.2
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyr)
library(readxl)
## Warning: package 'readxl' was built under R version 3.4.2

Downloading the Data Set

library(readxl)
tallb=read_excel("tallestbuildings.xlsx")

1. Please display how many buildings are there in each city represented in that dataset. An arrangement in either an ascending or a descending order of number of buildings is always helpful to an eye.

Create data set that displays number of buildings in each city

Citycounts=tallb%>%group_by(City)%>%summarize(number=length(City))

Create bar graph with city name on the vertical axis and building count on horizontal axis

ggplot(Citycounts,aes(reorder(City,-number),number))+geom_bar(stat = "identity",fill="red")+coord_flip()

2. Please plot different cities in order of the mean height of buildings in a city.

Adjust column name to simplify further coding

colnames(tallb)[colnames(tallb)=="Height (ft)"] <- "Height"

Create data set that displays city against the mean height of the buildings in each city

Citymean=tallb%>%group_by(City)%>%summarize(number=mean(Height))

Create a bar graph with City name on the vertical axis and mean height of the the buildings in each city on the horizontal axis

ggplot(Citymean,aes(reorder(City,-number),number))+geom_bar(stat = "identity",fill="5")+coord_flip()

3 and 4. Please redo 1 and 2 using the country information that is given. (Note that the country variable is present with the city variable. Perhaps a split of that variable is necessary.) You may want to check out the countrycode package to get the full names of different countries instead of relying on the cryptic country codes that are present in that dataset.

Download country code package

library(countrycode)
## Warning: package 'countrycode' was built under R version 3.4.2

Create data set which seperates the city and country into two columns

newtallb=tallb %>% separate(City, c("City", "Country"), sep="[:punct:]")
## Warning: Too many values at 100 locations: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
## 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, ...

Apply country code package to our data set

newtallb$Country=countrycode(newtallb$Country, "iso2c", "country.name", warn = TRUE, custom_dict = NULL, custom_match = NULL, origin_regex = FALSE)

Create data set which displays number of buildings in each country

Countrycounts=newtallb%>%group_by(Country)%>%summarize(number=length(Country))

Create bar graph with country name on the vertical axis and number of buildings in each country on the horizontal axis

ggplot(Countrycounts,aes(reorder(Country,-number),number))+geom_bar(stat = "identity",fill="red")+coord_flip()

Create data set which displays the mean height of the buildings in each country

Countrymean=newtallb%>%group_by(Country)%>%summarize(number=mean(Height))

Create bar graph with country name on the vertical axis and mean height of the buildings in each country on the horizontal axis

ggplot(Countrymean,aes(reorder(Country,-number),number))+geom_bar(stat = "identity",fill="5")+coord_flip()

5. In 4 above, you would’ve plot different countries in order of their mean height of a buildings in a city. If you have not used a bar graph there, please create a bar graph. In this bar graph, please color each bar for a country based on the number of buildings from this dataset that are present in that country.

Create data set which displays the country name with the number of buildings in each country and the mean height of the buildings in each country

Countrymeancount=newtallb%>%group_by(Country)%>%summarize(Buildingcounts=n(), Countrymean=mean(Height))

Create a bar graph with country name on the vertical axis and mean height of the buildings in each country on the horizontal axis. Have each bar be colored on the basis of the building count of each country

ggplot(Countrymeancount,aes(reorder(Country,-Countrymean),Countrymean, fill=Buildingcounts))+geom_bar(stat = "identity", position="dodge")+coord_flip()+labs(list(title= "Countries ranked by the mean height of tall buildings", x="", y="Countrymean"))+theme_classic()

6. What are the mean heights (in feet) of buildings that are used for different purposes. (Here, you will have different purposes in a column and the corresponding mean height in a different column.) In computing this, it is okay to double or triple count a building if it has multiple uses.

Create a data set which displays the mean heights of the buildings based on their purpose

Usemean=newtallb%>%group_by(Use)%>%summarize(number=mean(Height))

Create a bar graph with purpose on the vertical axis and mean height of the buildings for each purpose on the horizontal axis

ggplot(Usemean,aes(reorder(Use,number),number))+geom_bar(stat = "identity",fill="gold")+coord_flip()