Since 2008, guests and hosts have used Airbnb to travel in a more unique, personalized way. This analysis describes the trends and overview of homestays in Boston, MA.
Problem Statement:
Want to analyse vibes of homestays through AirBnB in Boston, MA. That includes catching up on the neighbourhood sentiments, analysing listed properties by their types, the concept of a Superhost in Airbnb, and words that describe the expensive listings.
Implementation:
The data was scraped and manipulated accordingly for the analysis. The data was then reviewed graphically to determine what is the general vibe in the neighbourhood.
Summary:
The analysis show that overall there is a Positive vibe from the listings at Boston, MA. Other detailed insights have been summarised in the last section.
Following are the packages required with their use:
tidytext = allows conversion of text to and from tidy formats
DT = HTML display of data
tidyverse = Allows for data manipulation and works in harmony with other packages as well
stringr = String operations
magrittr = pipe operator in r programming
leaflet = leaflet maps in r
ggplot2 = graphical representation in r
dplyr = data manipulation in r
tm = for text mining
wordcloud = for word cloud generator
ggmap = visualization by combining the spatial information of static maps from Google Maps
library(tidytext)
library(DT)
library(tm)
library(wordcloud)
library(tidyverse)
library(stringr)
library(magrittr)
library(leaflet)
library(ggplot2)
library(ggmap)
library(dplyr)
Explanation of data source: The original purpose of the data was to show people that how AirBnB is really being used and is affecting their neighbourhood. By analyzing publicly available information about a city’s Airbnb’s listings, Inside Airbnb provides filters and key metrics so people can see how Airbnb is being used to compete with the residential housing market. The data was posted on 7th September 2016 on their website. The original data set had 3585 rows and 95 variables (columns). There are quite a few number of missing values in the dataset. But they have been left blank, or in other words have not been imputed in any form (manipulated in some ways for later use). If the data does not exist it is either marked with NA or it is a blank space filled string, which has been taken care of in data cleaning.
Below is the HTML scrollable format of the data. As the data description in character type columns is too big, the table needs to be scrolled properly i.e. left-right and up-down to view the data. Each and every row is present along with all the columns. We can filter out some specific variables which we dont want to see using the clickable button “Column Visibility”. There is also a Search bar given on top of the table.
myfile <- 'https://raw.githubusercontent.com/ishantnayer/Rfiles/master/listings.csv'
listing_original<- read.csv(myfile)
datatable(listing_original ,extensions = 'Buttons', options = list(dom = 'Bfrtip', buttons = I('colvis')))