INTRODUCTION
As a Geographic Data Scientist for the city of Los Angeles I used various programmatic approaches in analyzing the data, including mapping socio-economic variables, querying Open Street Map Data, and conducting a Network Analysis within the Compton neighborhood.
## Reading layer `neighbourhoods (1)' from data source
## `C:\Users\james\OneDrive\Desktop\GY476_Data\SummativeAssesment\neighbourhoods (1).geojson'
## using driver `GeoJSON'
## Simple feature collection with 270 features and 2 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -118.9449 ymin: 33.29908 xmax: -117.6464 ymax: 34.82315
## Geodetic CRS: WGS 84
## Reading layer `LA_geometry' from data source
## `C:\Users\james\OneDrive\Desktop\GY476_Data\SummativeAssesment\Data_code\LA_geometry.gpkg'
## using driver `GPKG'
## Simple feature collection with 2498 features and 1 field (with 3 geometries empty)
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -118.9446 ymin: 32.80146 xmax: -117.6464 ymax: 34.8233
## Geodetic CRS: NAD83
PART A
I see that the LA geometry geopackage and the American Community Survey (ACS) data share a column called ‘GEOID’ which I can use to execute a table-join. Regarding the ACS data, all the columns have an alphanumeric heading and according to the Census.gov website, these characters correspond to a ‘Table ID’ which depict various socio-economic categories. This is useful in interpreting the data as combining these Table IDs gives me a clearer picture into the demographics of the data.
While examining the LA_geometry geopackage and Neighborhoods geojson, I can see that they both contain multi-polygonal features. These features will enable me to plot the basemap of LA, utilizing the bounding box minimums and maximums. A nice feature of the geojson is that it includes the neighborhood names which come in handy with data visualization.
Regarding the listings data from Airbnb, it’s a thorough table of housing listings that includes data such as the neighborhood of the house, minimum stay duration, price, and a latitude and longitude column. This column will be crucial in transforming this .csv file into a simple feature enabling me to plot it. However, a limitation of this data is that some of the listing prices have a ‘null’ value which is not very helpful in plotting where the most or least expensive pricings are located. Another limitation of the data is that some of the prices are extremely high or low. This will distort the map and give the viewer an unclear picture as to what the situation is really like in LA. As a result, since the mean will essentially rise due to these outliers, I will run a summary() function for the listings data to get the median price of the other listings.
The Coordinate Reference System I will use is the California State Plane Zone 5, or EPSG:2229. It covers southern California, which includes LA and using this CRS will allow me to accurately portray the geo-data without fear of major distortion or the data being plotted in the wrong locations.
In Figure 1, entire homes or apartments are shown to be more expensive than just renting a single room in a private residence or a hotel. There appears to be a higher count of entire homes or apartments for rent rather than individual homes. This could be due to property owners trying to maximize their profits in offering a higher price for a more attractive offer of an entire home.
Geographically, the western peninsula and central areas of LA contain the most expensive properties. Wachshmuth et al. argues “that Airbnb has introduced a new potential revenue flow into housing markets which is systematic but geographically uneven…” (2018). We can see the geographic unevenness from Figure 1 quite clearly in that the majority of the Airbnb properties for rent are located in the south central areas. Properties are sparsely located in the northern section. The paper goes on to explain how this drives consumer preferences in Airbnb properties and how consumers gravitate towards properties in more affluent or culturally important areas (Wachsmuth et al., 2018). You can also see there is a distinct difference in property concentration between coastal and inland areas. As aforementioned, people will prioritize having a nice scenic view, closer to tourist destinations, rather than a more inland property that may potentially lack the aesthetic vistas.
I made the decision to exclude extreme outliers due to those property prices skewing how the data would have looked. It would have made the colors hard to differentiate due to those prices dominating the map. Instead, I focused on the properties that a normal renter might be interested in, thus the 100 to 500 dollar range. Additionally, I thought using a simple point map would be most appropriate because it shows precisely where the properties are while also showing the trend in point clustering. This allows you to clearly see price variation within certain neighborhoods as well. I chose the sequential color scheme because according to Harrower & Brewer (2003), this is best practice for a continuous data set.
Looking at Figure 2, we can see where the most expensive Airbnb properties are located within census tracts. To the west toward Malibu, in some central pockets, and towards the north, properties are more expensive on a nightly rate. In a paper examining Airbnb consumer preferences, renters of Airbnb properties tend to “prefer to be closer to tourist attractions and property-owner recommendations” (Sánchez-Franco et al., 2023). These expensive areas contain some of LA’s most sought after experiences by tourists including beach access, Hollywood, and the surrounding wild nature of LA. You can see the subtle change of the gradient going from coastal to inland areas.
Despite there being small pockets with extremely expensive Airbnb properties within the central area, there are also properties that offer a more reasonable rate. According to Islam et al., in a paper that examines the Airbnb price model, “[a]n accurate valuation model of new host listing prices is desired by both owners and renters to trade-off between owner profit and customer satisfaction” (2022). I dug deeper and found out that within the same paper they explain that “…the number of bedrooms, accommodations, property types, and the total number of reviews positively influence the listing price” (Islam et al., 2022). Additionally “…attributes like location and size clearly matter greatly, and are taken into account by Airbnb hosts when setting prices” (Chris Gibbs et al., 2017).
For the data classification of Figure 2, it’s important to point out that 233 census tracts show no data. I chose to portray it this way to differentiate them from the low priced areas.
A choropleth map is appropriate for Figure 2 because I am showing continuous data over entire neighborhoods. This is useful for higher level analysis, especially for urban and city planners. This map would allow them to combine other demographic data as well such as race/ethnicity, income, or age of the residents within the census tract.
I chose the median income and poverty rate variables to see if there are any geographic correlations between them. You see that towards the coastal areas in the west, and in some northern pockets, there seems to be a correlation between poverty rates being low and median incomes being high. This is shown most drastically in the west. Towards Malibu and Santa Monica, we can see this is where the highest salaries are and where the lowest amount of poverty is. In the very south-central tip of the city is where it appears to be the most impoverished and also has the lowest incomes.
These maps shows a significant disparity between the two variables. The highest poverty is located within the central and southern areas whereas the highest incomes occur towards the west and north. As Wachsmuth et al. put plainly, “there is fire to go with this smoke” (2018). They explain that Airbnb’s rent gap is “culturally mediated…real economic activity only exists in areas where there is strong extra-local tourism demand” (2018). This ties in exactly with what Islam et al. were explaining with consumer and tourism preferences. These higher income areas with Airbnb listings will attract more economic activity in the form of rental tenants than lower income areas due to the attractive areas. This points toward an unequal distribution in tourism attractions and general activities that Airbnb tenants might find interesting.
Regarding data quality, the ACS data is robust and provides an advantage in covering all of the census tracts. It contains detailed socio-economic variables like the ones I chose. Capping the poverty rate at 50% would also prevent drastic outliers, similar to excluding the outliers in Figure 1. Regarding the colors I used, I wanted to portray as stark of a contrast as possible to drive home the inequality that the map viewer would see.
On this bivariate map, it shows the geographic relationship between household incomes and Airbnb prices. The bottom right shows four quadrants that split the data extremes into low income and low property prices, high income and low property prices, low income and high property prices, and finally high income and high property prices.
There is a positive correlation between high incomes and high Airbnb propety prices. What I like about the bivariate map is that it shows where similar relationships occur. Because of this, it can show policy makers exactly where they need to focus their efforts in curbing potential gentrification. You see the same pattern as in the previous two figures in that there is a gradual decrease in incomes and property prices as you move away from the coast and further inland. Further east of LA, you can see where higher incomes are more prominent than property prices. More centrally, we can see clusters of higher priced Airbnb rentals and lower incomes. There are also large swaths of some of the biggest census tracts that show the highest price and incomes. This is due to there only being one or a few Airbnb property listings in that area.
Perhaps the most interesting data point that this map reveals are the dark red areas. These locations indicate that there are cheaper Airbnb prices but extremely high incomes. According to a paper by Keren Horn and Mark Merante, “an owner…with the hope of earning extra income…may…decide to post [their room] on Airbnb rather than renting out the second bedroom, thereby impacting rental housing supply” (2017). This area may be filled with this exact type of homeowner, one who is already earning a hefty income but wants to earn more by renting out a second room. The low nightly rate may be due to attractive more tenants to their areas, even if they lack touristic features.