Many of you in class have expressed interest in learning R which I highly encourage! Unfortunately, we don’t have time to teach R in this course. Instead, I’m going to start sharing short blog posts like this one to explain some examples of my own R work that’s relevant to this class.

For the AirBnB project, our class did two rounds of round-robin presentations followed by peer feedback. Overall, evaluations improved slightly from round 1 (R1) to round 2 (R2) which is great! Unfortunately, that slight growth was split with half of the class improving and half of the class declining.

This blog post will cover reading the data, manipulating the data to calculate change over time, and visualization of the data using base R, dplyr, and ggplot2

Reading the Data and Setting Up

First we need to read in a few R packages and set our workind directory. The working directory is where R looks first to find the data we’re using. Setting this isn’t totally necessary, but it helps! Next we’ll load a few R packages. Packages can be thought of as add-ons to the basic syntax in R. People who figure out new or better ways to do something that R does can publish those processes for us to use to make our own analysis easier. Finally, we’ll read in the csv file from our working directory and print the first few rows.

# Set the document to display code and outputs
knitr::opts_chunk$set(echo=TRUE)
# Specify the folder in your computer where  this data lives (this will be different for you)
setwd('/Users/joshyazman/Desktop/AirBnB Project Feedback Visualization/')
# Load up the relevant R packages
library(ggplot2)
library(dplyr)

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union
# The read.csv() function is how we connect R to our data
reviews <- read.csv('AirBnB Project Feeback (Responses) - Anonymized Responses.csv')
# Print the top few lines of review data
head(reviews)

Data Manipulation

Now we need to prepare the data for visualization. In our case that’s we need to calculate the overall score for each review and then average of that over all score for each student in each round of presentations. Here we’ll use the %>% operator to take whatever object is to the left of the symbol and use it as an input in the function following it. For example, if a function requires a data table as an input, it might look like this: function(table). But we could also write that as table%>%function(). The rest of this section strings together dplyr syntax to create the table I’m looking for.

reviews_manipulated <- reviews%>%
  # Mutate allows us to create additional columns of data
  mutate(overall = (How.effective.was.the.presenter.s.argument.+ 
                   Please.rate.the.organization.of.the.presentation.+
                   Please.rate.the.visuals.included.in.the.presentation.)/3)%>%
  # group_by is dplyr's function for aggregating a table (like specifying a row variable in a pivot table)
  group_by(Name.of.Presenter, Presentation.Round)%>%
  # summarise() is like specifying the value in a pivot table
  summarise(mean_score = mean(overall))

We can tell if our code worked by printing out the first few rows of the resulting table.

print(head(reviews_manipulated))

Visualizing The Data

There are a number of ways we could display the data we’ve calculated, but I’ve decided to use a line graph with points for each round of presentations. Essentially, I need to layer a line graph on top of a scatter plot with a separate line for each student.

# The first step in ggplot is to set some characteristics that will apply to the whole graph. Then we'll add new features as we go!
# We'll start with setting the x and y axis and specifying the dataset we want to use
visual <- ggplot(data = reviews_manipulated, aes(x = as.character(paste0('Round ',Presentation.Round)), 
                                       y = mean_score, group = Name.of.Presenter))+
  # Add lines that connect the dots (the alpha= argument makes the lines lighter or darker)
  geom_line(alpha = .7, size = 1)+
  # Add points and set some characteristics of those points
  geom_point(size = 2, color = '#4581b2')+
  # Specify the axis labels
  labs(y = 'Average Overall Score',
       x = element_blank(),
       title = 'Change in Scores from Round 1 to Round 2',
       subtitle = 'Data comes from peer reviews of class project presentations in GA\'s analytics course')
# Call the object to display it
visual

I think it’s pretty cool to see how some of these scores improved from Round 1 to Round 2. Conversely, there were some presentations that didn’t do so well in the second round. Maybe since presentations were given to different groups of people in each round, the reviewers were tougher or easier for some people from round to round. Another possible explanation is that the time limit for presentations changes from 5 minutes in R1 to 3 minutes in R2 and maybe that explains some of the change.

Conclusion

Overall, presentations went really well for a first data analysis presentation. You all worked really hard on these projects and did some great work. But when you go read over your review scores, make sure you’re thinking about whether your score went up or down and try to think through why that happened so you can improve for the next round. I hope you find this information useful and maybe learn a little R and a little about the data analysis process from reading this. Please feel free to reach out to Luke or me if you have any questions!

LS0tCnRpdGxlOiAiTWFwcGluZyBQcmVzZW50YXRpb24gSW1wcm92ZW1lbnQgRnJvbSBSb3VuZCAxIHRvIFJvdW5kIDIiCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCk1hbnkgb2YgeW91IGluIGNsYXNzIGhhdmUgZXhwcmVzc2VkIGludGVyZXN0IGluIGxlYXJuaW5nIGBSYCB3aGljaCBJIGhpZ2hseSBlbmNvdXJhZ2UhIFVuZm9ydHVuYXRlbHksIHdlIGRvbid0IGhhdmUgdGltZSB0byB0ZWFjaCBgUmAgaW4gdGhpcyBjb3Vyc2UuIEluc3RlYWQsIEknbSBnb2luZyB0byBzdGFydCBzaGFyaW5nIHNob3J0IGJsb2cgcG9zdHMgbGlrZSB0aGlzIG9uZSB0byBleHBsYWluIHNvbWUgZXhhbXBsZXMgb2YgbXkgb3duIGBSYCB3b3JrIHRoYXQncyByZWxldmFudCB0byB0aGlzIGNsYXNzLgoKRm9yIHRoZSBBaXJCbkIgcHJvamVjdCwgb3VyIGNsYXNzIGRpZCB0d28gcm91bmRzIG9mIHJvdW5kLXJvYmluIHByZXNlbnRhdGlvbnMgZm9sbG93ZWQgYnkgcGVlciBmZWVkYmFjay4gT3ZlcmFsbCwgZXZhbHVhdGlvbnMgaW1wcm92ZWQgc2xpZ2h0bHkgZnJvbSByb3VuZCAxIChSMSkgdG8gcm91bmQgMiAoUjIpIHdoaWNoIGlzIGdyZWF0ISBVbmZvcnR1bmF0ZWx5LCB0aGF0IHNsaWdodCBncm93dGggd2FzIHNwbGl0IHdpdGggaGFsZiBvZiB0aGUgY2xhc3MgaW1wcm92aW5nIGFuZCBoYWxmIG9mIHRoZSBjbGFzcyBkZWNsaW5pbmcuIAoKVGhpcyBibG9nIHBvc3Qgd2lsbCBjb3ZlciByZWFkaW5nIHRoZSBkYXRhLCBtYW5pcHVsYXRpbmcgdGhlIGRhdGEgdG8gY2FsY3VsYXRlIGNoYW5nZSBvdmVyIHRpbWUsIGFuZCB2aXN1YWxpemF0aW9uIG9mIHRoZSBkYXRhIHVzaW5nIGBiYXNlIFJgLCBgZHBseXJgLCBhbmQgYGdncGxvdDJgCgojIyBSZWFkaW5nIHRoZSBEYXRhIGFuZCBTZXR0aW5nIFVwCkZpcnN0IHdlIG5lZWQgdG8gcmVhZCBpbiBhIGZldyBgUmAgcGFja2FnZXMgYW5kIHNldCBvdXIgd29ya2luZCBkaXJlY3RvcnkuIFRoZSB3b3JraW5nIGRpcmVjdG9yeSBpcyB3aGVyZSBgUmAgbG9va3MgZmlyc3QgdG8gZmluZCB0aGUgZGF0YSB3ZSdyZSB1c2luZy4gU2V0dGluZyB0aGlzIGlzbid0IHRvdGFsbHkgbmVjZXNzYXJ5LCBidXQgaXQgaGVscHMhIE5leHQgd2UnbGwgbG9hZCBhIGZldyBgUmAgcGFja2FnZXMuIFBhY2thZ2VzIGNhbiBiZSB0aG91Z2h0IG9mIGFzIGFkZC1vbnMgdG8gdGhlIGJhc2ljIHN5bnRheCBpbiBgUmAuIFBlb3BsZSB3aG8gZmlndXJlIG91dCBuZXcgb3IgYmV0dGVyIHdheXMgdG8gZG8gc29tZXRoaW5nIHRoYXQgYFJgIGRvZXMgY2FuIHB1Ymxpc2ggdGhvc2UgcHJvY2Vzc2VzIGZvciB1cyB0byB1c2UgdG8gbWFrZSBvdXIgb3duIGFuYWx5c2lzIGVhc2llci4gRmluYWxseSwgd2UnbGwgcmVhZCBpbiB0aGUgY3N2IGZpbGUgZnJvbSBvdXIgd29ya2luZyBkaXJlY3RvcnkgYW5kIHByaW50IHRoZSBmaXJzdCBmZXcgcm93cy4gCgpgYGB7ciBzZXR1cH0KIyBTZXQgdGhlIGRvY3VtZW50IHRvIGRpc3BsYXkgY29kZSBhbmQgb3V0cHV0cwprbml0cjo6b3B0c19jaHVuayRzZXQoZWNobz1UUlVFKQojIFNwZWNpZnkgdGhlIGZvbGRlciBpbiB5b3VyIGNvbXB1dGVyIHdoZXJlICB0aGlzIGRhdGEgbGl2ZXMgKHRoaXMgd2lsbCBiZSBkaWZmZXJlbnQgZm9yIHlvdSkKc2V0d2QoJy9Vc2Vycy9qb3NoeWF6bWFuL0Rlc2t0b3AvQWlyQm5CIFByb2plY3QgRmVlZGJhY2sgVmlzdWFsaXphdGlvbi8nKQoKIyBMb2FkIHVwIHRoZSByZWxldmFudCBSIHBhY2thZ2VzCmxpYnJhcnkoZ2dwbG90MikKbGlicmFyeShkcGx5cikKCiMgVGhlIHJlYWQuY3N2KCkgZnVuY3Rpb24gaXMgaG93IHdlIGNvbm5lY3QgUiB0byBvdXIgZGF0YQpyZXZpZXdzIDwtIHJlYWQuY3N2KCdBaXJCbkIgUHJvamVjdCBGZWViYWNrIChSZXNwb25zZXMpIC0gQW5vbnltaXplZCBSZXNwb25zZXMuY3N2JykKCiMgUHJpbnQgdGhlIHRvcCBmZXcgbGluZXMgb2YgcmV2aWV3IGRhdGEKaGVhZChyZXZpZXdzKQpgYGAKCiMjIERhdGEgTWFuaXB1bGF0aW9uCk5vdyB3ZSBuZWVkIHRvIHByZXBhcmUgdGhlIGRhdGEgZm9yIHZpc3VhbGl6YXRpb24uIEluIG91ciBjYXNlIHRoYXQncyB3ZSBuZWVkIHRvIGNhbGN1bGF0ZSB0aGUgb3ZlcmFsbCBzY29yZSBmb3IgZWFjaCByZXZpZXcgYW5kIHRoZW4gYXZlcmFnZSBvZiB0aGF0IG92ZXIgYWxsIHNjb3JlIGZvciBlYWNoIHN0dWRlbnQgaW4gZWFjaCByb3VuZCBvZiBwcmVzZW50YXRpb25zLiBIZXJlIHdlJ2xsIHVzZSB0aGUgYCU+JWAgb3BlcmF0b3IgdG8gdGFrZSB3aGF0ZXZlciBvYmplY3QgaXMgdG8gdGhlIGxlZnQgb2YgdGhlIHN5bWJvbCBhbmQgdXNlIGl0IGFzIGFuIGlucHV0IGluIHRoZSBmdW5jdGlvbiBmb2xsb3dpbmcgaXQuIEZvciBleGFtcGxlLCBpZiBhIGZ1bmN0aW9uIHJlcXVpcmVzIGEgZGF0YSB0YWJsZSBhcyBhbiBpbnB1dCwgaXQgbWlnaHQgbG9vayBsaWtlIHRoaXM6IGBmdW5jdGlvbih0YWJsZSlgLiBCdXQgd2UgY291bGQgYWxzbyB3cml0ZSB0aGF0IGFzIGB0YWJsZSU+JWZ1bmN0aW9uKClgLiBUaGUgcmVzdCBvZiB0aGlzIHNlY3Rpb24gc3RyaW5ncyB0b2dldGhlciBgZHBseXJgIHN5bnRheCB0byBjcmVhdGUgdGhlIHRhYmxlIEknbSBsb29raW5nIGZvci4gCgpgYGB7ciBkYXRhLW1hbmlwdWxhdGlvbn0KcmV2aWV3c19tYW5pcHVsYXRlZCA8LSByZXZpZXdzJT4lCiAgIyBNdXRhdGUgYWxsb3dzIHVzIHRvIGNyZWF0ZSBhZGRpdGlvbmFsIGNvbHVtbnMgb2YgZGF0YQogIG11dGF0ZShvdmVyYWxsID0gKEhvdy5lZmZlY3RpdmUud2FzLnRoZS5wcmVzZW50ZXIucy5hcmd1bWVudC4rIAogICAgICAgICAgICAgICAgICAgUGxlYXNlLnJhdGUudGhlLm9yZ2FuaXphdGlvbi5vZi50aGUucHJlc2VudGF0aW9uLisKICAgICAgICAgICAgICAgICAgIFBsZWFzZS5yYXRlLnRoZS52aXN1YWxzLmluY2x1ZGVkLmluLnRoZS5wcmVzZW50YXRpb24uKS8zKSU+JQogICMgZ3JvdXBfYnkgaXMgZHBseXIncyBmdW5jdGlvbiBmb3IgYWdncmVnYXRpbmcgYSB0YWJsZSAobGlrZSBzcGVjaWZ5aW5nIGEgcm93IHZhcmlhYmxlIGluIGEgcGl2b3QgdGFibGUpCiAgZ3JvdXBfYnkoTmFtZS5vZi5QcmVzZW50ZXIsIFByZXNlbnRhdGlvbi5Sb3VuZCklPiUKICAjIHN1bW1hcmlzZSgpIGlzIGxpa2Ugc3BlY2lmeWluZyB0aGUgdmFsdWUgaW4gYSBwaXZvdCB0YWJsZQogIHN1bW1hcmlzZShtZWFuX3Njb3JlID0gbWVhbihvdmVyYWxsKSkKYGBgCgpXZSBjYW4gdGVsbCBpZiBvdXIgY29kZSB3b3JrZWQgYnkgcHJpbnRpbmcgb3V0IHRoZSBmaXJzdCBmZXcgcm93cyBvZiB0aGUgcmVzdWx0aW5nIHRhYmxlLgpgYGB7cn0KcHJpbnQoaGVhZChyZXZpZXdzX21hbmlwdWxhdGVkKSkKYGBgCgojIFZpc3VhbGl6aW5nIFRoZSBEYXRhClRoZXJlIGFyZSBhIG51bWJlciBvZiB3YXlzIHdlIGNvdWxkIGRpc3BsYXkgdGhlIGRhdGEgd2UndmUgY2FsY3VsYXRlZCwgYnV0IEkndmUgZGVjaWRlZCB0byB1c2UgYSBsaW5lIGdyYXBoIHdpdGggcG9pbnRzIGZvciBlYWNoIHJvdW5kIG9mIHByZXNlbnRhdGlvbnMuIEVzc2VudGlhbGx5LCBJIG5lZWQgdG8gbGF5ZXIgYSBsaW5lIGdyYXBoIG9uIHRvcCBvZiBhIHNjYXR0ZXIgcGxvdCB3aXRoIGEgc2VwYXJhdGUgbGluZSBmb3IgZWFjaCBzdHVkZW50LgoKYGBge3IgdmlzdWFsLCB9CiMgVGhlIGZpcnN0IHN0ZXAgaW4gZ2dwbG90IGlzIHRvIHNldCBzb21lIGNoYXJhY3RlcmlzdGljcyB0aGF0IHdpbGwgYXBwbHkgdG8gdGhlIHdob2xlIGdyYXBoLiBUaGVuIHdlJ2xsIGFkZCBuZXcgZmVhdHVyZXMgYXMgd2UgZ28hCiMgV2UnbGwgc3RhcnQgd2l0aCBzZXR0aW5nIHRoZSB4IGFuZCB5IGF4aXMgYW5kIHNwZWNpZnlpbmcgdGhlIGRhdGFzZXQgd2Ugd2FudCB0byB1c2UKdmlzdWFsIDwtIGdncGxvdChkYXRhID0gcmV2aWV3c19tYW5pcHVsYXRlZCwgYWVzKHggPSBhcy5jaGFyYWN0ZXIocGFzdGUwKCdSb3VuZCAnLFByZXNlbnRhdGlvbi5Sb3VuZCkpLCAKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgeSA9IG1lYW5fc2NvcmUsIGdyb3VwID0gTmFtZS5vZi5QcmVzZW50ZXIpKSsKICAjIEFkZCBsaW5lcyB0aGF0IGNvbm5lY3QgdGhlIGRvdHMgKHRoZSBhbHBoYT0gYXJndW1lbnQgbWFrZXMgdGhlIGxpbmVzIGxpZ2h0ZXIgb3IgZGFya2VyKQogIGdlb21fbGluZShhbHBoYSA9IC43LCBzaXplID0gMSkrCiAgIyBBZGQgcG9pbnRzIGFuZCBzZXQgc29tZSBjaGFyYWN0ZXJpc3RpY3Mgb2YgdGhvc2UgcG9pbnRzCiAgZ2VvbV9wb2ludChzaXplID0gMiwgY29sb3IgPSAnIzQ1ODFiMicpKwogICMgU3BlY2lmeSB0aGUgYXhpcyBsYWJlbHMKICBsYWJzKHkgPSAnQXZlcmFnZSBPdmVyYWxsIFNjb3JlJywKICAgICAgIHggPSBlbGVtZW50X2JsYW5rKCksCiAgICAgICB0aXRsZSA9ICdDaGFuZ2UgaW4gU2NvcmVzIGZyb20gUm91bmQgMSB0byBSb3VuZCAyJywKICAgICAgIHN1YnRpdGxlID0gJ0RhdGEgY29tZXMgZnJvbSBwZWVyIHJldmlld3Mgb2YgY2xhc3MgcHJvamVjdCBwcmVzZW50YXRpb25zIGluIEdBXCdzIGFuYWx5dGljcyBjb3Vyc2UnKQoKIyBDYWxsIHRoZSBvYmplY3QgdG8gZGlzcGxheSBpdAp2aXN1YWwKYGBgCgpJIHRoaW5rIGl0J3MgcHJldHR5IGNvb2wgdG8gc2VlIGhvdyBzb21lIG9mIHRoZXNlIHNjb3JlcyBpbXByb3ZlZCBmcm9tIFJvdW5kIDEgdG8gUm91bmQgMi4gQ29udmVyc2VseSwgdGhlcmUgd2VyZSBzb21lIHByZXNlbnRhdGlvbnMgdGhhdCBkaWRuJ3QgZG8gc28gd2VsbCBpbiB0aGUgc2Vjb25kIHJvdW5kLiBNYXliZSBzaW5jZSBwcmVzZW50YXRpb25zIHdlcmUgZ2l2ZW4gdG8gZGlmZmVyZW50IGdyb3VwcyBvZiBwZW9wbGUgaW4gZWFjaCByb3VuZCwgdGhlIHJldmlld2VycyB3ZXJlIHRvdWdoZXIgb3IgZWFzaWVyIGZvciBzb21lIHBlb3BsZSBmcm9tIHJvdW5kIHRvIHJvdW5kLiBBbm90aGVyIHBvc3NpYmxlIGV4cGxhbmF0aW9uIGlzIHRoYXQgdGhlIHRpbWUgbGltaXQgZm9yIHByZXNlbnRhdGlvbnMgY2hhbmdlcyBmcm9tIDUgbWludXRlcyBpbiBSMSB0byAzIG1pbnV0ZXMgaW4gUjIgYW5kIG1heWJlIHRoYXQgZXhwbGFpbnMgc29tZSBvZiB0aGUgY2hhbmdlLiAKCiMjIENvbmNsdXNpb24KT3ZlcmFsbCwgcHJlc2VudGF0aW9ucyB3ZW50IHJlYWxseSB3ZWxsIGZvciBhIGZpcnN0IGRhdGEgYW5hbHlzaXMgcHJlc2VudGF0aW9uLiBZb3UgYWxsIHdvcmtlZCByZWFsbHkgaGFyZCBvbiB0aGVzZSBwcm9qZWN0cyBhbmQgZGlkIHNvbWUgZ3JlYXQgd29yay4gQnV0IHdoZW4geW91IGdvIHJlYWQgb3ZlciB5b3VyIHJldmlldyBzY29yZXMsIG1ha2Ugc3VyZSB5b3UncmUgdGhpbmtpbmcgYWJvdXQgd2hldGhlciB5b3VyIHNjb3JlIHdlbnQgdXAgb3IgZG93biBhbmQgdHJ5IHRvIHRoaW5rIHRocm91Z2ggd2h5IHRoYXQgaGFwcGVuZWQgc28geW91IGNhbiBpbXByb3ZlIGZvciB0aGUgbmV4dCByb3VuZC4gSSBob3BlIHlvdSBmaW5kIHRoaXMgaW5mb3JtYXRpb24gdXNlZnVsIGFuZCBtYXliZSBsZWFybiBhIGxpdHRsZSBgUmAgYW5kIGEgbGl0dGxlIGFib3V0IHRoZSBkYXRhIGFuYWx5c2lzIHByb2Nlc3MgZnJvbSByZWFkaW5nIHRoaXMuIFBsZWFzZSBmZWVsIGZyZWUgdG8gcmVhY2ggb3V0IHRvIEx1a2Ugb3IgbWUgaWYgeW91IGhhdmUgYW55IHF1ZXN0aW9ucyE=