Carvana Sentiment Analysis

Carvana Reviews Sentiment Analysis

On Carvana they don’t have a typical review or rating for a particular seller of the car due to the mass volume of cars alongside of different people selling cars. But since someone like you or I could sell our car to Carvana for them to sell at a price they deemed based on factors they decided people are able to leave reviews. These reviews are about the overall process and can be sorted in six different ways, most relevant, most helpful, highest to lowest, lowest to highest rating, and most recent.

For this Assignment I wanted to engage in three question:

  1. Are the people who have purchased cars from Carvana as speaking more positively or more negatively about the process of buying the car as a whole?

  2. On what day of the week are those who purchased from Carvana been most happy with the car buying process? What day of the week have they been the most dissatisfied?

  3. For 3 and 5 star reviews what are the most common words that people are using that separate them enjoying the process vs not enjoying the process?

Collecting the Data

For collecting the cars, I just used the recommended tab and decided to scrape and collect For collecting the reviews I wanted to keep it very similar throughout so for all of the questions listed above I will be drawing from the most recent section of the Carvana reviews as that gives myself the best opportunity to look at various reviews ranging from 1 to 5 star and everything in between. Along with trying to collect a data component of this, I decided to do my analysis on the first 20 pages of Carvana as that allows for me to see reviews from earlier today ranging to this past week or 7 days ago allowing for me to draw some conclusions on how people feel about the car buying process and it being fully online.

On Carvana it is very odd as far as the length of reviews per page they host about 8 on the first page and then about 30 on every page after that but when running and scraping the reviews on the website. It only allows you to get about 8 reviews from a page. I thought this was a result of the website only having 8 reviews on the first page but whether you include that or not it always gives you 8. So for the sake of looking at Carvana reviews, I kept this consistent so to accommodate for this I did the first 20 pages to get a total of 160 unique reviews.

Reviews In R

If you wanted to see how I went about getting the reviews as far as the code that I went and used you can see that below.

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidytext)
library(gutenbergr)
library(ggwordcloud)
library(textdata)
library(rvest)

Attaching package: 'rvest'

The following object is masked from 'package:readr':

    guess_encoding
library(httr)

Attaching package: 'httr'

The following object is masked from 'package:textdata':

    cache_info
library(chromote)



set_config(user_agent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36"))

Carvana_Reviews_Url <-
  read_html_live("https://www.carvana.com/reviews?bvstate=pg:1/ct:r")

carvana_review_scrape <- function(url) {

#Elements of a review 
  Stars <-
    Carvana_Reviews_Url %>% 
    html_elements("div.bv-content-header-meta") %>% 
    html_elements("span.bv-off-screen") %>% 
    html_text2()



Time_stamp <-
  Carvana_Reviews_Url %>% 
  html_elements("div.bv-content-header-meta") %>%
  html_elements("span.bv-content-datetime-stamp") %>% 
  html_text2() 


Review_Paragraph <-
  Carvana_Reviews_Url %>% 
  html_elements("div.bv-content-summary-body-text") %>% 
  html_text2()

Review_Title <-
  Carvana_Reviews_Url %>% 
  html_elements("div.bv-content-container") %>% 
  html_elements("h3.bv-content-title") %>% 
  html_text2() 

carvana_review_df <-
  data.frame(Stars, Time_stamp, Review_Title, Review_Paragraph)


return(carvana_review_df) 
}


pages <-
  c("https://www.carvana.com/reviews?bvstate=pg:1/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:2/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:3/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:4/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:5/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:6/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:7/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:8/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:9/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:10/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:11/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:12/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:13/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:14/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:15/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:16/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:17/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:18/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:19/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:20/ct:r"
  )


scrape_carvana_reviews_pages <- function(urls) {
  carvana_review_pages <- data.frame()
  
  for (i in seq_along(urls)) {
    print(paste("Collecting page", i, "of",length(urls), ":)", sep = ""))
    Sys.sleep(runif(1,5,15))
    
    carvana_review_pages <-
      carvana_review_scrape(urls[i]) %>% 
      mutate(page_id = urls[i]) %>% 
      bind_rows(carvana_review_pages)
    print(paste(urls[i], "collected", sep = " "))
    print(paste(nrow(carvana_review_pages), "total reviews collected so far!", sep = " "))
  }
  return(carvana_review_pages)
}

test_reviews <-
  scrape_carvana_reviews_pages("https://www.carvana.com/reviews?bvstate=pg:3/ct:r")
[1] "Collecting page1of1:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:3/ct:r collected"
[1] "8 total reviews collected so far!"
Carvana_reviews <- 
  scrape_carvana_reviews_pages(pages) 
[1] "Collecting page1of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:1/ct:r collected"
[1] "8 total reviews collected so far!"
[1] "Collecting page2of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:2/ct:r collected"
[1] "16 total reviews collected so far!"
[1] "Collecting page3of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:3/ct:r collected"
[1] "24 total reviews collected so far!"
[1] "Collecting page4of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:4/ct:r collected"
[1] "32 total reviews collected so far!"
[1] "Collecting page5of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:5/ct:r collected"
[1] "40 total reviews collected so far!"
[1] "Collecting page6of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:6/ct:r collected"
[1] "48 total reviews collected so far!"
[1] "Collecting page7of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:7/ct:r collected"
[1] "56 total reviews collected so far!"
[1] "Collecting page8of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:8/ct:r collected"
[1] "64 total reviews collected so far!"
[1] "Collecting page9of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:9/ct:r collected"
[1] "72 total reviews collected so far!"
[1] "Collecting page10of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:10/ct:r collected"
[1] "80 total reviews collected so far!"
[1] "Collecting page11of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:11/ct:r collected"
[1] "88 total reviews collected so far!"
[1] "Collecting page12of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:12/ct:r collected"
[1] "96 total reviews collected so far!"
[1] "Collecting page13of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:13/ct:r collected"
[1] "104 total reviews collected so far!"
[1] "Collecting page14of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:14/ct:r collected"
[1] "112 total reviews collected so far!"
[1] "Collecting page15of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:15/ct:r collected"
[1] "120 total reviews collected so far!"
[1] "Collecting page16of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:16/ct:r collected"
[1] "128 total reviews collected so far!"
[1] "Collecting page17of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:17/ct:r collected"
[1] "136 total reviews collected so far!"
[1] "Collecting page18of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:18/ct:r collected"
[1] "144 total reviews collected so far!"
[1] "Collecting page19of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:19/ct:r collected"
[1] "152 total reviews collected so far!"
[1] "Collecting page20of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:20/ct:r collected"
[1] "160 total reviews collected so far!"
view(Carvana_reviews)

Analysis

Question 1

Are the people who have purchased cars from Carvana as speaking more positively or more negatively about the process of buying the car as a whole?

For this visualization, a lot of what you are going to see is that based on the 160 reviews that I got spanning over the past week of when I collected these reviews, people are inherently very positively speaking of Carvana due to different processes in place.

library(tidyverse)
library(tidytext)
library(gutenbergr)
library(ggwordcloud)
library(textdata)
library(rvest)
library(httr)

set_config(user_agent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36"))

Carvana_Reviews_Url <-
  read_html("https://www.carvana.com/reviews")

scrape_carvana <- function(url) {

#Elements of a review 
Stars <-
  Carvana_Reviews_Url %>% 
  html_elements("div.bv-content-header-meta") %>% 
  html_elements("span.bv-content-rating bv-rating-ratio") %>% 
  html_text2()

Username <-
  Carvana_Reviews_Url %>% 
  html_elements("div.bv-content-header-meta") %>%
  html_elements("span.avatar:14") %>% 
  html_elements("span") %>% 
  html_text2()

Time_stamp <-
  Carvana_Reviews_Url %>% 
  html_elements("div.bv-content-header-meta") %>%
  html_elements("bv-content-datetime-stamp") %>% 
  html_text2()

Review_Title <-
  Carvana_Reviews_Url %>% 
  html_elements("div.bv-content-title-container") %>% 
  html_text2()

Review_Paragraph <-
  Carvana_Reviews_Url %>% 
  html_elements("bv-content-details-offset-on") %>% 
  html_elements("div.bv-content-summary-body-text") %>% 
  html_text2()



carvana_review_df <-
  data.frame(Stars, Username, Time_stamp, Review_Title, Review_Paragraph)


return(carvana_df) 
}


pages <-
  c("https://www.carvana.com/reviews?bvstate=pg:1/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:2/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:3/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:4/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:5/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:6/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:7/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:8/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:9/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:10/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:11/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:12/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:13/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:14/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:15/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:16/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:17/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:18/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:19/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:20/ct:r"
  )


scrape_carvana_reviews_pages <- function(urls) {
  carvana_review_pages <- data.frame()
  
  for (i in seq_along(urls)) {
    print(paste("Collecting page", i, "of",length(urls), ":)", sep = ""))
    Sys.sleep(runif(1,5,15))
    
    carvana_review_pages <-
      carvana_review_scrape(urls[i]) %>% 
      mutate(page_id = urls[i]) %>% 
      bind_rows(carvana_review_pages)
    print(paste(urls[i], "collected", sep = " "))
    print(paste(nrow(carvana_review_pages), "total reviews collected so far!", sep = " "))
  }
  return(carvana_review_pages)
}


Carvana_reviews <- 
  scrape_carvana_reviews_pages(pages) 
[1] "Collecting page1of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:1/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page2of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:2/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page3of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:3/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page4of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:4/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page5of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:5/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page6of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:6/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page7of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:7/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page8of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:8/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page9of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:9/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page10of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:10/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page11of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:11/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page12of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:12/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page13of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:13/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page14of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:14/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page15of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:15/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page16of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:16/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page17of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:17/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page18of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:18/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page19of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:19/ct:r collected"
[1] "0 total reviews collected so far!"
[1] "Collecting page20of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:20/ct:r collected"
[1] "0 total reviews collected so far!"
bing <- 
  get_sentiments("bing")

Carvana_counts <-
  Carvana_reviews %>% 
  group_by(Review_Paragraph) %>% 
  unnest_tokens(word, sentences) %>% 
  summarise(n=n()) %>% 
  cross_join(bing)

Carvana_counts %>% 
  filter(n>5) %>% 
  mutate(n = ifelse(sentiment == "negative", -n, n)) %>% 
  mutate(word = reorder(word, n)) %>% 
  ggplot(aes(word, n)) +
  geom_col() +
  coord_flip() +
  geom_text(aes(label = signif(n, digits = 3)), nudge_y = 8) +
  labs(title = "Positive and Negative Words for Carvana",
       subtitle = "Only words appearing at least 5 times are shown")

#| echo: false
2 * 2
[1] 4

Question 2

On what day of the week are those who purchased from Carvana most happy with the car buying process? What day of the week have they been the most dissatisfied?

As far as trying to track what day in particular are people deciding to write reviews there is no indication based on the visualization that people are more angry or more happy it seems to be similar throughout where the customers of Carvana are often very happy with the process or that the only ones going to leave reviews are those who are happy with buying or selling their car.

library(tidyverse)
library(tidytext)
library(gutenbergr)
library(ggwordcloud)
library(textdata)
library(rvest)
library(httr)
library(chromote)



set_config(user_agent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36"))

Carvana_Reviews_Url <-
  read_html_live("https://www.carvana.com/reviews?bvstate=pg:1/ct:r")

carvana_review_scrape <- function(url) {

#Elements of a review 
  Stars <-
    Carvana_Reviews_Url %>% 
    html_elements("div.bv-content-header-meta") %>% 
    html_elements("span.bv-off-screen") %>% 
    html_text2()



Time_stamp <-
  Carvana_Reviews_Url %>% 
  html_elements("div.bv-content-header-meta") %>%
  html_elements("span.bv-content-datetime-stamp") %>% 
  html_text2() 


Review_Paragraph <-
  Carvana_Reviews_Url %>% 
  html_elements("div.bv-content-summary-body-text") %>% 
  html_text2()

Review_Title <-
  Carvana_Reviews_Url %>% 
  html_elements("div.bv-content-container") %>% 
  html_elements("h3.bv-content-title") %>% 
  html_text2() 

carvana_review_df <-
  data.frame(Stars, Time_stamp, Review_Title, Review_Paragraph)


return(carvana_review_df) 
}

pages <-
  c("https://www.carvana.com/reviews?bvstate=pg:1/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:2/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:3/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:4/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:5/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:6/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:7/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:8/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:9/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:10/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:11/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:12/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:13/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:14/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:15/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:16/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:17/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:18/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:19/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:20/ct:r"
  )


scrape_carvana_reviews_pages <- function(urls) {
  carvana_review_pages <- data.frame()
  
  for (i in seq_along(urls)) {
    print(paste("Collecting page", i, "of",length(urls), ":)", sep = ""))
    Sys.sleep(runif(1,5,15))
    
    carvana_review_pages <-
      carvana_review_scrape(urls[i]) %>% 
      mutate(page_id = urls[i]) %>% 
      bind_rows(carvana_review_pages)
    print(paste(urls[i], "collected", sep = " "))
    print(paste(nrow(carvana_review_pages), "total reviews collected so far!", sep = " "))
  }
  return(carvana_review_pages)
}

Carvana_reviews <- 
  scrape_carvana_reviews_pages(pages) 
[1] "Collecting page1of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:1/ct:r collected"
[1] "8 total reviews collected so far!"
[1] "Collecting page2of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:2/ct:r collected"
[1] "16 total reviews collected so far!"
[1] "Collecting page3of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:3/ct:r collected"
[1] "24 total reviews collected so far!"
[1] "Collecting page4of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:4/ct:r collected"
[1] "32 total reviews collected so far!"
[1] "Collecting page5of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:5/ct:r collected"
[1] "40 total reviews collected so far!"
[1] "Collecting page6of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:6/ct:r collected"
[1] "48 total reviews collected so far!"
[1] "Collecting page7of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:7/ct:r collected"
[1] "56 total reviews collected so far!"
[1] "Collecting page8of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:8/ct:r collected"
[1] "64 total reviews collected so far!"
[1] "Collecting page9of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:9/ct:r collected"
[1] "72 total reviews collected so far!"
[1] "Collecting page10of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:10/ct:r collected"
[1] "80 total reviews collected so far!"
[1] "Collecting page11of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:11/ct:r collected"
[1] "88 total reviews collected so far!"
[1] "Collecting page12of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:12/ct:r collected"
[1] "96 total reviews collected so far!"
[1] "Collecting page13of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:13/ct:r collected"
[1] "104 total reviews collected so far!"
[1] "Collecting page14of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:14/ct:r collected"
[1] "112 total reviews collected so far!"
[1] "Collecting page15of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:15/ct:r collected"
[1] "120 total reviews collected so far!"
[1] "Collecting page16of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:16/ct:r collected"
[1] "128 total reviews collected so far!"
[1] "Collecting page17of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:17/ct:r collected"
[1] "136 total reviews collected so far!"
[1] "Collecting page18of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:18/ct:r collected"
[1] "144 total reviews collected so far!"
[1] "Collecting page19of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:19/ct:r collected"
[1] "152 total reviews collected so far!"
[1] "Collecting page20of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:20/ct:r collected"
[1] "160 total reviews collected so far!"
bing <- 
  get_sentiments("bing")

Carvana_Time <-
  Carvana_reviews %>% 
  group_by(Time_stamp) %>% 
  summarise(n=n()) %>% 
  cross_join(bing)

Carvana_Time %>% 
  filter(n>5) %>% 
  mutate(n = ifelse(sentiment == "negative", -n, n)) %>% 
  mutate(word = reorder(word, n)) %>% 
  ggplot(aes(word, n)) +
  geom_col() +
  coord_flip() +
  geom_text(aes(label = signif(n, digits = 3)), nudge_y = 8) +
  labs(title = "What are people emotions on carvana by when they wrote the review",
       subtitle = "Only words appearing at least 5 times are shown")

#| echo: false
2 * 2
[1] 4

Question 3

For 3 and 5 star reviews what are the most common words that people are using that separate them enjoying the process vs not enjoying the process?

For these reviews some of the most common words that people are using are words like easy, fair, quick, and timely. Even from looking at the negative reviews people seemed to echo these same words just some of the contractual things where incorrect and that was the words they used not correct and wrong.

library(tidyverse)
library(tidytext)
library(gutenbergr)
library(ggwordcloud)
library(textdata)
library(rvest)
library(httr)
library(chromote)



set_config(user_agent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36"))

Carvana_Reviews_Url <-
  read_html_live("https://www.carvana.com/reviews?bvstate=pg:1/ct:r")

carvana_review_scrape <- function(url) {

#Elements of a review 
  Stars <-
    Carvana_Reviews_Url %>% 
    html_elements("div.bv-content-header-meta") %>% 
    html_elements("span.bv-off-screen") %>% 
    html_text2()



Time_stamp <-
  Carvana_Reviews_Url %>% 
  html_elements("div.bv-content-header-meta") %>%
  html_elements("span.bv-content-datetime-stamp") %>% 
  html_text2() 


Review_Paragraph <-
  Carvana_Reviews_Url %>% 
  html_elements("div.bv-content-summary-body-text") %>% 
  html_text2()

Review_Title <-
  Carvana_Reviews_Url %>% 
  html_elements("div.bv-content-container") %>% 
  html_elements("h3.bv-content-title") %>% 
  html_text2() 

carvana_review_df <-
  data.frame(Stars, Time_stamp, Review_Title, Review_Paragraph)


return(carvana_review_df) 
}

pages <-
  c("https://www.carvana.com/reviews?bvstate=pg:1/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:2/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:3/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:4/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:5/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:6/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:7/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:8/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:9/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:10/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:11/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:12/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:13/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:14/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:15/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:16/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:17/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:18/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:19/ct:r",
    "https://www.carvana.com/reviews?bvstate=pg:20/ct:r"
  )


scrape_carvana_reviews_pages <- function(urls) {
  carvana_review_pages <- data.frame()
  
  for (i in seq_along(urls)) {
    print(paste("Collecting page", i, "of",length(urls), ":)", sep = ""))
    Sys.sleep(runif(1,5,15))
    
    carvana_review_pages <-
      carvana_review_scrape(urls[i]) %>% 
      mutate(page_id = urls[i]) %>% 
      bind_rows(carvana_review_pages)
    print(paste(urls[i], "collected", sep = " "))
    print(paste(nrow(carvana_review_pages), "total reviews collected so far!", sep = " "))
  }
  return(carvana_review_pages)
}


Carvana_reviews <- 
  scrape_carvana_reviews_pages(pages) 
[1] "Collecting page1of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:1/ct:r collected"
[1] "8 total reviews collected so far!"
[1] "Collecting page2of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:2/ct:r collected"
[1] "16 total reviews collected so far!"
[1] "Collecting page3of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:3/ct:r collected"
[1] "24 total reviews collected so far!"
[1] "Collecting page4of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:4/ct:r collected"
[1] "32 total reviews collected so far!"
[1] "Collecting page5of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:5/ct:r collected"
[1] "40 total reviews collected so far!"
[1] "Collecting page6of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:6/ct:r collected"
[1] "48 total reviews collected so far!"
[1] "Collecting page7of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:7/ct:r collected"
[1] "56 total reviews collected so far!"
[1] "Collecting page8of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:8/ct:r collected"
[1] "64 total reviews collected so far!"
[1] "Collecting page9of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:9/ct:r collected"
[1] "72 total reviews collected so far!"
[1] "Collecting page10of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:10/ct:r collected"
[1] "80 total reviews collected so far!"
[1] "Collecting page11of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:11/ct:r collected"
[1] "88 total reviews collected so far!"
[1] "Collecting page12of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:12/ct:r collected"
[1] "96 total reviews collected so far!"
[1] "Collecting page13of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:13/ct:r collected"
[1] "104 total reviews collected so far!"
[1] "Collecting page14of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:14/ct:r collected"
[1] "112 total reviews collected so far!"
[1] "Collecting page15of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:15/ct:r collected"
[1] "120 total reviews collected so far!"
[1] "Collecting page16of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:16/ct:r collected"
[1] "128 total reviews collected so far!"
[1] "Collecting page17of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:17/ct:r collected"
[1] "136 total reviews collected so far!"
[1] "Collecting page18of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:18/ct:r collected"
[1] "144 total reviews collected so far!"
[1] "Collecting page19of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:19/ct:r collected"
[1] "152 total reviews collected so far!"
[1] "Collecting page20of20:)"
[1] "https://www.carvana.com/reviews?bvstate=pg:20/ct:r collected"
[1] "160 total reviews collected so far!"
bing <- 
  get_sentiments("bing")

Carvana_Stars <-
  Carvana_reviews %>% 
  group_by(Stars) %>% 
  summarise(n=n()) %>% 
  cross_join(bing)


Carvana_Stars %>% 
  filter(Stars == "5" | Stars == "3") %>% 
  filter(n>5) %>% 
  mutate(n = ifelse(sentiment == "negative", -n, n)) %>% 
  mutate(word = reorder(word, n)) %>% 
  ggplot(aes(word, n)) +
  geom_col() +
  coord_flip() +
  geom_text(aes(label = signif(n, digits = 3)), nudge_y = 8) +
  labs(title = "Positive and Negative Words for 3 and 5 star Carvana Reviews",
       subtitle = "Only words appearing at least 5 times are shown")

#| echo: false
2 * 2
[1] 4

Conclusion

In this Data from Carvana, you can find access to a multitude of things what i decided to focus on where peoples overall rating of having this new experience of being able to purchase a car fully online. This was important to me because it was something intriguing essentially buying a car strictly from your house without having to deal with all the negotiating and through looking at the reviews that I scraped you could see that as a whole the vast majority of people where very happy with this process.