Assignment 7 Acura vs Lexus

Author

Emma Black

Introduction

I am analyzing the most popular luxury SUVs on the market (according to car rating website Edmunds.com) with the intention of finding the best one to purchase after graduation. I have found that two of the most reliable and highly rated SUV brands in my price range are Lexus and Acura, so I will be comparing reviews of two of their most popular entry level SUV models: the Acura RDX and the Lexus NX.

Note: I will refer to a positivity value throughout the analysis. This is a calculated field based on the overall positivity or negativity of the words used to write a review. A negative positivity value indicates a negative review.

Question 1: Is there a correlation between the month and the number of reviews published?

My hypothesis is that there would be more reviews around December and January, as these are generally the most popular times to purchase a car and it seems likely that people would review the car while it is relatively new to them.

Analysis

New model years typically being released in December, coupled with Christmas and promotions (such as the Lexus December to Remember Sales Event), makes December one of the most popular months to purchase a car. It would be reasonable to infer that people are more likely to review cars that they have recently bought, therefore causing a spike in car reviews around December, January, and February. While this pattern generally holds true, Lexus seems to have the most significant spike in reviews in December, while Acura’s largest spike happens in May. This suggests that there are likely other variables that impact when a consumer chooses to purchase a review outside of how recently they bought the car, or that Acura has particularly effective sales events in “off” months that Lexus does not have.

Question 2: Is one car reviewed more positively than the other?

I examined the distribution of the overall positivity scores of each car to understand where the majority of reviews fell and how outliers impacted the scores.

# A tibble: 2 × 2
  car_model median_positivity
  <chr>                 <dbl>
1 acura                   5  
2 lexus                   5.5

Analysis

Both cars have very similar median positivity values, with Acura having a median value of 5 and Lexus having a median value of 5.5. Outliers can likely be attributed to excessively long reviews, which would have a higher number of scoreable words. The similarity in positivity ratings is unsurprising, as both cars have very similar over-all ratings on the Edmunds website, with the Acura having a 4.1/5 star rating and the Lexus having a 4.2/5 star rating.

Question 3: Is the positivity value an accurate reflection of the reviewer’s feelings about the car?

To answer this question, I compared the star value that the reviewer assigned the car with the calculated positivity value to understand if a higher star value correlates to a higher positivity score.

Analysis

There appears to be a general trend that the higher the star rating the reviewer gave, the higher the overall positivity was of the language they used in their reviews. This suggets that the results of the text analysis are generally reliable. However, it should be noted that the overlapping margins of error suggest that further ANOVA is needed to understand the statistical relationship between these two variables.

Question 4: What are the top words used to describe each car? Do they differ between the two cars?

Analysis

Note: the word “5” for the Acura was used 391 times (not 3) but was cut off

The top 10 words used to describe both cars were extremely similar, with “5” ,“stars”, and car being the top 3 words used for both. These common words highlight the general qualities of a car that the reviews value, such as technology, comfort, and reliability.