SUV Analysis

Author

Emma Black

Introduction

I am analyzing the most popular SUVs on the market (according to car rating website Edmunds.com) with the intention of finding the best one to purchase after graduation. I will be examining variables such as price, MPG, value, technology, owner ratings, and expert ratings to gain a comprehensive understanding of each vehicle.

Data Dictionary

Click here to access the specific data set I obtained for this analysis.

Variable Name Description
car_name year, make, and model of a car
car_price an average of the given range of MSRP values, reflecting different trims
cost_to_drive the estimated monthly fuel costs assuming the car is driven primarily in Ohio, driven 15,000 miles per year, and 55% of those miles are 45% are highway
owner_stars the average star rating out of 5 given in reviews of real owners
num_owner_reviews the number of owner reviews posted for the car
total_rating the rating out of 10 that the experts at Edmunds assessed for the overall vehicle
mpg miles per gallon
tech_rate the rating out of 10 that experts at Edmunds assessed for the vehicle’s technology features
interior_rate the rating out of 10 that experts at Edmunds assessed for the vehicle’s interior quality
value_rating the rating out of 10 that experts at Edmunds assigned based on the quality and amount of features for the given vehicle price

Part 1

Summary Statistics

Transposed Summary Statistics
variable cost_to_drive owner_stars num_reviews mpg car_price
avg 198.32558 3.9688889 37.40909 22.44186 77322.71
median 197.00000 4.0000000 37.50000 22.00000 61975.00
sd 60.06137 0.5107135 22.05313 4.62036 48551.63
min 116.00000 2.9000000 2.00000 13.00000 26200.00
max 368.00000 5.0000000 98.00000 30.00000 224800.00

Which brand has the most cars on the list?

Because the list consists of the top 3 SUVs in each sub-category (such as Small 3 Row and Midsize Luxury), the brand with the most cars on the list is likely a brand that consistently produces quality vehicles.

Analysis

Mercedes is the clear front runner with 7 cars mentioned on the list, compared to the next highest of Audi at 4 cars. I found it more useful to compare the number of cars from each brand that made the list rather than the mean or median rating because all of the cars on the list are considered the best in their respective sub-categories. Therefore, there is not much difference in the mean and median values of their total rating by Edmunds.

Is the overall rating from Edmunds experts aligned with the owner ratings?

While the experts st Edmunds likely have a lot of technical knowledge of what makes a “good” car, I myself am not a car enthusiast and likely don’t prioritize all the same features in a car that experts do. I feel that the opinions of common people who drive the cars regularly would more accurately predict how I might rate a car.

Analysis

It appears that at an aggregate level, while the median of the owner reviews and the expert reviews are virtually the same (8 vs 8.1), the owner reviews have vastly more variation. This makes sense as common consumers are likely to have more variation in their standards and preferences than experts. Additionally, there are more total owner reviews than expert reviews, meaning there is more opportunity for variation with owner reviews, but, as the central limit theorem suggests, a greater likelihood that the median of this larger sample size will more accurately reflect the true median.

Which car has the best value and how much does it cost?

Cars with the Highest Value Rate
car_name value_rate car_price
2025 Genesis GV70 8.5 52000
2024 Hyundai Palisade 8.5 45250
2025 Kia Telluride 8.5 44788
2025 Kia Sorento 8.5 39690
Median Car Price and Value Rate
median_car_price median_value_rate
61975 7.5

Analysis:

The four cars tide for the highest value all have a value rating one full point above the median and prices well below the median. It’s also worth noting that two of the cars tied for best value are Kias, suggesting that this might be a more budget friendly alternative to Mercedes, which has the most total cars on the list.

Is there a correlation between MPG and price?

Analysis:

Yes, there is a negative correlation between price and MPG. This is likely due to the fact that performance vehicles (which tend to be more expensive) often prioritize power over fuel efficiency.

Is there a correlation between tech rating and price

Analysis:

Yes, it appears that the better the tech is in a car, the higher the price. However, it is worth noting that that cars with a tech rating of 9 have a wide range of prices, meaning that it is possible to get a car with high quality tech without breaking the bank.

Part 2

Two very comparable SUV brands in my price range are Lexus and Acura, so I will be comparing owner reviews sourced from Edmunds.com of two of their most popular SUV models: the Acura RDX and the Lexus NX.

Note: I will refer to a positivity value throughout the analysis. This is a calculated field based on the overall positivity or negativity of the words used to write a review. A negative positivity value indicates a negative review.

Click here to access the exact data set I used.

Total Ratings for Lexus NX and Acura RDX
car_name car_price cost_to_drive owner_stars num_owner_reviews total_rating mpg tech_rate interior_rate value_rate brand owner_stars_2x
2024 Acura RDX 49250 206 4.1 39 7.9 23 8.5 7.5 8 Acura 8.2
2025 Lexus NX 51572 121 4.2 31 7.8 28 8.0 7.5 8 Lexus 8.4

Exploration: What are the top words used to describe each car? Do they differ between the two cars?

Analysis

Note: the word “5” for the Acura was used 391 times (not 3) but was cut off

The top 10 words used to describe both cars were extremely similar, with “5” ,“stars”, and car being the top 3 words used for both. These common words highlight the general qualities of a car that the reviews value, such as technology, comfort, and reliability.

Is there a correlation between the month and the number of reviews published?

My hypothesis is that there would be more reviews around December and January, as these are generally the most popular times to purchase a car and it seems likely that people would review the car while it is relatively new to them.

Analysis

New model years typically being released in December, coupled with Christmas and promotions (such as the Lexus December to Remember Sales Event), makes December one of the most popular months to purchase a car. It would be reasonable to infer that people are more likely to review cars that they have recently bought, therefore causing a spike in car reviews around December, January, and February. While this pattern generally holds true, Lexus seems to have the most significant spike in reviews in December, while Acura’s largest spike happens in May. This suggests that there are likely other variables that impact when a consumer chooses to purchase a review outside of how recently they bought the car, or that Acura has particularly effective sales events in “off” months that Lexus does not have.

Is one car reviewed more positively than the other?

I examined the distribution of the overall positivity scores of each car to understand where the majority of reviews fell and how outliers impacted the scores.

# A tibble: 2 × 2
  car_model median_positivity
  <chr>                 <dbl>
1 acura                   5  
2 lexus                   5.5

Analysis

Both cars have very similar median positivity values, with Acura having a median value of 5 and Lexus having a median value of 5.5. Outliers can likely be attributed to excessively long reviews, which would have a higher number of scoreable words. The similarity in positivity ratings is unsurprising, as both cars have very similar over-all ratings on the Edmunds website, with the Acura having a 4.1/5 star rating and the Lexus having a 4.2/5 star rating.

Is the positivity value an accurate reflection of the reviewer’s feelings about the car?

To answer this question, I compared the star value that the reviewer assigned the car with the calculated positivity value to understand if a higher star value correlates to a higher positivity score.

Analysis

There appears to be a general trend that the higher the star rating the reviewer gave, the higher the overall positivity was of the language they used in their reviews. This suggets that the results of the text analysis are generally reliable. However, it should be noted that the overlapping margins of error suggest that further ANOVA is needed to understand the statistical relationship between these two variables.