Using R and Python, I extract usernames, ratings, and reviews from Trustpilot Store reviews. This analysis reveals key customer insights, tracks rating trends, and identifies common feedback themes.
How ratings have changed over time
Common words and recurring themes in user feedback
Sentiment trends: improving, declining, or stable
Top complaints and what users love most
By leveraging data-driven insights, this report helps refine app strategy and maintain a competitive edge.
Customer ratings have dropped from 1.9 (2022) to 1.3 (2023-2024), reflecting worsening experiences with report accuracy, pricing, and customer support. However, some users still find value in the reports, particularly when they confirm expected vehicle conditions.
Many customers complain about inaccurate reports, citing false mileage data, missing service history, and incorrect accident records. These errors have led to financial loss and distrust in the service. However, some users find CARFAX reports helpful, especially when the provided information matches reality and prevents them from making bad purchases.
A large number of reviews express dissatisfaction with the cost of CARFAX reports versus the information provided. Users believe they can find similar or better information for free elsewhere.
Many describe the purchase as a waste of money due to the lack of additional useful insights.
Users report major discrepancies in recorded mileage, which can negatively impact vehicle sales.
Some reviews mention inflated mileage numbers, while others complain about missing data.
These errors damage trust in CARFAX reports and create problems for vehicle buyers and sellers.
CARFAX Italy’s biggest challenge is data accuracy—some users find reports extremely useful, while others face major discrepancies that undermine trust in the service. The pricing is widely questioned, as users expect high reliability in exchange for the cost. The mixed feedback suggests CARFAX can be valuable, but only when the provided information is accurate. Addressing report inconsistencies could help rebuild confidence in the service.
People who have negative experiences are more likely to leave reviews compared to satisfied customers, which may skew the overall sentiment.
There is no control over who leaves reviews on Trustpilot, meaning some feedback may come from competitors, one-time users, or misinformed customers.
Some negative experiences may be due to misunderstandings of how CARFAX reports work, rather than actual inaccuracies.
As a part of this analysis, I also developed a GPT Bot that is provided with the dataset used for this analysis. Therefore, you can simply use to explore the more information on your own:
https://chatgpt.com/g/g-67bf512ac3908191acdd5f3a08257b86-carfax-user-reviews-analyzer
IMPORTANT: You will need to specify the that you would like to analyze Italian reviews from the Trustpilot.
The first step is to scrape Trustpilot reviews and structure the data for text analysis.
# Function to scrape a single Trustpilot review page
scrape_trustpilot <- function(page_num) {
# Construct the URL dynamically for the given page number
url <- paste0("https://www.trustpilot.com/review/www.carfax.com?languages=it&page=", page_num)
webpage <- read_html(url) # Read the webpage once
# Extract reviewer names
reviewer_names <- webpage %>%
html_nodes("span[data-consumer-name-typography='true']") %>%
html_text()
# Extract number of reviews per user
reviewer_counts <- webpage %>%
html_nodes(".styles_consumerExtraDetails__NFM0b span[data-consumer-reviews-count-typography='true']") %>%
html_text() %>%
str_extract("\\d+") %>%
as.numeric()
# Extract reviewer countries
reviewer_countries <- webpage %>%
html_nodes(".styles_consumerExtraDetails__NFM0b span:last-child") %>%
html_text()
# Extract ratings
review_ratings <- webpage %>%
html_nodes("div[data-service-review-rating]") %>%
html_attr("data-service-review-rating") %>%
as.numeric()
# Extract review dates
review_dates <- webpage %>%
html_nodes("time[data-service-review-date-time-ago]") %>%
html_attr("datetime") %>%
as.character()
# Extract review titles
review_titles <- webpage %>%
html_nodes("h2[data-service-review-title-typography='true']") %>%
html_text()
# Extract review content
review_contents <- webpage %>%
html_nodes("p[data-service-review-text-typography='true']") %>%
html_text()
# Extract experience dates
experience_dates <- webpage %>%
html_nodes("p[data-service-review-date-of-experience-typography='true']") %>%
html_text() %>%
str_remove("Date of experience: ") %>%
trimws()
# Ensure all columns have the same length (Handle missing values)
max_length <- max(length(reviewer_names), length(reviewer_counts), length(reviewer_countries),
length(review_ratings), length(review_dates), length(review_titles),
length(review_contents), length(experience_dates))
df <- data.frame(
Reviewer = c(reviewer_names, rep(NA, max_length - length(reviewer_names))),
ReviewCount = c(reviewer_counts, rep(NA, max_length - length(reviewer_counts))),
Country = c(reviewer_countries, rep(NA, max_length - length(reviewer_countries))),
Rating = c(review_ratings, rep(NA, max_length - length(review_ratings))),
Review_Date = c(review_dates, rep(NA, max_length - length(review_dates))),
Title = c(review_titles, rep(NA, max_length - length(review_titles))),
Content = c(review_contents, rep(NA, max_length - length(review_contents))),
Experience_Date = c(experience_dates, rep(NA, max_length - length(experience_dates))),
stringsAsFactors = FALSE
)
return(df)
}
# Set number of pages to scrape
num_pages <- 6
# Scrape multiple pages and combine results
reviews_all <- bind_rows(lapply(1:num_pages, scrape_trustpilot))
Now we can take a sneak peak in the data we collected:
Reviewer | ReviewCount | Country | Rating | Review_Date | Title | Content | Experience_Date |
---|---|---|---|---|---|---|---|
Dale Olmsted | 1 | US | 1 | 2025-02-12T21:36:08.000Z | Carfax report (From a certified… | Carfax report (From a certified dealership) was in… | February 05, 2025 |
Kal O | 4 | CA | 1 | 2025-02-06T04:26:02.000Z | Beware of Inaccurate Reports - Don’t Trust Them | I always believed CARFAX reports were reliable and… | January 10, 2025 |
Kristine Turk | 2 | US | 1 | 2025-01-28T06:47:30.000Z | I was told my vehicle was 100% clear of… | I was told my vehicle was 100% clear of accidents … | January 02, 2025 |
John Thomas | 1 | US | 1 | 2025-01-29T22:57:09.000Z | Carfax is a complete scam! | Carfax is a complete scam!They have a report on my… | January 29, 2025 |
Fred | 1 | US | 1 | 2025-01-23T02:13:11.000Z | No phone number to contact anyone | No phone number to contact anyone. As of now I hav… | January 22, 2025 |
Michael McCauley | 1 | US | 1 | 2025-01-09T18:21:29.000Z | I have brand new tires | I have brand new tires, carfax says I’m overdue on… | January 09, 2025 |
We successfully scraped 119 reviews, each containing the following details:
Reviewer’s Name – The name of the person who left the review.
Total Reviews by Reviewer – The total number of reviews this user has submitted on Trustpilot.
Country – The reviewer’s location.
Rating – The star rating given in the review.
Review Date – When the review was posted.
Review Title – The headline or summary of the review.
Review Content – The full text of the review.
Experience Date – When the reviewer had the experience they wrote about.
We will first explore the data we scrapped in order to understand what we can analyze.
Country | Count_of_reviews | Avg_rating | Min_date | Max_date |
---|---|---|---|---|
IT | 96 | 1.364583 | 2021-12-13T06:54:46.000Z | 2025-01-05T01:49:06.000Z |
US | 16 | 1.062500 | 2021-06-18T06:02:35.000Z | 2025-02-12T21:36:08.000Z |
CA | 2 | 1.000000 | 2024-11-09T16:57:50.000Z | 2025-02-06T04:26:02.000Z |
BG | 1 | 5.000000 | 2025-01-17T11:23:37.000Z | 2025-01-17T11:23:37.000Z |
DE | 1 | 1.000000 | 2023-08-15T13:13:07.000Z | 2023-08-15T13:13:07.000Z |
GB | 1 | 5.000000 | 2023-12-12T09:47:06.000Z | 2023-12-12T09:47:06.000Z |
PK | 1 | 1.000000 | 2025-02-14T22:28:22.000Z | 2025-02-14T22:28:22.000Z |
VA | 1 | 1.000000 | 2023-01-18T12:17:05.000Z | 2023-01-18T12:17:05.000Z |
Based on the data overview, it seems that majority of the reviews are coming from users from Italy. In the next steps, I will only take into account reviews from Italy.
We can also see that the average rating in italian market is pretty low, around 1.4 out of 5. The reviews spand from the period of end of 2021 to the beginning of 2025.
Based on this, we will try to explain why is the rating so low. The first point to do is to check rating average year over year.
Year | Count_of_reviews | Avg_rating |
---|---|---|
2021 | 1 | 1.000000 |
2022 | 10 | 1.900000 |
2023 | 31 | 1.258064 |
2024 | 53 | 1.339623 |
2025 | 1 | 1.000000 |
As it seems that the most of the reviews are from 2022 to 2024 period, we can see that the average rating was somewhat bigger in 2022, standing at 1.9, while it decrased in 2023 and 2024 to the level around 1.3. Judging the average ratings and development over the last few years, we could expect to hear probably bad experiences that these users have had bad experiences.
Many words indicate dissatisfaction
“soldi” (money), “buttati” (wasted), “inutile” (useless), and “rimborso” (refund) suggest that many users feel they wasted money on reports. “nulla” (nothing) and “solo” (only) hint that reports lacked expected details.
Technical & report-related issues
“informazioni” (information), “dati” (data), “targa” (license plate), and “incidente” (accident) suggest concerns about report completeness or accuracy.
Possible pricing concerns
“soldi” (money), “euro” (currency), and “rimborso” (refund) imply frustration with cost vs. value. Website & service issues
The Sentiment Analysis graph displays polarity trends over time, with recent reviews on the left (Index 0-20) and older reviews on the right (Index 60-80). The blue line represents the average sentiment, where higher values indicate positive sentiment and lower values indicate negativity. The gray shaded area shows the confidence interval, reflecting variability in sentiment at different points.
The sentiment in the reviews followed a distinct pattern over time:
Recent Decline (Index 0-20): Sentiment dropped sharply, indicating increased user frustration in the latest reviews. This could be due to recent changes in service, pricing concerns, or missing information in reports.
Stable Positive Phase (Index 20-60): Before the decline, sentiment remained relatively stable and slightly positive, suggesting that for a period, users were satisfied with the service.
Early Sentiment Increase (Index 60-80): Older reviews started with neutral to slightly positive sentiment, gradually improving over time. This could indicate that past issues were resolved, leading to a more positive outlook before the recent decline.
Example
“Servizio inutile. Fornisce dati ed informazioni facilmente verificabili, consultando altri portali gratuiti senza spendere un centesimo in ricerche dalla dubbia utilità. Ho voluto provare questo servizio, per verificare di persona i risultati e la delusione è stata totale. È da sconsigliare nel modo più assoluto.”
English Translation:
“Useless service. It provides data and information that can be easily verified by checking other free portals without spending a cent on research of questionable usefulness. I wanted to try this service to check the results myself, and the disappointment was total. I strongly advise against it.”
Example
“Utile per conoscere il peso dell’auto… Non risultano anomalie nel chilometraggio, vorrei capire in che modo sarebbero risultate, c’è solo il chilometraggio del primo acquisto e quello della prima revisione dopo 4 anni. Nessuna notizia in più da quelle che è possibile reperire gratuitamente, soldi buttati.”
English Translation:
“Useful for knowing the car’s weight… No mileage anomalies were found, but I’d like to understand how they would even appear. There’s only the mileage from the first purchase and the first inspection after four years. No additional information beyond what you can find for free, money wasted.”
Example
“Sono stato danneggiato da questo sito, non sono più riuscito a vendere la mia autovettura perché hanno prodotto un report falso indicando molti più km dei km effettivi. Io ho la documentazione di tutti i tagliandi effettuati, la Carfax non so quale documentazione ha prodotto.”
English Translation:
“I have been harmed by this site. I can no longer sell my car because they produced a false report indicating many more kilometers than the actual ones. I have documentation of all the maintenance records, but I have no idea what documentation CARFAX used to make such claims.”
Example
“Ho collaudato tempo fa la mia Toyota, nel riportare i km anziché 191000, come da contachilometri, ha riportato 218000. Ora dovrò rifare il collaudo e mi troverò ancora molti km in più del contachilometri. Cosa devo fare?”
English Translation:
“I had my Toyota inspected some time ago. Instead of reporting 191,000 km as shown on the odometer, they reported 218,000 km. Now, when I go for my next inspection, I will still have extra kilometers recorded. What should I do?”
Many users appreciate how CARFAX reports help them avoid buying problematic vehicles.
The reports provide key insights into a car’s past, helping users make informed decisions.
Example: > “Servizio quasi perfetto, bravi, mi ha evitato un grosso problema con un’auto che sembrava perfetta ma aveva problemi nascosti.”
English Translation:
“Almost perfect service, well done, it saved me from a big problem with a car that seemed perfect but had hidden issues.”
Users value the comprehensive information provided in CARFAX reports. The reports include ownership history, service records, and accident reports, giving a full picture of a car’s past.
Example: > “..nel mio caso è stato molto utile se avessi preso l’auto senza il report avrei avuto brutte sorprese.”
English Translation:
“..in my case, it was very useful. If I had bought the car without the report, I would have had unpleasant surprises.”
Some users praise CARFAX’s customer support team for their professionalism and helpfulness.
A few reviews specifically mention positive interactions with CARFAX representatives.
Example: > “Vorrei complimentarmi con la dott.ssa Elena Martino per la professionalità e disponibilità con cui mi ha aiutato.”
English Translation:
“I would like to compliment Dr. Elena Martino for the professionalism and availability with which she helped me.”
CARFAX reports are seen as a trustworthy source of vehicle information.
Users feel more confident in their car purchases when they have access to a CARFAX report.
Example: > “Grazie a Carfax ho potuto verificare che l’auto che volevo comprare era effettivamente in buone condizioni.”
English Translation:
“Thanks to CARFAX, I was able to verify that the car I wanted to buy was actually in good condition.”
Many users find CARFAX’s website and report system easy to use.
The interface is clear, and the process of retrieving a vehicle history report is simple.
Example: > “Il sito è intuitivo e il report è stato facile da ottenere. Tutto chiaro e semplice.”
English Translation:
“The website is intuitive, and the report was easy to obtain. Everything is clear and simple.”
These insights highlight the key aspects that users appreciate about CARFAX, including helping them make informed purchases, providing detailed reports, and offering a user-friendly platform.