#1. Executive Summary
Last spring semester, I studied abroad in Europe and had the chance to stay in a wide variety of accommodations. Since I traveled to new places almost every week, I accumulated extensive experience booking hotels. I always read reviews carefully before making a reservation—sometimes the hotel turned out to be much better than expected, sometimes worse, and sometimes the reviews felt spot on.
As such, when faced with countless hotel options, customer reviews are among the most trusted sources of information. However, numerical ratings alone often fail to capture the real experiences or emotions of guests, and going through thousands of reviews manually can be tiring—even before the trip begins.
Based on this personal experience, this project analyzes the 515K Hotel Reviews Data in Europe, a dataset containing 515,000 customer reviews and ratings for 1,493 luxury hotels across Europe. The project aims to explore the relationship between review text and numerical ratings, examine brand-specific patterns in review content, and build a simple keyword-based hotel recommendation model for a hypothetical traveler.
The analysis centers on three core themes:
1.Identify the various emotional expressions and review elements that characterize positive versus negative feedback.
2.Analyze and compare each hotel’s key attributes.
3.Leverage these insights to recommend the most suitable hotel for a hypothetical guest.
Through this project, it will be possible to uncover how emotional expressions in reviews correlate with numerical ratings, identify brand-level differences in customer feedback, and determine which aspects of review content are most useful when recommending hotels to travelers with specific preferences.
For instance, it’s unclear why niggle—a word typically associated with negative sentiment—appears in positive reviews, or why hutch, which seems unrelated, is found in negative ones. These cases warrant closer inspection. To investigate further, I will examine three examples each from positive and negative reviews that include these words to better understand their context. For this, I referred to the regex pattern regex(“\bniggle\b”) suggested by ChatGPT.
#🏙️ Accommodation Preferences
Close proximity to a tram or metro station A room with a canal view A safe and quiet atmosphere, ideal for solo travelers A spacious, clean bathroom with strong water pressure Small thoughtful touches like complimentary water or tea/coffee Friendly front desk staff and a smooth check-in process