Final Project - Hotels

Google Maps Places API

Author

Introduction

This is my introduction as a graduating member of the Business Analytics major from Xavier University! In this post, I will be exploring the intricacies of hotels/ski resorts both domestically and internationally. I will be analyzing key factors that set one another apart, how different countries stand up against each other, and see how they stack up in the eyes of the customer. I am excited to dive deeper into this data as I happen to be quite the skier. As I near graduation, I have decided that a ski trip will be in my near future, and I would like to utilize this data set to help narrow my search down.

For the second portion of this post, I will be calling to the Google Maps API to retrieve hotel data for the Cincinnati area. Here I will be utilizing this data set a bit differently, as there are no ski hotels/resorts in Cincinnati. However, I do have some extended family coming in for my graduation, and they will need a place to stay. I plan on using this data set to determine the best places for them to stay because no way are they staying with me.

Primary Data Set

Below is a quick overview of what the data set shows. Each row is a different ski hotel/resort.

Variables	Description
Country	Country hotel/resort is located
Resort	Name of resort
Hotel	Name of hotel
Price (£)	Price of hotel/resort (per night)
Distance From Lift (m)	Distance from ski lift (meters)
Altitude	Altitude of hotel/resort
Total Piste (km)	Total length of marked ski runs (kilometers)
Total Lifts	How many ski lifts are at the hotel/resort
Gondolas	How many of a specific (expensive) ski lift that completely encapsulates you while transporting you
Chair Lifts	How many chair lifts (very common) at the hotel/resort
Drag Lifts	How many drag lifts, where you remain on the ground and are pulled up the hotel, are at the hotel/resort
Blues	Level of ski runs (blue is easy to intermediate)
Reds	Level of ski runs (red is hard to advanced)
Blacks	Level of ski runs (black is the most challenging, “Black Diamond” at a lot of places)
Total Runs	How many runs (ski trails) at the hotel/resort
Sleeps	How many rooms at the hotel/resort
Dec Snow Low 2020 (cm)	Lowest amount of snow at the hotel in December of 2020 (centimeters)
Dec Snow High 2020 (cm)	Highest amount of snow at the hotel in December of 2020 (centimeters)
Jan Snow Low 2020 (cm)	Lowest amount of snow at the hotel in Janurary of 2020 (centimeters)
Jan Snow High 2020 (cm)	Highest amount of snow at the hotel in January of 2020 (centimeters)
Feb Snow Low 2020 (cm)	Lowest amount of snow at the hotel in February of 2020 (centimeters)
Feb Snow High 2020 (cm)	Highest amount of snow at the hotel in February of 2020 (centimeters)

Price (£) vs Altitude

As part of my analysis of ski resort data, I explored the relationship between hotel prices and altitude to better understand what factors might influence the cost of accommodation when planning a ski trip. Using a dataset of ski hotels across various European resorts, I created a scatter plot, mapping hotel prices (in £) against the altitude of each resort (in meters).

What emerged was a moderate upward trend, hotels located at higher altitudes tend to be more expensive. This makes sense as higher elevations often come with better snow reliability, ski-in/ski-out convenience, and more picturesque views, all of which can drive up demand (and price). That said, the spread of points at each altitude was fairly wide, indicating that altitude alone doesn’t fully explain price variation. There are likely other contributing factors, such as resort popularity, proximity to lifts (though some entries had “unknown” distance), amenities, or country-level pricing differences.

As an analyst looking to optimize for both cost and ski quality, this kind of insight is valuable. If altitude is important for snow conditions, it might be worth paying a premium. But for budget-conscious skiers (like myself), the plot shows that lower-cost options do exist even at higher altitudes, and not all low-elevation hotels are cheap. Overall, this visual helped me cut through assumptions and quantify a trade-off that many skiers make but rarely measure.

Distribution based on Price (£)

To complement my earlier analysis of ski hotel pricing trends by altitude, I also examined the overall distribution of hotel prices using a histogram. This visualization allowed me to move beyond individual relationships and instead look at the price picture as a whole. I used a bin width of £50 to group hotels into manageable price ranges and plotted the frequency of hotels within each range. The result was a clear visual of where most hotel prices cluster and how common higher-end or budget options are.

The histogram revealed a right-skewed distribution, with the bulk of hotels priced between roughly £500 and £800. This suggests that the majority of ski hotels in the dataset fall within a mid-range price bracket. However, there are noticeable tails extending into higher price ranges, which likely reflect luxury accommodations in premium resorts. On the lower end, relatively few hotels were priced under £400, which might indicate that truly low-budget options are rare or underrepresented in the dataset (perhaps due to the inclusion criteria or data source).

Understanding this distribution is important if you’re trying to plan around a specific budget. For example, I will have cost constraints, so knowing that most hotels cluster around the £500–£800 mark gives me a realistic benchmark to set expectations. This kind of basic exploratory data visualization adds depth to my understanding and informs more strategic decision-making downstream.

Price (£) vs Count of Ski Lifts

Continuing my exploration of ski resort data, I was curious to see whether resort infrastructure, specifically the number of ski lifts, has any relationship with hotel pricing. My hypothesis was that resorts with more lifts might indicate larger or better-equipped ski areas, which could in turn drive up hotel demand and prices. To test this, I plotted hotel prices against the “Total Lifts” variable using a scatter plot, with a linear regression line overlaid to help clarify the trend.

What I found was a weak but slightly positive correlation between the number of lifts and hotel price. The red regression line shows a gentle upward slope, suggesting that resorts with more ski lifts tend to have slightly more expensive hotels on average. That said, the spread of the data is quite wide, there are reasonably priced hotels even at resorts with 40+ lifts, and similarly, some high-priced hotels exist at more modest resorts. This implies that while lift count may be a factor, it’s far from the only driver of price variation.

Most Expensive Countries

To get a broader geographic perspective on pricing, I shifted focus to the country level, calculating the average hotel price per country and visualizing the top 10 most expensive ones. This bar chart ranks countries from highest to lowest average price, giving a quick comparative snapshot of where ski holidays tend to be most costly.

The results were pretty striking. Countries like Finland, France, and Austria emerged near the top, which aligns with expectations—these nations are known for their well-developed ski resorts and alpine luxury. What stood out was the degree of price difference between the top few and the rest. France, for instance, had an average hotel price significantly higher than many of its peers, reinforcing its reputation as a premium destination. Meanwhile, countries further down the list still had respectable infrastructure but were much more affordable, suggesting a potential value proposition for price-conscious travelers.

From a data-driven planning or recommendation standpoint, this chart is incredibly important. If I were prioritizing prestige or luxury experiences (which I am not), destinations at the top of this chart are strong candidates. But for travelers looking for cost-effective ski holidays (me), I can steer myself toward countries that offer decent lift access or snow conditions without the premium price tag.

Chair Lifts by Country

In addition to comparing hotel prices across countries, I also wanted to explore how ski infrastructure differs by location—specifically by looking at the average number of chairlifts per resort within each country. Chairlifts are a good proxy for how well-developed and expansive a resort might be, since they typically serve key pistes and high-traffic areas. I grouped the data by country, calculated the mean number of chairlifts, and visualized the results with a horizontal bar chart.

What emerged was a clear picture of which countries invest most heavily in lift infrastructure. France stood out, averaging a high number of chairlifts per resort. This aligns with it’s reputation as hosts of massive ski areas like Les Trois Vallées. France doesn’t just have a few mega-resorts, they seem to offer strong infrastructure across the board. On the other hand, some countries with lower average chairlifts might still have high-end resorts, but perhaps fewer of them—or more compact ski areas with a smaller lift footprint.

From an analytical standpoint, this kind of metric is useful when assessing the skiing experience quality beyond just price. If I prioritize access to terrain and lift capacity, chairlift count becomes a relevant metric. Combining this with earlier price and altitude insights gives a better picture of what each destination offers.

Secondary Source

The Google Maps Places API is a powerful tool that allows users to query for a variety of places (such as restaurants, hotels, gas stations) within a defined area. It is particularly useful for location-based apps or services, helping users find nearby places and get real-time data on locations, reviews, ratings, etc..

How to Access Google Maps Places API

To use the Google Maps Places API, you’ll need an API key. Follow these steps to set it up:

Go to the “Google Cloud Console”
Create a new project or select an existing one (top of the page)
Enable the Google Places API (may have to create an account first)
Generate an API key from the ‘Credentials’ section
Use the key in your R script to authenticate your API calls
Create a data frame and convert to a CSV file to better visualize the data

Call to the API

In this example (above), I used the Google Maps Places API to search for hotels near a location in Cincinnati. I specified the location using latitude and longitude coordinates, set a search radius of 1000 meters, and defined the type of place I was interested in (hotels). The API responded with a JSON object containing details of the matching places, which I then analyzed and stored in a data frame.

In the data frame, you will see a result of 20 hotels (each row represents one hotel) and 17 different variables (columns) describing the hotels. The Google Maps Places API provided me the name of the hotel, the address, the latitude/longitude, the status (operational or not), a rating, etc.

To better understand how hotel popularity relates to guest satisfaction, I created a bar chart that groups hotels by the number of user ratings they’ve received, then calculates the average rating within each group. This lets me analyze trends across popularity tiers, whether hotels with more reviews tend to have better, worse, or similar ratings compared to lesser-known spots. It’s a way to balance quantity and quality in a more structured view.

The chart clearly shows that hotels with moderate to high numbers of reviews tend to maintain solid average ratings, often above 4.0. Interestingly, the group with the highest volume of reviews doesn’t always have the absolute highest average rating, but it’s still strong, suggesting that popular hotels manage to satisfy guests at scale. On the other hand, some of the lower-review-count groups show slightly more fluctuation in average ratings, which could be due to less consistent service, newer establishments, or just the natural variability that comes with fewer data points.

This kind of grouped analysis is particularly valuable when making travel recommendations (for my family). A high rating alone isn’t always enough, if it’s based on just a handful of reviews, the insight is weaker. But seeing strong average ratings in well-reviewed, popular hotels gives me more confidence that these places consistently deliver. For customers who want a balance between reliability and quality, this visualization offers an evidence-based starting point to narrow down options.

Final Thoughts

Bringing the analysis full circle, this project gave me the opportunity to explore two distinct but equally practical use cases for location-based hotel data. On one hand, I used a rich dataset of ski resorts across Europe to investigate pricing patterns, infrastructure, and destination quality. This was motivated by my interest in planning a potential ski trip after graduation, a personal reward and a data-driven vacation rolled into one. By analyzing how hotel prices relate to altitude, lift infrastructure, and geographic location, I was able to surface insights that will help me make an informed choice based not just on cost, but on the skiing experience itself.

I also leveraged the Google Places API to analyze hotel options right here in Cincinnati. With my graduation on the horizon, I’ll be hosting extended family from out of town, and I wanted a clear way to identify hotels that are not only well-rated but have a proven track record based on user volume. The rating-volume visualization helped highlight properties that consistently deliver strong experiences at scale, exactly the kind of reliability I want for family members making the trip to celebrate.

Together, these two datasets showcase the versatility of data science for both long-term planning and near-term logistics. Whether I’m comparing alpine villages across France and Austria or accommodations in my own backyard, these tools allow me to move beyond guesswork and into decisions grounded in evidence.

--- title: "Final Project - Hotels" # Name of your HTML output subtitle: "Google Maps Places API" author: "CT" # Author name editor: visual toc: true # Generates an automatic table of contents. format: # Options related to formatting. html: # Options related to HTML output. code-tools: TRUE # Allow the code tools option showing in the output. embed-resources: TRUE # Embeds all components into a single HTML file. execute: # Options related to the execution of code chunks. warning: FALSE # FALSE: Code chunk sarnings are hidden by default. message: FALSE # FALSE: Code chunk messages are hidden by default. --- ```{r} #| include: false #| label: setup library(tidyverse) library(jsonlite) library(httr) library(magrittr) library(skimr) ``` ## Introduction This is my introduction as a graduating member of the Business Analytics major from Xavier University! In this post, I will be exploring the intricacies of hotels/ski resorts both domestically and internationally. I will be analyzing key factors that set one another apart, how different countries stand up against each other, and see how they stack up in the eyes of the customer. I am excited to dive deeper into this data as I happen to be quite the skier. As I near graduation, I have decided that a ski trip will be in my near future, and I would like to utilize this data set to help narrow my search down. For the second portion of this post, I will be calling to the Google Maps API to retrieve hotel data for the Cincinnati area. Here I will be utilizing this data set a bit differently, as there are no ski hotels/resorts in Cincinnati. However, I do have some extended family coming in for my graduation, and they will need a place to stay. I plan on using this data set to determine the best places for them to stay because no way are they staying with me. ## Primary Data Set ```{r} #| include: false #| label: dataset ski_hotels <- read_csv("https://myxavier-my.sharepoint.com/:x:/g/personal/temmingc_xavier_edu/ER-WoWRwyWtEsQ2oQMqnEqcBUYoGMpQiuHzFrC4Kc6DsDQ?download=1") ``` Below is a quick overview of what the data set shows. Each row is a different ski hotel/resort. | [Variables]{.underline} | [**Description**]{.underline} | | |-----------------------------------|-----------------------------------|-----------------------------------| | Country | Country hotel/resort is located | | | Resort | Name of resort | | | Hotel | Name of hotel | | | Price (£) | Price of hotel/resort (per night) | | | Distance From Lift (m) | Distance from ski lift (meters) | | | Altitude | Altitude of hotel/resort | | | Total Piste (km) | Total length of marked ski runs (kilometers) | | | Total Lifts | How many ski lifts are at the hotel/resort | | | Gondolas | How many of a specific (expensive) ski lift that completely encapsulates you while transporting you | | | Chair Lifts | How many chair lifts (very common) at the hotel/resort | | | Drag Lifts | How many drag lifts, where you remain on the ground and are pulled up the hotel, are at the hotel/resort | | | Blues | Level of ski runs (blue is easy to intermediate) | | | Reds | Level of ski runs (red is hard to advanced) | | | Blacks | Level of ski runs (black is the most challenging, "Black Diamond" at a lot of places) | | | Total Runs | How many runs (ski trails) at the hotel/resort | | | Sleeps | How many rooms at the hotel/resort | | | Dec Snow Low 2020 (cm) | Lowest amount of snow at the hotel in December of 2020 (centimeters) | | | Dec Snow High 2020 (cm) | Highest amount of snow at the hotel in December of 2020 (centimeters) | | | Jan Snow Low 2020 (cm) | Lowest amount of snow at the hotel in Janurary of 2020 (centimeters) | | | Jan Snow High 2020 (cm) | Highest amount of snow at the hotel in January of 2020 (centimeters) | | | Feb Snow Low 2020 (cm) | Lowest amount of snow at the hotel in February of 2020 (centimeters) | | | Feb Snow High 2020 (cm) | Highest amount of snow at the hotel in February of 2020 (centimeters) | | ### Price (£) vs Altitude As part of my analysis of ski resort data, I explored the relationship between **hotel prices** and **altitude** to better understand what factors might influence the cost of accommodation when planning a ski trip. Using a dataset of ski hotels across various European resorts, I created a scatter plot, mapping hotel prices (in £) against the altitude of each resort (in meters). ```{r} #| label: Price by Altitude #| include: TRUE #| echo: FALSE ggplot(ski_hotels, aes(x = `altitude (m)`, y = `price (£)`)) + geom_point(alpha = 0.6) + geom_smooth(method = 'lm', color = 'blue') + labs(title = "Ski-Hotel Prices vs Altitude", x = 'Altitude (meters)', y = 'Price (£)') + theme_minimal() ``` What emerged was a moderate upward trend, hotels located at higher altitudes tend to be more expensive. This makes sense as higher elevations often come with better snow reliability, ski-in/ski-out convenience, and more picturesque views, all of which can drive up demand (and price). That said, the spread of points at each altitude was fairly wide, indicating that altitude alone doesn’t fully explain price variation. There are likely other contributing factors, such as resort popularity, proximity to lifts (though some entries had "unknown" distance), amenities, or country-level pricing differences. As an analyst looking to optimize for both cost and ski quality, this kind of insight is valuable. If altitude is important for snow conditions, it might be worth paying a premium. But for budget-conscious skiers (like myself), the plot shows that lower**-**cost options do exist even at higher altitudes, and not all low-elevation hotels are cheap. Overall, this visual helped me cut through assumptions and quantify a trade-off that many skiers make but rarely measure. ### Distribution based on Price (£) To complement my earlier analysis of ski hotel pricing trends by altitude, I also examined the overall distribution of hotel prices using a histogram. This visualization allowed me to move beyond individual relationships and instead look at the price picture as a whole. I used a bin width of £50 to group hotels into manageable price ranges and plotted the frequency of hotels within each range. The result was a clear visual of where most hotel prices cluster and how common higher-end or budget options are. ```{r} #| label: Price by itself #| include: TRUE #| echo: FALSE ggplot(ski_hotels, aes(x = `price (£)`)) + geom_histogram(binwidth = 50, fill = "steelblue", color = "white") + labs(title = "Distribution of Hotel Prices", x = "Price (£)", y = "Count of Ski-Hotels") + theme_minimal() ``` The histogram revealed a right-skewed distribution, with the bulk of hotels priced between roughly £500 and £800. This suggests that the majority of ski hotels in the dataset fall within a mid-range price bracket. However, there are noticeable tails extending into higher price ranges, which likely reflect luxury accommodations in premium resorts. On the lower end, relatively few hotels were priced under £400, which might indicate that truly low-budget options are rare or underrepresented in the dataset (perhaps due to the inclusion criteria or data source). Understanding this distribution is important if you're trying to plan around a specific budget. For example, I will have cost constraints, so knowing that most hotels cluster around the £500–£800 mark gives me a realistic benchmark to set expectations. This kind of basic exploratory data visualization adds depth to my understanding and informs more strategic decision-making downstream. ### Price (£) vs Count of Ski Lifts Continuing my exploration of ski resort data, I was curious to see whether resort infrastructure, specifically the number of ski lifts, has any relationship with hotel pricing. My hypothesis was that resorts with more lifts might indicate larger or better-equipped ski areas, which could in turn drive up hotel demand and prices. To test this, I plotted hotel prices against the "Total Lifts" variable using a scatter plot, with a linear regression line overlaid to help clarify the trend. ```{r} #| label: Price by ski-lifts #| include: TRUE #| echo: FALSE ggplot(ski_hotels, aes(x = totalLifts, y = `price (£)`)) + geom_point(alpha = 0.6, color = "darkgreen") + geom_smooth(method = "lm", se = FALSE, color = "red") + labs(title = "Hotel Price vs. Total Ski Lifts", x = "Total Ski Lifts", y = "Price (£)") + theme_minimal() ``` What I found was a weak but slightly positive correlation between the number of lifts and hotel price. The red regression line shows a gentle upward slope, suggesting that resorts with more ski lifts tend to have slightly more expensive hotels on average. That said, the spread of the data is quite wide, there are reasonably priced hotels even at resorts with 40+ lifts, and similarly, some high-priced hotels exist at more modest resorts. This implies that while lift count may be a factor, it’s far from the only driver of price variation. ### Most Expensive Countries To get a broader geographic perspective on pricing, I shifted focus to the country level, calculating the average hotel price per country and visualizing the top 10 most expensive ones. This bar chart ranks countries from highest to lowest average price, giving a quick comparative snapshot of where ski holidays tend to be most costly. ```{r} #| label: Price by country #| include: TRUE #| echo: FALSE ski_hotels %>% group_by(country) %>% summarise(avg_price = mean(`price (£)`, na.rm = TRUE)) %>% arrange(desc(avg_price)) %>% slice_head(n = 10) %>% ggplot(aes(x = reorder(country, avg_price), y = avg_price)) + geom_col(fill = "blue") + coord_flip() + labs(title = "Top 10 Most Expensive Resorts by Country", x = "Country", y = "Average Price (£)") + theme_minimal() + theme(legend.position = "none") ``` The results were pretty striking. Countries like **Finland, France, and Austria** emerged near the top, which aligns with expectations—these nations are known for their well-developed ski resorts and alpine luxury. What stood out was the degree of price difference between the top few and the rest. France, for instance, had an average hotel price significantly higher than many of its peers, reinforcing its reputation as a premium destination. Meanwhile, countries further down the list still had respectable infrastructure but were much more affordable, suggesting a potential value proposition for price-conscious travelers. From a data-driven planning or recommendation standpoint, this chart is incredibly important. If I were prioritizing prestige or luxury experiences (which I am not), destinations at the top of this chart are strong candidates. But for travelers looking for cost-effective ski holidays (me), I can steer myself toward countries that offer decent lift access or snow conditions without the premium price tag. ### Chair Lifts by Country In addition to comparing hotel prices across countries, I also wanted to explore how ski infrastructure differs by location—specifically by looking at the average number of chairlifts per resort within each country. Chairlifts are a good proxy for how well-developed and expansive a resort might be, since they typically serve key pistes and high-traffic areas. I grouped the data by country, calculated the mean number of chairlifts, and visualized the results with a horizontal bar chart. ```{r} #| label: chairs by country #| include: TRUE #| echo: FALSE ski_hotels %>% group_by(country) %>% summarise(avg_chairlifts = mean(chairlifts, na.rm = TRUE)) %>% ggplot(aes(x = reorder(country, -avg_chairlifts), y = avg_chairlifts)) + geom_col(fill = "coral") + labs(title = "Average Number of Chairlifts by Country", x = "Country", y = "Average Chairlifts") + theme_minimal() + theme(legend.position = "none") ``` What emerged was a clear picture of which countries invest most heavily in lift infrastructure. France stood out, averaging a high number of chairlifts per resort. This aligns with it's reputation as hosts of massive ski areas like Les Trois Vallées. France doesn't just have a few mega-resorts, they seem to offer strong infrastructure across the board. On the other hand, some countries with lower average chairlifts might still have high-end resorts, but perhaps fewer of them—or more compact ski areas with a smaller lift footprint. From an analytical standpoint, this kind of metric is useful when assessing the skiing experience quality beyond just price. If I prioritize access to terrain and lift capacity, chairlift count becomes a relevant metric. Combining this with earlier price and altitude insights gives a better picture of what each destination offers. ## Secondary Source The Google Maps Places API is a powerful tool that allows users to query for a variety of places (such as restaurants, hotels, gas stations) within a defined area. It is particularly useful for location-based apps or services, helping users find nearby places and get real-time data on locations, reviews, ratings, etc.. ### How to Access Google Maps Places API To use the Google Maps Places API, you'll need an API key. Follow these steps to set it up: 1. Go to the "Google Cloud Console" 2. Create a new project or select an existing one (top of the page) 3. Enable the Google Places API (may have to create an account first) 4. Generate an API key from the 'Credentials' section 5. Use the key in your R script to authenticate your API calls 6. Create a data frame and convert to a CSV file to better visualize the data ### Call to the API In this example (above), I used the Google Maps Places API to search for hotels near a location in Cincinnati. I specified the location using latitude and longitude coordinates, set a search radius of 1000 meters, and defined the type of place I was interested in (hotels). The API responded with a JSON object containing details of the matching places, which I then analyzed and stored in a data frame. In the data frame, you will see a result of 20 hotels (each row represents one hotel) and 17 different variables (columns) describing the hotels. The Google Maps Places API provided me the name of the hotel, the address, the latitude/longitude, the status (operational or not), a rating, etc. ```{r} #| label: call to api thru csv #| include: FALSE cincy_hotels <- read_csv("https://myxavier-my.sharepoint.com/:x:/g/personal/temmingc_xavier_edu/EWbGhz8AUmZLkST9jgP3AgkBm3_eHT0f7HF_60p3D-WH7g?download=1") ``` ```{r} #| label: alter #| include: FALSE # Create user ratings total bins cincy_hotels <- cincy_hotels %>% mutate(user_ratings_total = cut( user_ratings_total, breaks = c(0, 1500, 3000, 4500, Inf), labels = c("1–1500", "1501–3000", "3001–4500", "4501+"), right = TRUE, include.lowest = TRUE )) %>% filter(!is.na(rating) & !is.na(user_ratings_total)) # Summarize average rating by the newly created user_ratings_total column ratings_summary <- cincy_hotels %>% group_by(user_ratings_total) %>% summarize(avg_rating = mean(rating, na.rm = TRUE)) ``` To better understand how hotel popularity relates to guest satisfaction, I created a bar chart that groups hotels by the **n**umber of user ratings they’ve received, then calculates the average rating within each group. This lets me analyze trends across popularity tiers, whether hotels with more reviews tend to have better, worse, or similar ratings compared to lesser-known spots. It’s a way to balance quantity and quality in a more structured view. ```{r} #| label: col #| include: TRUE #| echo: FALSE # Now plot it with geom_col ggplot(ratings_summary, aes(x = user_ratings_total, y = avg_rating)) + geom_col(fill = "#F39C12") + labs( title = "Average Rating by User Rating Volume Group", x = "User Ratings Group", y = "Average Rating" ) + theme_minimal() ``` The chart clearly shows that hotels with moderate to high numbers of reviews tend to maintain solid average ratings, often above 4.0. Interestingly, the group with the highest volume of reviews doesn’t always have the absolute highest average rating, but it’s still strong, suggesting that popular hotels manage to satisfy guests at scale. On the other hand, some of the lower-review-count groups show slightly more fluctuation in average ratings, which could be due to less consistent service, newer establishments, or just the natural variability that comes with fewer data points. This kind of grouped analysis is particularly valuable when making travel recommendations (for my family). A high rating alone isn’t always enough, if it’s based on just a handful of reviews, the insight is weaker. But seeing strong average ratings in well-reviewed, popular hotels gives me more confidence that these places consistently deliver. For customers who want a balance between reliability and quality, this visualization offers an evidence-based starting point to narrow down options. ## Final Thoughts Bringing the analysis full circle, this project gave me the opportunity to explore two distinct but equally practical use cases for location-based hotel data. On one hand, I used a rich dataset of ski resorts across Europe to investigate pricing patterns, infrastructure, and destination quality. This was motivated by my interest in planning a potential ski trip after graduation, a personal reward and a data-driven vacation rolled into one. By analyzing how hotel prices relate to altitude, lift infrastructure, and geographic location, I was able to surface insights that will help me make an informed choice based not just on cost, but on the skiing experience itself. I also leveraged the Google Places API to analyze hotel options right here in Cincinnati. With my graduation on the horizon, I’ll be hosting extended family from out of town, and I wanted a clear way to identify hotels that are not only well-rated but have a proven track record based on user volume. The rating-volume visualization helped highlight properties that consistently deliver strong experiences at scale, exactly the kind of reliability I want for family members making the trip to celebrate. Together, these two datasets showcase the versatility of data science for both long-term planning and near-term logistics. Whether I’m comparing alpine villages across France and Austria or accommodations in my own backyard, these tools allow me to move beyond guesswork and into decisions grounded in evidence.