library(leaflet)
library(ggplot2)
library(GGally)
library(tidyverse)
library(plotly)
library(DT)
## read in data
lvdata <- read.csv("PrivateClubData.csv",as.is = FALSE)
## preprocessing
lvdata$Long <- -lvdata$Long
lvdata$Type <- "Private"
lvdata$Type[lvdata$Private=="N"] <- "Public"
lvdata$TenYearCost <- lvdata$InitiationFee + 10*lvdata$AnnualDues
This graphic shows information and locations for many of the private courses and some public courses that offer a yearly pass. Information about each course can be obtained by clicking on it.
getColor <- function(lvdata) {
sapply(lvdata$Private, function(Private) {
if(Private == "Y") {
"green"
} else {
"red"
} })
}
icons <- awesomeIcons(
icon = 'flag',
iconColor = 'white',
library = 'ion',
markerColor = getColor(lvdata)
)
leaflet(lvdata) %>% addTiles() %>%
addAwesomeMarkers(~Long, ~Lat, icon=icons,
popup =~paste0(lvdata$Club, "</br>", lvdata$Type, "</br>",
"Initiation Fee: $", lvdata$InitiationFee, "</br>",
"Annual Dues: $", lvdata$AnnualDues, "</br>",
"10 Year Cost: $", format(lvdata$TenYearCost, scientific = FALSE),"</br>",
"Google Rating (",lvdata$GNumRatings,") : ",lvdata$GoogleRating,"</br>",
"GolfPass Rating (",lvdata$GPNumRatings,") : ",lvdata$GolfPassRating,"</br>",
"# Holes: ", lvdata$Holes, "</br>",
"Monthly Dining Min: $", lvdata$DiningMinimum, "</br>",
lvdata$Note)) %>%
addLegend(position="bottomright",
colors=c("green","red"), labels=c("Private","Public"))
A regression analysis was performed to see what factors could be influencing cost. Because cost consists of initiation fee and annual dues, the cost explored was the ten year cost (initiation fee plus 10 years of dues, not including dining minimums). The biggest factor in cost, as expected, was whether or not the facility was private. All private facilities provide much more than just golf (i.e. social atmosphere, training and practice facilities, raquet sports, restaurants), so that should be factored in when considering cost. The next factor that influenced cost was Google rating. When both of these factors are considered, 77% of the variation in 10 year cost is explained. This means that 23% of the variation in cost is explained by undetermined factors, like location or clientele. The regression equation for this prediction is:
\[ \underbrace{Y_i}_{\text{TenYearCost}} = -322627.33 + 92926.67 \underbrace{X_{i1}}_{\text{GoogleRating}} + 93725.33 \underbrace{X_{i2}}_{\text{private=1, public=0}} \] The first number in this equation is the y-intercept, which means, if the Google rating is 0, then the ten year cost would be $-322647 on average. Y-intercepts often do not make sense, as in this case, but are used to fit the line more appropriately. The second number is the slope. This means as Google rating goes up 1 point, the ten year cost goes up $92926 on average. This slope is similar whether private or public. The third number is the change in y-intercept when going from public to private courses. This means that private courses will cost an additional $93725 over 10 years on average when compared to public courses.
This equation can be used to determine average costs of courses. For example, a course like Bear’s Best as it transitions from public to private has a Google rating of 4.5. When using this equation: -322627.33 + 92926.67(4.5) + 93725.33(1), the average ten year cost for a course like this would be $189268.
The plot below shows this data and how this regression equation works. Each line represents the predictions (expected averages) for public (red) and private (green) courses. This plot shows which courses cost more than expected given its Google rating and which courses cost less than expected. The lower green dot representing Canyon Gate GC shows this course to be much less than expected. In fact, as you hover over the dot, you view information about this course, including how far the price is from the prediction (average), which is -81661. This means that its cost is $81661 lower than expected over 10 years for a private course with a similar Google rating of 4.6.
## run regression model
regdata <- select(lvdata, c("TenYearCost","GoogleRating","YelpRating","Private",
"GolfPassRating","Holes","Tennis","DiningMinimum"))
regmod <- lm(TenYearCost~GoogleRating + Private,
data=regdata)
#summary(regmod)
coefs <- coef(regmod)
## store the residuals
lvdata$Residuals <- NA
lvdata$Residuals[as.numeric(names(regmod$residuals))] <- round(regmod$residuals,2)
## plot
plot_ly(data=lvdata, x=~GoogleRating, y=~TenYearCost, color=~Private,
colors=c("red","green"),
text=paste0(lvdata$Club,"\n","Initiation Fee: $",lvdata$InitiationFee,"\n",
"Annual Dues: $",lvdata$AnnualDues, "\n",
"Price From Predicted: $",round(lvdata$Residuals,0),
"Ten Year Cost: $",lvdata$TenYearCost, "\n",
"Google Rating:",lvdata$GoogleRating, "\n",
"GolfPass Rating:",lvdata$GolfPassRating)) %>%
layout(title="Las Vegas Private and Public Course 10 Year Annual Cost",
xaxis=list(title="Google Rating"),
yaxis=list(title="Total 10 Year Cost"),
shapes=list(list(type="line", x0=4,x1=5,
y0=coefs[1]+coefs[2]*4,y1=coefs[1]+coefs[2]*5,
line=list(color="red",width=1)),
list(type="line", x0=4, x1=5,
y0=coefs[1]+coefs[3]+coefs[2]*4,
y1=coefs[1]+coefs[2]+coefs[3]*5,
line=list(color="green",width=1))),
legend=list(title=list(text='<b> Private </b>')),
annotations=list(x=4.9,y=52000,showarrow=FALSE,
text=paste0("r",'<sup> 2 </sup>',"=",
round(summary(regmod)$r.squared,2)*100,"%")))
Google, Yelp, and GolfPass ratings were collected for each course. Note that you can hover over the points to see more information. Number of ratings are presented in () of the ratings. This analysis compares these ratings to determine if they are correlated, as expected.
In the first graphic, the Google and Yelp ratings are positively correlated (r=0.75), however Chimera GC has a much lower Google Rating then their Yelp Rating would indicate.
The second graphic shows that the Google and Golfpass ratings are only slightly correlated (r=0.23). Chimera GC again has a much lower Google rating than their Golfpass rating would indicate. It is interesting to see that ratings are not always consistent among different sources.
lm1 <- lm(GoogleRating ~ YelpRating, data=lvdata)
coef1 <- coef(lm1)
plot_ly(data=lvdata, x=~YelpRating, y=~GoogleRating, color=~Private,
colors=c("red","green"),
text=paste0(lvdata$Club,"\n","Initiation Fee: $",lvdata$InitiationFee,"\n",
"Annual Dues: $",lvdata$AnnualDues, "\n",
"Google Rating (",lvdata$GNumRatings,"): ",lvdata$GoogleRating, "\n",
"Yelp Rating (",lvdata$YNumRatings,"): ",lvdata$YelpRating, "\n",
"GolfPass Rating (",lvdata$GPNumRatings,"): ",lvdata$GolfPassRating)) %>%
layout(title="Rating Comparison: Yelp vs Google",
xaxis=list(title="Yelp Rating"),
yaxis=list(title="Google Rating"),
shapes=list(list(type="line", x0=3,x1=5,
y0=coef1[1] + coef1[2]*3,y1=coef1[1] + coef1[2]*5,
line=list(color="black",width=1))),
annotations=list(x=4.8,y=4.1,showarrow=FALSE,
text=paste0("r=",round(sqrt(summary(lm1)$r.squared),2))),
legend=list(title=list(text='<b> Private </b>')))
lm2 <- lm(GoogleRating ~ GolfPassRating, data=lvdata)
coef2<- coef(lm2)
plot_ly(data=lvdata, x=~GolfPassRating, y=~GoogleRating, color=~Private,
colors=c("red","green"),
text=paste0(lvdata$Club,"\n","Initiation Fee: $",lvdata$InitiationFee,"\n",
"Annual Dues: $",lvdata$AnnualDues, "\n",
"Google Rating (",lvdata$GNumRatings,"): ",lvdata$GoogleRating, "\n",
"Yelp Rating (",lvdata$YNumRatings,"): ",lvdata$YelpRating, "\n",
"GolfPass Rating (",lvdata$GPNumRatings,"): ",lvdata$GolfPassRating)) %>% layout(title="Rating Comparison: GolfPass vs Google",
xaxis=list(title="GolfPass Rating"),
yaxis=list(title="Google Rating"),
shapes=list(list(type="line", x0=3,x1=5,
y0=coef2[1] + coef2[2]*3,y1=coef2[1] + coef2[2]*5,
line=list(color="black",width=1))),
annotations=list(x=4.8,y=4.1,showarrow=FALSE,
text=paste0("r=",round(sqrt(summary(lm2)$r.squared),2))),
legend=list(title=list(text='<b> Private </b>')))
Data was collected in January of 2025.
datatable(lvdata[,c("Club","Holes","Type","TenYearCost","InitiationFee",
"AnnualDues","DiningMinimum","Range","PracticeArea","FitnessCenter",
"Pool","Tennis","Pickleball","GoogleRating","GNumRatings","GolfPassRating",
"GPNumRatings","YelpRating","YNumRatings","Note")],
options=list(lengthMenu = c(3,10,30),pageLength = 20,
scrollY=400,scroller=TRUE,scrollX=TRUE),
extensions="Scroller")