title: “HW3 - Simple Linear Regression with Punta Cana Data” subtitle: “Analyzing the Relationship Between Bedrooms and Nightly Rate” author: “David J. Gibbens” date: “2026-04-07” output: ioslides_presentation
Given the recent volatility in the stock market, I have decided to diversify my long‑term investment strategy by opening a self‑directed Roth IRA to purchase real estate in Punta Cana, Dominican Republic. My goal is to acquire a property that can operate as a profitable Airbnb rental while also appreciating in value over time.
To make informed investment decisions, I need to understand which property features — such as the number of bedrooms — make a listing more attractive to tourists and more lucrative as a short‑term rental. This analysis uses real Punta Cana listing data to explore how bedroom count relates to nightly rental rates, forming the foundation for more advanced modeling in Project 1.
## bedrooms ttm_avg_rate
## Min. :1.000 Min. : 18.30
## 1st Qu.:1.000 1st Qu.: 63.20
## Median :2.000 Median : 99.25
## Mean :2.147 Mean : 177.45
## 3rd Qu.:3.000 3rd Qu.: 164.88
## Max. :7.000 Max. :2738.10
## `geom_smooth()` using formula = 'y ~ x'
##
## Call:
## lm(formula = ttm_avg_rate ~ bedrooms, data = pdc_small)
##
## Residuals:
## Min 1Q Median 3Q Max
## -384.74 -102.09 -41.81 98.40 2017.90
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -231.11 40.50 -5.707 4.41e-08 ***
## bedrooms 190.26 16.72 11.377 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 257.9 on 188 degrees of freedom
## Multiple R-squared: 0.4077, Adjusted R-squared: 0.4046
## F-statistic: 129.4 on 1 and 188 DF, p-value: < 2.2e-16
Using the coefficient estimates from the regression output: - Intercept (β0) = -231.11 - Slope (β1) = 190.26
\[ \hat{Y} = -231.11 + 190.26 \cdot \text{Bedrooms} \]
The estimated slope for bedrooms is
190.26.
This means:
For every additional bedroom, the predicted TTM average nightly rate increases by about $190.26 on average.
Because the p‑value for the slope is extremely small (\(< 2e-16\)), the relationship between bedrooms and nightly rate is statistically significant.
The intercept is –231.11, which represents the predicted nightly rate when the number of bedrooms is zero.
This value is not meaningful in a real‑estate context, but it is required mathematically to anchor the regression line.
The regression model has an R-squared value of 0.4077.
This means:
Approximately 40.8% of the variation in TTM average nightly rate is explained by differences in the number of bedrooms.
This indicates a moderately strong relationship for a single‑predictor model.
The F-statistic is 129.4 with a p-value < 2.2e-16.
This indicates:
The overall regression model is statistically significant, meaning bedrooms meaningfully predict nightly rate.
Based on this simple linear regression analysis:
Overall, there is a clear positive relationship between bedroom count and nightly rate, and this model provides a strong first step in understanding pricing dynamics in this market.
To ensure that the simple linear regression model is appropriate for this analysis, it is important to evaluate whether the model assumptions are reasonably met.
Including these plots strengthens the credibility of the analysis and provides a more complete understanding of model performance.
The fitted vs. actual plot compares predicted nightly rates to the true
nightly rates.
Points that fall close to the dashed 45‑degree line indicate accurate
predictions.
A clear upward trend supports the conclusion that bedroom count is
positively associated with nightly rate.
## A marker object has been specified, but markers is not in the mode
## Adding markers to the mode...
Questions or comments are welcome.