This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
###GENERALIZED LINEAR MODEL
library(stats)
# Read the data (replace 'your_data.csv' with the actual file path)
bike_data <- read.csv('D:/FALL 2023/STATISTICS/datasets/bike_data.csv')
bike_data$Holiday <- as.numeric(bike_data$Holiday == "Holiday")
# Create the linear regression model with Temperature as the explanatory variable
model <- lm(Rented.Bike.Count ~ Temperature, data = bike_data)
# Summarize the model
summary(model)
##
## Call:
## lm(formula = Rented.Bike.Count ~ Temperature, data = bike_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1100.60 -336.57 -49.69 233.81 2525.19
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 329.9525 8.5411 38.63 <2e-16 ***
## Temperature 29.0811 0.4862 59.82 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 543.5 on 8758 degrees of freedom
## Multiple R-squared: 0.29, Adjusted R-squared: 0.29
## F-statistic: 3578 on 1 and 8758 DF, p-value: < 2.2e-16
###EVALUATION
model1 <- lm(Rented.Bike.Count ~ Temperature,
filter(bike_data, Holiday == 1))
rsquared <- summary(model)$r.squared
bike_data |>
filter( Holiday == 1 ) |>
ggplot(mapping = aes(x = Temperature,
y = Rented.Bike.Count)) +
geom_point() +
geom_smooth(method = 'lm', color = 'gray', linetype = 'dashed',
se = FALSE) +
geom_smooth(se = FALSE) +
labs(title = "Temperature vs Biked rented ",
subtitle = paste("Linear Fit R-Squared =") ) +
theme_classic()
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
let us use sqrt transformation on explanatory variable according to our lambda value i.e Y=0.501 and close to 0 (positive)
bike_data <- bike_data |>
mutate(Temperature_sqrt = sqrt(Temperature)) # add new variable
## Warning: There was 1 warning in `mutate()`.
## ℹ In argument: `Temperature_sqrt = sqrt(Temperature)`.
## Caused by warning in `sqrt()`:
## ! NaNs produced
model <- lm(Rented.Bike.Count ~ Temperature_sqrt + Temperature,
filter(bike_data,Holiday == 1))
rsquared <- summary(model)$r.squared
bike_data |>
filter(Holiday == 1) |>
ggplot(mapping = aes(x = sqrt(Temperature),
y = Rented.Bike.Count)) +
geom_point() +
geom_smooth(method = 'lm', color = 'gray', linetype = 'dashed',
se = FALSE) +
geom_smooth(se = FALSE) +
labs(title = "Rented.Bike.Count vs. (Temperature) ^ sqrt",
subtitle = paste("Linear Fit R-Squared =", round(rsquared, 3))) +
theme_classic()
## Warning in sqrt(Temperature): NaNs produced
## Warning in sqrt(Temperature): NaNs produced
## Warning in sqrt(Temperature): NaNs produced
## Warning in sqrt(Temperature): NaNs produced
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 294 rows containing non-finite values (`stat_smooth()`).
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## Warning: Removed 294 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 294 rows containing missing values (`geom_point()`).
The scatter plot for square root transformation is as similar to
monotonic. In conclusion, ” If \(\lambda
\approx 1\), no transformation is needed”
###INTERPRETATION For each one-degree increase in temperature, we can expect an increase of approximately 29.0811 in the number of rented bikes, assuming all other factors remain constant.
This implies that as the temperature rises, more people are likely to rent bikes, which is a positive relationship between temperature and the number of rented bikes. It’s important to note that this interpretation assumes a linear relationship between temperature and bike rentals. The coefficient represents the estimated change in the response variable based on the dataset and model used.