This R Markdown file is a collection of useful R code snippets, tips, and best practices. I will be adding to it throughout the semester.
You can create a variable in R using the <-
operator:
num_vec <- c(1, 2, 3, 4, 5)
char_vec <- c("apple", "banana", "cherry")
df <- data.frame(
ID = 1:3,
Name = c("John", "Jane", "Doe"),
Age = c(25, 30, 22)
)
df
x <- sample(-10:10,1)
x
[1] 1
if (x > 0) {
print("x is positive")
} else {
print("x is non-positive")
}
[1] "x is positive"
x
[1] 1
#ggplot practice
cars
summary(cars)
speed dist
Min. : 4.0 Min. : 2.00
1st Qu.:12.0 1st Qu.: 26.00
Median :15.0 Median : 36.00
Mean :15.4 Mean : 42.98
3rd Qu.:19.0 3rd Qu.: 56.00
Max. :25.0 Max. :120.00
install.packages("tidyverse")
Error in install.packages : Updating loaded packages
install.packages("forecast")
WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding:
https://cran.rstudio.com/bin/windows/Rtools/
Installing package into ‘C:/Users/Ondrej/AppData/Local/R/win-library/4.2’
(as ‘lib’ is unspecified)
also installing the dependencies ‘xts’, ‘TTR’, ‘quadprog’, ‘quantmod’, ‘fracdiff’, ‘lmtest’, ‘timeDate’, ‘tseries’, ‘urca’, ‘zoo’, ‘RcppArmadillo’
There are binary versions available but the source versions are later:
Binaries will be installed
trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.2/xts_0.13.2.zip'
Content type 'application/zip' length 848808 bytes (828 KB)
downloaded 828 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.2/TTR_0.24.4.zip'
Content type 'application/zip' length 524895 bytes (512 KB)
downloaded 512 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.2/quadprog_1.5-8.zip'
Content type 'application/zip' length 36699 bytes (35 KB)
downloaded 35 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.2/quantmod_0.4.26.zip'
Content type 'application/zip' length 1047811 bytes (1023 KB)
downloaded 1023 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.2/fracdiff_1.5-3.zip'
Content type 'application/zip' length 104660 bytes (102 KB)
downloaded 102 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.2/lmtest_0.9-40.zip'
Content type 'application/zip' length 406020 bytes (396 KB)
downloaded 396 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.2/timeDate_4032.109.zip'
Content type 'application/zip' length 1419749 bytes (1.4 MB)
downloaded 1.4 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.2/tseries_0.10-55.zip'
Content type 'application/zip' length 379273 bytes (370 KB)
downloaded 370 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.2/urca_1.3-3.zip'
Content type 'application/zip' length 1096610 bytes (1.0 MB)
downloaded 1.0 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.2/zoo_1.8-12.zip'
Content type 'application/zip' length 1021198 bytes (997 KB)
downloaded 997 KB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.2/RcppArmadillo_0.12.8.2.1.zip'
Content type 'application/zip' length 2031774 bytes (1.9 MB)
downloaded 1.9 MB
trying URL 'https://cran.rstudio.com/bin/windows/contrib/4.2/forecast_8.22.0.zip'
Content type 'application/zip' length 1907328 bytes (1.8 MB)
downloaded 1.8 MB
package ‘xts’ successfully unpacked and MD5 sums checked
package ‘TTR’ successfully unpacked and MD5 sums checked
package ‘quadprog’ successfully unpacked and MD5 sums checked
package ‘quantmod’ successfully unpacked and MD5 sums checked
package ‘fracdiff’ successfully unpacked and MD5 sums checked
package ‘lmtest’ successfully unpacked and MD5 sums checked
package ‘timeDate’ successfully unpacked and MD5 sums checked
package ‘tseries’ successfully unpacked and MD5 sums checked
package ‘urca’ successfully unpacked and MD5 sums checked
package ‘zoo’ successfully unpacked and MD5 sums checked
package ‘RcppArmadillo’ successfully unpacked and MD5 sums checked
package ‘forecast’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\Ondrej\AppData\Local\Temp\RtmpquS9kW\downloaded_packages
install.packages("tidyverse")
WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding:
https://cran.rstudio.com/bin/windows/Rtools/
Warning in install.packages :
package ‘tidyverse’ is in use and will not be installed
install.packages("ggplot2")
Error in install.packages : Updating loaded packages
library(tidyverse)
library(dplyr)
library(forecast)
Warning: package ‘forecast’ was built under R version 4.2.3
Registered S3 method overwritten by 'quantmod':
method from
as.zoo.data.frame zoo
library(ggplot2)
model <- lm(pressure ~ wind , data = storms)
model
Call:
lm(formula = pressure ~ wind, data = storms)
Coefficients:
(Intercept) wind
1027.7029 -0.6827
intercept <- coef(model)[1]
slope <- coef(model)[2]
storms_filtered <- storms %>%
arrange(desc(pressure))
ggplot(storms, aes(x = wind, y = pressure)) +
geom_point() + # Adds points
geom_smooth(method = "lm", col = "red", lwd = 2) + # Regression line
geom_abline(intercept = intercept, slope = slope, col = "white", lty = 1 , lwd = 2) +
labs(title = "Wind Speed vs. Pressure in Storms",
x = "Wind Speed",
y = "Pressure")
`geom_smooth()` using formula = 'y ~ x'
ggplot(storms_clean, aes(x = year, y = pressure)) +
geom_point(alpha = 0.5) +
geom_smooth(method = "lm", se = FALSE, color = "blue") +
labs(title = "Pressure vs Year", x = "Year", y = "Pressure (mb)")
`geom_smooth()` using formula = 'y ~ x'
storms_clean <- storms %>%
filter(!is.na(pressure) & !is.na(wind) & !is.na(year))
# Fit a linear regression model
model <- lm(pressure ~ wind + year, data = storms_clean)
# Display the summary of the model
summary(model)
Call:
lm(formula = pressure ~ wind + year, data = storms_clean)
Residuals:
Min 1Q Median 3Q Max
-43.302 -3.030 0.872 3.979 27.521
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.147e+03 7.959e+00 144.15 <2e-16 ***
wind -6.836e-01 1.956e-03 -349.42 <2e-16 ***
year -5.967e-02 3.973e-03 -15.02 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.886 on 19063 degrees of freedom
Multiple R-squared: 0.865, Adjusted R-squared: 0.865
F-statistic: 6.106e+04 on 2 and 19063 DF, p-value: < 2.2e-16
# Example of predicting pressure for specific wind and year values
new_data <- data.frame(wind = c(50, 100), year = c(2015, 2020))
predictions <- predict(model, new_data)
# Display the predictions
print(predictions)
1 2
992.8072 958.3311
# Plot diagnostic plots for the model
par(mfrow = c(2, 2))
plot(model)