Heteroskedastisitas (Breusch–Pagan), multikolinearitas (VIF), pengaruh (Cook’s D), normalitas (QQ-plot). Laporkan temuan & rekomendasi.
#Data
library(car)
## Loading required package: carData
data(mtcars)
head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
str(mtcars)
## 'data.frame': 32 obs. of 11 variables:
## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
## $ disp: num 160 160 108 258 360 ...
## $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
## $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
## $ qsec: num 16.5 17 18.6 19.4 17 ...
## $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
## $ am : num 1 1 1 0 0 0 0 0 0 0 ...
## $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
## $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
#Model regresi
fit <- lm(mpg ~ ., data = mtcars)
vif(fit)
## cyl disp hp drat wt qsec vs am
## 15.373833 21.620241 9.832037 3.374620 15.164887 7.527958 4.965873 4.648487
## gear carb
## 5.357452 7.908747
Variabel disp memiliki nilai vifnya paling besar dan lebih dari 10 maka variabel disp akan dihapus.
fit1 <- lm(mpg ~ . -disp, data = mtcars)
vif(fit1)
## cyl hp drat wt qsec vs am gear
## 14.284737 7.123361 3.329298 6.189050 6.914423 4.916053 4.645108 5.324402
## carb
## 4.310597
Variabel cyl memiliki nilai vifnya paling besar dan lebih dari 10 maka variabel cyl akan dihapus.
fit2 <- lm(mpg ~ . -disp -cyl, data = mtcars)
vif(fit2)
## hp drat wt qsec vs am gear carb
## 6.015788 3.111501 6.051127 5.918682 4.270956 4.285815 4.690187 4.290468
Seluruh nilai VIF < 10, sehingga dapat disimpulkan bahwa model terbebas dari multikolinearitas (asumsi terpenuhi).
#Shapiro Wilk Test
res <- resid(fit2)
shapiro.test(res)
##
## Shapiro-Wilk normality test
##
## data: res
## W = 0.96495, p-value = 0.3727
#QQPlot
library(tibble)
diag_df <- tibble(fitted = fitted(fit2), resid = resid(fit2),
stdres = rstandard(fit2), hat = hatvalues(fit2),
cook = cooks.distance(fit2))
library(ggplot2)
qqplot <- ggplot(diag_df, aes(sample = stdres)) +
stat_qq() + stat_qq_line() + labs(x = "Theoretical", y = "Standardized Residuals")
qqplot
Karena p-value = 0.3727 > 0.05 & titik-titik residual cenderung mengikuti garis lurus diagonal maka residual berdistribusi normal (asumsi terpenuhi).
library(lmtest)
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
bptest(fit2)
##
## studentized Breusch-Pagan test
##
## data: fit2
## BP = 15.303, df = 8, p-value = 0.05352
ggplot(diag_df, aes(fitted, resid)) +
geom_point(alpha = 0.7) +
geom_hline(yintercept = 0, linetype = 2) +
labs(x = "Fitted", y = "Residuals")
Karena p-value = 0.05352 > 0.05 & titik-titik residual menyebar diatas dan bawah garis residual = 0 maka terima H₀. Artinya model terbebas dari heteroskedastisitas (asumsi terpenuhi).
cd <- cooks.distance(fit2)
cd
## Mazda RX4 Mazda RX4 Wag Datsun 710 Hornet 4 Drive
## 0.0181201869 0.0106924900 0.0732925719 0.0015811375
## Hornet Sportabout Valiant Duster 360 Merc 240D
## 0.0068566007 0.0468088736 0.0002614461 0.0348502452
## Merc 230 Merc 280 Merc 280C Merc 450SE
## 0.3303035090 0.0011746336 0.0241765241 0.0026879608
## Merc 450SL Merc 450SLC Cadillac Fleetwood Lincoln Continental
## 0.0033443994 0.0031841709 0.0060875746 0.0009283207
## Chrysler Imperial Fiat 128 Honda Civic Toyota Corolla
## 0.1443473904 0.0704888209 0.0231030848 0.1093415658
## Toyota Corona Dodge Challenger AMC Javelin Camaro Z28
## 0.0374106436 0.0108090861 0.0187710244 0.0011774350
## Pontiac Firebird Fiat X1-9 Porsche 914-2 Lotus Europa
## 0.0260170270 0.0034756825 0.0058809372 0.1632711396
## Ford Pantera L Ferrari Dino Maserati Bora Volvo 142E
## 0.3509199153 0.0001190531 0.1078346806 0.0685191182
#Titik yang patut ditinjau (rule of thumb)
which(cd > 4/length(cd))
## Merc 230 Chrysler Imperial Lotus Europa Ford Pantera L
## 9 17 28 29
plot(cd, ylab="Cook's D"); abline(h=4/length(cd), lty=2)
Terdapat 4 observasi yang memiliki nilai Di > ambangnya (4/n). Selanjutnya, akan dilakukan pengecekan apakah 4 observasi ini perlu dihapus atau tidak.
#dataset baru tanpa 4 mobil itu
influential_obs <- which(cd > (4/length(cd)))
mtcarss <- subset(mtcars, select = -c(disp, cyl))
mtcars_new <- mtcarss[-influential_obs, ]
fit3 <- lm(mpg ~ ., data = mtcars_new)
summary(fit2)
##
## Call:
## lm(formula = mpg ~ . - disp - cyl, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.8187 -1.3903 -0.3045 1.2269 4.5183
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 13.80810 12.88582 1.072 0.2950
## hp -0.01225 0.01649 -0.743 0.4650
## drat 0.88894 1.52061 0.585 0.5645
## wt -2.60968 1.15878 -2.252 0.0342 *
## qsec 0.63983 0.62752 1.020 0.3185
## vs 0.08786 1.88992 0.046 0.9633
## am 2.42418 1.91227 1.268 0.2176
## gear 0.69390 1.35294 0.513 0.6129
## carb -0.61286 0.59109 -1.037 0.3106
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.566 on 23 degrees of freedom
## Multiple R-squared: 0.8655, Adjusted R-squared: 0.8187
## F-statistic: 18.5 on 8 and 23 DF, p-value: 2.627e-08
summary(fit3)
##
## Call:
## lm(formula = mpg ~ ., data = mtcars_new)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.4398 -1.5273 -0.0027 1.1378 4.2436
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -18.25532 21.70599 -0.841 0.4108
## hp 0.01297 0.01969 0.659 0.5179
## drat 2.40851 1.65495 1.455 0.1619
## wt -3.60201 1.28552 -2.802 0.0114 *
## qsec 2.03098 1.03338 1.965 0.0642 .
## vs -3.14638 2.40718 -1.307 0.2068
## am 1.10451 1.85774 0.595 0.5592
## gear 1.74121 1.57344 1.107 0.2823
## carb -0.81296 0.69539 -1.169 0.2568
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.299 on 19 degrees of freedom
## Multiple R-squared: 0.8959, Adjusted R-squared: 0.852
## F-statistic: 20.43 on 8 and 19 DF, p-value: 8.687e-08
#Plot untuk fit2
plot_fit2 <- ggplot(data.frame(mpg = mtcarss$mpg, fitted = fitted(fit2)),
aes(x = fitted, y = mpg)) + geom_point() +
geom_line(aes(y = fitted), color = "blue", linewidth = 1) +
ggtitle("Regression with Outliers")
#Plot untuk fit3
plot_fit3 <- ggplot(data.frame(mpg = mtcars_new$mpg, fitted = fitted(fit3)),
aes(x = fitted, y = mpg)) + geom_point() +
geom_line(aes(y = fitted), color = "red", linewidth = 1) +
ggtitle("Regression without Outliers")
#plot the two scatterplots side by side
library(gridExtra)
gridExtra::grid.arrange(plot_fit2, plot_fit3, ncol=2)
Terlihat bahwa setelah dilakukan penghapusan 4 observasi tidak memberikan pengaruh yang besar terhadap hasil maka data akhir yang terpilih ialah data awal dengan model kedua.
summary(fit2)
##
## Call:
## lm(formula = mpg ~ . - disp - cyl, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.8187 -1.3903 -0.3045 1.2269 4.5183
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 13.80810 12.88582 1.072 0.2950
## hp -0.01225 0.01649 -0.743 0.4650
## drat 0.88894 1.52061 0.585 0.5645
## wt -2.60968 1.15878 -2.252 0.0342 *
## qsec 0.63983 0.62752 1.020 0.3185
## vs 0.08786 1.88992 0.046 0.9633
## am 2.42418 1.91227 1.268 0.2176
## gear 0.69390 1.35294 0.513 0.6129
## carb -0.61286 0.59109 -1.037 0.3106
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.566 on 23 degrees of freedom
## Multiple R-squared: 0.8655, Adjusted R-squared: 0.8187
## F-statistic: 18.5 on 8 and 23 DF, p-value: 2.627e-08
Terlihat bahwa hanya wt yang memiliki pengaruh secara signifikan terhadap mpg dengan besar pengaruh sebesar -2.60968. Artinya setiap penambahan 1 unit berat mobil, rata-rata konsumsi bahan bakar (mpg) menurun sebesar 2.61 mil per galon, dengan asumsi variabel lain tetap konstan.
Terlihat juga nilai R-squared = 0.8655, artinya sekitar 86,55% variasi mpg dapat dijelaskan oleh variabel-variabel yang ada dalam model. Adjusted R-squared = 0.8187, memperhitungkan jumlah variabel dalam model, menunjukkan model cukup baik menjelaskan variasi data.
Berat mobil (wt) adalah faktor utama yang mempengaruhi konsumsi bahan bakar, sedangkan variabel lain tidak menunjukkan pengaruh yang signifikan pada data ini. Persamaan akhir regresi yang di dapat dalam penelitian ini ialah:
\[ \hat{mpg} = 13.8081 - 2.60968 \cdot wt \]
Bandingkan standar error dan p-value.
#Data
n <- 400
x1 <- runif(n,0,10)
x2 <- rnorm(n,5,2)
#Model regresi
sigma_i <- 0.6 + 0.2*x1
err <- rnorm(n, 0, sigma_i)
y <- 2 + 1.5*x1 - 1.2*x2 + err
sim <- tibble(y, x1, x2, w = 1/sigma_i^2)
fit_ols <- lm(y ~ x1 + x2, data = sim)
fit_wls <- lm(y ~ x1 + x2, data = sim, weights = w)
#Uji Breusch-Pagan untuk heteroskedastisitas
lmtest::bptest(fit_ols)
##
## studentized Breusch-Pagan test
##
## data: fit_ols
## BP = 54.953, df = 2, p-value = 1.167e-12
library(ggplot2)
ggplot(data.frame(fitted=fitted(fit_ols), resid=resid(fit_ols)),
aes(fitted, resid)) +
geom_point(alpha=.6) +
geom_smooth(se=FALSE, method="loess") +
geom_hline(yintercept = 0, linetype=2) +
labs(x="Fitted", y="Residuals")
## `geom_smooth()` using formula = 'y ~ x'
Terlihat bahwa terdapat heteroskedastisitas maka metode Robust (HC3) dan WLS perlu dilakukan.
library(sandwich) # buat vcovHC
library(lmtest) # buat coeftest
library(tibble) # buat tibble
comp_se <- tibble(
term = names(coef(fit_ols)),
OLS_SE = coef(summary(fit_ols))[, "Std. Error"],
Robust_SE = sqrt(diag(vcovHC(fit_ols, type = "HC3"))),
WLS_SE = coef(summary(fit_wls))[,"Std. Error"],
OLS_pval = coef(summary(fit_ols))[,"Pr(>|t|)"],
Robust_pval = coeftest(fit_ols, vcov. = vcovHC(fit_ols, type="HC3"))[,4],
WLS_pval = coef(summary(fit_wls))[,"Pr(>|t|)"])
comp_se
## # A tibble: 3 × 7
## term OLS_SE Robust_SE WLS_SE OLS_pval Robust_pval WLS_pval
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.270 0.230 0.174 3.77e- 11 1.69e- 14 3.98e- 22
## 2 x1 0.0301 0.0319 0.0246 7.73e-172 3.96e-163 3.36e-201
## 3 x2 0.0427 0.0417 0.0313 1.87e- 96 3.35e- 99 1.62e-132
library(tidyr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:gridExtra':
##
## combine
## The following object is masked from 'package:car':
##
## recode
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
#Plot Standard Error
df_SE <- comp_se %>%
select(term, OLS_SE, Robust_SE, WLS_SE) %>%
gather(Method, SE, -term)
ggplot(df_SE, aes(x = term, y = SE, color = Method, group = Method)) +
geom_point(size = 3) +
geom_line() +
theme_minimal() +
labs(title = "Perbandingan Standard Error",
y = "Standard Error", x = "Term")
#Plot p-value (log scale)
df_pval <- comp_se %>%
select(term, OLS_pval, Robust_pval, WLS_pval) %>%
gather(Method, pval, -term)
ggplot(df_pval, aes(x = term, y = pval, color = Method, group = Method)) +
geom_point(size = 3) +
geom_line() +
scale_y_log10() +
theme_minimal() +
labs(title = "Perbandingan p-value (log scale)",
y = "p-value (log scale)", x = "Term")
Standard Error
P-Value
Semua variabel signifikan dengan terdapat perbedaan urutan p-value. SE WLS paling kecil dibanding metode lain maka WLS paling “efisien” dibandingkan OLS & Robust (HC3) jika bobot tetap.
Modelkan mpg ~ wt dan bandingkan AIC/BIC + 5-fold CV
library(splines)
fit_poly <- lm(mpg ~ poly(wt, 2, raw = TRUE), data = mtcars)
fit_spl <- lm(mpg ~ bs(wt, df = 5), data = mtcars)
library(dplyr)
AIC(fit_poly, fit_spl) %>% arrange(AIC)
## df AIC
## fit_poly 4 158.0484
## fit_spl 7 163.0402
BIC(fit_poly, fit_spl) %>% arrange(BIC)
## df BIC
## fit_poly 4 163.9113
## fit_spl 7 173.3003
Model polynomial merupakan model terbaik dengan nilai aic dan bic lebih rendah dari model spline.
set.seed(123)
library(modelr)
library(purrr)
##
## Attaching package: 'purrr'
## The following object is masked from 'package:car':
##
## some
cv_res <- crossv_kfold(mtcars, k = 5)
rmse_fun <- function(train, test, formula) {
m <- lm(formula, data = as.data.frame(train))
sqrt(mean((as.data.frame(test)$mpg - predict(m, newdata =
as.data.frame(test)))^2))}
forms <- list(
Poly = mpg ~ poly(wt, 2, raw = TRUE),
Spline = mpg ~ bs(wt, df = 5, Boundary.knots = range(mtcars$wt)))
cv_tbl <- map_dfr(names(forms), function(nm){
fml <- forms[[nm]]
tibble(model = nm, fold = seq_len(5),
rmse = map2_dbl(cv_res$train, cv_res$test, ~ rmse_fun(.x, .y, fml)))
})
cv_summary <- cv_tbl %>%
group_by(model) %>%
summarise(mean_rmse = mean(rmse), sd_rmse = sd(rmse))
cv_summary
## # A tibble: 2 × 3
## model mean_rmse sd_rmse
## <chr> <dbl> <dbl>
## 1 Poly 2.53 1.32
## 2 Spline 2.88 1.27
Model polynomial merupakan model terbaik jika dilihat dari rmse dengan nilai rmse lebih rendah dari model spline.
library(tidyr)
grid_wt <- tibble(wt = seq(min(mtcars$wt), max(mtcars$wt), length.out = 200))
pred <- grid_wt %>% mutate(
Poly = predict(fit_poly, newdata = grid_wt),
Spline = predict(fit_spl, newdata = grid_wt)
) %>% pivot_longer(Poly:Spline, names_to = "model", values_to = "yhat")
gg = ggplot() +
geom_point(data = mtcars, aes(wt, mpg), alpha = 0.4) +
geom_line(data = pred, aes(wt, yhat, color = model), linewidth = 1) +
labs(x = "Vehicle Weight", y = "Fuel Efficiency (mpg)",
title = "Polynomial vs Spline Fit") + theme_minimal()
gg
Model Polynomial (merah) mengikuti pola data dengan baik. Sedangkan, model Spline (biru) cenderung berbelok-belok dan kurang pola dengan baik.
Secara keseluruhan kedua model mirip, tapi metode polynomial sedikit lebih konsisten dan “efisien” untuk dataset kecil seperti mtcars.
Gunakan kode GLS di latihan, ganti x dengan sinyal musiman (mis. sin(2pit/12)), interpretasikan ̂𝜌 ; bandingkan SE OLS vs GLS.
library(tibble)
set.seed(42)
T <- 400
t <- 1:T
x <- sin(2*pi*t/12)
rho <- 0.6
u <- rnorm(T, 0, 1)
err <- numeric(T)
for (tt in 2:T) err[tt] <- rho*err[tt-1] + u[tt]
y <- 1 + 2*x + err
dar1 <- tibble(t = t, x = x, y = y)
head(dar1)
## # A tibble: 6 × 3
## t x y
## <int> <dbl> <dbl>
## 1 1 5 e- 1 2
## 2 2 8.66e- 1 2.17
## 3 3 1 e+ 0 3.02
## 4 4 8.66e- 1 3.38
## 5 5 5 e- 1 2.79
## 6 6 1.22e-16 1.37
Persamaan Regresi yang digunakan: \[ y = 1 + 2x + e \]
#Model OLS
fit_ols <- lm(y ~ x, data = dar1)
summary(fit_ols)
##
## Call:
## lm(formula = y ~ x, data = dar1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.0510 -0.7985 0.0059 0.7792 3.6370
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.96844 0.05664 17.1 <2e-16 ***
## x 1.83909 0.07995 23.0 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.133 on 398 degrees of freedom
## Multiple R-squared: 0.5707, Adjusted R-squared: 0.5697
## F-statistic: 529.2 on 1 and 398 DF, p-value: < 2.2e-16
#Model GLS dengan AR(1)
library(nlme)
##
## Attaching package: 'nlme'
## The following object is masked from 'package:dplyr':
##
## collapse
fit_gls <- gls(y ~ x, data = dar1, correlation = corAR1(form = ~ t))
summary(fit_gls)
## Generalized least squares fit by REML
## Model: y ~ x
## Data: dar1
## AIC BIC logLik
## 1109.051 1124.997 -550.5254
##
## Correlation Structure: AR(1)
## Formula: ~t
## Parameter estimate(s):
## Phi
## 0.5440471
##
## Coefficients:
## Value Std.Error t-value p-value
## (Intercept) 0.9724752 0.1043172 9.322293 0
## x 1.8390741 0.1132805 16.234685 0
##
## Correlation:
## (Intr)
## x -0.013
##
## Standardized residuals:
## Min Q1 Med Q3 Max
## -3.566387570 -0.705789893 0.001620549 0.681785092 3.195184639
##
## Residual standard error: 1.137024
## Degrees of freedom: 400 total; 398 residual
#Perbandingan SE
se_ols <- coef(summary(fit_ols))[, "Std. Error"]
se_gls <- summary(fit_gls)$tTable[, "Std.Error"]
cbind(SE_OLS = se_ols, SE_GLS = se_gls)
## SE_OLS SE_GLS
## (Intercept) 0.05663725 0.1043172
## x 0.07994741 0.1132805
Nilai SE pada model GLS lebih besar dibandingkan dengan model OLS. Artinya model GLS lebih valid karena model sesuai struktur error.
par(mfrow = c(1,2))
acf(resid(fit_ols), main = "ACF Residuals (OLS)")
acf(resid(fit_gls, type = "normalized"), main = "ACF Residuals (GLS)")
Terlihat bahwa:
par(mfrow = c(1,1))
lmtest::dwtest(fit_ols)
##
## Durbin-Watson test
##
## data: fit_ols
## DW = 0.92047, p-value < 2.2e-16
## alternative hypothesis: true autocorrelation is greater than 0
Terlihat bawa model OLS memiliki nilai DW < 2 maka model OLS terdapat indikasi autokorelasi positif. selain itu, terlihat pula p-value < 2.2e-16 maka H₀ ditolak, artinya ada autokorelasi positif secara signifikan pada residual OLS.
Dengan demikian, GLS dengan error AR(1) lebih tepat digunakan dibanding OLS dalam kasus ini karena mampu mengatasi adanya autokorelasi serial pada error.
rho_hat <- coef(fit_gls$modelStruct$corStruct, unconstrained = FALSE)
rho_hat
## Phi
## 0.5440471
#atau bisa pake summary(fit_gls). ambil yang nilai phi
Estimasi 𝜌 = 0.6072367 artinya error memiliki autokorelasi positif moderat (residual di waktu t cenderung masih berhubungan dengan residual di waktu sebelumnya (t−1)).
Jika tetap ingin menggunakan OLS Mekanisme:
Hasil akhirnya mirip GLS, hanya pendekatan yang berbeda.
#Estimasi rho dari regresi resid_t ~ resid_{t-1}
res <- resid(fit_ols)
rho_hat <- coef(lm(res[-1] ~ 0 + res[-length(res)]))[1]
#Quasi-differencing (transformasi data)
y_star <- dar1$y[-1] - rho_hat*dar1$y[-T]
x_star <- dar1$x[-1] - rho_hat*dar1$x[-T]
fit_qd <- lm(y_star ~ x_star)
summary(fit_qd)
##
## Call:
## lm(formula = y_star ~ x_star)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.93677 -0.61574 -0.02748 0.63409 2.57213
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.44702 0.04782 9.348 <2e-16 ***
## x_star 1.83858 0.11312 16.253 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9552 on 397 degrees of freedom
## Multiple R-squared: 0.3995, Adjusted R-squared: 0.398
## F-statistic: 264.2 on 1 and 397 DF, p-value: < 2.2e-16
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.