Pada tulisan ini akan dibahas secara singkat langkah-langkah melakukan analisis korelasi dan regresi linier.
Regresi Linier Sederhana adalah sebuah metode untuk mendapatkan sebuah persamaan matematis dari sebuah data yang terdiri dari hanya satu buah variabel dependent (biasa disebut \(y\)) dan satu buah variabel independent (biasa disebut \(x\)). Metode regresi linier secara umum bertujuan untuk mencari nilai koefisien regresi (\(\beta\)) dari variabel \(x\) untuk dapat menduga nilai variabel \(y\), yaitu \(\hat y\), dengan meminimalisisr nilai error, \(\epsilon\).
Persamaan atau formula regresi linier sederhana adalah sebagai berikut. \[ y = \beta_0 + \beta_1 x + \epsilon\] dengan \(\epsilon\) adalah nilai error yang tidak dapat diketahui dan bersifat random.
Untuk mendapat nilai dugaan yang diinginkan mendekati nilai \(y\) yang sesunguhnya/aktual, maka digunakan persamaan matematis sebagai berikut.
\[ \hat y = b_0 + b_1 x\] Pada persamaan pendugaan tersebut tidak terdapat nilai error karena tidak dapat dihitung dan tidak diketahui, sehingga diharapkan nilai error yang ada adalah sekecil mungkin dan diabaikan dalam persamaan matematis.
Persamaan matematis di atas adalah ketika persamaan regresi menggunakan satu variabel independent saja. Jika variabel independent yang digunakan sebanyak \(k\) maka persamaan matematis untuk mendapatkan \(\hat y\) adalah sebagai berikut.
\[ \hat y = b_0 + b_1 x_1 + b_2x_2 + \cdots + b_kx_k\]
Persamaan matematis untuk regresi linier sederhana atau berganda dapat dituliskan dalam notasi matriks sebagai berikut.
\[ \mathbf{\hat Y} = \mathbf{Xb}\]
dengan \[ \begin{aligned} \mathbf{\hat Y} &= \{\hat y_1, \hat y_2, \cdots, \hat y_n\}\\ \mathbf{b} &= \{b_0, b_1, \cdots, b_k\}\\ \mathbf{X} &= \left[\begin{array} {rrr} 1 & x_{11} & \cdots & x_{k1} \\ 1 & x_{12} & \cdots & x_{k2} \\ \cdots & \cdots & \cdots & \cdots \\ 1 & x_{1n} & \cdots & x_{kn} \end{array}\right] \end{aligned} \]
\[ \begin{aligned} R^2 &= 1-\frac{SS Error}{SS Total} \\ &= 1-\frac{\sum_{i=1}^{n}(\hat y_i - \bar y)^2}{\sum_{i=1}^{n}(y_i - \bar y)^2} \end{aligned} \]
\[ \begin{aligned} R^2_{adj} &= 1 - \biggl[(1 - R^2)\biggl(\frac{n-1}{n - p - 1}\biggr)\biggr] \\ &= 1-\frac{p-1}{n-1}\biggl(\frac{SSE}{SST}\biggr)\\ &= 1 - \frac{MSE}{SST/p-1} \end{aligned} \] Cukup dengan teorinya # Data
Kita akan menggunakan data ?mtcars dengan varaiabel mpg sebagai dependent variable atau target. Data ini sudah tersedia di R. Data diambil dari majalah Motor Trend US tahun 1974.
head(mtcars)
mtcars2 <- within(mtcars, {
vs <- factor(vs, labels = c("V-shaped", "Straight"))
am <- factor(am, labels = c("Automatic", "Manual"))
cyl <- factor(cyl)
gear <- factor(gear)
carb <- factor(carb)
})
str(mtcars2)
'data.frame': 32 obs. of 11 variables:
$ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
$ cyl : Factor w/ 3 levels "4","6","8": 2 2 1 2 3 2 3 1 1 2 ...
$ disp: num 160 160 108 258 360 ...
$ hp : num 110 110 93 110 175 105 245 62 95 123 ...
$ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
$ wt : num 2.62 2.88 2.32 3.21 3.44 ...
$ qsec: num 16.5 17 18.6 19.4 17 ...
$ vs : Factor w/ 2 levels "V-shaped","Straight": 1 1 2 2 1 2 1 2 2 2 ...
$ am : Factor w/ 2 levels "Automatic","Manual": 2 2 2 1 1 1 1 1 1 1 ...
$ gear: Factor w/ 3 levels "3","4","5": 2 2 2 1 1 1 1 2 2 2 ...
$ carb: Factor w/ 6 levels "1","2","3","4",..: 4 4 1 1 2 1 4 2 2 4 ...
Kita lakukan eksplorasi terlebih dahulu.
summary(mtcars2)
mpg cyl disp hp drat
Min. :10.40 4:11 Min. : 71.1 Min. : 52.0 Min. :2.760
1st Qu.:15.43 6: 7 1st Qu.:120.8 1st Qu.: 96.5 1st Qu.:3.080
Median :19.20 8:14 Median :196.3 Median :123.0 Median :3.695
Mean :20.09 Mean :230.7 Mean :146.7 Mean :3.597
3rd Qu.:22.80 3rd Qu.:326.0 3rd Qu.:180.0 3rd Qu.:3.920
Max. :33.90 Max. :472.0 Max. :335.0 Max. :4.930
wt qsec vs am gear carb
Min. :1.513 Min. :14.50 V-shaped:18 Automatic:19 3:15 1: 7
1st Qu.:2.581 1st Qu.:16.89 Straight:14 Manual :13 4:12 2:10
Median :3.325 Median :17.71 5: 5 3: 3
Mean :3.217 Mean :17.85 4:10
3rd Qu.:3.610 3rd Qu.:18.90 6: 1
Max. :5.424 Max. :22.90 8: 1
library(ggplot2)
ggplot(mtcars2, aes(x = mpg)) +
geom_histogram(bins = 10, color = "white", fill = "pink")
ggplot(mtcars2, aes(x = disp, y = mpg)) +
geom_point()
ggplot(mtcars2, aes(x = disp, y = mpg)) +
geom_point() +
geom_smooth(method = "lm")
cor(mtcars2[, c("mpg", "disp")])
mpg disp
mpg 1.0000000 -0.8475514
disp -0.8475514 1.0000000
cor.test(mtcars2$mpg, mtcars2$disp)
Pearson's product-moment correlation
data: mtcars2$mpg and mtcars2$disp
t = -8.7472, df = 30, p-value = 0.000000000938
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.9233594 -0.7081376
sample estimates:
cor
-0.8475514
ggplot(mtcars2, aes(x = cyl, y = mpg)) +
geom_boxplot()
ggplot(mtcars2, aes(x = hp, y = mpg)) +
geom_point() +
geom_smooth(method = "lm")
cor(mtcars2[, c("mpg", "hp")])
mpg hp
mpg 1.0000000 -0.7761684
hp -0.7761684 1.0000000
cor.test(mtcars2$mpg, mtcars2$hp)
Pearson's product-moment correlation
data: mtcars2$mpg and mtcars2$hp
t = -6.7424, df = 30, p-value = 0.0000001788
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.8852686 -0.5860994
sample estimates:
cor
-0.7761684
ggplot(mtcars2, aes(x = drat, y = mpg)) +
geom_point()
ggplot(mtcars2, aes(x = drat, y = mpg)) +
geom_point() +
geom_smooth(method = "lm")
cor(mtcars2[, c("mpg", "drat")])
mpg drat
mpg 1.0000000 0.6811719
drat 0.6811719 1.0000000
cor.test(mtcars2$mpg, mtcars2$drat)
Pearson's product-moment correlation
data: mtcars2$mpg and mtcars2$drat
t = 5.096, df = 30, p-value = 0.00001776
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.4360484 0.8322010
sample estimates:
cor
0.6811719
ggplot(mtcars2, aes(x = wt, y = mpg)) +
geom_point()
ggplot(mtcars2, aes(x = wt, y = mpg)) +
geom_point() +
geom_smooth(method = "lm")
cor(mtcars2[, c("mpg", "wt")])
mpg wt
mpg 1.0000000 -0.8676594
wt -0.8676594 1.0000000
cor.test(mtcars2$mpg, mtcars2$wt)
Pearson's product-moment correlation
data: mtcars2$mpg and mtcars2$wt
t = -9.559, df = 30, p-value = 0.0000000001294
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.9338264 -0.7440872
sample estimates:
cor
-0.8676594
ggplot(mtcars2, aes(x = qsec, y = mpg)) +
geom_point()
ggplot(mtcars2, aes(x = qsec, y = mpg)) +
geom_point() +
geom_smooth(method = "lm")
cor(mtcars2[, c("mpg", "qsec")])
mpg qsec
mpg 1.000000 0.418684
qsec 0.418684 1.000000
cor.test(mtcars2$mpg, mtcars2$qsec)
Pearson's product-moment correlation
data: mtcars2$mpg and mtcars2$qsec
t = 2.5252, df = 30, p-value = 0.01708
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.08195487 0.66961864
sample estimates:
cor
0.418684
library(corrplot)
corrmatrix <- cor(mtcars2[, c(1, 3:7)])
corrplot(corrmatrix, method = "number")
lm1 <- lm(mpg ~ disp, data = mtcars2)
summary(lm1)
Call:
lm(formula = mpg ~ disp, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-4.8922 -2.2022 -0.9631 1.6272 7.2305
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 29.599855 1.229720 24.070 < 0.0000000000000002 ***
disp -0.041215 0.004712 -8.747 0.000000000938 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.251 on 30 degrees of freedom
Multiple R-squared: 0.7183, Adjusted R-squared: 0.709
F-statistic: 76.51 on 1 and 30 DF, p-value: 0.000000000938
lm1 <- lm(mpg ~ hp, data = mtcars2)
summary(lm1)
Call:
lm(formula = mpg ~ hp, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-5.7121 -2.1122 -0.8854 1.5819 8.2360
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 30.09886 1.63392 18.421 < 0.0000000000000002 ***
hp -0.06823 0.01012 -6.742 0.000000179 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.863 on 30 degrees of freedom
Multiple R-squared: 0.6024, Adjusted R-squared: 0.5892
F-statistic: 45.46 on 1 and 30 DF, p-value: 0.0000001788
lm1 <- lm(mpg ~ drat, data = mtcars2)
summary(lm1)
Call:
lm(formula = mpg ~ drat, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-9.0775 -2.6803 -0.2095 2.2976 9.0225
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -7.525 5.477 -1.374 0.18
drat 7.678 1.507 5.096 0.0000178 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.485 on 30 degrees of freedom
Multiple R-squared: 0.464, Adjusted R-squared: 0.4461
F-statistic: 25.97 on 1 and 30 DF, p-value: 0.00001776
lm1 <- lm(mpg ~ wt, data = mtcars2)
summary(lm1)
Call:
lm(formula = mpg ~ wt, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-4.5432 -2.3647 -0.1252 1.4096 6.8727
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.2851 1.8776 19.858 < 0.0000000000000002 ***
wt -5.3445 0.5591 -9.559 0.000000000129 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.046 on 30 degrees of freedom
Multiple R-squared: 0.7528, Adjusted R-squared: 0.7446
F-statistic: 91.38 on 1 and 30 DF, p-value: 0.0000000001294
lm1 <- lm(mpg ~ qsec, data = mtcars2)
summary(lm1)
Call:
lm(formula = mpg ~ qsec, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-9.8760 -3.4539 -0.7203 2.2774 11.6491
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -5.1140 10.0295 -0.510 0.6139
qsec 1.4121 0.5592 2.525 0.0171 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 5.564 on 30 degrees of freedom
Multiple R-squared: 0.1753, Adjusted R-squared: 0.1478
F-statistic: 6.377 on 1 and 30 DF, p-value: 0.01708
Dengan taraf nyata (\(\alpha\)) sebesar 5%, masing-masing variabel numerik berpengaruh signifikan terhadap mpg.
lm2 <- lm(mpg ~ disp + hp, data = mtcars2)
summary(lm2)
Call:
lm(formula = mpg ~ disp + hp, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-4.7945 -2.3036 -0.8246 1.8582 6.9363
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 30.735904 1.331566 23.083 < 0.0000000000000002 ***
disp -0.030346 0.007405 -4.098 0.000306 ***
hp -0.024840 0.013385 -1.856 0.073679 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.127 on 29 degrees of freedom
Multiple R-squared: 0.7482, Adjusted R-squared: 0.7309
F-statistic: 43.09 on 2 and 29 DF, p-value: 0.000000002062
Ternyata jika disp dan hp digunakan bersamaan, hp tidak berpengaruh signifikan.
lm2 <- lm(mpg ~ disp + drat, data = mtcars2)
summary(lm2)
Call:
lm(formula = mpg ~ disp + drat, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-5.1265 -2.2045 -0.5835 1.4497 6.9884
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 21.844880 6.747971 3.237 0.00302 **
disp -0.035694 0.006653 -5.365 0.00000919 ***
drat 1.802027 1.542091 1.169 0.25210
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.232 on 29 degrees of freedom
Multiple R-squared: 0.731, Adjusted R-squared: 0.7125
F-statistic: 39.41 on 2 and 29 DF, p-value: 0.000000005385
Variabel drat juga sama, tidak berpengaruh signifikan.
lm2 <- lm(mpg ~ disp + wt, data = mtcars2)
summary(lm2)
Call:
lm(formula = mpg ~ disp + wt, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-3.4087 -2.3243 -0.7683 1.7721 6.3484
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 34.96055 2.16454 16.151 0.000000000000000491 ***
disp -0.01773 0.00919 -1.929 0.06362 .
wt -3.35082 1.16413 -2.878 0.00743 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.917 on 29 degrees of freedom
Multiple R-squared: 0.7809, Adjusted R-squared: 0.7658
F-statistic: 51.69 on 2 and 29 DF, p-value: 0.0000000002744
Jika disp dan wt diagunakan bersamaan dalam membentuk model regresi, wt lebih berpengaruh signifikan. Variabel disp menjadi tidak berpengaruh signifikan.
lm3 <- lm(mpg ~ disp + hp + drat, data = mtcars2)
summary(lm3)
Call:
lm(formula = mpg ~ disp + hp + drat, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-5.1225 -1.8454 -0.4456 1.1342 6.4958
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 19.344293 6.370882 3.036 0.00513 **
disp -0.019232 0.009371 -2.052 0.04960 *
hp -0.031229 0.013345 -2.340 0.02663 *
drat 2.714975 1.487366 1.825 0.07863 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.008 on 28 degrees of freedom
Multiple R-squared: 0.775, Adjusted R-squared: 0.7509
F-statistic: 32.15 on 3 and 28 DF, p-value: 0.00000000328
Hanya drat yang tidak berpengaruh signifikan.
lm1 <- lm(mpg ~ cyl, data = mtcars2)
summary(lm1)
Call:
lm(formula = mpg ~ cyl, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-5.2636 -1.8357 0.0286 1.3893 7.2364
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 26.6636 0.9718 27.437 < 0.0000000000000002 ***
cyl6 -6.9208 1.5583 -4.441 0.000119 ***
cyl8 -11.5636 1.2986 -8.905 0.000000000857 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.223 on 29 degrees of freedom
Multiple R-squared: 0.7325, Adjusted R-squared: 0.714
F-statistic: 39.7 on 2 and 29 DF, p-value: 0.000000004979
lm1 <- lm(mpg ~ vs, data = mtcars2)
summary(lm1)
Call:
lm(formula = mpg ~ vs, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-6.757 -3.082 -1.267 2.828 9.383
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 16.617 1.080 15.390 0.000000000000000885 ***
vsStraight 7.940 1.632 4.864 0.000034159372544199 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.581 on 30 degrees of freedom
Multiple R-squared: 0.4409, Adjusted R-squared: 0.4223
F-statistic: 23.66 on 1 and 30 DF, p-value: 0.00003416
lm1 <- lm(mpg ~ am, data = mtcars2)
summary(lm1)
Call:
lm(formula = mpg ~ am, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-9.3923 -3.0923 -0.2974 3.2439 9.5077
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 17.147 1.125 15.247 0.00000000000000113 ***
amManual 7.245 1.764 4.106 0.000285 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.902 on 30 degrees of freedom
Multiple R-squared: 0.3598, Adjusted R-squared: 0.3385
F-statistic: 16.86 on 1 and 30 DF, p-value: 0.000285
lm1 <- lm(mpg ~ gear, data = mtcars2)
summary(lm1)
Call:
lm(formula = mpg ~ gear, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-6.7333 -3.2333 -0.9067 2.8483 9.3667
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 16.107 1.216 13.250 0.0000000000000787 ***
gear4 8.427 1.823 4.621 0.0000725738200575 ***
gear5 5.273 2.431 2.169 0.0384 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.708 on 29 degrees of freedom
Multiple R-squared: 0.4292, Adjusted R-squared: 0.3898
F-statistic: 10.9 on 2 and 29 DF, p-value: 0.0002948
lm1 <- lm(mpg ~ carb, data = mtcars2)
summary(lm1)
Call:
lm(formula = mpg ~ carb, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-7.243 -3.325 0.000 2.360 8.557
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 25.343 1.854 13.670 0.000000000000221 ***
carb2 -2.943 2.417 -1.218 0.23435
carb3 -9.043 3.385 -2.672 0.01285 *
carb4 -9.553 2.417 -3.952 0.00053 ***
carb6 -5.643 5.243 -1.076 0.29174
carb8 -10.343 5.243 -1.973 0.05927 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.905 on 26 degrees of freedom
Multiple R-squared: 0.4445, Adjusted R-squared: 0.3377
F-statistic: 4.161 on 5 and 26 DF, p-value: 0.006546
lm2 <- lm(mpg ~ cyl + vs, data = mtcars2)
summary(lm2)
Call:
lm(formula = mpg ~ cyl + vs, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-5.201 -1.686 0.000 1.463 7.299
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 27.2901 2.0856 13.085 0.000000000000188 ***
cyl6 -7.1535 1.7235 -4.151 0.00028 ***
cyl8 -12.1901 2.2616 -5.390 0.000009559631134 ***
vsStraight -0.6891 2.0210 -0.341 0.73567
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.273 on 28 degrees of freedom
Multiple R-squared: 0.7336, Adjusted R-squared: 0.705
F-statistic: 25.7 on 3 and 28 DF, p-value: 0.00000003412
lm2 <- lm(mpg ~ cyl + am, data = mtcars2)
summary(lm2)
Call:
lm(formula = mpg ~ cyl + am, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-5.9618 -1.4971 -0.2057 1.8907 6.5382
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 24.802 1.323 18.752 < 0.0000000000000002 ***
cyl6 -6.156 1.536 -4.009 0.000411 ***
cyl8 -10.068 1.452 -6.933 0.000000155 ***
amManual 2.560 1.298 1.973 0.058457 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.073 on 28 degrees of freedom
Multiple R-squared: 0.7651, Adjusted R-squared: 0.7399
F-statistic: 30.4 on 3 and 28 DF, p-value: 0.000000005959
lm2 <- lm(mpg ~ cyl + gear, data = mtcars2)
summary(lm2)
Call:
lm(formula = mpg ~ cyl + gear, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-5.3520 -1.7633 -0.3789 1.7393 7.1480
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 25.428 1.881 13.517 0.000000000000155 ***
cyl6 -6.656 1.629 -4.086 0.000353 ***
cyl8 -10.542 1.958 -5.384 0.000010865640238 ***
gear4 1.324 1.928 0.687 0.498000
gear5 1.500 1.855 0.809 0.425707
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.294 on 27 degrees of freedom
Multiple R-squared: 0.7398, Adjusted R-squared: 0.7012
F-statistic: 19.19 on 4 and 27 DF, p-value: 0.0000001405
lm2 <- lm(mpg ~ cyl + carb, data = mtcars2)
summary(lm2)
Call:
lm(formula = mpg ~ cyl + carb, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-5.339 -1.354 0.000 2.068 7.061
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 26.8391 1.4004 19.165 0.00000000000000047 ***
cyl6 -5.2370 2.0713 -2.528 0.0184 *
cyl8 -10.2935 1.8498 -5.565 0.00001003182496434 ***
carb2 -0.3217 1.7737 -0.181 0.8576
carb3 -0.2457 2.8136 -0.087 0.9312
carb4 -2.7783 2.0787 -1.337 0.1939
carb6 -1.9022 3.8829 -0.490 0.6287
carb8 -1.5457 3.9287 -0.393 0.6975
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.358 on 24 degrees of freedom
Multiple R-squared: 0.7596, Adjusted R-squared: 0.6895
F-statistic: 10.84 on 7 and 24 DF, p-value: 0.00000418
lm3 <- lm(mpg ~ cyl + vs + am, data = mtcars2)
summary(lm3)
Call:
lm(formula = mpg ~ cyl + vs + am, data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-6.2821 -1.4402 0.0391 1.8845 6.2179
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 22.809 2.928 7.789 0.0000000224 ***
cyl6 -5.399 1.837 -2.938 0.00668 **
cyl8 -8.161 2.892 -2.822 0.00884 **
vsStraight 1.708 2.235 0.764 0.45135
amManual 3.165 1.528 2.071 0.04805 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.097 on 27 degrees of freedom
Multiple R-squared: 0.7701, Adjusted R-squared: 0.736
F-statistic: 22.61 on 4 and 27 DF, p-value: 0.00000002741
lm_all <- lm(mpg ~ ., data = mtcars2)
summary(lm_all)
Call:
lm(formula = mpg ~ ., data = mtcars2)
Residuals:
Min 1Q Median 3Q Max
-3.5087 -1.3584 -0.0948 0.7745 4.6251
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 23.87913 20.06582 1.190 0.2525
cyl6 -2.64870 3.04089 -0.871 0.3975
cyl8 -0.33616 7.15954 -0.047 0.9632
disp 0.03555 0.03190 1.114 0.2827
hp -0.07051 0.03943 -1.788 0.0939 .
drat 1.18283 2.48348 0.476 0.6407
wt -4.52978 2.53875 -1.784 0.0946 .
qsec 0.36784 0.93540 0.393 0.6997
vsStraight 1.93085 2.87126 0.672 0.5115
amManual 1.21212 3.21355 0.377 0.7113
gear4 1.11435 3.79952 0.293 0.7733
gear5 2.52840 3.73636 0.677 0.5089
carb2 -0.97935 2.31797 -0.423 0.6787
carb3 2.99964 4.29355 0.699 0.4955
carb4 1.09142 4.44962 0.245 0.8096
carb6 4.47757 6.38406 0.701 0.4938
carb8 7.25041 8.36057 0.867 0.3995
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.833 on 15 degrees of freedom
Multiple R-squared: 0.8931, Adjusted R-squared: 0.779
F-statistic: 7.83 on 16 and 15 DF, p-value: 0.000124