Jednoduchá regresia

Zadanie: Urobte lineárny regresný model pre premennú dist z datasetu cars, pričom prediktorom je premenná speed. Overte predpoklady pre tento model a otestujte jeho štatistickú významnosť. Vizualizujte vzťah reálnych a predikovaných hodnôt pre target. Je medzi týmito premennými skutočne lineárny vzťah alebo sa dá nájsť iný, ktorý bude dávať lepšie výsledky?

data1 <- datasets::cars

# Lineárny regresný model
model1 <- lm(dist ~ speed, data1)
summary(model1) # F-test, R^2, p-hodnoty
## 
## Call:
## lm(formula = dist ~ speed, data = data1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -29.069  -9.525  -2.272   9.215  43.201 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -17.5791     6.7584  -2.601   0.0123 *  
## speed         3.9324     0.4155   9.464 1.49e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 15.38 on 48 degrees of freedom
## Multiple R-squared:  0.6511, Adjusted R-squared:  0.6438 
## F-statistic: 89.57 on 1 and 48 DF,  p-value: 1.49e-12

Tak že R^2 = 0.6511 (65.1 % variability v dist je vysvetlené premennou speed). Adjustované R^2 = 0.6438. Hodnota F-testu: F = 89.57, p < 0.001 znamená, že model je celkovo významný.

Vizualizacia:

plot(data1$speed, data1$dist, main="Scatterplot speed vs dist",
     xlab="Speed", ylab="Dist", pch=19, col="blue")
abline(model1, col="red") # Regresná priamka

Sú normálne rozdelené? Shapiro-Wilk test

shapiro.test(resid(model1)) # Shapiro-Wilk test
## 
##  Shapiro-Wilk normality test
## 
## data:  resid(model1)
## W = 0.94509, p-value = 0.02152

Je stredná hodnota nula? t-test

t.test(model1$residuals, mu=0)
## 
##  One Sample t-test
## 
## data:  model1$residuals
## t = 1.0315e-16, df = 49, p-value = 1
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -4.326  4.326
## sample estimates:
##    mean of x 
## 2.220446e-16

Stredna hodnota je nula.

Sú náhodné? Runs test a turning point test

library(randtests)
runs.test(model1$residuals)
## 
##  Runs Test
## 
## data:  model1$residuals
## statistic = -0.28577, runs = 25, n1 = 25, n2 = 25, n = 50, p-value =
## 0.7751
## alternative hypothesis: nonrandomness
turning.point.test(model1$residuals)
## 
##  Turning Point Test
## 
## data:  model1$residuals
## statistic = -1.4961, n = 49, p-value = 0.1346
## alternative hypothesis: non randomness

Rezíduá sú náhodné.

Platí homoskedasticita? Golfeld-Quandtov test:

library(lmtest)
## Warning: пакет 'lmtest' был собран под R версии 4.4.2
## Загрузка требуемого пакета: zoo
## 
## Присоединяю пакет: 'zoo'
## Следующие объекты скрыты от 'package:base':
## 
##     as.Date, as.Date.numeric
gqtest(model1)
## 
##  Goldfeld-Quandt test
## 
## data:  model1
## GQ = 1.5512, df1 = 23, df2 = 23, p-value = 0.1498
## alternative hypothesis: variance increases from segment 1 to 2

Rezíduá maju konštantne rozptyly.

Sú nekorelované? Resp. neexistuje autokorelácia? Durbin-Watsonov test:

library(car)
## Warning: пакет 'car' был собран под R версии 4.4.2
## Загрузка требуемого пакета: carData
## Warning: пакет 'carData' был собран под R версии 4.4.2
durbinWatsonTest(model1)
##  lag Autocorrelation D-W Statistic p-value
##    1       0.1604322      1.676225   0.178
##  Alternative hypothesis: rho != 0

Rezíduá sú nekorelované.

Porovnanie modelov:

# Alternatívny model (kvadratický vzťah)
model_quad <- lm(dist ~ speed + I(speed^2), data = cars)
summary(model_quad)
## 
## Call:
## lm(formula = dist ~ speed + I(speed^2), data = cars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -28.720  -9.184  -3.188   4.628  45.152 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  2.47014   14.81716   0.167    0.868
## speed        0.91329    2.03422   0.449    0.656
## I(speed^2)   0.09996    0.06597   1.515    0.136
## 
## Residual standard error: 15.18 on 47 degrees of freedom
## Multiple R-squared:  0.6673, Adjusted R-squared:  0.6532 
## F-statistic: 47.14 on 2 and 47 DF,  p-value: 5.852e-12
anova(model1, model_quad)
## Analysis of Variance Table
## 
## Model 1: dist ~ speed
## Model 2: dist ~ speed + I(speed^2)
##   Res.Df   RSS Df Sum of Sq     F Pr(>F)
## 1     48 11354                          
## 2     47 10825  1    528.81 2.296 0.1364

Násobná regresia – generované dáta

Zadanie: Nájdite najvhodnejší lineárny regresný model pre nižšie uvedené náhodne generované dáta. Overte predpoklady. Zdôvodnite prečo je vybraný model najvhodnejší. Testujte štatistickú významnosť modelu a parametrov.

n <- 100  
size <- rnorm(n, mean = 2000, sd = 500) # House size  
bedrooms <- sample(2:5, n, replace = TRUE) # Number of bedrooms  
bathrooms <- sample(1:4, n, replace = TRUE) # Number of bathrooms  
age <- rpois(n, lambda = 10) # Age of the house  
price <- 50000 + size * 150 + bedrooms * 10000 - age * 2000 + rnorm(n, mean = 0, sd = 30000) # Create target variable 
house_data <- data.frame( Size = size, Bedrooms = bedrooms, Bathrooms = bathrooms, Age = age, Price = price ) # Create data frame
house_data
##          Size Bedrooms Bathrooms Age    Price
## 1   1273.7147        4         2   9 327675.2
## 2   2336.6568        3         1  18 416102.0
## 3   1778.4684        2         2  11 287858.3
## 4   2114.5043        2         4  19 285917.6
## 5   2027.8086        5         1   9 381211.2
## 6   1649.0849        3         4  14 252444.3
## 7   2036.9870        5         4  10 394121.3
## 8   2570.1199        4         1   9 455091.3
## 9   2067.3272        5         2  22 366556.7
## 10  2108.8902        4         3   8 398374.1
## 11  2707.6004        2         2  15 395148.0
## 12  1838.9219        4         2  10 360426.0
## 13  2488.6596        2         2   9 411899.4
## 14  1168.6646        5         3  10 283609.4
## 15  2438.1485        5         2  11 433519.6
## 16  1969.0318        4         3   9 416981.5
## 17  1068.2977        3         1  10 221784.6
## 18  1988.9479        5         3   6 359774.7
## 19  2445.6826        3         4  10 420865.3
## 20  2664.0058        3         1  10 446145.2
## 21  2209.4099        2         4   7 427344.2
## 22  2155.8874        2         2   4 370287.4
## 23  1002.5307        4         1  14 219887.5
## 24  2982.2998        2         3   8 551227.4
## 25  1740.2298        5         3  16 384499.4
## 26  2924.1014        3         2   9 513481.7
## 27  1889.6166        4         4   7 302329.9
## 28  2156.0895        5         4   5 393066.6
## 29  1530.3124        2         4   7 273687.0
## 30  2231.7717        2         1   6 356713.5
## 31  2887.9781        2         2  10 462223.1
## 32  1923.8451        5         4  13 395586.8
## 33  2779.5816        5         2  10 488770.3
## 34  2475.6780        3         3   4 430991.8
## 35  1923.4543        4         4  11 332395.9
## 36  2180.3688        5         2   9 430912.4
## 37  2110.6396        3         1  17 393615.1
## 38  1834.6946        5         1   9 316541.6
## 39  1777.2431        5         1   7 382514.7
## 40   995.8425        2         1  12 257880.5
## 41  2124.8690        5         2  11 411101.5
## 42  2586.6258        4         3  10 465444.5
## 43   603.7253        5         1   5 197276.9
## 44  2345.9292        5         2  12 469316.9
## 45  2170.3577        5         3  11 394001.0
## 46  1927.0995        4         4  10 390290.0
## 47  1788.6476        4         2  15 296977.6
## 48  2321.2343        4         1  14 493364.1
## 49  2762.9614        3         3  11 462967.9
## 50  2298.9738        5         2  12 417699.6
## 51  1836.6629        4         1   9 302502.2
## 52  2119.4744        5         1   9 410216.3
## 53  1697.1201        4         1   9 343345.3
## 54  2220.5075        2         3   5 379008.0
## 55  2394.8484        4         3  10 437076.6
## 56  2078.9610        5         3   3 362589.8
## 57  2798.3107        4         2  15 487507.7
## 58  1613.9529        2         2   8 320266.2
## 59  1632.9467        3         2   4 297858.1
## 60  1778.8792        3         4   8 318892.8
## 61  2448.1034        3         4  10 416516.5
## 62  1773.3328        4         4  12 313810.8
## 63  1946.3702        3         1  10 340606.8
## 64  2439.9142        2         4   9 437747.9
## 65   907.8200        2         1   9 141544.7
## 66  2112.5208        2         2  13 350527.4
## 67  3002.1371        5         1   7 566086.9
## 68  2004.6748        4         2   7 314789.6
## 69  2072.1219        3         4   9 357082.4
## 70  1943.5003        4         2  10 348741.3
## 71  2029.7569        3         3   9 340324.4
## 72  1696.8290        3         1   7 320883.5
## 73  1468.9888        5         1  10 316865.2
## 74  2485.4445        5         2   9 482376.7
## 75  1475.1658        2         2  12 271746.0
## 76  3074.9061        3         3  13 533252.2
## 77  2463.8855        4         4  15 443318.4
## 78  2787.9295        4         1  12 499302.7
## 79  2109.4463        3         1  11 375634.1
## 80  1979.2124        5         1  11 428779.1
## 81  3066.9384        2         4  14 514938.9
## 82  2180.1629        3         2  12 385172.7
## 83  1706.0471        4         3  21 336780.8
## 84  1935.9288        3         2  11 327160.0
## 85  1642.2857        5         4  13 363570.5
## 86  2227.1912        2         2  15 399554.7
## 87  2071.3910        2         4   8 395315.7
## 88  1474.7602        2         1  11 223182.5
## 89  1288.0843        2         3  10 223701.4
## 90  1933.3378        3         3   7 321412.7
## 91   933.3453        5         3  19 153916.8
## 92  2219.1687        3         4  10 393919.9
## 93  1817.0478        3         2  15 324853.7
## 94  3663.7051        3         4   9 592780.3
## 95  2303.0893        3         4  11 400012.6
## 96  1794.0573        2         1  14 351584.5
## 97  1737.0622        4         4  13 310984.9
## 98  2071.0548        5         4   8 395858.2
## 99  2126.3672        3         2  14 367972.6
## 100 2145.9714        5         3   9 377231.4
plot(Price ~ ., data = house_data)

Model:

model2 <- lm(house_data$Price ~ house_data$Size + house_data$Bedrooms + house_data$Bathrooms + house_data$Age)
print(model2)
## 
## Call:
## lm(formula = house_data$Price ~ house_data$Size + house_data$Bedrooms + 
##     house_data$Bathrooms + house_data$Age)
## 
## Coefficients:
##          (Intercept)       house_data$Size   house_data$Bedrooms  
##              23825.1                 153.6               14006.6  
## house_data$Bathrooms        house_data$Age  
##              -3162.3                -801.3
summary(model2)
## 
## Call:
## lm(formula = house_data$Price ~ house_data$Size + house_data$Bedrooms + 
##     house_data$Bathrooms + house_data$Age)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -62887 -16594   -249  15676  71293 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          23825.119  18782.041   1.269    0.208    
## house_data$Size        153.625      5.778  26.586  < 2e-16 ***
## house_data$Bedrooms  14006.635   2612.414   5.362 5.78e-07 ***
## house_data$Bathrooms -3162.302   2642.483  -1.197    0.234    
## house_data$Age        -801.333    827.140  -0.969    0.335    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 29290 on 95 degrees of freedom
## Multiple R-squared:  0.8848, Adjusted R-squared:   0.88 
## F-statistic: 182.5 on 4 and 95 DF,  p-value: < 2.2e-16

Analýza rezíduí:

model2$residuals 
##            1            2            3            4            5            6 
##  65685.50015   8873.92721 -22058.33188 -62887.43124 -13795.30961 -42873.87412 
##            7            8            9           10           11           12 
##   7993.04080  -9221.19959 -20941.18480  10441.33469 -54301.37314  12407.59655 
##           13           14           15           16           17           18 
##  -8723.15543  27715.06384 -19760.50590  51335.84540   2997.75989 -25341.17469 
##           19           20           21           22           23           24 
##    -35.66196 -17782.59803  54343.33835  -3219.67789    402.78239  57130.19489 
##           25           26           27           28           29           30 
##  45606.21433  11957.62401 -49555.88866 -15365.49383   5012.64317 -30010.95217 
##           31           32           33           34           35           36 
## -18943.57672  29244.01718 -17763.79958  -2487.49049 -21482.87278  15631.14684 
##           37           38           39           40           41           42 
##  20307.59818 -48797.68581  24398.73016  65833.94052   5949.08118   5722.18425 
##           43           44           45           46           47           48 
##  17840.12805  31005.34271 -14977.30475  35049.80979 -39310.78586  71293.36569 
##           49           50           51           52           53           54 
##  -9036.05146 -13398.43332 -49132.80662   1127.68871  13147.52858   -462.70989 
##           55           56           57           58           59           60 
##   6816.14311 -38758.41830  -3890.40337  33219.27081  -9318.76907  -1173.00935 
##           61           62           63           64           65           66 
##  -4756.37461 -16204.28035 -13074.08294  30938.44857 -39383.39352  -9105.47403 
##           67           68           69           70           71           72 
##  19796.39487 -61096.67712  -7231.57514 -15342.99116 -20643.49179   3134.41977 
##           73           74           75           76           77           78 
##   8508.87322  20228.09903   9225.61695  14928.40216   9621.10016   3933.16951 
##           79           80           81           82           83           84 
##  -2298.02216  42840.91541  15809.40681    340.34109  21152.24201 -20953.21084 
##           85           86           87           88           89           90 
##  40482.25606  23908.25668  44319.38688 -43239.28665  -8518.97667 -26345.49628 
##           91           92           93           94           95           96 
## -58614.59651   7817.17764  -1791.05981 -16040.96967   1818.91778  38514.68940 
##           97           98           99          100 
## -12656.76263   2893.58831  -6992.67780 -29603.21375

Sú normálne rozdelené? Shapiro-Wilk test

shapiro.test(model2$residuals) 
## 
##  Shapiro-Wilk normality test
## 
## data:  model2$residuals
## W = 0.9866, p-value = 0.4115

Rezíduá je normalne rozdelené.

Je stredná hodnota nula? t-test

t.test(model2$residuals, mu=0)
## 
##  One Sample t-test
## 
## data:  model2$residuals
## t = 2.9635e-16, df = 99, p-value = 1
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -5692.61  5692.61
## sample estimates:
##    mean of x 
## 8.501999e-13

Stredna hodnota je nula.

Sú náhodné? Runs test a turning point test:

runs.test(model2$residuals)
## 
##  Runs Test
## 
## data:  model2$residuals
## statistic = 0.20102, runs = 52, n1 = 50, n2 = 50, n = 100, p-value =
## 0.8407
## alternative hypothesis: nonrandomness
turning.point.test(model2$residuals)
## 
##  Turning Point Test
## 
## data:  model2$residuals
## statistic = 0.63827, n = 100, p-value = 0.5233
## alternative hypothesis: non randomness

Rezíduá sú náhodné.

Platí homoskedasticita? Golfeld-Quandtov test

gqtest(model2)
## 
##  Goldfeld-Quandt test
## 
## data:  model2
## GQ = 0.5548, df1 = 45, df2 = 45, p-value = 0.9745
## alternative hypothesis: variance increases from segment 1 to 2

Rezíduá maju konštantne rozptyly.

Sú nekorelované? Resp. neexistuje autokorelácia? Durbin-Watsonov test

durbinWatsonTest(model2)
##  lag Autocorrelation D-W Statistic p-value
##    1     -0.04793618      2.032168   0.902
##  Alternative hypothesis: rho != 0

Rezíduá sú nekorelované.

Porovnanie modelov:

model_quad2 <- lm(house_data$Price ~ house_data$Size + I(house_data$Size^2) + house_data$Bedrooms + I(house_data$Bedrooms^2) + house_data$Bathrooms + I(house_data$Bathrooms^2) + house_data$Age + + I(house_data$Age^2))  
print(model_quad2)
## 
## Call:
## lm(formula = house_data$Price ~ house_data$Size + I(house_data$Size^2) + 
##     house_data$Bedrooms + I(house_data$Bedrooms^2) + house_data$Bathrooms + 
##     I(house_data$Bathrooms^2) + house_data$Age + +I(house_data$Age^2))
## 
## Coefficients:
##               (Intercept)            house_data$Size  
##                 3.866e+04                  1.428e+02  
##      I(house_data$Size^2)        house_data$Bedrooms  
##                 2.591e-03                 -5.279e+03  
##  I(house_data$Bedrooms^2)       house_data$Bathrooms  
##                 2.780e+03                 -1.076e+04  
## I(house_data$Bathrooms^2)             house_data$Age  
##                 1.607e+03                  5.262e+03  
##       I(house_data$Age^2)  
##                -2.537e+02
anova(model2, model_quad2)
## Analysis of Variance Table
## 
## Model 1: house_data$Price ~ house_data$Size + house_data$Bedrooms + house_data$Bathrooms + 
##     house_data$Age
## Model 2: house_data$Price ~ house_data$Size + I(house_data$Size^2) + house_data$Bedrooms + 
##     I(house_data$Bedrooms^2) + house_data$Bathrooms + I(house_data$Bathrooms^2) + 
##     house_data$Age + +I(house_data$Age^2)
##   Res.Df        RSS Df  Sum of Sq    F Pr(>F)
## 1     95 8.1485e+10                          
## 2     91 7.7923e+10  4 3562088096 1.04 0.3911

Keď porovnáme hodnoty týchto dvoch modelov, zdá sa, že rozdiely v informačných kritériách nie sú príliš veľké. Teda zvolime linearný model.

Násobná regresia - dataset

Zadanie: Urobte lineárny regresný model pre premennú mpg z datasetu mtcars, pričom prediktormi budú premenné horsepower, weight a cylinders. Overte predpoklady pre tento model a otestujte jeho štatistickú významnosť. Vizualizujte vzťah reálnych a predikovaných hodnôt pre target. Dávajú iné premenné z datasetu ako prediktory lepší model a ak áno, tak ktoré?

model4 <- lm(mpg ~ hp + wt + cyl, data = mtcars)
summary(model4)
## 
## Call:
## lm(formula = mpg ~ hp + wt + cyl, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.9290 -1.5598 -0.5311  1.1850  5.8986 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 38.75179    1.78686  21.687  < 2e-16 ***
## hp          -0.01804    0.01188  -1.519 0.140015    
## wt          -3.16697    0.74058  -4.276 0.000199 ***
## cyl         -0.94162    0.55092  -1.709 0.098480 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.512 on 28 degrees of freedom
## Multiple R-squared:  0.8431, Adjusted R-squared:  0.8263 
## F-statistic: 50.17 on 3 and 28 DF,  p-value: 2.184e-11

Vizualizacia:

predicted <- predict(model4)
plot(mtcars$mpg, predicted, main="Reálne vs Predikované hodnoty",
     xlab="Reálne hodnoty (mpg)", ylab="Predikované hodnoty", pch=19, col="blue")
abline(0, 1, col="red", lty=2) # Ideálna zhoda

Analýza rezíduí:

model4$residuals 
##           Mazda RX4       Mazda RX4 Wag          Datsun 710      Hornet 4 Drive 
##          -1.8204257          -1.0128476          -3.1603990           0.4639233 
##   Hornet Sportabout             Valiant          Duster 360           Merc 240D 
##           1.5322025          -2.1503588          -1.1934238           0.6356864 
##            Merc 230            Merc 280           Merc 280C          Merc 450SE 
##          -0.4957351          -0.7890124          -2.1890124           1.3175861 
##          Merc 450SL         Merc 450SLC  Cadillac Fleetwood Lincoln Continental 
##           1.1408152          -0.8008361          -0.4944331           0.2370012 
##   Chrysler Imperial            Fiat 128         Honda Civic      Toyota Corolla 
##           4.5573819           5.5725355           1.4673228           5.8985522 
##       Toyota Corona    Dodge Challenger         AMC Javelin          Camaro Z28 
##          -3.9290355          -1.8653922          -2.4345849          -1.3383411 
##    Pontiac Firebird           Fiat X1-9       Porsche 914-2        Lotus Europa 
##           3.3148266          -0.3667124          -0.5665304           2.2446157 
##      Ford Pantera L        Ferrari Dino       Maserati Bora          Volvo 142E 
##          -0.6174892          -1.4729031           1.1300053          -2.8149817

Sú normálne rozdelené? Shapiro-Wilk test

shapiro.test(model4$residuals) 
## 
##  Shapiro-Wilk normality test
## 
## data:  model4$residuals
## W = 0.93455, p-value = 0.05252

Rezíduá je normalne rozdelené.

Je stredná hodnota nula? t-test

t.test(model4$residuals, mu=0)
## 
##  One Sample t-test
## 
## data:  model4$residuals
## t = 2.5078e-16, df = 31, p-value = 1
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -0.8605801  0.8605801
## sample estimates:
##    mean of x 
## 1.058181e-16

Stredna hodnota je nula.

Sú náhodné? Runs test a turning point test:

runs.test(model4$residuals)
## 
##  Runs Test
## 
## data:  model4$residuals
## statistic = -0.7188, runs = 15, n1 = 16, n2 = 16, n = 32, p-value =
## 0.4723
## alternative hypothesis: nonrandomness
turning.point.test(model4$residuals)
## 
##  Turning Point Test
## 
## data:  model4$residuals
## statistic = -0.43167, n = 32, p-value = 0.666
## alternative hypothesis: non randomness

Rezíduá sú náhodné.

Platí homoskedasticita? Golfeld-Quandtov test

gqtest(model4)
## 
##  Goldfeld-Quandt test
## 
## data:  model4
## GQ = 6.6633, df1 = 12, df2 = 12, p-value = 0.001265
## alternative hypothesis: variance increases from segment 1 to 2

Rezíduá maju konštantne rozptyly.

Sú nekorelované? Resp. neexistuje autokorelácia? Durbin-Watsonov test

durbinWatsonTest(model4)
##  lag Autocorrelation D-W Statistic p-value
##    1        0.146151       1.64407   0.158
##  Alternative hypothesis: rho != 0

Rezíduá sú nekorelované.

Porovnanie modelov:

model5 <- lm(mtcars$mpg ~ mtcars$disp + mtcars$drat + mtcars$qsec)
model6 <- lm(mtcars$mpg~ mtcars$am + mtcars$gear)
model7 <- lm(mtcars$mpg ~ mtcars$hp + mtcars$qsec + mtcars$vs)
model8 <- lm(mtcars$mpg ~ mtcars$disp + mtcars$qsec + mtcars$wt)
c(AIC(model4),AIC(model5),AIC(model6),AIC(model7),AIC(model8))
## [1] 155.4766 171.4928 198.4822 175.1748 158.7203
c(BIC(model4),BIC(model5),BIC(model6),BIC(model7),BIC(model8))
## [1] 162.8053 178.8215 204.3452 182.5034 166.0490

Najlepši je 8 model.