Please, don’t change the structure of the document. Submit your knitted .html file no later that 11:59 p.m. May, 15 as Home task 2 project in LMS.
You have a dataset containing data on 21 European countries (task2.csv).
smoke2 - whether a respondent smokes or not (1 - yes, 0 - no) hinctnta - household net income country - a country where each respondent lives price - the average price for a Marlboro pack
Your dependent variable is smoke2.
Calculate ICC for the empty model. Decide, whether multilevel modeling is necessary (1 point)
(For educational purposes ignore the result and proceed with MLM)
data<-read.csv("C:/Users/ASUS/Desktop/multilevel regression 2020/hometask3.csv")
library(lme4)
## Warning: package 'lme4' was built under R version 3.6.3
## Loading required package: Matrix
moodel <- lmer(smoke2~ (1 | cntry), REML = FALSE, data = data)
summary(moodel)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: smoke2 ~ (1 | cntry)
## Data: data
##
## AIC BIC logLik deviance df.resid
## 45228.3 45254.1 -22611.2 45222.3 40114
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -0.6731 -0.6137 -0.5381 -0.3248 2.0290
##
## Random effects:
## Groups Name Variance Std.Dev.
## cntry (Intercept) 0.001338 0.03658
## Residual 0.180499 0.42485
## Number of obs: 40117, groups: cntry, 21
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 0.237500 0.008272 28.71
library(ICC)
ICCbare(cntry,smoke2,data)
## NAs removed from rows:
## 125 1901 4364 5416 5889 6763 7667 8387 8832 9087 10840 10850 10940 11294 12022 12916 14089 14580 16752 22426 22811 23007 23562 23805 24440 25017 25063 25153 25156 25441 25444 25447 25449 25452 25594 25961 26860 27341 27342 27347 27348 27476 27477 27654 27701 27788 27815 28045 28129 28316 28804 29310 29583 30581 30583 30675 30921 30922 32538 33676 34283 36894 37179 37257 37510 38428 38805 39349
## Warning in ICCbare(cntry, smoke2, data):
## [1] 0.007264089
# The result shows that multelvel modeling in unecessary
Add an income variable as a fixed effect. Interpret the results (1 point)
model1<-lmer(smoke2~hinctnta+(1|cntry),REML = FALSE,data = data)
summary(model1)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: smoke2 ~ hinctnta + (1 | cntry)
## Data: data
##
## AIC BIC logLik deviance df.resid
## 36021.0 36121.4 -17998.5 35997.0 31848
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -0.8824 -0.6144 -0.5189 -0.2870 2.1254
##
## Random effects:
## Groups Name Variance Std.Dev.
## cntry (Intercept) 0.001481 0.03848
## Residual 0.180925 0.42535
## Number of obs: 31860, groups: cntry, 20
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 0.253589 0.011263 22.516
## hinctntaD - 9th decile -0.041808 0.011133 -3.755
## hinctntaF - 5th decile -0.024556 0.010242 -2.398
## hinctntaH - 10th decile -0.067957 0.011034 -6.159
## hinctntaJ - 1st decile 0.054405 0.010646 5.110
## hinctntaK - 7th decile -0.021782 0.010417 -2.091
## hinctntaM - 4th decile 0.000137 0.010181 0.013
## hinctntaP - 8th decile -0.040566 0.010514 -3.858
## hinctntaR - 2nd decile 0.015292 0.010268 1.489
## hinctntaS - 6th decile -0.006327 0.010335 -0.612
##
## Correlation of Fixed Effects:
## (Intr) hnD-9d hnF-5d hH-10d hnJ-1d hnK-7d hnM-4d hnP-8d hnR-2d
## hnctntD-9td -0.420
## hnctntF-5td -0.455 0.460
## hnctntH-10d -0.425 0.437 0.464
## hnctntJ-1sd -0.436 0.440 0.479 0.444
## hnctntK-7td -0.448 0.456 0.492 0.461 0.470
## hnctntM-4td -0.457 0.462 0.502 0.467 0.481 0.494
## hnctntP-8td -0.444 0.454 0.487 0.460 0.465 0.483 0.489
## hnctntR-2nd -0.453 0.457 0.497 0.461 0.481 0.488 0.500 0.483
## hnctntS-6td -0.451 0.458 0.496 0.463 0.474 0.489 0.498 0.485 0.492
## acording with the income, the sample size shows the low tendence to buy cigarettes
Add the variable price into your model. Interpret the result (1 point)
I strongly recommend to scale that variable
model2 <- lmer(smoke2~ (1 | cntry), REML = FALSE, data = data)
summary(model2)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: smoke2 ~ (1 | cntry)
## Data: data
##
## AIC BIC logLik deviance df.resid
## 45228.3 45254.1 -22611.2 45222.3 40114
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -0.6731 -0.6137 -0.5381 -0.3248 2.0290
##
## Random effects:
## Groups Name Variance Std.Dev.
## cntry (Intercept) 0.001338 0.03658
## Residual 0.180499 0.42485
## Number of obs: 40117, groups: cntry, 21
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 0.237500 0.008272 28.71
library(sjPlot)
## Warning: package 'sjPlot' was built under R version 3.6.3
## Install package "strengejacke" from GitHub (`devtools::install_github("strengejacke/strengejacke")`) to load all sj-packages at once!
plot_model(model2, type = "re", sort.est = "sort.all", grid = F)
## Warning in checkMatrixPackageVersion(): Package version inconsistency detected.
## TMB was built with Matrix version 1.2.18
## Current Matrix version is 1.2.17
## Please re-install 'TMB' from source using install.packages('TMB', type = 'source') or ask CRAN for a binary version of 'TMB' matching CRAN's 'Matrix' package
#Most of the scandinavian countries and portugal do not smoke in comarison with the others countries
Randomize household income. Visualize the random effect (1 point). Briefly describe the plot (1 point). Decide whether it is a good idea to randomize that variable. (1 point)
model3<-lmer(smoke2 ~ hinctnta + price + ( 1 + hinctnta | cntry), data = data)
## boundary (singular) fit: see ?isSingular
summary(model3)
## Linear mixed model fit by REML ['lmerMod']
## Formula: smoke2 ~ hinctnta + price + (1 + hinctnta | cntry)
## Data: data
##
## REML criterion at convergence: 35940.8
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.0156 -0.6249 -0.5094 -0.2580 2.1640
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## cntry (Intercept) 0.001008 0.03174
## hinctntaD - 9th decile 0.002807 0.05298 -0.52
## hinctntaF - 5th decile 0.001009 0.03176 -0.34 0.59
## hinctntaH - 10th decile 0.003304 0.05748 -0.44 0.82 0.83
## hinctntaJ - 1st decile 0.004462 0.06680 0.54 -0.33 0.09 -0.34
## hinctntaK - 7th decile 0.001695 0.04117 -0.69 0.84 0.67 0.81
## hinctntaM - 4th decile 0.001106 0.03326 -0.15 0.58 0.56 0.32
## hinctntaP - 8th decile 0.001333 0.03651 -0.38 0.76 0.43 0.57
## hinctntaR - 2nd decile 0.002420 0.04920 0.36 -0.23 -0.18 -0.54
## hinctntaS - 6th decile 0.001442 0.03797 -0.38 0.54 0.99 0.82
## Residual 0.179775 0.42400
##
##
##
##
##
##
## -0.38
## 0.41 0.48
## -0.44 0.73 0.58
## 0.84 -0.39 0.53 -0.24
## 0.02 0.69 0.49 0.43 -0.27
##
## Number of obs: 31860, groups: cntry, 20
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 0.325143 0.019699 16.505
## hinctntaD - 9th decile -0.037469 0.016399 -2.285
## hinctntaF - 5th decile -0.024834 0.012506 -1.986
## hinctntaH - 10th decile -0.064433 0.017145 -3.758
## hinctntaJ - 1st decile 0.048070 0.018520 2.596
## hinctntaK - 7th decile -0.019774 0.013963 -1.416
## hinctntaM - 4th decile -0.001489 0.012673 -0.117
## hinctntaP - 8th decile -0.040463 0.013403 -3.019
## hinctntaR - 2nd decile 0.014068 0.015122 0.930
## hinctntaS - 6th decile -0.006919 0.013431 -0.515
## price -0.009435 0.002231 -4.228
##
## Correlation of Fixed Effects:
## (Intr) hnD-9d hnF-5d hH-10d hnJ-1d hnK-7d hnM-4d hnP-8d hnR-2d
## hnctntD-9td -0.318
## hnctntF-5td -0.304 0.502
## hnctntH-10d -0.294 0.647 0.603
## hnctntJ-1sd -0.021 -0.017 0.273 -0.040
## hnctntK-7td -0.375 0.637 0.557 0.625 0.002
## hnctntM-4td -0.268 0.506 0.526 0.388 0.423 0.489
## hnctntP-8td -0.306 0.581 0.465 0.492 -0.006 0.583 0.523
## hnctntR-2nd -0.088 0.093 0.204 -0.091 0.688 0.066 0.508 0.153
## hnctntS-6td -0.309 0.488 0.676 0.625 0.226 0.576 0.496 0.461 0.136
## price -0.856 0.020 0.022 0.019 0.038 0.018 0.026 0.023 0.007
## hnS-6d
## hnctntD-9td
## hnctntF-5td
## hnctntH-10d
## hnctntJ-1sd
## hnctntK-7td
## hnctntM-4td
## hnctntP-8td
## hnctntR-2nd
## hnctntS-6td
## price 0.023
## convergence code: 0
## boundary (singular) fit: see ?isSingular
plot_model(model3,type = "re",sort.est = "sort.all", grid = T)
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
# The countie changes per country
Add the cross-level interaction effect to the model. Visualize interactive effect. Interpret the result (1 point)
Decide whether adding interaction effect statistically improves your model (1 point)
model4<-lmer(smoke2 ~ hinctnta + price + hinctnta*price + ( 1 | cntry),data)
summary(model4)
## Linear mixed model fit by REML ['lmerMod']
## Formula: smoke2 ~ hinctnta + price + hinctnta * price + (1 | cntry)
## Data: data
##
## REML criterion at convergence: 36098.8
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -0.8228 -0.6267 -0.5270 -0.2410 2.1641
##
## Random effects:
## Groups Name Variance Std.Dev.
## cntry (Intercept) 0.001191 0.03451
## Residual 0.180648 0.42503
## Number of obs: 31860, groups: cntry, 20
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 0.268236 0.029292 9.157
## hinctntaD - 9th decile 0.044526 0.032614 1.365
## hinctntaF - 5th decile 0.044552 0.028591 1.558
## hinctntaH - 10th decile 0.022499 0.031635 0.711
## hinctntaJ - 1st decile -0.019163 0.028990 -0.661
## hinctntaK - 7th decile 0.066492 0.030171 2.204
## hinctntaM - 4th decile 0.047227 0.028425 1.661
## hinctntaP - 8th decile 0.022776 0.030335 0.751
## hinctntaR - 2nd decile -0.011235 0.028315 -0.397
## hinctntaS - 6th decile 0.049393 0.029296 1.686
## price -0.002166 0.003605 -0.601
## hinctntaD - 9th decile:price -0.011459 0.004051 -2.828
## hinctntaF - 5th decile:price -0.008986 0.003469 -2.591
## hinctntaH - 10th decile:price -0.011994 0.003918 -3.062
## hinctntaJ - 1st decile:price 0.008810 0.003374 2.611
## hinctntaK - 7th decile:price -0.011632 0.003713 -3.133
## hinctntaM - 4th decile:price -0.006127 0.003456 -1.773
## hinctntaP - 8th decile:price -0.008354 0.003751 -2.227
## hinctntaR - 2nd decile:price 0.003250 0.003351 0.970
## hinctntaS - 6th decile:price -0.007308 0.003597 -2.032
##
## Correlation matrix not shown by default, as p = 20 > 12.
## Use print(x, correlation=TRUE) or
## vcov(x) if you need it
plot_model(model4,type = "int", sort.est = "sort.all", grid = F)
# high price means low demand of people who consume this product
Calculate odds ratios for fixed effects and interpret the results (1 point)
plot_model(model4, type = "re", sort.est = "sort.all", grid = T)
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
exp(fixef(model4))
## (Intercept) hinctntaD - 9th decile
## 1.3076564 1.0455325
## hinctntaF - 5th decile hinctntaH - 10th decile
## 1.0455591 1.0227542
## hinctntaJ - 1st decile hinctntaK - 7th decile
## 0.9810195 1.0687524
## hinctntaM - 4th decile hinctntaP - 8th decile
## 1.0483594 1.0230375
## hinctntaR - 2nd decile hinctntaS - 6th decile
## 0.9888281 1.0506334
## price hinctntaD - 9th decile:price
## 0.9978359 0.9886064
## hinctntaF - 5th decile:price hinctntaH - 10th decile:price
## 0.9910546 0.9880773
## hinctntaJ - 1st decile:price hinctntaK - 7th decile:price
## 1.0088493 0.9884349
## hinctntaM - 4th decile:price hinctntaP - 8th decile:price
## 0.9938919 0.9916804
## hinctntaR - 2nd decile:price hinctntaS - 6th decile:price
## 1.0032553 0.9927187
## people located in the hinctntaR - 2nd decile:price 1.0032553 tends 6 times less to smoke in comparison with the rest.
Create a table combining outputs of all estimated models (1 point)
tab_model(moodel,model1,model2,model3,model4, show.ci = )
| smoke2 | smoke2 | smoke2 | smoke2 | smoke2 | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Predictors | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 0.24 | 0.22 – 0.25 | <0.001 | 0.25 | 0.23 – 0.28 | <0.001 | 0.24 | 0.22 – 0.25 | <0.001 | 0.33 | 0.29 – 0.36 | <0.001 | 0.27 | 0.21 – 0.33 | <0.001 |
| hinctnta [D - 9th decile] | -0.04 | -0.06 – -0.02 | <0.001 | -0.04 | -0.07 – -0.01 | 0.022 | 0.04 | -0.02 – 0.11 | 0.172 | ||||||
| hinctnta [F - 5th decile] | -0.02 | -0.04 – -0.00 | 0.017 | -0.02 | -0.05 – -0.00 | 0.047 | 0.04 | -0.01 – 0.10 | 0.119 | ||||||
|
hinctnta [H - 10th decile] |
-0.07 | -0.09 – -0.05 | <0.001 | -0.06 | -0.10 – -0.03 | <0.001 | 0.02 | -0.04 – 0.08 | 0.477 | ||||||
| hinctnta [J - 1st decile] | 0.05 | 0.03 – 0.08 | <0.001 | 0.05 | 0.01 – 0.08 | 0.009 | -0.02 | -0.08 – 0.04 | 0.509 | ||||||
| hinctnta [K - 7th decile] | -0.02 | -0.04 – -0.00 | 0.037 | -0.02 | -0.05 – 0.01 | 0.157 | 0.07 | 0.01 – 0.13 | 0.028 | ||||||
| hinctnta [M - 4th decile] | 0.00 | -0.02 – 0.02 | 0.989 | -0.00 | -0.03 – 0.02 | 0.906 | 0.05 | -0.01 – 0.10 | 0.097 | ||||||
| hinctnta [P - 8th decile] | -0.04 | -0.06 – -0.02 | <0.001 | -0.04 | -0.07 – -0.01 | 0.003 | 0.02 | -0.04 – 0.08 | 0.453 | ||||||
| hinctnta [R - 2nd decile] | 0.02 | -0.00 – 0.04 | 0.136 | 0.01 | -0.02 – 0.04 | 0.352 | -0.01 | -0.07 – 0.04 | 0.692 | ||||||
| hinctnta [S - 6th decile] | -0.01 | -0.03 – 0.01 | 0.540 | -0.01 | -0.03 – 0.02 | 0.606 | 0.05 | -0.01 – 0.11 | 0.092 | ||||||
| price | -0.01 | -0.01 – -0.01 | <0.001 | -0.00 | -0.01 – 0.00 | 0.548 | |||||||||
|
hinctnta [D - 9th decile] * price |
-0.01 | -0.02 – -0.00 | 0.005 | ||||||||||||
|
hinctnta [F - 5th decile] * price |
-0.01 | -0.02 – -0.00 | 0.010 | ||||||||||||
|
hinctnta [H - 10th decile] * price |
-0.01 | -0.02 – -0.00 | 0.002 | ||||||||||||
|
hinctnta [J - 1st decile] * price |
0.01 | 0.00 – 0.02 | 0.009 | ||||||||||||
|
hinctnta [K - 7th decile] * price |
-0.01 | -0.02 – -0.00 | 0.002 | ||||||||||||
|
hinctnta [M - 4th decile] * price |
-0.01 | -0.01 – 0.00 | 0.076 | ||||||||||||
|
hinctnta [P - 8th decile] * price |
-0.01 | -0.02 – -0.00 | 0.026 | ||||||||||||
|
hinctnta [R - 2nd decile] * price |
0.00 | -0.00 – 0.01 | 0.332 | ||||||||||||
|
hinctnta [S - 6th decile] * price |
-0.01 | -0.01 – -0.00 | 0.042 | ||||||||||||
| Random Effects | |||||||||||||||
| σ2 | 0.18 | 0.18 | 0.18 | 0.18 | 0.18 | ||||||||||
| τ00 | 0.00 cntry | 0.00 cntry | 0.00 cntry | 0.00 cntry | 0.00 cntry | ||||||||||
| τ11 | 0.00 cntry.hinctntaD - 9th decile | ||||||||||||||
| 0.00 cntry.hinctntaF - 5th decile | |||||||||||||||
| 0.00 cntry.hinctntaH - 10th decile | |||||||||||||||
| 0.00 cntry.hinctntaJ - 1st decile | |||||||||||||||
| 0.00 cntry.hinctntaK - 7th decile | |||||||||||||||
| 0.00 cntry.hinctntaM - 4th decile | |||||||||||||||
| 0.00 cntry.hinctntaP - 8th decile | |||||||||||||||
| 0.00 cntry.hinctntaR - 2nd decile | |||||||||||||||
| 0.00 cntry.hinctntaS - 6th decile | |||||||||||||||
| ρ01 | -0.52 | ||||||||||||||
| -0.34 | |||||||||||||||
| -0.44 | |||||||||||||||
| 0.54 | |||||||||||||||
| -0.69 | |||||||||||||||
| -0.15 | |||||||||||||||
| -0.38 | |||||||||||||||
| 0.36 | |||||||||||||||
| -0.38 | |||||||||||||||
| ICC | 0.01 | 0.01 | 0.01 | 0.01 | |||||||||||
| N | 21 cntry | 20 cntry | 21 cntry | 20 cntry | 20 cntry | ||||||||||
| Observations | 40117 | 31860 | 40117 | 31860 | 31860 | ||||||||||
| Marginal R2 / Conditional R2 | 0.000 / 0.007 | 0.005 / 0.013 | 0.000 / 0.007 | 0.008 / NA | 0.009 / 0.015 | ||||||||||
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.