hometask3

Please, don’t change the structure of the document. Submit your knitted .html file no later that 11:59 p.m. May, 15 as Home task 2 project in LMS.

You have a dataset containing data on 21 European countries (task2.csv).

smoke2 - whether a respondent smokes or not (1 - yes, 0 - no) hinctnta - household net income country - a country where each respondent lives price - the average price for a Marlboro pack

Your dependent variable is smoke2.

Task 1

Calculate ICC for the empty model. Decide, whether multilevel modeling is necessary (1 point)
(For educational purposes ignore the result and proceed with MLM)

data<-read.csv("C:/Users/ASUS/Desktop/multilevel regression 2020/hometask3.csv")
library(lme4)

## Warning: package 'lme4' was built under R version 3.6.3

## Loading required package: Matrix

moodel <- lmer(smoke2~ (1 | cntry), REML = FALSE, data = data)
summary(moodel)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: smoke2 ~ (1 | cntry)
##    Data: data
## 
##      AIC      BIC   logLik deviance df.resid 
##  45228.3  45254.1 -22611.2  45222.3    40114 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -0.6731 -0.6137 -0.5381 -0.3248  2.0290 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  cntry    (Intercept) 0.001338 0.03658 
##  Residual             0.180499 0.42485 
## Number of obs: 40117, groups:  cntry, 21
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 0.237500   0.008272   28.71

library(ICC)
ICCbare(cntry,smoke2,data)

## NAs removed from rows:
##  125 1901 4364 5416 5889 6763 7667 8387 8832 9087 10840 10850 10940 11294 12022 12916 14089 14580 16752 22426 22811 23007 23562 23805 24440 25017 25063 25153 25156 25441 25444 25447 25449 25452 25594 25961 26860 27341 27342 27347 27348 27476 27477 27654 27701 27788 27815 28045 28129 28316 28804 29310 29583 30581 30583 30675 30921 30922 32538 33676 34283 36894 37179 37257 37510 38428 38805 39349

## Warning in ICCbare(cntry, smoke2, data):

## [1] 0.007264089

# The result shows that multelvel modeling in unecessary

Task 2

Add an income variable as a fixed effect. Interpret the results (1 point)

model1<-lmer(smoke2~hinctnta+(1|cntry),REML = FALSE,data = data)
summary(model1)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: smoke2 ~ hinctnta + (1 | cntry)
##    Data: data
## 
##      AIC      BIC   logLik deviance df.resid 
##  36021.0  36121.4 -17998.5  35997.0    31848 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -0.8824 -0.6144 -0.5189 -0.2870  2.1254 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  cntry    (Intercept) 0.001481 0.03848 
##  Residual             0.180925 0.42535 
## Number of obs: 31860, groups:  cntry, 20
## 
## Fixed effects:
##                          Estimate Std. Error t value
## (Intercept)              0.253589   0.011263  22.516
## hinctntaD - 9th decile  -0.041808   0.011133  -3.755
## hinctntaF - 5th decile  -0.024556   0.010242  -2.398
## hinctntaH - 10th decile -0.067957   0.011034  -6.159
## hinctntaJ - 1st decile   0.054405   0.010646   5.110
## hinctntaK - 7th decile  -0.021782   0.010417  -2.091
## hinctntaM - 4th decile   0.000137   0.010181   0.013
## hinctntaP - 8th decile  -0.040566   0.010514  -3.858
## hinctntaR - 2nd decile   0.015292   0.010268   1.489
## hinctntaS - 6th decile  -0.006327   0.010335  -0.612
## 
## Correlation of Fixed Effects:
##             (Intr) hnD-9d hnF-5d hH-10d hnJ-1d hnK-7d hnM-4d hnP-8d hnR-2d
## hnctntD-9td -0.420                                                        
## hnctntF-5td -0.455  0.460                                                 
## hnctntH-10d -0.425  0.437  0.464                                          
## hnctntJ-1sd -0.436  0.440  0.479  0.444                                   
## hnctntK-7td -0.448  0.456  0.492  0.461  0.470                            
## hnctntM-4td -0.457  0.462  0.502  0.467  0.481  0.494                     
## hnctntP-8td -0.444  0.454  0.487  0.460  0.465  0.483  0.489              
## hnctntR-2nd -0.453  0.457  0.497  0.461  0.481  0.488  0.500  0.483       
## hnctntS-6td -0.451  0.458  0.496  0.463  0.474  0.489  0.498  0.485  0.492

## acording with the income, the sample size shows the low tendence to buy cigarettes

Task 3

Add the variable price into your model. Interpret the result (1 point)

I strongly recommend to scale that variable

model2 <- lmer(smoke2~ (1 | cntry), REML = FALSE, data = data)
summary(model2)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: smoke2 ~ (1 | cntry)
##    Data: data
## 
##      AIC      BIC   logLik deviance df.resid 
##  45228.3  45254.1 -22611.2  45222.3    40114 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -0.6731 -0.6137 -0.5381 -0.3248  2.0290 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  cntry    (Intercept) 0.001338 0.03658 
##  Residual             0.180499 0.42485 
## Number of obs: 40117, groups:  cntry, 21
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 0.237500   0.008272   28.71

library(sjPlot)

## Warning: package 'sjPlot' was built under R version 3.6.3

## Install package "strengejacke" from GitHub (`devtools::install_github("strengejacke/strengejacke")`) to load all sj-packages at once!

plot_model(model2, type = "re", sort.est = "sort.all", grid = F)

## Warning in checkMatrixPackageVersion(): Package version inconsistency detected.
## TMB was built with Matrix version 1.2.18
## Current Matrix version is 1.2.17
## Please re-install 'TMB' from source using install.packages('TMB', type = 'source') or ask CRAN for a binary version of 'TMB' matching CRAN's 'Matrix' package

#Most of the scandinavian countries and portugal do not smoke in comarison with the others countries

Task 4

Randomize household income. Visualize the random effect (1 point). Briefly describe the plot (1 point). Decide whether it is a good idea to randomize that variable. (1 point)

model3<-lmer(smoke2 ~ hinctnta + price + ( 1 + hinctnta | cntry), data = data)

## boundary (singular) fit: see ?isSingular

summary(model3)

## Linear mixed model fit by REML ['lmerMod']
## Formula: smoke2 ~ hinctnta + price + (1 + hinctnta | cntry)
##    Data: data
## 
## REML criterion at convergence: 35940.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.0156 -0.6249 -0.5094 -0.2580  2.1640 
## 
## Random effects:
##  Groups   Name                    Variance Std.Dev. Corr                   
##  cntry    (Intercept)             0.001008 0.03174                         
##           hinctntaD - 9th decile  0.002807 0.05298  -0.52                  
##           hinctntaF - 5th decile  0.001009 0.03176  -0.34  0.59            
##           hinctntaH - 10th decile 0.003304 0.05748  -0.44  0.82  0.83      
##           hinctntaJ - 1st decile  0.004462 0.06680   0.54 -0.33  0.09 -0.34
##           hinctntaK - 7th decile  0.001695 0.04117  -0.69  0.84  0.67  0.81
##           hinctntaM - 4th decile  0.001106 0.03326  -0.15  0.58  0.56  0.32
##           hinctntaP - 8th decile  0.001333 0.03651  -0.38  0.76  0.43  0.57
##           hinctntaR - 2nd decile  0.002420 0.04920   0.36 -0.23 -0.18 -0.54
##           hinctntaS - 6th decile  0.001442 0.03797  -0.38  0.54  0.99  0.82
##  Residual                         0.179775 0.42400                         
##                               
##                               
##                               
##                               
##                               
##                               
##  -0.38                        
##   0.41  0.48                  
##  -0.44  0.73  0.58            
##   0.84 -0.39  0.53 -0.24      
##   0.02  0.69  0.49  0.43 -0.27
##                               
## Number of obs: 31860, groups:  cntry, 20
## 
## Fixed effects:
##                          Estimate Std. Error t value
## (Intercept)              0.325143   0.019699  16.505
## hinctntaD - 9th decile  -0.037469   0.016399  -2.285
## hinctntaF - 5th decile  -0.024834   0.012506  -1.986
## hinctntaH - 10th decile -0.064433   0.017145  -3.758
## hinctntaJ - 1st decile   0.048070   0.018520   2.596
## hinctntaK - 7th decile  -0.019774   0.013963  -1.416
## hinctntaM - 4th decile  -0.001489   0.012673  -0.117
## hinctntaP - 8th decile  -0.040463   0.013403  -3.019
## hinctntaR - 2nd decile   0.014068   0.015122   0.930
## hinctntaS - 6th decile  -0.006919   0.013431  -0.515
## price                   -0.009435   0.002231  -4.228
## 
## Correlation of Fixed Effects:
##             (Intr) hnD-9d hnF-5d hH-10d hnJ-1d hnK-7d hnM-4d hnP-8d hnR-2d
## hnctntD-9td -0.318                                                        
## hnctntF-5td -0.304  0.502                                                 
## hnctntH-10d -0.294  0.647  0.603                                          
## hnctntJ-1sd -0.021 -0.017  0.273 -0.040                                   
## hnctntK-7td -0.375  0.637  0.557  0.625  0.002                            
## hnctntM-4td -0.268  0.506  0.526  0.388  0.423  0.489                     
## hnctntP-8td -0.306  0.581  0.465  0.492 -0.006  0.583  0.523              
## hnctntR-2nd -0.088  0.093  0.204 -0.091  0.688  0.066  0.508  0.153       
## hnctntS-6td -0.309  0.488  0.676  0.625  0.226  0.576  0.496  0.461  0.136
## price       -0.856  0.020  0.022  0.019  0.038  0.018  0.026  0.023  0.007
##             hnS-6d
## hnctntD-9td       
## hnctntF-5td       
## hnctntH-10d       
## hnctntJ-1sd       
## hnctntK-7td       
## hnctntM-4td       
## hnctntP-8td       
## hnctntR-2nd       
## hnctntS-6td       
## price        0.023
## convergence code: 0
## boundary (singular) fit: see ?isSingular

plot_model(model3,type = "re",sort.est = "sort.all", grid = T)

## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.

## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.
## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.

# The countie changes per country

Task 5

Add the cross-level interaction effect to the model. Visualize interactive effect. Interpret the result (1 point)

Decide whether adding interaction effect statistically improves your model (1 point)

model4<-lmer(smoke2 ~ hinctnta + price + hinctnta*price + ( 1 | cntry),data)
summary(model4)

## Linear mixed model fit by REML ['lmerMod']
## Formula: smoke2 ~ hinctnta + price + hinctnta * price + (1 | cntry)
##    Data: data
## 
## REML criterion at convergence: 36098.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -0.8228 -0.6267 -0.5270 -0.2410  2.1641 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  cntry    (Intercept) 0.001191 0.03451 
##  Residual             0.180648 0.42503 
## Number of obs: 31860, groups:  cntry, 20
## 
## Fixed effects:
##                                Estimate Std. Error t value
## (Intercept)                    0.268236   0.029292   9.157
## hinctntaD - 9th decile         0.044526   0.032614   1.365
## hinctntaF - 5th decile         0.044552   0.028591   1.558
## hinctntaH - 10th decile        0.022499   0.031635   0.711
## hinctntaJ - 1st decile        -0.019163   0.028990  -0.661
## hinctntaK - 7th decile         0.066492   0.030171   2.204
## hinctntaM - 4th decile         0.047227   0.028425   1.661
## hinctntaP - 8th decile         0.022776   0.030335   0.751
## hinctntaR - 2nd decile        -0.011235   0.028315  -0.397
## hinctntaS - 6th decile         0.049393   0.029296   1.686
## price                         -0.002166   0.003605  -0.601
## hinctntaD - 9th decile:price  -0.011459   0.004051  -2.828
## hinctntaF - 5th decile:price  -0.008986   0.003469  -2.591
## hinctntaH - 10th decile:price -0.011994   0.003918  -3.062
## hinctntaJ - 1st decile:price   0.008810   0.003374   2.611
## hinctntaK - 7th decile:price  -0.011632   0.003713  -3.133
## hinctntaM - 4th decile:price  -0.006127   0.003456  -1.773
## hinctntaP - 8th decile:price  -0.008354   0.003751  -2.227
## hinctntaR - 2nd decile:price   0.003250   0.003351   0.970
## hinctntaS - 6th decile:price  -0.007308   0.003597  -2.032

## 
## Correlation matrix not shown by default, as p = 20 > 12.
## Use print(x, correlation=TRUE)  or
##     vcov(x)        if you need it

plot_model(model4,type = "int", sort.est = "sort.all", grid = F)

# high price means low demand of people who consume this product

Task 6

Calculate odds ratios for fixed effects and interpret the results (1 point)

plot_model(model4, type = "re", sort.est = "sort.all", grid = T)

## Sorting each group of random effects ('sort.all') is not possible when 'facets = TRUE'.

exp(fixef(model4))

##                   (Intercept)        hinctntaD - 9th decile 
##                     1.3076564                     1.0455325 
##        hinctntaF - 5th decile       hinctntaH - 10th decile 
##                     1.0455591                     1.0227542 
##        hinctntaJ - 1st decile        hinctntaK - 7th decile 
##                     0.9810195                     1.0687524 
##        hinctntaM - 4th decile        hinctntaP - 8th decile 
##                     1.0483594                     1.0230375 
##        hinctntaR - 2nd decile        hinctntaS - 6th decile 
##                     0.9888281                     1.0506334 
##                         price  hinctntaD - 9th decile:price 
##                     0.9978359                     0.9886064 
##  hinctntaF - 5th decile:price hinctntaH - 10th decile:price 
##                     0.9910546                     0.9880773 
##  hinctntaJ - 1st decile:price  hinctntaK - 7th decile:price 
##                     1.0088493                     0.9884349 
##  hinctntaM - 4th decile:price  hinctntaP - 8th decile:price 
##                     0.9938919                     0.9916804 
##  hinctntaR - 2nd decile:price  hinctntaS - 6th decile:price 
##                     1.0032553                     0.9927187

## people located in the hinctntaR - 2nd decile:price  1.0032553 tends  6 times less to smoke in comparison with the rest.

Task 7

Create a table combining outputs of all estimated models (1 point)

tab_model(moodel,model1,model2,model3,model4, show.ci = )

	smoke2			smoke2			smoke2			smoke2			smoke2
Predictors	Estimates	CI	p	Estimates	CI	p	Estimates	CI	p	Estimates	CI	p	Estimates	CI	p
(Intercept)	0.24	0.22 – 0.25	<0.001	0.25	0.23 – 0.28	<0.001	0.24	0.22 – 0.25	<0.001	0.33	0.29 – 0.36	<0.001	0.27	0.21 – 0.33	<0.001
hinctnta [D - 9th decile]				-0.04	-0.06 – -0.02	<0.001				-0.04	-0.07 – -0.01	0.022	0.04	-0.02 – 0.11	0.172
hinctnta [F - 5th decile]				-0.02	-0.04 – -0.00	0.017				-0.02	-0.05 – -0.00	0.047	0.04	-0.01 – 0.10	0.119
hinctnta [H - 10th decile]				-0.07	-0.09 – -0.05	<0.001				-0.06	-0.10 – -0.03	<0.001	0.02	-0.04 – 0.08	0.477
hinctnta [J - 1st decile]				0.05	0.03 – 0.08	<0.001				0.05	0.01 – 0.08	0.009	-0.02	-0.08 – 0.04	0.509
hinctnta [K - 7th decile]				-0.02	-0.04 – -0.00	0.037				-0.02	-0.05 – 0.01	0.157	0.07	0.01 – 0.13	0.028
hinctnta [M - 4th decile]				0.00	-0.02 – 0.02	0.989				-0.00	-0.03 – 0.02	0.906	0.05	-0.01 – 0.10	0.097
hinctnta [P - 8th decile]				-0.04	-0.06 – -0.02	<0.001				-0.04	-0.07 – -0.01	0.003	0.02	-0.04 – 0.08	0.453
hinctnta [R - 2nd decile]				0.02	-0.00 – 0.04	0.136				0.01	-0.02 – 0.04	0.352	-0.01	-0.07 – 0.04	0.692
hinctnta [S - 6th decile]				-0.01	-0.03 – 0.01	0.540				-0.01	-0.03 – 0.02	0.606	0.05	-0.01 – 0.11	0.092
price										-0.01	-0.01 – -0.01	<0.001	-0.00	-0.01 – 0.00	0.548
hinctnta [D - 9th decile] * price													-0.01	-0.02 – -0.00	0.005
hinctnta [F - 5th decile] * price													-0.01	-0.02 – -0.00	0.010
hinctnta [H - 10th decile] * price													-0.01	-0.02 – -0.00	0.002
hinctnta [J - 1st decile] * price													0.01	0.00 – 0.02	0.009
hinctnta [K - 7th decile] * price													-0.01	-0.02 – -0.00	0.002
hinctnta [M - 4th decile] * price													-0.01	-0.01 – 0.00	0.076
hinctnta [P - 8th decile] * price													-0.01	-0.02 – -0.00	0.026
hinctnta [R - 2nd decile] * price													0.00	-0.00 – 0.01	0.332
hinctnta [S - 6th decile] * price													-0.01	-0.01 – -0.00	0.042
Random Effects
σ²	0.18			0.18			0.18			0.18			0.18
τ₀₀	0.00 _cntry			0.00 _cntry			0.00 _cntry			0.00 _cntry			0.00 _cntry
τ₁₁										0.00 _{cntry.hinctntaD - 9th decile}
										0.00 _{cntry.hinctntaF - 5th decile}
										0.00 _{cntry.hinctntaH - 10th decile}
										0.00 _{cntry.hinctntaJ - 1st decile}
										0.00 _{cntry.hinctntaK - 7th decile}
										0.00 _{cntry.hinctntaM - 4th decile}
										0.00 _{cntry.hinctntaP - 8th decile}
										0.00 _{cntry.hinctntaR - 2nd decile}
										0.00 _{cntry.hinctntaS - 6th decile}
ρ₀₁										-0.52
										-0.34
										-0.44
										0.54
										-0.69
										-0.15
										-0.38
										0.36
										-0.38
ICC	0.01			0.01			0.01						0.01
N	21 _cntry			20 _cntry			21 _cntry			20 _cntry			20 _cntry
Observations	40117			31860			40117			31860			31860
Marginal R² / Conditional R²	0.000 / 0.007			0.005 / 0.013			0.000 / 0.007			0.008 / NA			0.009 / 0.015

summary(cars)

##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.