Олеся Волченко
9 октября 2021
Чуть подробнее посмотрим, что происходит при стандартизации переменной trust
## [1] 20.13244
## [1] -1.489098e-16
## [1] 5.220974
## [1] 1
##
## 3 4 5 6 7 8 9 10 11 12 13 14 15 16
## -3.28146495740175 15 0 0 0 0 0 0 0 0 0 0 0 0 0
## -3.08992982836383 0 3 0 0 0 0 0 0 0 0 0 0 0 0
## -2.8983946993259 0 0 5 0 0 0 0 0 0 0 0 0 0 0
## -2.70685957028798 0 0 0 7 0 0 0 0 0 0 0 0 0 0
## -2.51532444125005 0 0 0 0 11 0 0 0 0 0 0 0 0 0
## -2.32378931221213 0 0 0 0 0 22 0 0 0 0 0 0 0 0
## -2.1322541831742 0 0 0 0 0 0 35 0 0 0 0 0 0 0
## -1.94071905413628 0 0 0 0 0 0 0 23 0 0 0 0 0 0
## -1.74918392509835 0 0 0 0 0 0 0 0 33 0 0 0 0 0
## -1.55764879606042 0 0 0 0 0 0 0 0 0 37 0 0 0 0
## -1.3661136670225 0 0 0 0 0 0 0 0 0 0 47 0 0 0
## -1.17457853798457 0 0 0 0 0 0 0 0 0 0 0 66 0 0
## -0.983043408946649 0 0 0 0 0 0 0 0 0 0 0 0 94 0
## -0.791508279908724 0 0 0 0 0 0 0 0 0 0 0 0 0 103
## -0.599973150870799 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -0.408438021832874 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -0.216902892794948 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -0.0253677637570231 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 0.166167365280902 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 0.357702494318827 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 0.549237623356753 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 0.740772752394678 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 0.932307881432603 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 1.12384301047053 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 1.31537813950845 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 1.50691326854638 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 1.6984483975843 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 1.88998352662223 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 2.08151865566015 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 2.27305378469808 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 2.464588913736 0 0 0 0 0 0 0 0 0 0 0 0 0 0
##
## 17 18 19 20 21 22 23 24 25 26 27 28 29 30
## -3.28146495740175 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -3.08992982836383 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -2.8983946993259 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -2.70685957028798 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -2.51532444125005 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -2.32378931221213 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -2.1322541831742 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -1.94071905413628 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -1.74918392509835 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -1.55764879606042 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -1.3661136670225 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -1.17457853798457 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -0.983043408946649 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -0.791508279908724 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## -0.599973150870799 105 0 0 0 0 0 0 0 0 0 0 0 0 0
## -0.408438021832874 0 175 0 0 0 0 0 0 0 0 0 0 0 0
## -0.216902892794948 0 0 140 0 0 0 0 0 0 0 0 0 0 0
## -0.0253677637570231 0 0 0 163 0 0 0 0 0 0 0 0 0 0
## 0.166167365280902 0 0 0 0 150 0 0 0 0 0 0 0 0 0
## 0.357702494318827 0 0 0 0 0 212 0 0 0 0 0 0 0 0
## 0.549237623356753 0 0 0 0 0 0 165 0 0 0 0 0 0 0
## 0.740772752394678 0 0 0 0 0 0 0 179 0 0 0 0 0 0
## 0.932307881432603 0 0 0 0 0 0 0 0 166 0 0 0 0 0
## 1.12384301047053 0 0 0 0 0 0 0 0 0 108 0 0 0 0
## 1.31537813950845 0 0 0 0 0 0 0 0 0 0 88 0 0 0
## 1.50691326854638 0 0 0 0 0 0 0 0 0 0 0 44 0 0
## 1.6984483975843 0 0 0 0 0 0 0 0 0 0 0 0 22 0
## 1.88998352662223 0 0 0 0 0 0 0 0 0 0 0 0 0 12
## 2.08151865566015 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 2.27305378469808 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 2.464588913736 0 0 0 0 0 0 0 0 0 0 0 0 0 0
##
## 31 32 33
## -3.28146495740175 0 0 0
## -3.08992982836383 0 0 0
## -2.8983946993259 0 0 0
## -2.70685957028798 0 0 0
## -2.51532444125005 0 0 0
## -2.32378931221213 0 0 0
## -2.1322541831742 0 0 0
## -1.94071905413628 0 0 0
## -1.74918392509835 0 0 0
## -1.55764879606042 0 0 0
## -1.3661136670225 0 0 0
## -1.17457853798457 0 0 0
## -0.983043408946649 0 0 0
## -0.791508279908724 0 0 0
## -0.599973150870799 0 0 0
## -0.408438021832874 0 0 0
## -0.216902892794948 0 0 0
## -0.0253677637570231 0 0 0
## 0.166167365280902 0 0 0
## 0.357702494318827 0 0 0
## 0.549237623356753 0 0 0
## 0.740772752394678 0 0 0
## 0.932307881432603 0 0 0
## 1.12384301047053 0 0 0
## 1.31537813950845 0 0 0
## 1.50691326854638 0 0 0
## 1.6984483975843 0 0 0
## 1.88998352662223 0 0 0
## 2.08151865566015 2 0 0
## 2.27305378469808 0 4 0
## 2.464588913736 0 0 14
##
## Call:
## lm(formula = trust ~ gndr + agea + eduyrs + hinctnta + marsts,
## data = data1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -18.4422 -3.1892 0.6636 3.7162 15.6552
##
## Coefficients:
## Estimate
## (Intercept) 13.65197
## gndrFemale 0.21721
## agea 0.04946
## eduyrs 0.20883
## hinctnta 0.15818
## marstsLegally divorced/civil union dissolved -0.60940
## marstsWidowed/civil partner died 0.58870
## marstsNone of these (NEVER married or in legally registered civil union) 0.18298
## Std. Error
## (Intercept) 1.20466
## gndrFemale 0.34950
## agea 0.01280
## eduyrs 0.04971
## hinctnta 0.06588
## marstsLegally divorced/civil union dissolved 0.65411
## marstsWidowed/civil partner died 0.70527
## marstsNone of these (NEVER married or in legally registered civil union) 0.63993
## t value
## (Intercept) 11.333
## gndrFemale 0.621
## agea 3.865
## eduyrs 4.201
## hinctnta 2.401
## marstsLegally divorced/civil union dissolved -0.932
## marstsWidowed/civil partner died 0.835
## marstsNone of these (NEVER married or in legally registered civil union) 0.286
## Pr(>|t|)
## (Intercept) < 2e-16
## gndrFemale 0.534407
## agea 0.000118
## eduyrs 2.88e-05
## hinctnta 0.016517
## marstsLegally divorced/civil union dissolved 0.351730
## marstsWidowed/civil partner died 0.404072
## marstsNone of these (NEVER married or in legally registered civil union) 0.774978
##
## (Intercept) ***
## gndrFemale
## agea ***
## eduyrs ***
## hinctnta *
## marstsLegally divorced/civil union dissolved
## marstsWidowed/civil partner died
## marstsNone of these (NEVER married or in legally registered civil union)
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.438 on 1061 degrees of freedom
## (1195 observations deleted due to missingness)
## Multiple R-squared: 0.04898, Adjusted R-squared: 0.0427
## F-statistic: 7.806 on 7 and 1061 DF, p-value: 3.082e-09
model1scaled <- lm(scale(trust) ~ gndr + scale(as.numeric(agea)) + scale(as.numeric(eduyrs)) +
scale(as.numeric(hinctnta)) + marsts, data = data1)
summary(model1scaled)##
## Call:
## lm(formula = scale(trust) ~ gndr + scale(as.numeric(agea)) +
## scale(as.numeric(eduyrs)) + scale(as.numeric(hinctnta)) +
## marsts, data = data1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.5323 -0.6108 0.1271 0.7118 2.9985
##
## Coefficients:
## Estimate
## (Intercept) -0.05420
## gndrFemale 0.04160
## scale(as.numeric(agea)) 0.17419
## scale(as.numeric(eduyrs)) 0.15000
## scale(as.numeric(hinctnta)) 0.09075
## marstsLegally divorced/civil union dissolved -0.11672
## marstsWidowed/civil partner died 0.11276
## marstsNone of these (NEVER married or in legally registered civil union) 0.03505
## Std. Error
## (Intercept) 0.11438
## gndrFemale 0.06694
## scale(as.numeric(agea)) 0.04507
## scale(as.numeric(eduyrs)) 0.03571
## scale(as.numeric(hinctnta)) 0.03779
## marstsLegally divorced/civil union dissolved 0.12528
## marstsWidowed/civil partner died 0.13508
## marstsNone of these (NEVER married or in legally registered civil union) 0.12257
## t value
## (Intercept) -0.474
## gndrFemale 0.621
## scale(as.numeric(agea)) 3.865
## scale(as.numeric(eduyrs)) 4.201
## scale(as.numeric(hinctnta)) 2.401
## marstsLegally divorced/civil union dissolved -0.932
## marstsWidowed/civil partner died 0.835
## marstsNone of these (NEVER married or in legally registered civil union) 0.286
## Pr(>|t|)
## (Intercept) 0.635684
## gndrFemale 0.534407
## scale(as.numeric(agea)) 0.000118
## scale(as.numeric(eduyrs)) 2.88e-05
## scale(as.numeric(hinctnta)) 0.016517
## marstsLegally divorced/civil union dissolved 0.351730
## marstsWidowed/civil partner died 0.404072
## marstsNone of these (NEVER married or in legally registered civil union) 0.774978
##
## (Intercept)
## gndrFemale
## scale(as.numeric(agea)) ***
## scale(as.numeric(eduyrs)) ***
## scale(as.numeric(hinctnta)) *
## marstsLegally divorced/civil union dissolved
## marstsWidowed/civil partner died
## marstsNone of these (NEVER married or in legally registered civil union)
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.042 on 1061 degrees of freedom
## (1195 observations deleted due to missingness)
## Multiple R-squared: 0.04898, Adjusted R-squared: 0.0427
## F-statistic: 7.806 on 7 and 1061 DF, p-value: 3.082e-09
library(foreign)
data <- read.spss("/Users/olesyavolchenko/Yandex.Disk.localized/datafiles/ESS/ESS7e02_2.sav", to.data.frame = T, use.value.labels = T)
data1 <- data[which(data$cntry == "United Kingdom"), ]
data1$trust <- as.numeric(data1$ppltrst) + as.numeric(data1$pplfair) + as.numeric(data1$pplhlp)
data1$agea <- as.numeric(as.character(data1$agea))
data1$eduyrs <- as.numeric(as.character(data1$eduyrs))
data1$hinctnta <- as.numeric(data1$hinctnta)
data1 <- na.omit(data1[c("trust", "agea", "eduyrs", "hinctnta", "gndr")])##
## Call:
## lm(formula = trust ~ gndr, data = data1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -17.1719 -3.1719 0.8281 3.8281 12.8978
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 20.10218 0.17540 114.606 <2e-16 ***
## gndrFemale 0.06976 0.23926 0.292 0.771
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.177 on 1881 degrees of freedom
## Multiple R-squared: 4.519e-05, Adjusted R-squared: -0.0004864
## F-statistic: 0.085 on 1 and 1881 DF, p-value: 0.7707
##
## Call:
## lm(formula = trust ~ gndr + agea, data = data1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -18.0089 -3.0064 0.6874 3.7257 13.4577
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 18.061918 0.397527 45.436 < 2e-16 ***
## gndrFemale 0.140800 0.237604 0.593 0.554
## agea 0.038273 0.006705 5.708 1.33e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.134 on 1880 degrees of freedom
## Multiple R-squared: 0.01708, Adjusted R-squared: 0.01603
## F-statistic: 16.33 on 2 and 1880 DF, p-value: 9.281e-08
## Analysis of Variance Table
##
## Model 1: trust ~ gndr
## Model 2: trust ~ gndr + agea
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 1881 50406
## 2 1880 49547 1 858.64 32.58 1.327e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Тест показывает, что вторая модель значимо лучше. Можно продолжить усложнять модель.
## Registered S3 methods overwritten by 'parameters':
## method from
## as.double.parameters_kurtosis datawizard
## as.double.parameters_skewness datawizard
## as.double.parameters_smoothness datawizard
## as.numeric.parameters_kurtosis datawizard
## as.numeric.parameters_skewness datawizard
## as.numeric.parameters_smoothness datawizard
## print.parameters_distribution datawizard
## print.parameters_kurtosis datawizard
## print.parameters_skewness datawizard
## summary.parameters_kurtosis datawizard
## summary.parameters_skewness datawizard
## $gndr
##
## $agea
1.1. Визуализировать распределение зависимой переменной при помощи гистограммы.
2.1. Визуализировать распределения предикторов.
Оценить первую модель при помощи функции lm(), она может включать в себя только один предиктор.
Поочередно добавлять последующие переменные, принимая решение при помощи функции anova() улучшает ли эта переменная модель.
Для финальной модели проинтерпретировать коэффициенты и выписать значение \(R^2\).
5.1. Визуализировать коэффициенты.
не путайте зависимую и независимые переменные.
интерпретируйте результаты.
проверяйте, в чем измеряются переменные.
помните, что регрессия не устанавливает причинно-следственных связей.
сравнение моделей ановой показывает не значим ли добавленный коэффициент, а улучшает ли добавленный коэффициент модель (так, коэффициент может быть значимым, но модель не улучшать, потому что он тратит слишком много степеней свободы)
dummy-переменные интерпретируются не как непрерывные. Задумывайтесь, нужно ли переменную включать в модель как dummy или как числовую. Переменная может быть закодирована числами, но при этом её нужно включить в модель как dummy.