DIFERENCIA ENTRE LA CORRELACIÓN Y REGRESIÓN…
#MÍNIMO DISTANCIA DE VALORES A LA RECTA: MINIMOS CUADRADOS. RECTA CON MÍNIMA DISNTANCIA TOTAL.
pvalue: lA PROBABILIDAD DE QUE H1 …DADO QUE H0
¿POR QUÉ LO NECESITAMOS? PARA SABER SILA MUESTRA HABLA SIGNIFICATIVAMENTE DE LA POBLACIÓN.
POR CADA UNIDAD PORCENTUAL QUE AUMENTA X SE ESPERA QUE Y AUMENTE EN — 6.78 Y=58.32+6.78*x
r2: porcentaje de la varianza está siendo explicada por Y
load(url("https://www.dropbox.com/s/fyobx9uswy3qgp3/dataWorld_q.rda?dl=1"))
names(dataWorld_q)
## [1] "country" "quinq" "tfr" "yearSchF" "contracep"
## [6] "age1mar" "sanitat" "water" "birthSkill" "childMort"
## [11] "deathRate" "extPov" "famWorkFem" "femWork" "incomePp"
## [16] "income10p" "gini" "lifExpFem" "lifExpTot" "maleWork"
## [21] "materMort" "vaccMeas" "schGenEq" "doctor" "teenFert"
model1<- lm(lifExpFem~femWork,dataWorld_q)
model1
##
## Call:
## lm(formula = lifExpFem ~ femWork, data = dataWorld_q)
##
## Coefficients:
## (Intercept) femWork
## 78.5885 -0.1623
Y=78.5885 -0.1623*x
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.2 --
## v ggplot2 3.3.6 v purrr 0.3.4
## v tibble 3.1.8 v dplyr 1.0.9
## v tidyr 1.2.0 v stringr 1.4.0
## v readr 2.1.2 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
ggplot(dataWorld_q, aes(x=femWork, y=lifExpFem)) +
geom_point() + geom_smooth(method="lm", se = F)
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 1642 rows containing non-finite values (stat_smooth).
## Warning: Removed 1642 rows containing missing values (geom_point).
No hay relación entre las variable independientes: supuesta de la relación multiple.
model2<- lm(lifExpFem~femWork+sanitat+income10p,dataWorld_q)
summary(model2)
##
## Call:
## lm(formula = lifExpFem ~ femWork + sanitat + income10p, data = dataWorld_q)
##
## Residuals:
## Min 1Q Median 3Q Max
## -15.7446 -2.2400 0.4262 2.8351 12.4650
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 53.986169 1.603971 33.658 < 2e-16 ***
## femWork 0.039648 0.015579 2.545 0.0113 *
## sanitat 0.282769 0.008148 34.702 < 2e-16 ***
## income10p -0.125383 0.030522 -4.108 4.76e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.567 on 440 degrees of freedom
## (2272 observations deleted due to missingness)
## Multiple R-squared: 0.785, Adjusted R-squared: 0.7835
## F-statistic: 535.4 on 3 and 440 DF, p-value: < 2.2e-16
Y= 53.98617+0.03965+0.28277-0.12538
EL P VALUE: < 2.2e-16 , ES MENOR A 0.05 POTR ENDE SE RECHAZA h0. Entonces el modelo es válido.
Multiple R-squared: 0.785 –> Este ,modelo explica en un 78.5 %
model3<- lm(lifExpFem~sanitat+famWorkFem+materMort,dataWorld_q)
summary(model3)
##
## Call:
## lm(formula = lifExpFem ~ sanitat + famWorkFem + materMort, data = dataWorld_q)
##
## Residuals:
## Min 1Q Median 3Q Max
## -16.1452 -1.9861 0.3949 2.4655 11.7319
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 64.441940 1.277691 50.436 <2e-16 ***
## sanitat 0.155055 0.013230 11.720 <2e-16 ***
## famWorkFem -0.003754 0.013634 -0.275 0.783
## materMort -0.016338 0.001360 -12.016 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.63 on 363 degrees of freedom
## (2349 observations deleted due to missingness)
## Multiple R-squared: 0.8275, Adjusted R-squared: 0.8261
## F-statistic: 580.5 on 3 and 363 DF, p-value: < 2.2e-16
r2: 0.8275
model4<- lm(lifExpFem~sanitat+water+materMort,dataWorld_q)
summary(model4)
##
## Call:
## lm(formula = lifExpFem ~ sanitat + water + materMort, data = dataWorld_q)
##
## Residuals:
## Min 1Q Median 3Q Max
## -15.6314 -2.2501 0.3332 2.6298 12.0004
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 62.372577 1.450420 43.003 < 2e-16 ***
## sanitat 0.119066 0.011640 10.229 < 2e-16 ***
## water 0.048571 0.018545 2.619 0.00907 **
## materMort -0.016279 0.001093 -14.891 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.907 on 533 degrees of freedom
## (2179 observations deleted due to missingness)
## Multiple R-squared: 0.8438, Adjusted R-squared: 0.8429
## F-statistic: 959.6 on 3 and 533 DF, p-value: < 2.2e-16
Multiple R-squared: 0.84 –> r2: 84.3%
model4<- lm(lifExpFem~sanitat+doctor+materMort,dataWorld_q)
summary(model4)
##
## Call:
## lm(formula = lifExpFem ~ sanitat + doctor + materMort, data = dataWorld_q)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.9754 -2.2057 0.5363 2.5090 12.0869
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 65.6627469 0.8827673 74.383 < 2e-16 ***
## sanitat 0.1086868 0.0109531 9.923 < 2e-16 ***
## doctor 1.0421621 0.1665371 6.258 8.59e-10 ***
## materMort -0.0168825 0.0009975 -16.924 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.721 on 485 degrees of freedom
## (2227 observations deleted due to missingness)
## Multiple R-squared: 0.858, Adjusted R-squared: 0.8572
## F-statistic: 977.1 on 3 and 485 DF, p-value: < 2.2e-16
r2: Multiple R-squared: 0.858