Para crear nuestro proyecto deberemos de ir al menú superior File << New Project
Se nos abrirá una ventana con 3 opciones:
New directory: crear un proyecto desde el inicio (opción recomendable).
Existing directory: crear un proyecto con los códigos que tienes ya guardados en una carpeta.
Version control: para importar el proyecto de algún repositorio y vincularlo a él.
Seleccionamos New project
Debemos elegir el directorio de nuestro computador en donde queremos que se guarde (se crea una carpeta).
Adicionalmente le asignamos nombre al proyecto (que será el mismo nombre de la carpeta).
Una vez que el proyecto está creado, abriremos un script de R (donde escribiremos el código),
vamos a File << New File << R script
Guardaremos el archivo (archivo de extensión .R). Haciendo Click en el disquete
Este será nuestro código principal (puedes ponerle el nombre que quieras, normalmente se le llama main.R para diferenciarlo del resto), desde el que iremos construyendo nuestro código e iremos llamando a otros archivos si es necesario.
Recuerda que programar es como escribir: cuanto más limpio y estructurado, mejor se entenderá.
# Si queremos ver el conjunto de datos basta con que escribamos su nombre.
rosas
## # A tibble: 16 × 4
## ROSASVEN PPROSAS PPCLAVEL IPDFS
## <dbl> <dbl> <dbl> <dbl>
## 1 11484 2.26 3.49 158.
## 2 9348 2.54 2.85 173.
## 3 8429 3.07 4.06 165.
## 4 10079 2.91 3.64 173.
## 5 9240 2.73 3.21 178.
## 6 8862 2.77 3.66 199.
## 7 6216 3.59 3.76 186.
## 8 8253 3.23 3.49 189.
## 9 8038 2.6 3.13 180.
## 10 7476 2.89 3.2 183.
## 11 5911 3.77 3.65 182.
## 12 7950 3.64 3.6 185
## 13 6134 2.82 2.94 184
## 14 5868 2.96 3.12 188.
## 15 3160 4.24 3.58 176.
## 16 5872 3.69 3.53 188
# Si queremos ver el listado de variables que tenemos en el conjunto de datos podemos utilizar las siguiente funciones
ls(rosas)
## [1] "IPDFS" "PPCLAVEL" "PPROSAS" "ROSASVEN"
list(rosas)
## [[1]]
## # A tibble: 16 × 4
## ROSASVEN PPROSAS PPCLAVEL IPDFS
## <dbl> <dbl> <dbl> <dbl>
## 1 11484 2.26 3.49 158.
## 2 9348 2.54 2.85 173.
## 3 8429 3.07 4.06 165.
## 4 10079 2.91 3.64 173.
## 5 9240 2.73 3.21 178.
## 6 8862 2.77 3.66 199.
## 7 6216 3.59 3.76 186.
## 8 8253 3.23 3.49 189.
## 9 8038 2.6 3.13 180.
## 10 7476 2.89 3.2 183.
## 11 5911 3.77 3.65 182.
## 12 7950 3.64 3.6 185
## 13 6134 2.82 2.94 184
## 14 5868 2.96 3.12 188.
## 15 3160 4.24 3.58 176.
## 16 5872 3.69 3.53 188
LISTADO DE VARIABLES EN EL ARCHIVO ROSAS
ROSASVEN: Cantidad de rosas vendidas en docenas.
PPROSAS: Precio promedio al mayoreo de las rosas por docena en dólares.
PPCLAVEL: Precio promedio al mayoreo de los claveles por docena en dólares.
IPDFS: Ingreso promedio disponible familiar semanal en dólares.
# Matriz de grafico de dispersion
plot(rosas)
# Nombre de la grafica y color
plot(rosas,main="MATRIZ DE DISPERSION",col="blue")
# Aplicando la función attach, que permite individualizar las variables en un conjunto de datos.
attach(rosas)
# grafica PPROSAS y ROSASVEN
plot(PPROSAS,ROSASVEN,main = "ROSAS VENDIDAS Y PRECIO PROMEDIO DE ROSAS"
,col="blue")
# grafica PPCLAVEL y ROSASVEN
plot(PPROSAS,ROSASVEN,main = "ROSAS VENDIDAS Y PRECIO PROMEDIO DE CLAVELES"
,col="green")
# grafica IPDFS y ROSASVEN
plot(IPDFS,ROSASVEN,main = "ROSAS VENDIDAS Y PRECIO PROMEDIO DE CLAVELES"
,col="red")
# En R además de realizar gráficos de dispersión también podemos realizar histogramas de forma sencilla mediante la siguiente función.
hist(ROSASVEN,main="ROSAS VENDIDAS",col="blue")
hist(PPROSAS,main="PRECIO ROSAS",col="blue")
hist(PPCLAVEL,main="PRECIO CLAVELES",col="blue")
hist(IPDFS,main="INGRESO PROMEDIO DISPONIBLE FAMILIAR SEMANAL",col="blue")
# ademas podemos asignarle a nuestro histograma la curva de densidad.
hist(ROSASVEN,freq=FALSE,main="ROSAS VENDIDAS",col="red")
lines(density(ROSASVEN))
hist(PPROSAS,freq=FALSE,main="PRECIO DE ROSAS",col="red")
lines(density(PPROSAS))
hist(PPCLAVEL,freq=FALSE,main="PRECIO CLAVELES",col="red")
lines(density(PPCLAVEL))
hist(IPDFS,freq=FALSE,main="INGRESO PROMEDIO DISPONIBLE FAMILIAR SEMANAL",col="red")
lines(density(IPDFS))
# Marcos de graficas
par(mfrow=c(2,2))
plot(PPROSAS,ROSASVEN,main="ROSAS VENDIDAS",col="blue")
plot(PPCLAVEL,ROSASVEN,main="PRECIO DE ROSAS",col="green")
plot(IPDFS,ROSASVEN,main="INGRESO PROMEDIO DISPONIBLE FAMILIAR SEMANAL",col="red")
hist(ROSASVEN,freq=FALSE,main="ROSAS VENDIDAS",col="red")
lines(density(ROSASVEN))
# para visualizar nuevamente los datos uno a uno
par(mfrow=c(1,1))
# matriz de correlacion
cor(rosas)
## ROSASVEN PPROSAS PPCLAVEL IPDFS
## ROSASVEN 1.00000000 -0.7841546 -0.02268586 -0.4130362
## PPROSAS -0.78415462 1.0000000 0.47249050 0.2892452
## PPCLAVEL -0.02268586 0.4724905 1.00000000 -0.1043957
## IPDFS -0.41303617 0.2892452 -0.10439569 1.0000000
# Correlaciones individuales.
cor(ROSASVEN,PPROSAS,method=c("pearson"))
## [1] -0.7841546
cor(ROSASVEN,PPROSAS,method=c("spearman"))
## [1] -0.7117647
cor(ROSASVEN,PPROSAS,method=c("kendall"))
## [1] -0.5666667
# construccion del modelo (NIVEL-NIVEL)
modelo1=lm(ROSASVEN~PPROSAS+PPCLAVEL+IPDFS)
modelo1
##
## Call:
## lm(formula = ROSASVEN ~ PPROSAS + PPCLAVEL + IPDFS)
##
## Coefficients:
## (Intercept) PPROSAS PPCLAVEL IPDFS
## 13354.60 -3628.19 2633.75 -19.25
summary(modelo1)
##
## Call:
## lm(formula = ROSASVEN ~ PPROSAS + PPCLAVEL + IPDFS)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1340.89 -703.27 -67.34 835.27 1882.46
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 13354.60 6485.42 2.059 0.0619 .
## PPROSAS -3628.19 635.63 -5.708 9.79e-05 ***
## PPCLAVEL 2633.75 1012.64 2.601 0.0232 *
## IPDFS -19.25 30.69 -0.627 0.5422
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1076 on 12 degrees of freedom
## Multiple R-squared: 0.7779, Adjusted R-squared: 0.7224
## F-statistic: 14.01 on 3 and 12 DF, p-value: 0.0003164
# Eliminamos IPDFS por que no es significativa
modelo1op=lm(ROSASVEN~PPROSAS+PPCLAVEL)
summary(modelo1op)
##
## Call:
## lm(formula = ROSASVEN ~ PPROSAS + PPCLAVEL)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1454.50 -680.52 -90.12 823.18 1848.07
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9734.2 2888.1 3.371 0.00502 **
## PPROSAS -3782.2 572.5 -6.607 1.7e-05 ***
## PPCLAVEL 2815.3 947.5 2.971 0.01082 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1051 on 13 degrees of freedom
## Multiple R-squared: 0.7706, Adjusted R-squared: 0.7354
## F-statistic: 21.84 on 2 and 13 DF, p-value: 6.971e-05
# construccion del modelo (LOG-LOG)
modelo2=lm(log(ROSASVEN)~log(PPROSAS)+log(PPCLAVEL)+log(IPDFS))
modelo2
##
## Call:
## lm(formula = log(ROSASVEN) ~ log(PPROSAS) + log(PPCLAVEL) + log(IPDFS))
##
## Coefficients:
## (Intercept) log(PPROSAS) log(PPCLAVEL) log(IPDFS)
## 6.2877 -1.8562 1.4541 0.5596
summary(modelo2)
##
## Call:
## lm(formula = log(ROSASVEN) ~ log(PPROSAS) + log(PPCLAVEL) + log(IPDFS))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.29450 -0.09833 -0.01829 0.12112 0.30782
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.2877 4.8746 1.290 0.22139
## log(PPROSAS) -1.8562 0.3438 -5.399 0.00016 ***
## log(PPCLAVEL) 1.4541 0.5724 2.540 0.02593 *
## log(IPDFS) 0.5596 0.9211 0.607 0.55484
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1759 on 12 degrees of freedom
## Multiple R-squared: 0.7373, Adjusted R-squared: 0.6716
## F-statistic: 11.22 on 3 and 12 DF, p-value: 0.0008491
# Eliminamos IPDFS por que no es significativa
modelo2op=lm(log(ROSASVEN)~log(PPROSAS)+log(PPCLAVEL))
modelo2op
##
## Call:
## lm(formula = log(ROSASVEN) ~ log(PPROSAS) + log(PPCLAVEL))
##
## Coefficients:
## (Intercept) log(PPROSAS) log(PPCLAVEL)
## 9.228 -1.761 1.340
summary(modelo2op)
##
## Call:
## lm(formula = log(ROSASVEN) ~ log(PPROSAS) + log(PPCLAVEL))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.33467 -0.09749 -0.00747 0.11701 0.31182
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.2278 0.5684 16.235 5.18e-10 ***
## log(PPROSAS) -1.7607 0.2982 -5.904 5.20e-05 ***
## log(PPCLAVEL) 1.3398 0.5273 2.541 0.0246 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1715 on 13 degrees of freedom
## Multiple R-squared: 0.7292, Adjusted R-squared: 0.6875
## F-statistic: 17.5 on 2 and 13 DF, p-value: 0.0002053
# construccion del modelo (LOG-NIVEL)
modelo3=lm(log(ROSASVEN)~PPROSAS+PPCLAVEL+IPDFS)
modelo3
##
## Call:
## lm(formula = log(ROSASVEN) ~ PPROSAS + PPCLAVEL + IPDFS)
##
## Coefficients:
## (Intercept) PPROSAS PPCLAVEL IPDFS
## 9.018779 -0.574228 0.408445 0.001472
summary(modelo3)
##
## Call:
## lm(formula = log(ROSASVEN) ~ PPROSAS + PPCLAVEL + IPDFS)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.24655 -0.12000 -0.02361 0.11205 0.30961
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.018779 1.015144 8.884 1.27e-06 ***
## PPROSAS -0.574228 0.099493 -5.772 8.86e-05 ***
## PPCLAVEL 0.408445 0.158505 2.577 0.0242 *
## IPDFS 0.001472 0.004805 0.306 0.7646
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1685 on 12 degrees of freedom
## Multiple R-squared: 0.7589, Adjusted R-squared: 0.6986
## F-statistic: 12.59 on 3 and 12 DF, p-value: 0.000513
# Eliminamos IPDFS por que no es significativa
modelo3op=lm(log(ROSASVEN)~PPROSAS+PPCLAVEL)
modelo3op
##
## Call:
## lm(formula = log(ROSASVEN) ~ PPROSAS + PPCLAVEL)
##
## Coefficients:
## (Intercept) PPROSAS PPCLAVEL
## 9.2956 -0.5625 0.3946
summary(modelo3op)
##
## Call:
## lm(formula = log(ROSASVEN) ~ PPROSAS + PPCLAVEL)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.26500 -0.10197 -0.01915 0.10928 0.31224
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.29557 0.44656 20.816 2.29e-11 ***
## PPROSAS -0.56245 0.08852 -6.354 2.52e-05 ***
## PPCLAVEL 0.39457 0.14651 2.693 0.0184 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1625 on 13 degrees of freedom
## Multiple R-squared: 0.757, Adjusted R-squared: 0.7196
## F-statistic: 20.25 on 2 and 13 DF, p-value: 0.0001015
# construccion del modelo (NIVEL-LOG)
modelo4=lm(ROSASVEN~log(PPROSAS)+log(PPCLAVEL)+log(IPDFS))
modelo4
##
## Call:
## lm(formula = ROSASVEN ~ log(PPROSAS) + log(PPCLAVEL) + log(IPDFS))
##
## Coefficients:
## (Intercept) log(PPROSAS) log(PPCLAVEL) log(IPDFS)
## 16637 -11864 9540 -1430
summary(modelo4)
##
## Call:
## lm(formula = ROSASVEN ~ log(PPROSAS) + log(PPCLAVEL) + log(IPDFS))
##
## Residuals:
## Min 1Q Median 3Q Max
## -1259.8 -780.0 -189.0 878.4 1886.0
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 16637 29893 0.557 0.588060
## log(PPROSAS) -11864 2108 -5.628 0.000111 ***
## log(PPCLAVEL) 9540 3510 2.718 0.018688 *
## log(IPDFS) -1430 5648 -0.253 0.804438
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1079 on 12 degrees of freedom
## Multiple R-squared: 0.777, Adjusted R-squared: 0.7213
## F-statistic: 13.94 on 3 and 12 DF, p-value: 0.0003242
# Eliminamos IPDFS por que no es significativa
modelo4op=lm(ROSASVEN~log(PPROSAS)+log(PPCLAVEL))
modelo4op
##
## Call:
## lm(formula = ROSASVEN ~ log(PPROSAS) + log(PPCLAVEL))
##
## Coefficients:
## (Intercept) log(PPROSAS) log(PPCLAVEL)
## 9124 -12109 9832
summary(modelo4op)
##
## Call:
## lm(formula = ROSASVEN ~ log(PPROSAS) + log(PPCLAVEL))
##
## Residuals:
## Min 1Q Median 3Q Max
## -1303.2 -773.5 -145.1 866.6 1875.8
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9124 3442 2.651 0.0200 *
## log(PPROSAS) -12108 1806 -6.704 1.46e-05 ***
## log(PPCLAVEL) 9832 3194 3.079 0.0088 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1039 on 13 degrees of freedom
## Multiple R-squared: 0.7758, Adjusted R-squared: 0.7413
## F-statistic: 22.5 on 2 and 13 DF, p-value: 6.009e-05
library(stargazer)
stargazer(modelo1op,modelo2op,modelo3op,modelo4op,type = "text")
##
## ==============================================================================
## Dependent variable:
## ------------------------------------------------
## ROSASVEN log(ROSASVEN) ROSASVEN
## (1) (2) (3) (4)
## ------------------------------------------------------------------------------
## PPROSAS -3,782.196*** -0.562***
## (572.455) (0.089)
##
## PPCLAVEL 2,815.252** 0.395**
## (947.511) (0.147)
##
## log(PPROSAS) -1.761*** -12,108.510***
## (0.298) (1,806.055)
##
## log(PPCLAVEL) 1.340** 9,831.960***
## (0.527) (3,193.686)
##
## Constant 9,734.217*** 9.228*** 9.296*** 9,124.086**
## (2,888.059) (0.568) (0.447) (3,442.398)
##
## ------------------------------------------------------------------------------
## Observations 16 16 16 16
## R2 0.771 0.729 0.757 0.776
## Adjusted R2 0.735 0.688 0.720 0.741
## Residual Std. Error (df = 13) 1,050.883 0.172 0.162 1,038.957
## F Statistic (df = 2; 13) 21.841*** 17.501*** 20.250*** 22.495***
## ==============================================================================
## Note: *p<0.1; **p<0.05; ***p<0.01
plot(modelo4op, which=1)
library(gvlma)
gvlma.lm(modelo4op)
##
## Call:
## lm(formula = ROSASVEN ~ log(PPROSAS) + log(PPCLAVEL))
##
## Coefficients:
## (Intercept) log(PPROSAS) log(PPCLAVEL)
## 9124 -12109 9832
##
##
## ASSESSMENT OF THE LINEAR MODEL ASSUMPTIONS
## USING THE GLOBAL TEST ON 4 DEGREES-OF-FREEDOM:
## Level of Significance = 0.05
##
## Call:
## gvlma.lm(lmobj = modelo4op)
##
## Value p-value Decision
## Global Stat 1.9633 0.7425 Assumptions acceptable.
## Skewness 0.5215 0.4702 Assumptions acceptable.
## Kurtosis 0.6904 0.4060 Assumptions acceptable.
## Link Function 0.5909 0.4421 Assumptions acceptable.
## Heteroscedasticity 0.1606 0.6886 Assumptions acceptable.
plot(modelo4op, which=1)
library(lmtest)
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
dwtest(modelo4op)
##
## Durbin-Watson test
##
## data: modelo4op
## DW = 2.2508, p-value = 0.6503
## alternative hypothesis: true autocorrelation is greater than 0
bgtest(modelo4op)
##
## Breusch-Godfrey test for serial correlation of order up to 1
##
## data: modelo4op
## LM test = 0.33289, df = 1, p-value = 0.564
plot(modelo4op, which=1)
gvlma.lm(modelo4op)
##
## Call:
## lm(formula = ROSASVEN ~ log(PPROSAS) + log(PPCLAVEL))
##
## Coefficients:
## (Intercept) log(PPROSAS) log(PPCLAVEL)
## 9124 -12109 9832
##
##
## ASSESSMENT OF THE LINEAR MODEL ASSUMPTIONS
## USING THE GLOBAL TEST ON 4 DEGREES-OF-FREEDOM:
## Level of Significance = 0.05
##
## Call:
## gvlma.lm(lmobj = modelo4op)
##
## Value p-value Decision
## Global Stat 1.9633 0.7425 Assumptions acceptable.
## Skewness 0.5215 0.4702 Assumptions acceptable.
## Kurtosis 0.6904 0.4060 Assumptions acceptable.
## Link Function 0.5909 0.4421 Assumptions acceptable.
## Heteroscedasticity 0.1606 0.6886 Assumptions acceptable.
bptest(modelo4op)
##
## studentized Breusch-Pagan test
##
## data: modelo4op
## BP = 0.99219, df = 2, p-value = 0.6089
plot(modelo4op, which=2)
gvlma.lm(modelo4op)
##
## Call:
## lm(formula = ROSASVEN ~ log(PPROSAS) + log(PPCLAVEL))
##
## Coefficients:
## (Intercept) log(PPROSAS) log(PPCLAVEL)
## 9124 -12109 9832
##
##
## ASSESSMENT OF THE LINEAR MODEL ASSUMPTIONS
## USING THE GLOBAL TEST ON 4 DEGREES-OF-FREEDOM:
## Level of Significance = 0.05
##
## Call:
## gvlma.lm(lmobj = modelo4op)
##
## Value p-value Decision
## Global Stat 1.9633 0.7425 Assumptions acceptable.
## Skewness 0.5215 0.4702 Assumptions acceptable.
## Kurtosis 0.6904 0.4060 Assumptions acceptable.
## Link Function 0.5909 0.4421 Assumptions acceptable.
## Heteroscedasticity 0.1606 0.6886 Assumptions acceptable.
shapiro.test(residuals(modelo4op))
##
## Shapiro-Wilk normality test
##
## data: residuals(modelo4op)
## W = 0.93479, p-value = 0.29
ks.test(residuals(modelo4op),"pnorm",mean(residuals(modelo4op)),sd(residuals(modelo4op)))
##
## Exact one-sample Kolmogorov-Smirnov test
##
## data: residuals(modelo4op)
## D = 0.13462, p-value = 0.8974
## alternative hypothesis: two-sided
library(tseries)
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
jarque.bera.test(residuals(modelo4op))
##
## Jarque Bera Test
##
## data: residuals(modelo4op)
## X-squared = 1.2119, df = 2, p-value = 0.5456
library(car)
## Warning: package 'car' was built under R version 4.3.3
## Loading required package: carData
vif(modelo4op)
## log(PPROSAS) log(PPCLAVEL)
## 1.308652 1.308652
attach(cualitativasex)
MUJER = ifelse(EMPLEADO=="MUJER",1,0)
HOMBRE=ifelse(EMPLEADO=="HOMBRE",1,0)
modelogbh=lm(cualitativasex$`SALARIO DIARIO`~MUJER)
summary(modelogbh)
##
## Call:
## lm(formula = cualitativasex$`SALARIO DIARIO` ~ MUJER)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.5995 -1.8495 -0.9877 1.4260 17.8805
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.0995 0.2100 33.806 < 2e-16 ***
## MUJER -2.5118 0.3034 -8.279 1.04e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.476 on 524 degrees of freedom
## Multiple R-squared: 0.1157, Adjusted R-squared: 0.114
## F-statistic: 68.54 on 1 and 524 DF, p-value: 1.042e-15
modelogbm=lm(cualitativasex$`SALARIO DIARIO`~HOMBRE)
summary(modelogbm)
##
## Call:
## lm(formula = cualitativasex$`SALARIO DIARIO` ~ HOMBRE)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.5995 -1.8495 -0.9877 1.4260 17.8805
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.5877 0.2190 20.950 < 2e-16 ***
## HOMBRE 2.5118 0.3034 8.279 1.04e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.476 on 524 degrees of freedom
## Multiple R-squared: 0.1157, Adjusted R-squared: 0.114
## F-statistic: 68.54 on 1 and 524 DF, p-value: 1.042e-15
modelogbhm=lm(cualitativasex$`SALARIO DIARIO`~MUJER+EDUCACION+EXPERIENCIA+ANTIGÜEDAD)
summary(modelogbhm)
##
## Call:
## lm(formula = cualitativasex$`SALARIO DIARIO` ~ MUJER + EDUCACION +
## EXPERIENCIA + ANTIGÜEDAD)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.7675 -1.8080 -0.4229 1.0467 14.0075
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.56794 0.72455 -2.164 0.0309 *
## MUJER -1.81085 0.26483 -6.838 2.26e-11 ***
## EDUCACION 0.57150 0.04934 11.584 < 2e-16 ***
## EXPERIENCIA 0.02540 0.01157 2.195 0.0286 *
## ANTIGÜEDAD 0.14101 0.02116 6.663 6.83e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.958 on 521 degrees of freedom
## Multiple R-squared: 0.3635, Adjusted R-squared: 0.3587
## F-statistic: 74.4 on 4 and 521 DF, p-value: < 2.2e-16
CASADO= ifelse(ESTADO=="CASADO",1,0)
modelogbhm=lm(cualitativasex$`SALARIO DIARIO`~MUJER+EDUCACION+EXPERIENCIA+ANTIGÜEDAD+CASADO)
summary(modelogbhm)
##
## Call:
## lm(formula = cualitativasex$`SALARIO DIARIO` ~ MUJER + EDUCACION +
## EXPERIENCIA + ANTIGÜEDAD + CASADO)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.731 -1.816 -0.500 1.050 13.928
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.61815 0.72305 -2.238 0.0256 *
## MUJER -1.74140 0.26649 -6.535 1.52e-10 ***
## EDUCACION 0.55568 0.04986 11.144 < 2e-16 ***
## EXPERIENCIA 0.01874 0.01203 1.558 0.1198
## ANTIGÜEDAD 0.13878 0.02114 6.566 1.25e-10 ***
## CASADO 0.55924 0.28595 1.956 0.0510 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.95 on 520 degrees of freedom
## Multiple R-squared: 0.3682, Adjusted R-squared: 0.3621
## F-statistic: 60.61 on 5 and 520 DF, p-value: < 2.2e-16
contruimos un modelo optimo teniendo en cuenta que la variables experiencia no es estadisticamente diferent a cero.
modelogbhm=lm(cualitativasex$`SALARIO DIARIO`~MUJER+EDUCACION+ANTIGÜEDAD+CASADO)
summary(modelogbhm)
##
## Call:
## lm(formula = cualitativasex$`SALARIO DIARIO` ~ MUJER + EDUCACION +
## ANTIGÜEDAD + CASADO)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.5538 -1.7647 -0.5327 1.0318 13.9883
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.13853 0.65517 -1.738 0.0828 .
## MUJER -1.71050 0.26611 -6.428 2.93e-10 ***
## EDUCACION 0.52936 0.04698 11.268 < 2e-16 ***
## ANTIGÜEDAD 0.15418 0.01871 8.241 1.40e-15 ***
## CASADO 0.68522 0.27466 2.495 0.0129 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.954 on 521 degrees of freedom
## Multiple R-squared: 0.3652, Adjusted R-squared: 0.3604
## F-statistic: 74.95 on 4 and 521 DF, p-value: < 2.2e-16
modelogbhm=lm(log(cualitativasex$`SALARIO DIARIO`)~MUJER+EDUCACION+log(EXPERIENCIA)+ANTIGÜEDAD+CASADO)
summary(modelogbhm)
##
## Call:
## lm(formula = log(cualitativasex$`SALARIO DIARIO`) ~ MUJER + EDUCACION +
## log(EXPERIENCIA) + ANTIGÜEDAD + CASADO)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.82988 -0.25114 -0.01359 0.24749 1.20123
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.316767 0.104585 3.029 0.00258 **
## MUJER -0.295142 0.036638 -8.056 5.44e-15 ***
## EDUCACION 0.087451 0.006659 13.132 < 2e-16 ***
## log(EXPERIENCIA) 0.100151 0.021132 4.739 2.77e-06 ***
## ANTIGÜEDAD 0.013936 0.002817 4.947 1.02e-06 ***
## CASADO 0.062998 0.041620 1.514 0.13073
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4052 on 520 degrees of freedom
## Multiple R-squared: 0.4245, Adjusted R-squared: 0.419
## F-statistic: 76.71 on 5 and 520 DF, p-value: < 2.2e-16
modelogbhm=lm(log(cualitativasex$`SALARIO DIARIO`)~MUJER+EDUCACION+EXPERIENCIA+ANTIGÜEDAD+CASADO)
summary(modelogbhm)
##
## Call:
## lm(formula = log(cualitativasex$`SALARIO DIARIO`) ~ MUJER + EDUCACION +
## EXPERIENCIA + ANTIGÜEDAD + CASADO)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.87254 -0.27256 -0.03779 0.25349 1.23666
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.490058 0.101108 4.847 1.66e-06 ***
## MUJER -0.285530 0.037264 -7.662 9.00e-14 ***
## EDUCACION 0.083905 0.006973 12.033 < 2e-16 ***
## EXPERIENCIA 0.003134 0.001682 1.863 0.06300 .
## ANTIGÜEDAD 0.016867 0.002955 5.707 1.93e-08 ***
## CASADO 0.125739 0.039986 3.145 0.00176 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4125 on 520 degrees of freedom
## Multiple R-squared: 0.4036, Adjusted R-squared: 0.3979
## F-statistic: 70.38 on 5 and 520 DF, p-value: < 2.2e-16
modelogbhm=lm(cualitativasex$`SALARIO DIARIO`~MUJER+EDUCACION+log(EXPERIENCIA)+ANTIGÜEDAD+CASADO)
summary(modelogbhm)
##
## Call:
## lm(formula = cualitativasex$`SALARIO DIARIO` ~ MUJER + EDUCACION +
## log(EXPERIENCIA) + ANTIGÜEDAD + CASADO)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.667 -1.706 -0.507 1.087 13.875
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.61989 0.75249 -3.482 0.000540 ***
## MUJER -1.79686 0.26361 -6.816 2.59e-11 ***
## EDUCACION 0.57581 0.04791 12.017 < 2e-16 ***
## log(EXPERIENCIA) 0.58529 0.15204 3.849 0.000133 ***
## ANTIGÜEDAD 0.12201 0.02027 6.019 3.31e-09 ***
## CASADO 0.19546 0.29946 0.653 0.514229
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.915 on 520 degrees of freedom
## Multiple R-squared: 0.3828, Adjusted R-squared: 0.3769
## F-statistic: 64.51 on 5 and 520 DF, p-value: < 2.2e-16
Grupo base hombres solteros
HCASADO = ifelse(EMPLEADO=="HOMBRE" & ESTADO=="CASADO",1,0)
MCASADA = ifelse(EMPLEADO=="MUJER" & ESTADO=="CASADO",1,0)
MSOLTERA = ifelse(EMPLEADO=="MUJER" & ESTADO=="SOLTERO",1,0)
ESTADOS = cbind(HCASADO,MCASADA,MSOLTERA)
MODELO4G=lm(`SALARIO DIARIO`~HCASADO+MCASADA+MSOLTERA+EDUCACION+EXPERIENCIA+ANTIGÜEDAD)
summary(MODELO4G)
##
## Call:
## lm(formula = `SALARIO DIARIO` ~ HCASADO + MCASADA + MSOLTERA +
## EDUCACION + EXPERIENCIA + ANTIGÜEDAD)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.9810 -1.6450 -0.5047 1.0242 13.6029
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.39419 0.73011 -3.279 0.00111 **
## HCASADO 1.82189 0.39522 4.610 5.08e-06 ***
## MCASADA -0.88192 0.41374 -2.132 0.03351 *
## MSOLTERA -0.30851 0.41002 -0.752 0.45214
## EDUCACION 0.55286 0.04895 11.293 < 2e-16 ***
## EXPERIENCIA 0.01897 0.01181 1.607 0.10876
## ANTIGÜEDAD 0.12980 0.02084 6.228 9.77e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.896 on 519 degrees of freedom
## Multiple R-squared: 0.3923, Adjusted R-squared: 0.3853
## F-statistic: 55.84 on 6 and 519 DF, p-value: < 2.2e-16