Proceso
- Cargar librerías
- Cargar datos
- Identificar variables
- Crear datos de entrenamiento y validación
- Crear modelo de Máquinas de Soporte Vectorial SVM *Deteminar anticipadamente e mejor costo del modelo de SVM Pendiente
- Analizar y/o describir el modelo
- Realizar predicciones con el conjunto de datos de validación
- Evaluar el modelo de predicción con matriz de confusión
- Interpretar el caso
- Comparar con el algoritmo de regresión logística del caso 11.
1. Cargar librerias
library(ggplot2)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(knitr)
library(caret)
## Warning: package 'caret' was built under R version 4.0.3
## Loading required package: lattice
library(readr)
library(knitr)
library(e1071)
## Warning: package 'e1071' was built under R version 4.0.3
2. Cargar datos
datos <- read.csv("https://raw.githubusercontent.com/rpizarrog/FundamentosMachineLearning/master/datos/adultos_clean.csv", encoding = "UTF-8")
# datos <- read.csv("../datos/adultos_clean.csv")
# kable(head(datos))
# kable(tail(datos))
kable(head(datos, 10), caption = "Los primeros 10 registros de datos")
Los primeros 10 registros de datos
| 1 |
25 |
Private |
Dropout |
7 |
Not_married |
Black |
Male |
40 |
<=50K |
0.1095890 |
0.4000000 |
0.3979592 |
0 |
| 2 |
38 |
Private |
HighGrad |
9 |
Married |
White |
Male |
50 |
<=50K |
0.2876712 |
0.5333333 |
0.5000000 |
0 |
| 3 |
28 |
Local-gov |
Community |
12 |
Married |
White |
Male |
40 |
>50K |
0.1506849 |
0.7333333 |
0.3979592 |
1 |
| 4 |
44 |
Private |
Community |
10 |
Married |
Black |
Male |
40 |
>50K |
0.3698630 |
0.6000000 |
0.3979592 |
1 |
| 5 |
18 |
? |
Community |
10 |
Not_married |
White |
Female |
30 |
<=50K |
0.0136986 |
0.6000000 |
0.2959184 |
0 |
| 6 |
34 |
Private |
Dropout |
6 |
Not_married |
White |
Male |
30 |
<=50K |
0.2328767 |
0.3333333 |
0.2959184 |
0 |
| 7 |
29 |
? |
HighGrad |
9 |
Not_married |
Black |
Male |
40 |
<=50K |
0.1643836 |
0.5333333 |
0.3979592 |
0 |
| 8 |
63 |
Self-emp-not-inc |
Master |
15 |
Married |
White |
Male |
32 |
>50K |
0.6301370 |
0.9333333 |
0.3163265 |
1 |
| 9 |
24 |
Private |
Community |
10 |
Not_married |
White |
Female |
40 |
<=50K |
0.0958904 |
0.6000000 |
0.3979592 |
0 |
| 10 |
55 |
Private |
Dropout |
4 |
Married |
White |
Male |
10 |
<=50K |
0.5205479 |
0.2000000 |
0.0918367 |
0 |
kable(tail(datos, 10), caption = "Los últimos 10 registros de datos")
Los últimos 10 registros de datos
| 48833 |
48833 |
32 |
Private |
Dropout |
6 |
Married |
Amer-Indian-Eskimo |
Male |
40 |
<=50K |
0.2054795 |
0.3333333 |
0.3979592 |
0 |
| 48834 |
48834 |
43 |
Private |
Community |
11 |
Married |
White |
Male |
45 |
<=50K |
0.3561644 |
0.6666667 |
0.4489796 |
0 |
| 48835 |
48835 |
32 |
Private |
Master |
14 |
Not_married |
Asian-Pac-Islander |
Male |
11 |
<=50K |
0.2054795 |
0.8666667 |
0.1020408 |
0 |
| 48836 |
48836 |
53 |
Private |
Master |
14 |
Married |
White |
Male |
40 |
>50K |
0.4931507 |
0.8666667 |
0.3979592 |
1 |
| 48837 |
48837 |
22 |
Private |
Community |
10 |
Not_married |
White |
Male |
40 |
<=50K |
0.0684932 |
0.6000000 |
0.3979592 |
0 |
| 48838 |
48838 |
27 |
Private |
Community |
12 |
Married |
White |
Female |
38 |
<=50K |
0.1369863 |
0.7333333 |
0.3775510 |
0 |
| 48839 |
48839 |
40 |
Private |
HighGrad |
9 |
Married |
White |
Male |
40 |
>50K |
0.3150685 |
0.5333333 |
0.3979592 |
1 |
| 48840 |
48840 |
58 |
Private |
HighGrad |
9 |
Widow |
White |
Female |
40 |
<=50K |
0.5616438 |
0.5333333 |
0.3979592 |
0 |
| 48841 |
48841 |
22 |
Private |
HighGrad |
9 |
Not_married |
White |
Male |
20 |
<=50K |
0.0684932 |
0.5333333 |
0.1938776 |
0 |
| 48842 |
48842 |
52 |
Self-emp-inc |
HighGrad |
9 |
Married |
White |
Female |
40 |
>50K |
0.4794521 |
0.5333333 |
0.3979592 |
1 |
3. Las variables
- Convertir a factor la variable dependiente income 10
datos$income10 <- factor(datos$income10)
str(datos)
## 'data.frame': 48842 obs. of 14 variables:
## $ X : int 1 2 3 4 5 6 7 8 9 10 ...
## $ age : int 25 38 28 44 18 34 29 63 24 55 ...
## $ workclass : chr "Private" "Private" "Local-gov" "Private" ...
## $ education : chr "Dropout" "HighGrad" "Community" "Community" ...
## $ educational.num : int 7 9 12 10 10 6 9 15 10 4 ...
## $ marital.status : chr "Not_married" "Married" "Married" "Married" ...
## $ race : chr "Black" "White" "White" "Black" ...
## $ gender : chr "Male" "Male" "Male" "Male" ...
## $ hours.per.week : int 40 50 40 40 30 30 40 32 40 10 ...
## $ income : chr "<=50K" "<=50K" ">50K" ">50K" ...
## $ age.scale : num 0.1096 0.2877 0.1507 0.3699 0.0137 ...
## $ educational.num.scale: num 0.4 0.533 0.733 0.6 0.6 ...
## $ hours.per.week.scale : num 0.398 0.5 0.398 0.398 0.296 ...
## $ income10 : Factor w/ 2 levels "0","1": 1 1 2 2 1 1 1 2 1 1 ...
- age la edad de la persona
- workclass es un tipo o clase de trabajo de la persona, privado, gobierno, por su cuenta,
- education indica el nivel educativo de la persona
- educational es el valor numérico de education
- marital.status es su estado civil
- race es el tipo de raza de persona
- gender es el género de la persona
- hours.per.week son las horas que trbaja por semana income son los ingresos
- age_scale la edad escalada
- hours.per.week escalada
- income10 con valores de 0 gana menos de 50 mil y 1 mas de 50 mil
4. Crear datos de entrenamiento y validación
- 70 % datos de entrenamiento
- 30 % datos de validación
set.seed(2020)
entrena <- createDataPartition(y = datos$income10, p = 0.7, list = FALSE, times = 1)
# Datos entrenamiento
datos.entrenamiento <- datos[entrena, ] # [renglones, columna]
# Datos validación
datos.validacion <- datos[-entrena, ]
kable(head(datos.entrenamiento, 10), caption = "Datos de entrenamiento (primeros diez)", row.names = 1:nrow(datos.entrenamiento))
## Warning in if (is.na(row.names)) row.names = has_rownames(x): la condición tiene
## longitud > 1 y sólo el primer elemento será usado
## Warning in if (row.names) {: la condición tiene longitud > 1 y sólo el primer
## elemento será usado
Datos de entrenamiento (primeros diez)
| 1 |
1 |
25 |
Private |
Dropout |
7 |
Not_married |
Black |
Male |
40 |
<=50K |
0.1095890 |
0.4000000 |
0.3979592 |
0 |
| 2 |
2 |
38 |
Private |
HighGrad |
9 |
Married |
White |
Male |
50 |
<=50K |
0.2876712 |
0.5333333 |
0.5000000 |
0 |
| 3 |
3 |
28 |
Local-gov |
Community |
12 |
Married |
White |
Male |
40 |
>50K |
0.1506849 |
0.7333333 |
0.3979592 |
1 |
| 4 |
4 |
44 |
Private |
Community |
10 |
Married |
Black |
Male |
40 |
>50K |
0.3698630 |
0.6000000 |
0.3979592 |
1 |
| 5 |
5 |
18 |
? |
Community |
10 |
Not_married |
White |
Female |
30 |
<=50K |
0.0136986 |
0.6000000 |
0.2959184 |
0 |
| 7 |
7 |
29 |
? |
HighGrad |
9 |
Not_married |
Black |
Male |
40 |
<=50K |
0.1643836 |
0.5333333 |
0.3979592 |
0 |
| 8 |
8 |
63 |
Self-emp-not-inc |
Master |
15 |
Married |
White |
Male |
32 |
>50K |
0.6301370 |
0.9333333 |
0.3163265 |
1 |
| 9 |
9 |
24 |
Private |
Community |
10 |
Not_married |
White |
Female |
40 |
<=50K |
0.0958904 |
0.6000000 |
0.3979592 |
0 |
| 10 |
10 |
55 |
Private |
Dropout |
4 |
Married |
White |
Male |
10 |
<=50K |
0.5205479 |
0.2000000 |
0.0918367 |
0 |
| 11 |
11 |
65 |
Private |
HighGrad |
9 |
Married |
White |
Male |
40 |
>50K |
0.6575342 |
0.5333333 |
0.3979592 |
1 |
kable(head(datos.validacion, 10), caption = "Datos de validación (primeros diez)", row.names = 1:nrow(datos.entrenamiento))
## Warning in if (is.na(row.names)) row.names = has_rownames(x): la condición tiene
## longitud > 1 y sólo el primer elemento será usado
## Warning in if (row.names) {: la condición tiene longitud > 1 y sólo el primer
## elemento será usado
Datos de validación (primeros diez)
| 6 |
6 |
34 |
Private |
Dropout |
6 |
Not_married |
White |
Male |
30 |
<=50K |
0.2328767 |
0.3333333 |
0.2959184 |
0 |
| 15 |
15 |
48 |
Private |
HighGrad |
9 |
Married |
White |
Male |
48 |
>50K |
0.4246575 |
0.5333333 |
0.4795918 |
1 |
| 17 |
17 |
20 |
State-gov |
Community |
10 |
Not_married |
White |
Male |
25 |
<=50K |
0.0410959 |
0.6000000 |
0.2448980 |
0 |
| 36 |
36 |
65 |
? |
HighGrad |
9 |
Married |
White |
Male |
40 |
<=50K |
0.6575342 |
0.5333333 |
0.3979592 |
0 |
| 41 |
41 |
65 |
Private |
Master |
14 |
Married |
White |
Male |
50 |
>50K |
0.6575342 |
0.8666667 |
0.5000000 |
1 |
| 49 |
49 |
52 |
Private |
Dropout |
7 |
Separated |
Black |
Female |
18 |
<=50K |
0.4794521 |
0.4000000 |
0.1734694 |
0 |
| 51 |
51 |
18 |
Private |
Community |
10 |
Not_married |
White |
Male |
20 |
<=50K |
0.0136986 |
0.6000000 |
0.1938776 |
0 |
| 54 |
54 |
22 |
Private |
HighGrad |
9 |
Not_married |
White |
Male |
60 |
>50K |
0.0684932 |
0.5333333 |
0.6020408 |
1 |
| 56 |
56 |
21 |
Private |
Community |
10 |
Not_married |
White |
Female |
40 |
<=50K |
0.0547945 |
0.6000000 |
0.3979592 |
0 |
| 58 |
58 |
34 |
Local-gov |
Bachelors |
13 |
Married |
White |
Male |
50 |
>50K |
0.2328767 |
0.8000000 |
0.5000000 |
1 |
5. Crear modelo de Máquinas de Soporte Vectorial SVM
- Con el paquete e1071 se genera el modelo de SVM
- Se utiliza la función svm
- Kernel lineal
modelo1 <- svm(income10 ~ ., data = datos.entrenamiento, kernel = "linear", scale = TRUE, cost = .05)
6. Analizar y/o describir el modelo
summary(modelo1)
##
## Call:
## svm(formula = income10 ~ ., data = datos.entrenamiento, kernel = "linear",
## cost = 0.05, scale = TRUE)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: linear
## cost: 0.05
##
## Number of Support Vectors: 246
##
## ( 143 103 )
##
##
## Number of Classes: 2
##
## Levels:
## 0 1
7. Realizar predicciones con el conjunto de datos de validación
prediccion <- predict(modelo1, datos.validacion)
- Agregando columna al final de datos de validación para comparar
datos.validacion <- cbind(datos.validacion, prediccion = prediccion)
kable(head(datos.validacion, 50), caption = "Las predicciones, primeros 10 registros")
Las predicciones, primeros 10 registros
| 6 |
6 |
34 |
Private |
Dropout |
6 |
Not_married |
White |
Male |
30 |
<=50K |
0.2328767 |
0.3333333 |
0.2959184 |
0 |
0 |
| 15 |
15 |
48 |
Private |
HighGrad |
9 |
Married |
White |
Male |
48 |
>50K |
0.4246575 |
0.5333333 |
0.4795918 |
1 |
1 |
| 17 |
17 |
20 |
State-gov |
Community |
10 |
Not_married |
White |
Male |
25 |
<=50K |
0.0410959 |
0.6000000 |
0.2448980 |
0 |
0 |
| 36 |
36 |
65 |
? |
HighGrad |
9 |
Married |
White |
Male |
40 |
<=50K |
0.6575342 |
0.5333333 |
0.3979592 |
0 |
0 |
| 41 |
41 |
65 |
Private |
Master |
14 |
Married |
White |
Male |
50 |
>50K |
0.6575342 |
0.8666667 |
0.5000000 |
1 |
1 |
| 49 |
49 |
52 |
Private |
Dropout |
7 |
Separated |
Black |
Female |
18 |
<=50K |
0.4794521 |
0.4000000 |
0.1734694 |
0 |
0 |
| 51 |
51 |
18 |
Private |
Community |
10 |
Not_married |
White |
Male |
20 |
<=50K |
0.0136986 |
0.6000000 |
0.1938776 |
0 |
0 |
| 54 |
54 |
22 |
Private |
HighGrad |
9 |
Not_married |
White |
Male |
60 |
>50K |
0.0684932 |
0.5333333 |
0.6020408 |
1 |
1 |
| 56 |
56 |
21 |
Private |
Community |
10 |
Not_married |
White |
Female |
40 |
<=50K |
0.0547945 |
0.6000000 |
0.3979592 |
0 |
0 |
| 58 |
58 |
34 |
Local-gov |
Bachelors |
13 |
Married |
White |
Male |
50 |
>50K |
0.2328767 |
0.8000000 |
0.5000000 |
1 |
1 |
| 59 |
59 |
42 |
Self-emp-inc |
HighGrad |
9 |
Married |
White |
Male |
50 |
>50K |
0.3424658 |
0.5333333 |
0.5000000 |
1 |
1 |
| 61 |
61 |
30 |
Private |
Bachelors |
13 |
Not_married |
White |
Female |
50 |
<=50K |
0.1780822 |
0.8000000 |
0.5000000 |
0 |
0 |
| 64 |
64 |
33 |
Private |
HighGrad |
9 |
Not_married |
White |
Female |
40 |
<=50K |
0.2191781 |
0.5333333 |
0.3979592 |
0 |
0 |
| 68 |
68 |
19 |
Private |
Community |
10 |
Not_married |
White |
Male |
20 |
<=50K |
0.0273973 |
0.6000000 |
0.1938776 |
0 |
0 |
| 74 |
74 |
21 |
Private |
Community |
10 |
Separated |
White |
Female |
40 |
<=50K |
0.0547945 |
0.6000000 |
0.3979592 |
0 |
0 |
| 77 |
77 |
41 |
Private |
HighGrad |
9 |
Married |
White |
Male |
50 |
<=50K |
0.3287671 |
0.5333333 |
0.5000000 |
0 |
0 |
| 79 |
79 |
50 |
Private |
HighGrad |
9 |
Married |
White |
Male |
40 |
<=50K |
0.4520548 |
0.5333333 |
0.3979592 |
0 |
0 |
| 81 |
81 |
45 |
Self-emp-inc |
Community |
10 |
Married |
White |
Male |
50 |
<=50K |
0.3835616 |
0.6000000 |
0.5000000 |
0 |
0 |
| 91 |
91 |
59 |
Private |
Bachelors |
13 |
Not_married |
White |
Female |
25 |
<=50K |
0.5753425 |
0.8000000 |
0.2448980 |
0 |
0 |
| 95 |
95 |
34 |
Private |
Master |
14 |
Not_married |
Amer-Indian-Eskimo |
Male |
40 |
<=50K |
0.2328767 |
0.8666667 |
0.3979592 |
0 |
0 |
| 96 |
96 |
20 |
Private |
HighGrad |
9 |
Not_married |
White |
Male |
40 |
<=50K |
0.0410959 |
0.5333333 |
0.3979592 |
0 |
0 |
| 97 |
97 |
25 |
Private |
Bachelors |
13 |
Not_married |
White |
Female |
40 |
<=50K |
0.1095890 |
0.8000000 |
0.3979592 |
0 |
0 |
| 102 |
102 |
33 |
Private |
Community |
10 |
Not_married |
Black |
Female |
35 |
<=50K |
0.2191781 |
0.6000000 |
0.3469388 |
0 |
0 |
| 111 |
111 |
18 |
Private |
HighGrad |
9 |
Not_married |
White |
Female |
48 |
<=50K |
0.0136986 |
0.5333333 |
0.4795918 |
0 |
0 |
| 112 |
112 |
28 |
Private |
Community |
10 |
Married |
White |
Male |
40 |
<=50K |
0.1506849 |
0.6000000 |
0.3979592 |
0 |
0 |
| 120 |
120 |
43 |
Private |
Bachelors |
13 |
Separated |
White |
Female |
40 |
>50K |
0.3561644 |
0.8000000 |
0.3979592 |
1 |
1 |
| 123 |
123 |
19 |
Private |
Community |
10 |
Not_married |
White |
Male |
30 |
<=50K |
0.0273973 |
0.6000000 |
0.2959184 |
0 |
0 |
| 129 |
129 |
27 |
Self-emp-not-inc |
HighGrad |
9 |
Married |
White |
Male |
60 |
>50K |
0.1369863 |
0.5333333 |
0.6020408 |
1 |
1 |
| 131 |
131 |
41 |
Private |
Community |
10 |
Married |
White |
Male |
40 |
<=50K |
0.3287671 |
0.6000000 |
0.3979592 |
0 |
0 |
| 135 |
135 |
57 |
Private |
HighGrad |
9 |
Married |
Black |
Male |
48 |
<=50K |
0.5479452 |
0.5333333 |
0.4795918 |
0 |
0 |
| 141 |
141 |
46 |
Private |
Master |
14 |
Married |
White |
Male |
40 |
<=50K |
0.3972603 |
0.8666667 |
0.3979592 |
0 |
0 |
| 144 |
144 |
43 |
Self-emp-inc |
HighGrad |
9 |
Married |
White |
Male |
45 |
>50K |
0.3561644 |
0.5333333 |
0.4489796 |
1 |
1 |
| 145 |
145 |
34 |
Private |
Master |
14 |
Not_married |
White |
Female |
30 |
<=50K |
0.2328767 |
0.8666667 |
0.2959184 |
0 |
0 |
| 147 |
147 |
44 |
Private |
Community |
11 |
Widow |
White |
Female |
30 |
<=50K |
0.3698630 |
0.6666667 |
0.2959184 |
0 |
0 |
| 153 |
153 |
50 |
Private |
Dropout |
4 |
Married |
White |
Male |
20 |
<=50K |
0.4520548 |
0.2000000 |
0.1938776 |
0 |
0 |
| 160 |
160 |
38 |
Self-emp-inc |
Bachelors |
13 |
Separated |
White |
Male |
40 |
<=50K |
0.2876712 |
0.8000000 |
0.3979592 |
0 |
0 |
| 161 |
161 |
55 |
Private |
Dropout |
7 |
Married |
White |
Male |
30 |
<=50K |
0.5205479 |
0.4000000 |
0.2959184 |
0 |
0 |
| 163 |
163 |
22 |
Private |
HighGrad |
9 |
Married |
White |
Male |
45 |
<=50K |
0.0684932 |
0.5333333 |
0.4489796 |
0 |
0 |
| 165 |
165 |
46 |
State-gov |
Master |
14 |
Married |
White |
Male |
45 |
>50K |
0.3972603 |
0.8666667 |
0.4489796 |
1 |
1 |
| 167 |
167 |
58 |
Self-emp-not-inc |
PhD |
16 |
Married |
White |
Male |
16 |
>50K |
0.5616438 |
1.0000000 |
0.1530612 |
1 |
1 |
| 168 |
168 |
42 |
Private |
HighGrad |
9 |
Married |
White |
Male |
45 |
<=50K |
0.3424658 |
0.5333333 |
0.4489796 |
0 |
0 |
| 171 |
171 |
54 |
Private |
HighGrad |
9 |
Married |
White |
Male |
40 |
>50K |
0.5068493 |
0.5333333 |
0.3979592 |
1 |
1 |
| 172 |
172 |
34 |
Private |
Master |
14 |
Not_married |
White |
Male |
40 |
<=50K |
0.2328767 |
0.8666667 |
0.3979592 |
0 |
0 |
| 173 |
173 |
26 |
Private |
Bachelors |
13 |
Not_married |
White |
Female |
40 |
<=50K |
0.1232877 |
0.8000000 |
0.3979592 |
0 |
0 |
| 175 |
175 |
48 |
Local-gov |
Master |
14 |
Separated |
Black |
Female |
40 |
<=50K |
0.4246575 |
0.8666667 |
0.3979592 |
0 |
0 |
| 176 |
176 |
36 |
Private |
Community |
10 |
Married |
White |
Male |
45 |
>50K |
0.2602740 |
0.6000000 |
0.4489796 |
1 |
1 |
| 185 |
185 |
44 |
Private |
HighGrad |
9 |
Separated |
Black |
Female |
40 |
<=50K |
0.3698630 |
0.5333333 |
0.3979592 |
0 |
0 |
| 189 |
189 |
34 |
State-gov |
Bachelors |
13 |
Not_married |
Black |
Male |
40 |
<=50K |
0.2328767 |
0.8000000 |
0.3979592 |
0 |
0 |
| 193 |
193 |
47 |
Private |
Community |
10 |
Separated |
Black |
Female |
37 |
<=50K |
0.4109589 |
0.6000000 |
0.3673469 |
0 |
0 |
| 196 |
196 |
31 |
Private |
Community |
10 |
Separated |
White |
Female |
40 |
<=50K |
0.1917808 |
0.6000000 |
0.3979592 |
0 |
0 |
8. Evaluar el modelo de predicción con matriz de confusión
Generando la matriz de confusión
mat.confusion <- table(predicho = prediccion, real = datos.validacion$income10)
mat.confusion
## real
## predicho 0 1
## 0 11146 0
## 1 0 3506
9. Interpretar caso
El objetivo es Aplicar e interpretar el algoritmo de maquinas de soporte vectorial SVM con los datos de personas e ingresos de USA y realozar una comparacion con el modelo del caso 11 de regresión loígistica el valor de exactitud. Como variable dependiente tenemos los ingresos identificados con 0 y 1 ( 0 es los que ganan por debajo o igual de 50 mil y 1 los que ganan por encima de 50 mil). Las variables independientes son age, workclass, education, marital.status, race, gender, hours.per.week, etc. La matriz de confusión se obtiene que el numero de verdaderos es de 11146 y que el caso de negativos es de 3506. El numero de vectores en los que se apoyo fueron 246 y 2 clases (0,1).
10. Comparar con el algoritmo de regresión logística del caso 11.
formula = income10 ~ age.scale + workclass + education + marital.status + race + gender + hours.per.week.scale
modelo2 <- glm(formula, data = datos.entrenamiento, family = 'binomial')
summary(modelo2)
##
## Call:
## glm(formula = formula, family = "binomial", data = datos.entrenamiento)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.7337 -0.5768 -0.2588 -0.0654 3.3492
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.419e+00 2.228e-01 -10.858 < 2e-16 ***
## age.scale 2.224e+00 1.053e-01 21.121 < 2e-16 ***
## workclassFederal-gov 1.421e+00 1.237e-01 11.485 < 2e-16 ***
## workclassLocal-gov 6.942e-01 1.100e-01 6.312 2.76e-10 ***
## workclassNever-worked -8.124e+00 1.042e+02 -0.078 0.9379
## workclassPrivate 8.124e-01 9.598e-02 8.464 < 2e-16 ***
## workclassSelf-emp-inc 1.218e+00 1.186e-01 10.270 < 2e-16 ***
## workclassSelf-emp-not-inc 1.878e-01 1.071e-01 1.753 0.0797 .
## workclassState-gov 5.339e-01 1.223e-01 4.367 1.26e-05 ***
## workclassWithout-pay -3.965e-01 8.276e-01 -0.479 0.6318
## educationCommunity -9.930e-01 4.428e-02 -22.426 < 2e-16 ***
## educationDropout -2.782e+00 7.802e-02 -35.657 < 2e-16 ***
## educationHighGrad -1.611e+00 4.523e-02 -35.610 < 2e-16 ***
## educationMaster 6.250e-01 6.110e-02 10.230 < 2e-16 ***
## educationPhD 1.077e+00 1.379e-01 7.814 5.55e-15 ***
## marital.statusNot_married -2.491e+00 5.355e-02 -46.511 < 2e-16 ***
## marital.statusSeparated -2.102e+00 5.650e-02 -37.214 < 2e-16 ***
## marital.statusWidow -2.163e+00 1.287e-01 -16.809 < 2e-16 ***
## raceAsian-Pac-Islander -2.461e-02 2.074e-01 -0.119 0.9055
## raceBlack 4.784e-04 1.968e-01 0.002 0.9981
## raceOther -9.881e-02 2.817e-01 -0.351 0.7258
## raceWhite 2.155e-01 1.876e-01 1.148 0.2509
## genderMale 9.432e-02 4.455e-02 2.117 0.0342 *
## hours.per.week.scale 3.136e+00 1.398e-01 22.430 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 37626 on 34189 degrees of freedom
## Residual deviance: 25011 on 34166 degrees of freedom
## AIC: 25059
##
## Number of Fisher Scoring iterations: 11