Proceso
- Cargar librerías
- Cargar datos
- Identificar variables
- Crear datos de entrenamiento y validación
- Crear modelo de Máquinas de Soporte Vectorial SVM *Deteminar anticipadamente e mejor costo del modelo de SVM Pendiente
- Analizar y/o describir el modelo
- Realizar predicciones con el conjunto de datos de validación
- Evaluar el modelo de predicción con matriz de confusión
- Interpretar el caso
- Comparar con el algoritmo de regresión logística del caso 11.
1. Cargar librerias
library(ggplot2)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(knitr)
library(caret)
## Warning: package 'caret' was built under R version 4.0.3
## Loading required package: lattice
library(readr)
library(knitr)
library(e1071)
## Warning: package 'e1071' was built under R version 4.0.3
2. Cargar datos
datos <- read.csv("https://raw.githubusercontent.com/rpizarrog/FundamentosMachineLearning/master/datos/adultos_clean.csv", encoding = "UTF-8")
# datos <- read.csv("../datos/adultos_clean.csv")
# kable(head(datos))
# kable(tail(datos))
kable(head(datos, 10), caption = "Los primeros 10 registros de datos")
Los primeros 10 registros de datos
| 1 |
25 |
Private |
Dropout |
7 |
Not_married |
Black |
Male |
40 |
<=50K |
0.1095890 |
0.4000000 |
0.3979592 |
0 |
| 2 |
38 |
Private |
HighGrad |
9 |
Married |
White |
Male |
50 |
<=50K |
0.2876712 |
0.5333333 |
0.5000000 |
0 |
| 3 |
28 |
Local-gov |
Community |
12 |
Married |
White |
Male |
40 |
>50K |
0.1506849 |
0.7333333 |
0.3979592 |
1 |
| 4 |
44 |
Private |
Community |
10 |
Married |
Black |
Male |
40 |
>50K |
0.3698630 |
0.6000000 |
0.3979592 |
1 |
| 5 |
18 |
? |
Community |
10 |
Not_married |
White |
Female |
30 |
<=50K |
0.0136986 |
0.6000000 |
0.2959184 |
0 |
| 6 |
34 |
Private |
Dropout |
6 |
Not_married |
White |
Male |
30 |
<=50K |
0.2328767 |
0.3333333 |
0.2959184 |
0 |
| 7 |
29 |
? |
HighGrad |
9 |
Not_married |
Black |
Male |
40 |
<=50K |
0.1643836 |
0.5333333 |
0.3979592 |
0 |
| 8 |
63 |
Self-emp-not-inc |
Master |
15 |
Married |
White |
Male |
32 |
>50K |
0.6301370 |
0.9333333 |
0.3163265 |
1 |
| 9 |
24 |
Private |
Community |
10 |
Not_married |
White |
Female |
40 |
<=50K |
0.0958904 |
0.6000000 |
0.3979592 |
0 |
| 10 |
55 |
Private |
Dropout |
4 |
Married |
White |
Male |
10 |
<=50K |
0.5205479 |
0.2000000 |
0.0918367 |
0 |
kable(tail(datos, 10), caption = "Los últimos 10 registros de datos")
Los últimos 10 registros de datos
| 48833 |
48833 |
32 |
Private |
Dropout |
6 |
Married |
Amer-Indian-Eskimo |
Male |
40 |
<=50K |
0.2054795 |
0.3333333 |
0.3979592 |
0 |
| 48834 |
48834 |
43 |
Private |
Community |
11 |
Married |
White |
Male |
45 |
<=50K |
0.3561644 |
0.6666667 |
0.4489796 |
0 |
| 48835 |
48835 |
32 |
Private |
Master |
14 |
Not_married |
Asian-Pac-Islander |
Male |
11 |
<=50K |
0.2054795 |
0.8666667 |
0.1020408 |
0 |
| 48836 |
48836 |
53 |
Private |
Master |
14 |
Married |
White |
Male |
40 |
>50K |
0.4931507 |
0.8666667 |
0.3979592 |
1 |
| 48837 |
48837 |
22 |
Private |
Community |
10 |
Not_married |
White |
Male |
40 |
<=50K |
0.0684932 |
0.6000000 |
0.3979592 |
0 |
| 48838 |
48838 |
27 |
Private |
Community |
12 |
Married |
White |
Female |
38 |
<=50K |
0.1369863 |
0.7333333 |
0.3775510 |
0 |
| 48839 |
48839 |
40 |
Private |
HighGrad |
9 |
Married |
White |
Male |
40 |
>50K |
0.3150685 |
0.5333333 |
0.3979592 |
1 |
| 48840 |
48840 |
58 |
Private |
HighGrad |
9 |
Widow |
White |
Female |
40 |
<=50K |
0.5616438 |
0.5333333 |
0.3979592 |
0 |
| 48841 |
48841 |
22 |
Private |
HighGrad |
9 |
Not_married |
White |
Male |
20 |
<=50K |
0.0684932 |
0.5333333 |
0.1938776 |
0 |
| 48842 |
48842 |
52 |
Self-emp-inc |
HighGrad |
9 |
Married |
White |
Female |
40 |
>50K |
0.4794521 |
0.5333333 |
0.3979592 |
1 |
3. Las variables
- Convertir a factor la variable dependiente income 10
datos$income10 <- factor(datos$income10)
str(datos)
## 'data.frame': 48842 obs. of 14 variables:
## $ X : int 1 2 3 4 5 6 7 8 9 10 ...
## $ age : int 25 38 28 44 18 34 29 63 24 55 ...
## $ workclass : chr "Private" "Private" "Local-gov" "Private" ...
## $ education : chr "Dropout" "HighGrad" "Community" "Community" ...
## $ educational.num : int 7 9 12 10 10 6 9 15 10 4 ...
## $ marital.status : chr "Not_married" "Married" "Married" "Married" ...
## $ race : chr "Black" "White" "White" "Black" ...
## $ gender : chr "Male" "Male" "Male" "Male" ...
## $ hours.per.week : int 40 50 40 40 30 30 40 32 40 10 ...
## $ income : chr "<=50K" "<=50K" ">50K" ">50K" ...
## $ age.scale : num 0.1096 0.2877 0.1507 0.3699 0.0137 ...
## $ educational.num.scale: num 0.4 0.533 0.733 0.6 0.6 ...
## $ hours.per.week.scale : num 0.398 0.5 0.398 0.398 0.296 ...
## $ income10 : Factor w/ 2 levels "0","1": 1 1 2 2 1 1 1 2 1 1 ...
- age la edad de la persona
- workclass es un tipo o clase de trabajo de la persona, privado, gobierno, por su cuenta,
- education indica el nivel educativo de la persona
- educational es el valor numérico de education
- marital.status es su estado civil
- race es el tipo de raza de persona
- gender es el género de la persona
- hours.per.week son las horas que trbaja por semana income son los ingresos
- age_scale la edad escalada
- hours.per.week escalada
- income10 con valores de 0 gana menos de 50 mil y 1 mas de 50 mil
4. Crear datos de entrenamiento y validación
- 70 % datos de entrenamiento
- 30 % datos de validación
set.seed(2020)
entrena <- createDataPartition(y = datos$income10, p = 0.7, list = FALSE, times = 1)
# Datos entrenamiento
datos.entrenamiento <- datos[entrena, ] # [renglones, columna]
# Datos validación
datos.validacion <- datos[-entrena, ]
kable(head(datos.entrenamiento, 10), caption = "Datos de entrenamiento (primeros diez)", row.names = 1:nrow(datos.entrenamiento))
## Warning in if (is.na(row.names)) row.names = has_rownames(x): la condición tiene
## longitud > 1 y sólo el primer elemento será usado
## Warning in if (row.names) {: la condición tiene longitud > 1 y sólo el primer
## elemento será usado
Datos de entrenamiento (primeros diez)
| 1 |
1 |
25 |
Private |
Dropout |
7 |
Not_married |
Black |
Male |
40 |
<=50K |
0.1095890 |
0.4000000 |
0.3979592 |
0 |
| 2 |
2 |
38 |
Private |
HighGrad |
9 |
Married |
White |
Male |
50 |
<=50K |
0.2876712 |
0.5333333 |
0.5000000 |
0 |
| 3 |
3 |
28 |
Local-gov |
Community |
12 |
Married |
White |
Male |
40 |
>50K |
0.1506849 |
0.7333333 |
0.3979592 |
1 |
| 4 |
4 |
44 |
Private |
Community |
10 |
Married |
Black |
Male |
40 |
>50K |
0.3698630 |
0.6000000 |
0.3979592 |
1 |
| 5 |
5 |
18 |
? |
Community |
10 |
Not_married |
White |
Female |
30 |
<=50K |
0.0136986 |
0.6000000 |
0.2959184 |
0 |
| 7 |
7 |
29 |
? |
HighGrad |
9 |
Not_married |
Black |
Male |
40 |
<=50K |
0.1643836 |
0.5333333 |
0.3979592 |
0 |
| 8 |
8 |
63 |
Self-emp-not-inc |
Master |
15 |
Married |
White |
Male |
32 |
>50K |
0.6301370 |
0.9333333 |
0.3163265 |
1 |
| 9 |
9 |
24 |
Private |
Community |
10 |
Not_married |
White |
Female |
40 |
<=50K |
0.0958904 |
0.6000000 |
0.3979592 |
0 |
| 10 |
10 |
55 |
Private |
Dropout |
4 |
Married |
White |
Male |
10 |
<=50K |
0.5205479 |
0.2000000 |
0.0918367 |
0 |
| 11 |
11 |
65 |
Private |
HighGrad |
9 |
Married |
White |
Male |
40 |
>50K |
0.6575342 |
0.5333333 |
0.3979592 |
1 |
kable(head(datos.validacion, 10), caption = "Datos de validación (primeros diez)", row.names = 1:nrow(datos.entrenamiento))
## Warning in if (is.na(row.names)) row.names = has_rownames(x): la condición tiene
## longitud > 1 y sólo el primer elemento será usado
## Warning in if (row.names) {: la condición tiene longitud > 1 y sólo el primer
## elemento será usado
Datos de validación (primeros diez)
| 6 |
6 |
34 |
Private |
Dropout |
6 |
Not_married |
White |
Male |
30 |
<=50K |
0.2328767 |
0.3333333 |
0.2959184 |
0 |
| 15 |
15 |
48 |
Private |
HighGrad |
9 |
Married |
White |
Male |
48 |
>50K |
0.4246575 |
0.5333333 |
0.4795918 |
1 |
| 17 |
17 |
20 |
State-gov |
Community |
10 |
Not_married |
White |
Male |
25 |
<=50K |
0.0410959 |
0.6000000 |
0.2448980 |
0 |
| 36 |
36 |
65 |
? |
HighGrad |
9 |
Married |
White |
Male |
40 |
<=50K |
0.6575342 |
0.5333333 |
0.3979592 |
0 |
| 41 |
41 |
65 |
Private |
Master |
14 |
Married |
White |
Male |
50 |
>50K |
0.6575342 |
0.8666667 |
0.5000000 |
1 |
| 49 |
49 |
52 |
Private |
Dropout |
7 |
Separated |
Black |
Female |
18 |
<=50K |
0.4794521 |
0.4000000 |
0.1734694 |
0 |
| 51 |
51 |
18 |
Private |
Community |
10 |
Not_married |
White |
Male |
20 |
<=50K |
0.0136986 |
0.6000000 |
0.1938776 |
0 |
| 54 |
54 |
22 |
Private |
HighGrad |
9 |
Not_married |
White |
Male |
60 |
>50K |
0.0684932 |
0.5333333 |
0.6020408 |
1 |
| 56 |
56 |
21 |
Private |
Community |
10 |
Not_married |
White |
Female |
40 |
<=50K |
0.0547945 |
0.6000000 |
0.3979592 |
0 |
| 58 |
58 |
34 |
Local-gov |
Bachelors |
13 |
Married |
White |
Male |
50 |
>50K |
0.2328767 |
0.8000000 |
0.5000000 |
1 |
5. Crear modelo de Máquinas de Soporte Vectorial SVM
- Con el paquete e1071 se genera el modelo de SVM
- Se utiliza la función svm
- Kernel lineal
modelo1 <- svm(income10 ~ ., data = datos.entrenamiento, kernel = "linear", scale = TRUE, cost = .05)
6. Analizar y/o describir el modelo
summary(modelo1)
##
## Call:
## svm(formula = income10 ~ ., data = datos.entrenamiento, kernel = "linear",
## cost = 0.05, scale = TRUE)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: linear
## cost: 0.05
##
## Number of Support Vectors: 246
##
## ( 143 103 )
##
##
## Number of Classes: 2
##
## Levels:
## 0 1
7. Realizar predicciones con el conjunto de datos de validación
prediccion <- predict(modelo1, datos.validacion)
- Agregando columna al final de datos de validación para comparar
datos.validacion <- cbind(datos.validacion, prediccion = prediccion)
kable(head(datos.validacion, 50), caption = "Las predicciones, primeros 10 registros")
Las predicciones, primeros 10 registros
| 6 |
6 |
34 |
Private |
Dropout |
6 |
Not_married |
White |
Male |
30 |
<=50K |
0.2328767 |
0.3333333 |
0.2959184 |
0 |
0 |
| 15 |
15 |
48 |
Private |
HighGrad |
9 |
Married |
White |
Male |
48 |
>50K |
0.4246575 |
0.5333333 |
0.4795918 |
1 |
1 |
| 17 |
17 |
20 |
State-gov |
Community |
10 |
Not_married |
White |
Male |
25 |
<=50K |
0.0410959 |
0.6000000 |
0.2448980 |
0 |
0 |
| 36 |
36 |
65 |
? |
HighGrad |
9 |
Married |
White |
Male |
40 |
<=50K |
0.6575342 |
0.5333333 |
0.3979592 |
0 |
0 |
| 41 |
41 |
65 |
Private |
Master |
14 |
Married |
White |
Male |
50 |
>50K |
0.6575342 |
0.8666667 |
0.5000000 |
1 |
1 |
| 49 |
49 |
52 |
Private |
Dropout |
7 |
Separated |
Black |
Female |
18 |
<=50K |
0.4794521 |
0.4000000 |
0.1734694 |
0 |
0 |
| 51 |
51 |
18 |
Private |
Community |
10 |
Not_married |
White |
Male |
20 |
<=50K |
0.0136986 |
0.6000000 |
0.1938776 |
0 |
0 |
| 54 |
54 |
22 |
Private |
HighGrad |
9 |
Not_married |
White |
Male |
60 |
>50K |
0.0684932 |
0.5333333 |
0.6020408 |
1 |
1 |
| 56 |
56 |
21 |
Private |
Community |
10 |
Not_married |
White |
Female |
40 |
<=50K |
0.0547945 |
0.6000000 |
0.3979592 |
0 |
0 |
| 58 |
58 |
34 |
Local-gov |
Bachelors |
13 |
Married |
White |
Male |
50 |
>50K |
0.2328767 |
0.8000000 |
0.5000000 |
1 |
1 |
| 59 |
59 |
42 |
Self-emp-inc |
HighGrad |
9 |
Married |
White |
Male |
50 |
>50K |
0.3424658 |
0.5333333 |
0.5000000 |
1 |
1 |
| 61 |
61 |
30 |
Private |
Bachelors |
13 |
Not_married |
White |
Female |
50 |
<=50K |
0.1780822 |
0.8000000 |
0.5000000 |
0 |
0 |
| 64 |
64 |
33 |
Private |
HighGrad |
9 |
Not_married |
White |
Female |
40 |
<=50K |
0.2191781 |
0.5333333 |
0.3979592 |
0 |
0 |
| 68 |
68 |
19 |
Private |
Community |
10 |
Not_married |
White |
Male |
20 |
<=50K |
0.0273973 |
0.6000000 |
0.1938776 |
0 |
0 |
| 74 |
74 |
21 |
Private |
Community |
10 |
Separated |
White |
Female |
40 |
<=50K |
0.0547945 |
0.6000000 |
0.3979592 |
0 |
0 |
| 77 |
77 |
41 |
Private |
HighGrad |
9 |
Married |
White |
Male |
50 |
<=50K |
0.3287671 |
0.5333333 |
0.5000000 |
0 |
0 |
| 79 |
79 |
50 |
Private |
HighGrad |
9 |
Married |
White |
Male |
40 |
<=50K |
0.4520548 |
0.5333333 |
0.3979592 |
0 |
0 |
| 81 |
81 |
45 |
Self-emp-inc |
Community |
10 |
Married |
White |
Male |
50 |
<=50K |
0.3835616 |
0.6000000 |
0.5000000 |
0 |
0 |
| 91 |
91 |
59 |
Private |
Bachelors |
13 |
Not_married |
White |
Female |
25 |
<=50K |
0.5753425 |
0.8000000 |
0.2448980 |
0 |
0 |
| 95 |
95 |
34 |
Private |
Master |
14 |
Not_married |
Amer-Indian-Eskimo |
Male |
40 |
<=50K |
0.2328767 |
0.8666667 |
0.3979592 |
0 |
0 |
| 96 |
96 |
20 |
Private |
HighGrad |
9 |
Not_married |
White |
Male |
40 |
<=50K |
0.0410959 |
0.5333333 |
0.3979592 |
0 |
0 |
| 97 |
97 |
25 |
Private |
Bachelors |
13 |
Not_married |
White |
Female |
40 |
<=50K |
0.1095890 |
0.8000000 |
0.3979592 |
0 |
0 |
| 102 |
102 |
33 |
Private |
Community |
10 |
Not_married |
Black |
Female |
35 |
<=50K |
0.2191781 |
0.6000000 |
0.3469388 |
0 |
0 |
| 111 |
111 |
18 |
Private |
HighGrad |
9 |
Not_married |
White |
Female |
48 |
<=50K |
0.0136986 |
0.5333333 |
0.4795918 |
0 |
0 |
| 112 |
112 |
28 |
Private |
Community |
10 |
Married |
White |
Male |
40 |
<=50K |
0.1506849 |
0.6000000 |
0.3979592 |
0 |
0 |
| 120 |
120 |
43 |
Private |
Bachelors |
13 |
Separated |
White |
Female |
40 |
>50K |
0.3561644 |
0.8000000 |
0.3979592 |
1 |
1 |
| 123 |
123 |
19 |
Private |
Community |
10 |
Not_married |
White |
Male |
30 |
<=50K |
0.0273973 |
0.6000000 |
0.2959184 |
0 |
0 |
| 129 |
129 |
27 |
Self-emp-not-inc |
HighGrad |
9 |
Married |
White |
Male |
60 |
>50K |
0.1369863 |
0.5333333 |
0.6020408 |
1 |
1 |
| 131 |
131 |
41 |
Private |
Community |
10 |
Married |
White |
Male |
40 |
<=50K |
0.3287671 |
0.6000000 |
0.3979592 |
0 |
0 |
| 135 |
135 |
57 |
Private |
HighGrad |
9 |
Married |
Black |
Male |
48 |
<=50K |
0.5479452 |
0.5333333 |
0.4795918 |
0 |
0 |
| 141 |
141 |
46 |
Private |
Master |
14 |
Married |
White |
Male |
40 |
<=50K |
0.3972603 |
0.8666667 |
0.3979592 |
0 |
0 |
| 144 |
144 |
43 |
Self-emp-inc |
HighGrad |
9 |
Married |
White |
Male |
45 |
>50K |
0.3561644 |
0.5333333 |
0.4489796 |
1 |
1 |
| 145 |
145 |
34 |
Private |
Master |
14 |
Not_married |
White |
Female |
30 |
<=50K |
0.2328767 |
0.8666667 |
0.2959184 |
0 |
0 |
| 147 |
147 |
44 |
Private |
Community |
11 |
Widow |
White |
Female |
30 |
<=50K |
0.3698630 |
0.6666667 |
0.2959184 |
0 |
0 |
| 153 |
153 |
50 |
Private |
Dropout |
4 |
Married |
White |
Male |
20 |
<=50K |
0.4520548 |
0.2000000 |
0.1938776 |
0 |
0 |
| 160 |
160 |
38 |
Self-emp-inc |
Bachelors |
13 |
Separated |
White |
Male |
40 |
<=50K |
0.2876712 |
0.8000000 |
0.3979592 |
0 |
0 |
| 161 |
161 |
55 |
Private |
Dropout |
7 |
Married |
White |
Male |
30 |
<=50K |
0.5205479 |
0.4000000 |
0.2959184 |
0 |
0 |
| 163 |
163 |
22 |
Private |
HighGrad |
9 |
Married |
White |
Male |
45 |
<=50K |
0.0684932 |
0.5333333 |
0.4489796 |
0 |
0 |
| 165 |
165 |
46 |
State-gov |
Master |
14 |
Married |
White |
Male |
45 |
>50K |
0.3972603 |
0.8666667 |
0.4489796 |
1 |
1 |
| 167 |
167 |
58 |
Self-emp-not-inc |
PhD |
16 |
Married |
White |
Male |
16 |
>50K |
0.5616438 |
1.0000000 |
0.1530612 |
1 |
1 |
| 168 |
168 |
42 |
Private |
HighGrad |
9 |
Married |
White |
Male |
45 |
<=50K |
0.3424658 |
0.5333333 |
0.4489796 |
0 |
0 |
| 171 |
171 |
54 |
Private |
HighGrad |
9 |
Married |
White |
Male |
40 |
>50K |
0.5068493 |
0.5333333 |
0.3979592 |
1 |
1 |
| 172 |
172 |
34 |
Private |
Master |
14 |
Not_married |
White |
Male |
40 |
<=50K |
0.2328767 |
0.8666667 |
0.3979592 |
0 |
0 |
| 173 |
173 |
26 |
Private |
Bachelors |
13 |
Not_married |
White |
Female |
40 |
<=50K |
0.1232877 |
0.8000000 |
0.3979592 |
0 |
0 |
| 175 |
175 |
48 |
Local-gov |
Master |
14 |
Separated |
Black |
Female |
40 |
<=50K |
0.4246575 |
0.8666667 |
0.3979592 |
0 |
0 |
| 176 |
176 |
36 |
Private |
Community |
10 |
Married |
White |
Male |
45 |
>50K |
0.2602740 |
0.6000000 |
0.4489796 |
1 |
1 |
| 185 |
185 |
44 |
Private |
HighGrad |
9 |
Separated |
Black |
Female |
40 |
<=50K |
0.3698630 |
0.5333333 |
0.3979592 |
0 |
0 |
| 189 |
189 |
34 |
State-gov |
Bachelors |
13 |
Not_married |
Black |
Male |
40 |
<=50K |
0.2328767 |
0.8000000 |
0.3979592 |
0 |
0 |
| 193 |
193 |
47 |
Private |
Community |
10 |
Separated |
Black |
Female |
37 |
<=50K |
0.4109589 |
0.6000000 |
0.3673469 |
0 |
0 |
| 196 |
196 |
31 |
Private |
Community |
10 |
Separated |
White |
Female |
40 |
<=50K |
0.1917808 |
0.6000000 |
0.3979592 |
0 |
0 |
8. Evaluar el modelo de predicción con matriz de confusión
Generando la matriz de confusión
mat.confusion <- table(predicho = prediccion, real = datos.validacion$income10)
mat.confusion
## real
## predicho 0 1
## 0 11146 0
## 1 0 3506
9. Interpretar caso
En este analisis de datos usamos un conjunto de datos de personas que viven en USA. El objetivo es Aplicar e interpretar el algoritmo de maquinas de soporte vectorial SVM con los datos de personas e ingresos de USA y comparar con el caso 11 de regresión loígistica el valor de exactitud. La variable dependiente son los ingresos identificados con 0 y 1 donde 0 es los que ganan por debajo o igual de 50 mil y 1 los que ganan por encima de 50 mil. Y las variables independientes son age, workclass, education, marital.status, race, gender, hours.per.week, etc. La matriz de confusión se obtiene que el numero de verdaderos es de 11146 y que el caso de negativos es de 3506. El numero de vectores en los que se apoyo fueron 246 y 2 clases (0,1).
10. Comparar con el algoritmo de regresión logística del caso 11.
formula = income10 ~ age.scale + workclass + education + marital.status + race + gender + hours.per.week.scale
modelo2 <- glm(formula, data = datos.entrenamiento, family = 'binomial')
summary(modelo2)
##
## Call:
## glm(formula = formula, family = "binomial", data = datos.entrenamiento)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.7337 -0.5768 -0.2588 -0.0654 3.3492
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.419e+00 2.228e-01 -10.858 < 2e-16 ***
## age.scale 2.224e+00 1.053e-01 21.121 < 2e-16 ***
## workclassFederal-gov 1.421e+00 1.237e-01 11.485 < 2e-16 ***
## workclassLocal-gov 6.942e-01 1.100e-01 6.312 2.76e-10 ***
## workclassNever-worked -8.124e+00 1.042e+02 -0.078 0.9379
## workclassPrivate 8.124e-01 9.598e-02 8.464 < 2e-16 ***
## workclassSelf-emp-inc 1.218e+00 1.186e-01 10.270 < 2e-16 ***
## workclassSelf-emp-not-inc 1.878e-01 1.071e-01 1.753 0.0797 .
## workclassState-gov 5.339e-01 1.223e-01 4.367 1.26e-05 ***
## workclassWithout-pay -3.965e-01 8.276e-01 -0.479 0.6318
## educationCommunity -9.930e-01 4.428e-02 -22.426 < 2e-16 ***
## educationDropout -2.782e+00 7.802e-02 -35.657 < 2e-16 ***
## educationHighGrad -1.611e+00 4.523e-02 -35.610 < 2e-16 ***
## educationMaster 6.250e-01 6.110e-02 10.230 < 2e-16 ***
## educationPhD 1.077e+00 1.379e-01 7.814 5.55e-15 ***
## marital.statusNot_married -2.491e+00 5.355e-02 -46.511 < 2e-16 ***
## marital.statusSeparated -2.102e+00 5.650e-02 -37.214 < 2e-16 ***
## marital.statusWidow -2.163e+00 1.287e-01 -16.809 < 2e-16 ***
## raceAsian-Pac-Islander -2.461e-02 2.074e-01 -0.119 0.9055
## raceBlack 4.784e-04 1.968e-01 0.002 0.9981
## raceOther -9.881e-02 2.817e-01 -0.351 0.7258
## raceWhite 2.155e-01 1.876e-01 1.148 0.2509
## genderMale 9.432e-02 4.455e-02 2.117 0.0342 *
## hours.per.week.scale 3.136e+00 1.398e-01 22.430 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 37626 on 34189 degrees of freedom
## Residual deviance: 25011 on 34166 degrees of freedom
## AIC: 25059
##
## Number of Fisher Scoring iterations: 11