Primer entregable DataViz

introduccion:

fuente de los datos: https://archive.ics.uci.edu/dataset/2/adult

Descripción general del dataset

  • Conocido también como Census Income.

  • Fue publicado en el UCI Machine Learning Repository por Barry Becker en 1996, a partir de datos del censo de EE. UU. de 1994.

Tiene 48 842 instancias y 14 atributos.

Atributos incluidos

  • Age
  • Fnlwgt (final weight)
  • Education
  • Education-num
  • Capital-gain
  • Capital-loss
  • Hours-per-week
  • Workclass
  • Occupation
  • Native-country
  • Relationship
  • Marital-status
  • Race
  • Sex
  • Income

Cargado y verificacion de datos:

Sys.setlocale("LC_ALL", "es_ES.UTF-8")
## [1] "LC_COLLATE=es_ES.UTF-8;LC_CTYPE=es_ES.UTF-8;LC_MONETARY=es_ES.UTF-8;LC_NUMERIC=C;LC_TIME=es_ES.UTF-8"
adult1 <- read.csv("~/rstudio/adult.data", header=FALSE,na.strings=" ?")
adult2 <- read.csv("~/rstudio/adult.test", header=FALSE,na.strings=" ?")
df <- rbind(adult1,adult2)
str(df)
## 'data.frame':    48842 obs. of  15 variables:
##  $ V1 : int  39 50 38 53 28 37 49 52 31 42 ...
##  $ V2 : chr  " State-gov" " Self-emp-not-inc" " Private" " Private" ...
##  $ V3 : int  77516 83311 215646 234721 338409 284582 160187 209642 45781 159449 ...
##  $ V4 : chr  " Bachelors" " Bachelors" " HS-grad" " 11th" ...
##  $ V5 : int  13 13 9 7 13 14 5 9 14 13 ...
##  $ V6 : chr  " Never-married" " Married-civ-spouse" " Divorced" " Married-civ-spouse" ...
##  $ V7 : chr  " Adm-clerical" " Exec-managerial" " Handlers-cleaners" " Handlers-cleaners" ...
##  $ V8 : chr  " Not-in-family" " Husband" " Not-in-family" " Husband" ...
##  $ V9 : chr  " White" " White" " White" " Black" ...
##  $ V10: chr  " Male" " Male" " Male" " Male" ...
##  $ V11: int  2174 0 0 0 0 0 0 0 14084 5178 ...
##  $ V12: int  0 0 0 0 0 0 0 0 0 0 ...
##  $ V13: int  40 13 40 40 40 40 16 45 50 40 ...
##  $ V14: chr  " United-States" " United-States" " United-States" " United-States" ...
##  $ V15: chr  " <=50K" " <=50K" " <=50K" " <=50K" ...
head(df)
##   V1                V2     V3         V4 V5                  V6
## 1 39         State-gov  77516  Bachelors 13       Never-married
## 2 50  Self-emp-not-inc  83311  Bachelors 13  Married-civ-spouse
## 3 38           Private 215646    HS-grad  9            Divorced
## 4 53           Private 234721       11th  7  Married-civ-spouse
## 5 28           Private 338409  Bachelors 13  Married-civ-spouse
## 6 37           Private 284582    Masters 14  Married-civ-spouse
##                   V7             V8     V9     V10  V11 V12 V13            V14
## 1       Adm-clerical  Not-in-family  White    Male 2174   0  40  United-States
## 2    Exec-managerial        Husband  White    Male    0   0  13  United-States
## 3  Handlers-cleaners  Not-in-family  White    Male    0   0  40  United-States
## 4  Handlers-cleaners        Husband  Black    Male    0   0  40  United-States
## 5     Prof-specialty           Wife  Black  Female    0   0  40           Cuba
## 6    Exec-managerial           Wife  White  Female    0   0  40  United-States
##      V15
## 1  <=50K
## 2  <=50K
## 3  <=50K
## 4  <=50K
## 5  <=50K
## 6  <=50K

Cargamos el conjunto de datos desde las 2 tablas y como no es nuestro objetivo realizar tecnicas de machine learning juntamos los registros de train y test para formar un dataframe completo con toda la informacion. Se observa ademas la falta de nombres representativos para las columnas asi que vamos a arreglar eso

cols <- c("Age",
          "Workclass",
          "Fnlwgt",
          "Education",
          "Education-num",
          "Marital-status",
          "Occupation",
          "Relationship",
          "Race",
          "Sex",
          "Capital-gain",
          "Capital-loss",
          "Hours-per-week",
          "Native-country",
          "Income"
          )
colnames(df) <- cols
head(df)
##   Age         Workclass Fnlwgt  Education Education-num      Marital-status
## 1  39         State-gov  77516  Bachelors            13       Never-married
## 2  50  Self-emp-not-inc  83311  Bachelors            13  Married-civ-spouse
## 3  38           Private 215646    HS-grad             9            Divorced
## 4  53           Private 234721       11th             7  Married-civ-spouse
## 5  28           Private 338409  Bachelors            13  Married-civ-spouse
## 6  37           Private 284582    Masters            14  Married-civ-spouse
##           Occupation   Relationship   Race     Sex Capital-gain Capital-loss
## 1       Adm-clerical  Not-in-family  White    Male         2174            0
## 2    Exec-managerial        Husband  White    Male            0            0
## 3  Handlers-cleaners  Not-in-family  White    Male            0            0
## 4  Handlers-cleaners        Husband  Black    Male            0            0
## 5     Prof-specialty           Wife  Black  Female            0            0
## 6    Exec-managerial           Wife  White  Female            0            0
##   Hours-per-week Native-country Income
## 1             40  United-States  <=50K
## 2             13  United-States  <=50K
## 3             40  United-States  <=50K
## 4             40  United-States  <=50K
## 5             40           Cuba  <=50K
## 6             40  United-States  <=50K

ademas notamos que el tipo de dato está correctamente entendido por r en el caso de las variables numericas excepto income y education-num(a pesar de ser un numero es una variable categorica) y las convertiremos a factor

factores <- c("Workclass", "Education", "Education-num", "Marital-status", "Occupation", "Relationship", "Race", "Sex", "Native-country")
df[factores] <- lapply(df[factores], as.factor)
head(df)
##   Age         Workclass Fnlwgt  Education Education-num      Marital-status
## 1  39         State-gov  77516  Bachelors            13       Never-married
## 2  50  Self-emp-not-inc  83311  Bachelors            13  Married-civ-spouse
## 3  38           Private 215646    HS-grad             9            Divorced
## 4  53           Private 234721       11th             7  Married-civ-spouse
## 5  28           Private 338409  Bachelors            13  Married-civ-spouse
## 6  37           Private 284582    Masters            14  Married-civ-spouse
##           Occupation   Relationship   Race     Sex Capital-gain Capital-loss
## 1       Adm-clerical  Not-in-family  White    Male         2174            0
## 2    Exec-managerial        Husband  White    Male            0            0
## 3  Handlers-cleaners  Not-in-family  White    Male            0            0
## 4  Handlers-cleaners        Husband  Black    Male            0            0
## 5     Prof-specialty           Wife  Black  Female            0            0
## 6    Exec-managerial           Wife  White  Female            0            0
##   Hours-per-week Native-country Income
## 1             40  United-States  <=50K
## 2             13  United-States  <=50K
## 3             40  United-States  <=50K
## 4             40  United-States  <=50K
## 5             40           Cuba  <=50K
## 6             40  United-States  <=50K

Ahora que ya finalizamos con el cargado y la validacion adecuada del tipo de dato podemos proseguir con un analisis exploratorio de datos con su respectivo analisis de datos faltantes

Analisis exploratorio de datos

realizamos un summary del dataframe

summary(df)
##       Age                    Workclass         Fnlwgt       
##  Min.   :17.00    Private         :33906   Min.   :  12285  
##  1st Qu.:28.00    Self-emp-not-inc: 3862   1st Qu.: 117551  
##  Median :37.00    Local-gov       : 3136   Median : 178145  
##  Mean   :38.64    State-gov       : 1981   Mean   : 189664  
##  3rd Qu.:48.00    Self-emp-inc    : 1695   3rd Qu.: 237642  
##  Max.   :90.00   (Other)          : 1463   Max.   :1490400  
##                  NA's             : 2799                    
##          Education     Education-num                  Marital-status 
##   HS-grad     :15784   9      :15784    Divorced             : 6633  
##   Some-college:10878   10     :10878    Married-AF-spouse    :   37  
##   Bachelors   : 8025   13     : 8025    Married-civ-spouse   :22379  
##   Masters     : 2657   14     : 2657    Married-spouse-absent:  628  
##   Assoc-voc   : 2061   11     : 2061    Never-married        :16117  
##   11th        : 1812   7      : 1812    Separated            : 1530  
##  (Other)      : 7625   (Other): 7625    Widowed              : 1518  
##             Occupation             Relationship                    Race      
##   Prof-specialty : 6172    Husband       :19716    Amer-Indian-Eskimo:  470  
##   Craft-repair   : 6112    Not-in-family :12583    Asian-Pac-Islander: 1519  
##   Exec-managerial: 6086    Other-relative: 1506    Black             : 4685  
##   Adm-clerical   : 5611    Own-child     : 7581    Other             :  406  
##   Sales          : 5504    Unmarried     : 5125    White             :41762  
##  (Other)         :16548    Wife          : 2331                              
##  NA's            : 2809                                                      
##       Sex         Capital-gain    Capital-loss    Hours-per-week 
##   Female:16192   Min.   :    0   Min.   :   0.0   Min.   : 1.00  
##   Male  :32650   1st Qu.:    0   1st Qu.:   0.0   1st Qu.:40.00  
##                  Median :    0   Median :   0.0   Median :40.00  
##                  Mean   : 1079   Mean   :  87.5   Mean   :40.42  
##                  3rd Qu.:    0   3rd Qu.:   0.0   3rd Qu.:45.00  
##                  Max.   :99999   Max.   :4356.0   Max.   :99.00  
##                                                                  
##         Native-country     Income         
##   United-States:43832   Length:48842      
##   Mexico       :  951   Class :character  
##   Philippines  :  295   Mode  :character  
##   Germany      :  206                     
##   Puerto-Rico  :  184                     
##  (Other)       : 2517                     
##  NA's          :  857

Podemos observar que hay valores vacios(NA’s) por lo que debemos analizarlos antes de poder continuar con nuestro analisis ### Analisis de datos faltantes

library(Amelia)
## Cargando paquete requerido: Rcpp
## ## 
## ## Amelia II: Multiple Imputation
## ## (Version 1.8.3, built: 2024-11-07)
## ## Copyright (C) 2005-2025 James Honaker, Gary King and Matthew Blackwell
## ## Refer to http://gking.harvard.edu/amelia/ for more information
## ##
missmap(df,col=c("red","blue"),legend=TRUE)

# Contar y calcular porcentaje de "?" por columna
df[df == " ?"] <- NA
conteo_preguntas <- colSums(is.na(df))
porcentaje_preguntas <- round((conteo_preguntas / nrow(df)) * 100, 2)

# Crear tabla resumen
resumen <- data.frame(
  Columna = names(conteo_preguntas),
  Cantidad = conteo_preguntas,
  Porcentaje = porcentaje_preguntas
)

print(resumen)
##                       Columna Cantidad Porcentaje
## Age                       Age        0       0.00
## Workclass           Workclass     2799       5.73
## Fnlwgt                 Fnlwgt        0       0.00
## Education           Education        0       0.00
## Education-num   Education-num        0       0.00
## Marital-status Marital-status        0       0.00
## Occupation         Occupation     2809       5.75
## Relationship     Relationship        0       0.00
## Race                     Race        0       0.00
## Sex                       Sex        0       0.00
## Capital-gain     Capital-gain        0       0.00
## Capital-loss     Capital-loss        0       0.00
## Hours-per-week Hours-per-week        0       0.00
## Native-country Native-country      857       1.75
## Income                 Income        0       0.00

se observa que las variables de Workclass, Occupation y Native-country tienen NA que no es muy problematico. En las primeras 2 variables sobrepasa el 5% por lo que se realizara una imputacion simple en ellas, en el de los paises nativos al ser del 1.73% el impacto de los NA no sera mucho en la variable. Revisaremos la normalidad de las variables numericas para verificar cual imputacion seria mejor en dichos casos

num_vars <- sapply(df, is.numeric)
for (col in names(df)[num_vars]) {
  
  # Extraer datos sin NA
  x <- df[[col]][!is.na(df[[col]])]
  
  # ks.test contra normal con media y sd de x
  ks_res <- ks.test(x, "pnorm", mean = mean(x), sd = sd(x))
  
  if (ks_res$p.value > 0.05) {
    # Considerar normal → imputar con media
    df[[col]][is.na(df[[col]])] <- mean(x)
    cat(col, ": normal → imputada con media\n")
  } else {
    # No normal → imputar con mediana
    df[[col]][is.na(df[[col]])] <- median(x)
    cat(col, ": NO normal → imputada con mediana\n")
  }
}
## Warning in ks.test.default(x, "pnorm", mean = mean(x), sd = sd(x)): ties should
## not be present for the one-sample Kolmogorov-Smirnov test
## Age : NO normal → imputada con mediana
## Warning in ks.test.default(x, "pnorm", mean = mean(x), sd = sd(x)): ties should
## not be present for the one-sample Kolmogorov-Smirnov test
## Fnlwgt : NO normal → imputada con mediana
## Warning in ks.test.default(x, "pnorm", mean = mean(x), sd = sd(x)): ties should
## not be present for the one-sample Kolmogorov-Smirnov test
## Capital-gain : NO normal → imputada con mediana
## Warning in ks.test.default(x, "pnorm", mean = mean(x), sd = sd(x)): ties should
## not be present for the one-sample Kolmogorov-Smirnov test
## Capital-loss : NO normal → imputada con mediana
## Warning in ks.test.default(x, "pnorm", mean = mean(x), sd = sd(x)): ties should
## not be present for the one-sample Kolmogorov-Smirnov test
## Hours-per-week : NO normal → imputada con mediana

Como las variables no son normales los valores NA se los va a imputar por la mediana

df$Fnlwgt[is.na(df$Fnlwgt)] <- median(df$Fnlwgt, na.rm = TRUE)
df$`Capital-gain`[is.na(df$`Capital-gain`)] <- median(df$`Capital-gain`, na.rm = TRUE)
df$`Capital-loss`[is.na(df$`Capital-loss`)] <- median(df$`Capital-loss`, na.rm = TRUE)
df$`Hours-per-week`[is.na(df$`Hours-per-week`)]<-median(df$`Hours-per-week`,na.rm=TRUE)

Como el resto de variables que contienen NA son categoricas, debemos imputarlas por la moda

moda <- function(x) {
  ux <- na.omit(unique(x))
  ux[which.max(tabulate(match(x, ux)))]
}
df$Workclass[is.na(df$Workclass)] <- moda(df$Workclass)
df$`Education-num`[is.na(df$`Education-num`)]<-moda(df$`Education-num`)
df$Occupation[is.na(df$Occupation)]<-moda(df$Occupation)
df$`Native-country`[is.na(df$`Native-country`)]<-moda(df$`Native-country`)
colSums(is.na(df))
##            Age      Workclass         Fnlwgt      Education  Education-num 
##              0              0              0              0              0 
## Marital-status     Occupation   Relationship           Race            Sex 
##              0              0              0              0              0 
##   Capital-gain   Capital-loss Hours-per-week Native-country         Income 
##              0              0              0              0              0

Con esto las variables NA ya estan completamente codificadas por lo que podemos finalmente realizar el analisis exploratorio ## Analisis Univariado ### Analisis de variables Categoricas Para iniciar el analisis univariado empezaremos analizando el comportamiento de las variables categoricas

library(ggplot2)

for (var in factores) {
  cat("\n========================\n")
  cat("Variable:", var, "\n")
  cat("========================\n")
  
  # Frecuencia absoluta
  freq_table <- table(df[[var]])
  print(freq_table)
  
  # Porcentaje
  prop_table <- prop.table(freq_table) * 100
  print(round(prop_table, 2))
}
## 
## ========================
## Variable: Workclass 
## ========================
## 
##       Federal-gov         Local-gov      Never-worked           Private 
##              1432              3136                10             36705 
##      Self-emp-inc  Self-emp-not-inc         State-gov       Without-pay 
##              1695              3862              1981                21 
## 
##       Federal-gov         Local-gov      Never-worked           Private 
##              2.93              6.42              0.02             75.15 
##      Self-emp-inc  Self-emp-not-inc         State-gov       Without-pay 
##              3.47              7.91              4.06              0.04 
## 
## ========================
## Variable: Education 
## ========================
## 
##          10th          11th          12th       1st-4th       5th-6th 
##          1389          1812           657           247           509 
##       7th-8th           9th    Assoc-acdm     Assoc-voc     Bachelors 
##           955           756          1601          2061          8025 
##     Doctorate       HS-grad       Masters     Preschool   Prof-school 
##           594         15784          2657            83           834 
##  Some-college 
##         10878 
## 
##          10th          11th          12th       1st-4th       5th-6th 
##          2.84          3.71          1.35          0.51          1.04 
##       7th-8th           9th    Assoc-acdm     Assoc-voc     Bachelors 
##          1.96          1.55          3.28          4.22         16.43 
##     Doctorate       HS-grad       Masters     Preschool   Prof-school 
##          1.22         32.32          5.44          0.17          1.71 
##  Some-college 
##         22.27 
## 
## ========================
## Variable: Education-num 
## ========================
## 
##     1     2     3     4     5     6     7     8     9    10    11    12    13 
##    83   247   509   955   756  1389  1812   657 15784 10878  2061  1601  8025 
##    14    15    16 
##  2657   834   594 
## 
##     1     2     3     4     5     6     7     8     9    10    11    12    13 
##  0.17  0.51  1.04  1.96  1.55  2.84  3.71  1.35 32.32 22.27  4.22  3.28 16.43 
##    14    15    16 
##  5.44  1.71  1.22 
## 
## ========================
## Variable: Marital-status 
## ========================
## 
##               Divorced      Married-AF-spouse     Married-civ-spouse 
##                   6633                     37                  22379 
##  Married-spouse-absent          Never-married              Separated 
##                    628                  16117                   1530 
##                Widowed 
##                   1518 
## 
##               Divorced      Married-AF-spouse     Married-civ-spouse 
##                  13.58                   0.08                  45.82 
##  Married-spouse-absent          Never-married              Separated 
##                   1.29                  33.00                   3.13 
##                Widowed 
##                   3.11 
## 
## ========================
## Variable: Occupation 
## ========================
## 
##       Adm-clerical       Armed-Forces       Craft-repair    Exec-managerial 
##               5611                 15               6112               6086 
##    Farming-fishing  Handlers-cleaners  Machine-op-inspct      Other-service 
##               1490               2072               3022               4923 
##    Priv-house-serv     Prof-specialty    Protective-serv              Sales 
##                242               8981                983               5504 
##       Tech-support   Transport-moving 
##               1446               2355 
## 
##       Adm-clerical       Armed-Forces       Craft-repair    Exec-managerial 
##              11.49               0.03              12.51              12.46 
##    Farming-fishing  Handlers-cleaners  Machine-op-inspct      Other-service 
##               3.05               4.24               6.19              10.08 
##    Priv-house-serv     Prof-specialty    Protective-serv              Sales 
##               0.50              18.39               2.01              11.27 
##       Tech-support   Transport-moving 
##               2.96               4.82 
## 
## ========================
## Variable: Relationship 
## ========================
## 
##         Husband   Not-in-family  Other-relative       Own-child       Unmarried 
##           19716           12583            1506            7581            5125 
##            Wife 
##            2331 
## 
##         Husband   Not-in-family  Other-relative       Own-child       Unmarried 
##           40.37           25.76            3.08           15.52           10.49 
##            Wife 
##            4.77 
## 
## ========================
## Variable: Race 
## ========================
## 
##  Amer-Indian-Eskimo  Asian-Pac-Islander               Black               Other 
##                 470                1519                4685                 406 
##               White 
##               41762 
## 
##  Amer-Indian-Eskimo  Asian-Pac-Islander               Black               Other 
##                0.96                3.11                9.59                0.83 
##               White 
##               85.50 
## 
## ========================
## Variable: Sex 
## ========================
## 
##  Female    Male 
##   16192   32650 
## 
##  Female    Male 
##   33.15   66.85 
## 
## ========================
## Variable: Native-country 
## ========================
## 
##                    Cambodia                      Canada 
##                          28                         182 
##                       China                    Columbia 
##                         122                          85 
##                        Cuba          Dominican-Republic 
##                         138                         103 
##                     Ecuador                 El-Salvador 
##                          45                         155 
##                     England                      France 
##                         127                          38 
##                     Germany                      Greece 
##                         206                          49 
##                   Guatemala                       Haiti 
##                          88                          75 
##          Holand-Netherlands                    Honduras 
##                           1                          20 
##                        Hong                     Hungary 
##                          30                          19 
##                       India                        Iran 
##                         151                          59 
##                     Ireland                       Italy 
##                          37                         105 
##                     Jamaica                       Japan 
##                         106                          92 
##                        Laos                      Mexico 
##                          23                         951 
##                   Nicaragua  Outlying-US(Guam-USVI-etc) 
##                          49                          23 
##                        Peru                 Philippines 
##                          46                         295 
##                      Poland                    Portugal 
##                          87                          67 
##                 Puerto-Rico                    Scotland 
##                         184                          21 
##                       South                      Taiwan 
##                         115                          65 
##                    Thailand             Trinadad&Tobago 
##                          30                          27 
##               United-States                     Vietnam 
##                       44689                          86 
##                  Yugoslavia 
##                          23 
## 
##                    Cambodia                      Canada 
##                        0.06                        0.37 
##                       China                    Columbia 
##                        0.25                        0.17 
##                        Cuba          Dominican-Republic 
##                        0.28                        0.21 
##                     Ecuador                 El-Salvador 
##                        0.09                        0.32 
##                     England                      France 
##                        0.26                        0.08 
##                     Germany                      Greece 
##                        0.42                        0.10 
##                   Guatemala                       Haiti 
##                        0.18                        0.15 
##          Holand-Netherlands                    Honduras 
##                        0.00                        0.04 
##                        Hong                     Hungary 
##                        0.06                        0.04 
##                       India                        Iran 
##                        0.31                        0.12 
##                     Ireland                       Italy 
##                        0.08                        0.21 
##                     Jamaica                       Japan 
##                        0.22                        0.19 
##                        Laos                      Mexico 
##                        0.05                        1.95 
##                   Nicaragua  Outlying-US(Guam-USVI-etc) 
##                        0.10                        0.05 
##                        Peru                 Philippines 
##                        0.09                        0.60 
##                      Poland                    Portugal 
##                        0.18                        0.14 
##                 Puerto-Rico                    Scotland 
##                        0.38                        0.04 
##                       South                      Taiwan 
##                        0.24                        0.13 
##                    Thailand             Trinadad&Tobago 
##                        0.06                        0.06 
##               United-States                     Vietnam 
##                       91.50                        0.18 
##                  Yugoslavia 
##                        0.05
for (var in factores) {
  cat("\nTop categorías en", var, "\n")
  print(sort(table(df[[var]]), decreasing = TRUE)[1:5])
}
## 
## Top categorías en Workclass 
## 
##           Private  Self-emp-not-inc         Local-gov         State-gov 
##             36705              3862              3136              1981 
##      Self-emp-inc 
##              1695 
## 
## Top categorías en Education 
## 
##       HS-grad  Some-college     Bachelors       Masters     Assoc-voc 
##         15784         10878          8025          2657          2061 
## 
## Top categorías en Education-num 
## 
##     9    10    13    14    11 
## 15784 10878  8025  2657  2061 
## 
## Top categorías en Marital-status 
## 
##  Married-civ-spouse       Never-married            Divorced           Separated 
##               22379               16117                6633                1530 
##             Widowed 
##                1518 
## 
## Top categorías en Occupation 
## 
##   Prof-specialty     Craft-repair  Exec-managerial     Adm-clerical 
##             8981             6112             6086             5611 
##            Sales 
##             5504 
## 
## Top categorías en Relationship 
## 
##        Husband  Not-in-family      Own-child      Unmarried           Wife 
##          19716          12583           7581           5125           2331 
## 
## Top categorías en Race 
## 
##               White               Black  Asian-Pac-Islander  Amer-Indian-Eskimo 
##               41762                4685                1519                 470 
##               Other 
##                 406 
## 
## Top categorías en Sex 
## 
##    Male  Female    <NA>    <NA>    <NA> 
##   32650   16192                         
## 
## Top categorías en Native-country 
## 
##  United-States         Mexico    Philippines        Germany    Puerto-Rico 
##          44689            951            295            206            184
for (var in factores) {
  p <- ggplot(df, aes(x = .data[[var]])) +
    geom_bar(fill = "steelblue") +
    labs(
      title = enc2utf8(paste("Distribución de", var)),
      x = var,
      y = "Frecuencia"
    ) +
    theme_minimal()
  
  print(p)
}

Podemos observar la moda para cada variable gracias al ranking. Gracias al grafico y a los resumenes podemos observar que existe desbalance en todas las variables categoricas por lo que damos inicio al analisis de las variables numericas.

analisis de variables numericas

iniciamos con medidas descriptivas para las variables cuantitativas, primero creamos un vector para identificar y realizar los ciclos mas facilmente

numericas <- names(df)[sapply(df, is.numeric)]
numericas
## [1] "Age"            "Fnlwgt"         "Capital-gain"   "Capital-loss"  
## [5] "Hours-per-week"

ahora con un resumen estadistico

for (var in numericas) {
  cat("\n========================\n")
  cat("Variable:", var, "\n")
  cat("========================\n")
  
  # Medidas descriptivas
  valores <- df[[var]]
  cat("Mínimo:", min(valores, na.rm = TRUE), "\n")
  cat("Máximo:", max(valores, na.rm = TRUE), "\n")
  cat("Media:", mean(valores, na.rm = TRUE), "\n")
  cat("Mediana:", median(valores, na.rm = TRUE), "\n")
  cat("Desviación estándar:", sd(valores, na.rm = TRUE), "\n")
  cat("Cuartiles:\n")
  print(quantile(valores, na.rm = TRUE))
}
## 
## ========================
## Variable: Age 
## ========================
## Mínimo: 17 
## Máximo: 90 
## Media: 38.64359 
## Mediana: 37 
## Desviación estándar: 13.71051 
## Cuartiles:
##   0%  25%  50%  75% 100% 
##   17   28   37   48   90 
## 
## ========================
## Variable: Fnlwgt 
## ========================
## Mínimo: 12285 
## Máximo: 1490400 
## Media: 189664.1 
## Mediana: 178144.5 
## Desviación estándar: 105604 
## Cuartiles:
##        0%       25%       50%       75%      100% 
##   12285.0  117550.5  178144.5  237642.0 1490400.0 
## 
## ========================
## Variable: Capital-gain 
## ========================
## Mínimo: 0 
## Máximo: 99999 
## Media: 1079.068 
## Mediana: 0 
## Desviación estándar: 7452.019 
## Cuartiles:
##    0%   25%   50%   75%  100% 
##     0     0     0     0 99999 
## 
## ========================
## Variable: Capital-loss 
## ========================
## Mínimo: 0 
## Máximo: 4356 
## Media: 87.50231 
## Mediana: 0 
## Desviación estándar: 403.0046 
## Cuartiles:
##   0%  25%  50%  75% 100% 
##    0    0    0    0 4356 
## 
## ========================
## Variable: Hours-per-week 
## ========================
## Mínimo: 1 
## Máximo: 99 
## Media: 40.42238 
## Mediana: 40 
## Desviación estándar: 12.39144 
## Cuartiles:
##   0%  25%  50%  75% 100% 
##    1   40   40   45   99

seguidamente un histograma para tener una idea grafica de la distribucion de las variables(si las tiene)

for (var in numericas) {
  p1 <- ggplot(df, aes(x = .data[[var]])) +
    geom_histogram(fill = "steelblue", color = "white") +
    labs(title = paste("Histograma de", var), x = var, y = "Frecuencia") +
    theme_minimal()
  print(p1)
}
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

for (var in numericas) {
  p1 <- ggplot(df, aes(y = .data[[var]])) +
    geom_boxplot(fill = "orange") +
    labs(title = paste("Boxplot de", var), y = var) +
    theme_minimal()
  print(p1)
}

se puede observar presencia de datos atipicos en el boxplot y los histogramas unicamente age parece asimilarse graficamente a la forma de una normal por lo que será importante realizar pruebas analiticas para determinar normalidad, pruebas analiticas para entender la naturaleza de estos datos atipicos

# Aplicar prueba de Lilliefors a cada 
library(moments)
library(nortest)
for (var in numericas) {
  cat("\n========================\n")
  cat("Variable:", var, "\n")
  cat("========================\n")
  
  resultado <- lillie.test(df[[var]])
  print(resultado)
}
## 
## ========================
## Variable: Age 
## ========================
## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  df[[var]]
## D = 0.063157, p-value < 2.2e-16
## 
## 
## ========================
## Variable: Fnlwgt 
## ========================
## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  df[[var]]
## D = 0.088684, p-value < 2.2e-16
## 
## 
## ========================
## Variable: Capital-gain 
## ========================
## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  df[[var]]
## D = 0.47495, p-value < 2.2e-16
## 
## 
## ========================
## Variable: Capital-loss 
## ========================
## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  df[[var]]
## D = 0.53922, p-value < 2.2e-16
## 
## 
## ========================
## Variable: Hours-per-week 
## ========================
## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  df[[var]]
## D = 0.24712, p-value < 2.2e-16

Se concluye que ninguna variable tiene comportamiento normal. Ahora analizamos valores atipicos usando el criterio del rango intercuartilico

for (var in numericas) {
  cat("\n========================\n")
  cat("Variable:", var, "\n")
  cat("========================\n")
  
  x <- df[[var]]
  n_total <- sum(!is.na(x))  # valores no NA
  outliers <- boxplot.stats(x)$out
  
  if (length(outliers) > 0) {
    porcentaje <- (length(outliers) / n_total) * 100
    cat("Cantidad de outliers:", length(outliers), "\n")
    cat("Porcentaje de outliers:", round(porcentaje, 2), "%\n")
  } else {
    cat("No se detectaron valores atípicos.\n")
  }
}
## 
## ========================
## Variable: Age 
## ========================
## Cantidad de outliers: 216 
## Porcentaje de outliers: 0.44 %
## 
## ========================
## Variable: Fnlwgt 
## ========================
## Cantidad de outliers: 1453 
## Porcentaje de outliers: 2.97 %
## 
## ========================
## Variable: Capital-gain 
## ========================
## Cantidad de outliers: 4035 
## Porcentaje de outliers: 8.26 %
## 
## ========================
## Variable: Capital-loss 
## ========================
## Cantidad de outliers: 2282 
## Porcentaje de outliers: 4.67 %
## 
## ========================
## Variable: Hours-per-week 
## ========================
## Cantidad de outliers: 13496 
## Porcentaje de outliers: 27.63 %
for (var in numericas) {
  cat("\n========================\n")
  cat("Variable:", var, "\n")
  cat("========================\n")
  
  x <- df[[var]]
  Q1 <- quantile(x, 0.25, na.rm = TRUE)
  Q3 <- quantile(x, 0.75, na.rm = TRUE)
  IQR_value <- Q3 - Q1
  
  limite_inferior <- Q1 - 1.5 * IQR_value
  limite_superior <- Q3 + 1.5 * IQR_value
  
  outliers_bajos <- x[x < limite_inferior]
  outliers_altos <- x[x > limite_superior]
  
  cat("Outliers bajos:", length(outliers_bajos), "\n")
  cat("Outliers altos:", length(outliers_altos), "\n")
}
## 
## ========================
## Variable: Age 
## ========================
## Outliers bajos: 0 
## Outliers altos: 216 
## 
## ========================
## Variable: Fnlwgt 
## ========================
## Outliers bajos: 0 
## Outliers altos: 1453 
## 
## ========================
## Variable: Capital-gain 
## ========================
## Outliers bajos: 0 
## Outliers altos: 4035 
## 
## ========================
## Variable: Capital-loss 
## ========================
## Outliers bajos: 0 
## Outliers altos: 2282 
## 
## ========================
## Variable: Hours-per-week 
## ========================
## Outliers bajos: 8286 
## Outliers altos: 5210

Ahora vamos a realizar el tratado de outliers: - Como la variable Age, Fnlwgt, capital-loss tiene menos del 5% de outliers no es necesario realizar alguna limpieza - Para la variable capital-gain se tiene un porcentaje de outliers con el que podemos realizar limpiezas simples, por lo que realizaremos imputacion por la mediana (dado que los datos son no normales) - Como la variable hours-per-week tiene una cantidad de outliers considerable se realizaran tecnicas de imputacion avanzada

mediana_capital_gain <- median(df$`capital-gain`, na.rm = TRUE)
df$`capital-gain`[df$`capital-gain` %in% boxplot.stats(df$`capital-gain`)$out] <- mediana_capital_gain

Analisis Bivariado

Se va a realizar el analisis bivariado de el dataset, realizando comparaciones entre las variables para determinar si las variables estan relacionadas

Analisis bivariado de variables numericas

Realizamos una matriz de correlacion para determinar la correlacion entre las variables numericas

library(dplyr)
## 
## Adjuntando el paquete: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
Var_num <- df %>% 
  select(where(is.numeric))
cor_matrix <- cor(Var_num, use = "complete.obs")
library(reshape2)
cor_df <- melt(cor_matrix)
ggplot(cor_df, aes(x = Var1, y = Var2, fill = value)) +
  geom_tile(color = "white") +
  scale_fill_gradient2(low = "blue", high = "red", mid = "white",
                       midpoint = 0, limit = c(-1, 1), space = "Lab",
                       name = "Correlación") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, vjust = 1,
                                   size = 12, hjust = 1)) +
  coord_fixed() +
  ggtitle("Matriz de correlación (Heatmap)")

Podemos observar que las variables numericas no tienen una alta correlacion entre si, por lo que no tienen una relacion lineal muy alta. Vamos a utilizar scatterplots para revisar si hay relacion no lineal entre las variables.

Analisis Bivariado de categoricas

Realizamos tablas de contingencia para las variables categoricas y utilizamos la prueba de chi-cuadrado para determinar si hay diferencias estadisticamente significativas entre las variables y si pueden estar relacionadas.

if(length(factores) >= 2){
  comb_cat <- combn(factores, 2, simplify = FALSE)
  
  for (par in comb_cat) {
    tabla <- table(df[[par[1]]], df[[par[2]]])
    cat("\nTabla de contingencia entre", par[1], "y", par[2], ":\n")
    print(tabla)
    if(min(dim(tabla)) > 1){ # Para evitar error si hay solo 1 nivel
      chis <- chisq.test(tabla,simulate.p.value = TRUE)
      cat("chi cuadrado p-valor:", chis$p.value, "\n")
    }
  }
}
## 
## Tabla de contingencia entre Workclass y Education :
##                    
##                      10th  11th  12th  1st-4th  5th-6th  7th-8th   9th
##    Federal-gov         15    14     8        1        1        4     6
##    Local-gov           52    61    25        5       13       40    31
##    Never-worked         2     3     0        0        0        1     0
##    Private           1170  1590   570      221      450      734   637
##    Self-emp-inc        27    23    13        2        8       20    13
##    Self-emp-not-inc   104   106    30       16       33      138    59
##    State-gov           19    15    11        2        4       16    10
##    Without-pay          0     0     0        0        0        2     0
##                    
##                      Assoc-acdm  Assoc-voc  Bachelors  Doctorate  HS-grad
##    Federal-gov               81         61        313         22      395
##    Local-gov                129        124        700         35      761
##    Never-worked               0          0          0          0        2
##    Private                 1157       1571       5556        281    12492
##    Self-emp-inc              58         59        418         55      426
##    Self-emp-not-inc         111        171        607         76     1279
##    State-gov                 63         75        431        125      415
##    Without-pay                2          0          0          0       14
##                    
##                      Masters  Preschool  Prof-school  Some-college
##    Federal-gov           110          0           38           363
##    Local-gov             526          4           41           589
##    Never-worked            0          0            0             2
##    Private              1443         73          385          8375
##    Self-emp-inc          112          0          129           332
##    Self-emp-not-inc      207          5          196           724
##    State-gov             259          1           45           490
##    Without-pay             0          0            0             3
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Workclass y Education-num :
##                    
##                         1     2     3     4     5     6     7     8     9    10
##    Federal-gov          0     1     1     4     6    15    14     8   395   363
##    Local-gov            4     5    13    40    31    52    61    25   761   589
##    Never-worked         0     0     0     1     0     2     3     0     2     2
##    Private             73   221   450   734   637  1170  1590   570 12492  8375
##    Self-emp-inc         0     2     8    20    13    27    23    13   426   332
##    Self-emp-not-inc     5    16    33   138    59   104   106    30  1279   724
##    State-gov            1     2     4    16    10    19    15    11   415   490
##    Without-pay          0     0     0     2     0     0     0     0    14     3
##                    
##                        11    12    13    14    15    16
##    Federal-gov         61    81   313   110    38    22
##    Local-gov          124   129   700   526    41    35
##    Never-worked         0     0     0     0     0     0
##    Private           1571  1157  5556  1443   385   281
##    Self-emp-inc        59    58   418   112   129    55
##    Self-emp-not-inc   171   111   607   207   196    76
##    State-gov           75    63   431   259    45   125
##    Without-pay          0     2     0     0     0     0
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Workclass y Marital-status :
##                    
##                      Divorced  Married-AF-spouse  Married-civ-spouse
##    Federal-gov            238                  3                 721
##    Local-gov              529                  0                1536
##    Never-worked             1                  0                   1
##    Private               4971                 29               15400
##    Self-emp-inc           146                  0                1264
##    Self-emp-not-inc       432                  3                2554
##    State-gov              316                  2                 890
##    Without-pay              0                  0                  13
##                    
##                      Married-spouse-absent  Never-married  Separated  Widowed
##    Federal-gov                          15            368         39       48
##    Local-gov                            33            798        100      140
##    Never-worked                          1              7          0        0
##    Private                             497          13478       1216     1114
##    Self-emp-inc                          8            211         25       41
##    Self-emp-not-inc                     48            613         85      127
##    State-gov                            25            636         65       47
##    Without-pay                           1              6          0        1
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Workclass y Occupation :
##                    
##                      Adm-clerical  Armed-Forces  Craft-repair  Exec-managerial
##    Federal-gov                487            15            93              268
##    Local-gov                  421             0           211              331
##    Never-worked                 0             0             0                0
##    Private                   4208             0          4748             3995
##    Self-emp-inc                47             0           167              617
##    Self-emp-not-inc            70             0           798              587
##    State-gov                  375             0            94              287
##    Without-pay                  3             0             1                1
##                    
##                      Farming-fishing  Handlers-cleaners  Machine-op-inspct
##    Federal-gov                     9                 36                 19
##    Local-gov                      43                 65                 24
##    Never-worked                    0                  0                  0
##    Private                       670               1923               2882
##    Self-emp-inc                   82                  6                 17
##    Self-emp-not-inc              653                 21                 59
##    State-gov                      25                 19                 19
##    Without-pay                     8                  2                  2
##                    
##                      Other-service  Priv-house-serv  Prof-specialty
##    Federal-gov                  55                0             253
##    Local-gov                   300                0            1061
##    Never-worked                  0                0              10
##    Private                    4057              242            6208
##    Self-emp-inc                 42                0             245
##    Self-emp-not-inc            276                0             575
##    State-gov                   191                0             629
##    Without-pay                   2                0               0
##                    
##                      Protective-serv  Sales  Tech-support  Transport-moving
##    Federal-gov                    47     17            96                37
##    Local-gov                     450     16            58               156
##    Never-worked                    0      0             0                 0
##    Private                       299   4439          1154              1880
##    Self-emp-inc                    5    420             9                38
##    Self-emp-not-inc                7    591            42               183
##    State-gov                     175     20            87                60
##    Without-pay                     0      1             0                 1
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Workclass y Relationship :
##                    
##                      Husband  Not-in-family  Other-relative  Own-child
##    Federal-gov           658            398              31         99
##    Local-gov            1288            791              64        331
##    Never-worked            0              1               1          7
##    Private             13457           9781            1302       6551
##    Self-emp-inc         1189            253              14         93
##    Self-emp-not-inc     2343            773              61        244
##    State-gov             773            586              33        248
##    Without-pay             8              0               0          8
##                    
##                      Unmarried  Wife
##    Federal-gov             185    61
##    Local-gov               436   226
##    Never-worked              0     1
##    Private                3938  1676
##    Self-emp-inc             78    68
##    Self-emp-not-inc        250   191
##    State-gov               236   105
##    Without-pay               2     3
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Workclass y Race :
##                    
##                      Amer-Indian-Eskimo  Asian-Pac-Islander  Black  Other
##    Federal-gov                       33                  64    255     11
##    Local-gov                         65                  59    437     16
##    Never-worked                       0                   0      3      0
##    Private                          313                1139   3575    343
##    Self-emp-inc                       2                  64     38      6
##    Self-emp-not-inc                  34                 100    136     16
##    State-gov                         23                  92    240     14
##    Without-pay                        0                   1      1      0
##                    
##                      White
##    Federal-gov        1069
##    Local-gov          2559
##    Never-worked          7
##    Private           31335
##    Self-emp-inc       1585
##    Self-emp-not-inc   3576
##    State-gov          1612
##    Without-pay          19
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Workclass y Sex :
##                    
##                      Female  Male
##    Federal-gov          452   980
##    Local-gov           1258  1878
##    Never-worked           3     7
##    Private            12869 23836
##    Self-emp-inc         211  1484
##    Self-emp-not-inc     629  3233
##    State-gov            763  1218
##    Without-pay            7    14
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Workclass y Native-country :
##                    
##                      Cambodia  Canada  China  Columbia  Cuba
##    Federal-gov              1       4      2         2     3
##    Local-gov                0      10      3         2     8
##    Never-worked             0       0      0         0     0
##    Private                 24     132     94        71   100
##    Self-emp-inc             0      13      5         0    12
##    Self-emp-not-inc         2      19      5         8    15
##    State-gov                1       4     13         2     0
##    Without-pay              0       0      0         0     0
##                    
##                      Dominican-Republic  Ecuador  El-Salvador  England  France
##    Federal-gov                        0        0            2        4       1
##    Local-gov                          4        1            5        6       4
##    Never-worked                       0        0            0        0       0
##    Private                           94       39          143       98      27
##    Self-emp-inc                       2        0            1        2       2
##    Self-emp-not-inc                   3        3            4       13       3
##    State-gov                          0        2            0        4       1
##    Without-pay                        0        0            0        0       0
##                    
##                      Germany  Greece  Guatemala  Haiti  Holand-Netherlands
##    Federal-gov             8       0          1      2                   0
##    Local-gov              15       1          4      4                   0
##    Never-worked            0       0          0      0                   0
##    Private               150      29         80     62                   1
##    Self-emp-inc            8      10          0      2                   0
##    Self-emp-not-inc       15       9          3      3                   0
##    State-gov              10       0          0      2                   0
##    Without-pay             0       0          0      0                   0
##                    
##                      Honduras  Hong  Hungary  India  Iran  Ireland  Italy
##    Federal-gov              0     0        0      4     2        0      1
##    Local-gov                1     1        0      3     2        0      5
##    Never-worked             0     0        0      0     0        0      0
##    Private                 14    26       12    106    39       33     77
##    Self-emp-inc             2     1        1     10     2        1      6
##    Self-emp-not-inc         1     0        6     11    13        3     14
##    State-gov                2     2        0     17     1        0      2
##    Without-pay              0     0        0      0     0        0      0
##                    
##                      Jamaica  Japan  Laos  Mexico  Nicaragua
##    Federal-gov             1      3     1       4          1
##    Local-gov               7      2     1      21          3
##    Never-worked            0      0     0       0          0
##    Private                90     72    21     864         41
##    Self-emp-inc            1      4     0       9          0
##    Self-emp-not-inc        4      8     0      47          3
##    State-gov               3      3     0       6          1
##    Without-pay             0      0     0       0          0
##                    
##                      Outlying-US(Guam-USVI-etc)  Peru  Philippines  Poland
##    Federal-gov                                0     1           19       3
##    Local-gov                                  1     1           16       3
##    Never-worked                               0     0            0       0
##    Private                                   20    42          240      72
##    Self-emp-inc                               0     0            2       1
##    Self-emp-not-inc                           1     2            8       5
##    State-gov                                  1     0            9       3
##    Without-pay                                0     0            1       0
##                    
##                      Portugal  Puerto-Rico  Scotland  South  Taiwan  Thailand
##    Federal-gov              1           10         0      2       0         0
##    Local-gov                2           13         0      0       2         1
##    Never-worked             0            0         0      0       0         0
##    Private                 55          151        18     72      47        21
##    Self-emp-inc             4            1         0     12       8         4
##    Self-emp-not-inc         4            5         2     28       1         3
##    State-gov                1            4         1      1       7         1
##    Without-pay              0            0         0      0       0         0
##                    
##                      Trinadad&Tobago  United-States  Vietnam  Yugoslavia
##    Federal-gov                     1           1346        2           0
##    Local-gov                       2           2977        4           1
##    Never-worked                    0             10        0           0
##    Private                        20          33320       71          17
##    Self-emp-inc                    2           1565        1           1
##    Self-emp-not-inc                1           3576        7           4
##    State-gov                       1           1875        1           0
##    Without-pay                     0             20        0           0
## chi cuadrado p-valor: 0.005497251 
## 
## Tabla de contingencia entre Education y Education-num :
##                
##                     1     2     3     4     5     6     7     8     9    10
##    10th             0     0     0     0     0  1389     0     0     0     0
##    11th             0     0     0     0     0     0  1812     0     0     0
##    12th             0     0     0     0     0     0     0   657     0     0
##    1st-4th          0   247     0     0     0     0     0     0     0     0
##    5th-6th          0     0   509     0     0     0     0     0     0     0
##    7th-8th          0     0     0   955     0     0     0     0     0     0
##    9th              0     0     0     0   756     0     0     0     0     0
##    Assoc-acdm       0     0     0     0     0     0     0     0     0     0
##    Assoc-voc        0     0     0     0     0     0     0     0     0     0
##    Bachelors        0     0     0     0     0     0     0     0     0     0
##    Doctorate        0     0     0     0     0     0     0     0     0     0
##    HS-grad          0     0     0     0     0     0     0     0 15784     0
##    Masters          0     0     0     0     0     0     0     0     0     0
##    Preschool       83     0     0     0     0     0     0     0     0     0
##    Prof-school      0     0     0     0     0     0     0     0     0     0
##    Some-college     0     0     0     0     0     0     0     0     0 10878
##                
##                    11    12    13    14    15    16
##    10th             0     0     0     0     0     0
##    11th             0     0     0     0     0     0
##    12th             0     0     0     0     0     0
##    1st-4th          0     0     0     0     0     0
##    5th-6th          0     0     0     0     0     0
##    7th-8th          0     0     0     0     0     0
##    9th              0     0     0     0     0     0
##    Assoc-acdm       0  1601     0     0     0     0
##    Assoc-voc     2061     0     0     0     0     0
##    Bachelors        0     0  8025     0     0     0
##    Doctorate        0     0     0     0     0   594
##    HS-grad          0     0     0     0     0     0
##    Masters          0     0     0  2657     0     0
##    Preschool        0     0     0     0     0     0
##    Prof-school      0     0     0     0   834     0
##    Some-college     0     0     0     0     0     0
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Education y Marital-status :
##                
##                  Divorced  Married-AF-spouse  Married-civ-spouse
##    10th               172                  1                 525
##    11th               192                  0                 545
##    12th                63                  0                 199
##    1st-4th             17                  0                 125
##    5th-6th             31                  0                 271
##    7th-8th            101                  0                 541
##    9th                 98                  0                 349
##    Assoc-acdm         280                  2                 697
##    Assoc-voc          361                  2                1013
##    Bachelors          843                  6                4136
##    Doctorate           56                  1                 403
##    HS-grad           2416                 15                7243
##    Masters            367                  0                1527
##    Preschool            2                  0                  30
##    Prof-school         74                  1                 596
##    Some-college      1560                  9                4179
##                
##                  Married-spouse-absent  Never-married  Separated  Widowed
##    10th                             22            525         75       69
##    11th                             25            913         79       58
##    12th                             10            348         19       18
##    1st-4th                          18             50         12       25
##    5th-6th                          27            128         31       21
##    7th-8th                          20            158         39       96
##    9th                              13            220         41       35
##    Assoc-acdm                       21            522         43       36
##    Assoc-voc                        19            539         64       63
##    Bachelors                        98           2681        136      125
##    Doctorate                        13             97         11       13
##    HS-grad                         199           4671        607      633
##    Masters                          24            635         44       60
##    Preschool                         5             38          3        5
##    Prof-school                       5            139          9       10
##    Some-college                    109           4453        317      251
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Education y Occupation :
##                
##                  Adm-clerical  Armed-Forces  Craft-repair  Exec-managerial
##    10th                    59             0           239               42
##    11th                   100             0           270               51
##    12th                    52             1            92               18
##    1st-4th                  6             0            28                6
##    5th-6th                  8             0            71                6
##    7th-8th                 20             0           172               28
##    9th                     20             0           144               23
##    Assoc-acdm             281             0           167              240
##    Assoc-voc              269             1           375              234
##    Bachelors              765             1           332             2025
##    Doctorate                6             0             4               84
##    HS-grad               2047             5          2911             1192
##    Masters                105             2            34              779
##    Preschool                3             0             6                1
##    Prof-school             12             1             9               69
##    Some-college          1858             4          1258             1288
##                
##                  Farming-fishing  Handlers-cleaners  Machine-op-inspct
##    10th                       71                108                152
##    11th                       67                177                153
##    12th                       29                 55                 61
##    1st-4th                    33                 26                 36
##    5th-6th                    52                 59                 95
##    7th-8th                   106                 66                129
##    9th                        44                 72                102
##    Assoc-acdm                 25                 34                 51
##    Assoc-voc                  85                 43                 95
##    Bachelors                 113                 79                 99
##    Doctorate                   1                  0                  1
##    HS-grad                   573                943               1531
##    Masters                    14                  5                 12
##    Preschool                  17                  5                 12
##    Prof-school                 7                  0                  1
##    Some-college              253                400                492
##                
##                  Other-service  Priv-house-serv  Prof-specialty
##    10th                    280                8             164
##    11th                    368               18             215
##    12th                    129                8              71
##    1st-4th                  55               14              22
##    5th-6th                  98               20              43
##    7th-8th                 149               17             124
##    9th                     142               16              73
##    Assoc-acdm              110                3             279
##    Assoc-voc               160                5             328
##    Bachelors               259               12            2486
##    Doctorate                 2                1             468
##    HS-grad                1936               91            1156
##    Masters                  35                1            1369
##    Preschool                22                2              11
##    Prof-school               7                0             691
##    Some-college           1171               26            1481
##                
##                  Protective-serv  Sales  Tech-support  Transport-moving
##    10th                       12    120             5               129
##    11th                       18    232             9               134
##    12th                       11     70             4                56
##    1st-4th                     1      8             0                12
##    5th-6th                     1     17             1                38
##    7th-8th                    11     40             6                87
##    9th                         9     47             3                61
##    Assoc-acdm                 50    209           116                36
##    Assoc-voc                  67    163           181                55
##    Bachelors                 147   1268           346                93
##    Doctorate                   1     16             8                 2
##    HS-grad                   326   1580           270              1223
##    Masters                    20    206            61                14
##    Preschool                   0      2             0                 2
##    Prof-school                 1     23            10                 3
##    Some-college              308   1503           426               410
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Education y Relationship :
##                
##                  Husband  Not-in-family  Other-relative  Own-child  Unmarried
##    10th              471            303              62        326        181
##    11th              479            332              90        651        216
##    12th              166            128              38        239         65
##    1st-4th           111             67              23         11         29
##    5th-6th           238            105              61         19         65
##    7th-8th           503            208              46         66        103
##    9th               305            160              44        108        104
##    Assoc-acdm        581            483              30        198        201
##    Assoc-voc         883            536              38        212        274
##    Bachelors        3636           2503             132        774        516
##    Doctorate         377            143               4         11         33
##    HS-grad          6388           3841             607       2287       1937
##    Masters          1339            825              19         82        212
##    Preschool          23             38               6          7          6
##    Prof-school       558            182               7         17         32
##    Some-college     3658           2729             299       2573       1151
##                
##                  Wife
##    10th            46
##    11th            44
##    12th            21
##    1st-4th          6
##    5th-6th         21
##    7th-8th         29
##    9th             35
##    Assoc-acdm     108
##    Assoc-voc      118
##    Bachelors      464
##    Doctorate       26
##    HS-grad        724
##    Masters        180
##    Preschool        3
##    Prof-school     38
##    Some-college   468
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Education y Race :
##                
##                  Amer-Indian-Eskimo  Asian-Pac-Islander  Black  Other  White
##    10th                          22                  16    182     11   1158
##    11th                          26                  27    252     22   1485
##    12th                           5                  15    105     17    515
##    1st-4th                        4                  10     24     13    196
##    5th-6th                        2                  28     41     23    415
##    7th-8th                       10                  14     90     23    818
##    9th                            9                  10    111     15    611
##    Assoc-acdm                    13                  49    161     10   1368
##    Assoc-voc                     31                  53    165      9   1803
##    Bachelors                     29                 408    504     50   7034
##    Doctorate                      3                  46     16      3    526
##    HS-grad                      176                 336   1780    105  13387
##    Masters                       13                 140    143     13   2348
##    Preschool                      1                   7     12      2     61
##    Prof-school                    2                  58     21      5    748
##    Some-college                 124                 302   1078     85   9289
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Education y Sex :
##                
##                  Female  Male
##    10th             457   932
##    11th             650  1162
##    12th             211   446
##    1st-4th           61   186
##    5th-6th          127   382
##    7th-8th          239   716
##    9th              220   536
##    Assoc-acdm       627   974
##    Assoc-voc        734  1327
##    Bachelors       2477  5548
##    Doctorate        113   481
##    HS-grad         5097 10687
##    Masters          845  1812
##    Preschool         24    59
##    Prof-school      132   702
##    Some-college    4178  6700
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Education y Native-country :
##                
##                  Cambodia  Canada  China  Columbia  Cuba  Dominican-Republic
##    10th                 0       2      3         4     4                   4
##    11th                 0       5      0         2     2                   4
##    12th                 1       3      1         3     5                   8
##    1st-4th              1       0      1         2     5                  10
##    5th-6th              0       1      2         2     8                   5
##    7th-8th              1       5      5         2     7                  12
##    9th                  0       3      2         4     6                   7
##    Assoc-acdm           0       4      0         4     4                   1
##    Assoc-voc            2      11      0         5     6                   2
##    Bachelors            5      38     29         6    19                   6
##    Doctorate            0      11     13         1     2                   0
##    HS-grad              9      51     28        34    33                  26
##    Masters              0      11     24         1     9                   3
##    Preschool            1       0      2         0     0                   2
##    Prof-school          0       3      3         4     6                   0
##    Some-college         8      34      9        11    22                  13
##                
##                  Ecuador  El-Salvador  England  France  Germany  Greece
##    10th                2            6        4       0        2       3
##    11th                1            7        2       0        5       1
##    12th                2            2        1       0        4       0
##    1st-4th             1           13        1       0        0       0
##    5th-6th             2           29        1       0        0       1
##    7th-8th             2            6        0       1        2       3
##    9th                 0           15        1       0        0       0
##    Assoc-acdm          0            4        6       3       11       0
##    Assoc-voc           2            1        3       1       16       5
##    Bachelors           5            8       34      10       50       5
##    Doctorate           0            1        7       3        6       0
##    HS-grad            14           34       37       6       47      17
##    Masters             3            3       12       7        8       5
##    Preschool           0            7        0       0        0       0
##    Prof-school         0            1        3       1        5       1
##    Some-college       11           18       15       6       50       8
##                
##                  Guatemala  Haiti  Holand-Netherlands  Honduras  Hong  Hungary
##    10th                  6      3                   0         1     0        0
##    11th                  7      3                   0         2     1        0
##    12th                  3      2                   0         0     0        0
##    1st-4th               9      1                   0         1     0        0
##    5th-6th              12      3                   0         3     1        1
##    7th-8th              11      2                   0         0     1        0
##    9th                   6      4                   0         0     2        0
##    Assoc-acdm            1      4                   0         0     2        1
##    Assoc-voc             2      0                   0         0     1        1
##    Bachelors             3      5                   0         2     6        5
##    Doctorate             0      0                   0         0     2        0
##    HS-grad              17     24                   0         5     7        6
##    Masters               0      2                   0         0     5        2
##    Preschool             2      4                   0         0     1        0
##    Prof-school           0      1                   0         1     0        1
##    Some-college          9     17                   1         5     1        2
##                
##                  India  Iran  Ireland  Italy  Jamaica  Japan  Laos  Mexico
##    10th              2     0        0      3        2      1     0      42
##    11th              8     0        2      1        5      0     1      40
##    12th              0     0        0      7        2      0     0      23
##    1st-4th           0     0        0      4        0      0     0     103
##    5th-6th           0     0        0     12        1      1     3     216
##    7th-8th           0     0        2      7        1      0     1      79
##    9th               1     0        1      0        2      0     0      76
##    Assoc-acdm        5     5        1      4        8      2     1       4
##    Assoc-voc         3     2        2      2        3      2     2      13
##    Bachelors        37    18        8     14       10     26     3      43
##    Doctorate        10     5        0      1        2      2     0       1
##    HS-grad           9     7       17     32       38     23     6     178
##    Masters          36    10        1      6        4     11     0       9
##    Preschool         1     0        0      0        1      0     1      31
##    Prof-school      22     0        0      2        0      5     1       2
##    Some-college     17    12        3     10       27     19     4      91
##                
##                  Nicaragua  Outlying-US(Guam-USVI-etc)  Peru  Philippines
##    10th                  1                           0     2            1
##    11th                  3                           1     3            9
##    12th                  1                           0     0            1
##    1st-4th               1                           0     0            5
##    5th-6th               3                           0     0           10
##    7th-8th               1                           0     2            3
##    9th                   1                           2     0            3
##    Assoc-acdm            1                           1     2           11
##    Assoc-voc             1                           1     1            9
##    Bachelors             3                           5     8          105
##    Doctorate             1                           0     0            0
##    HS-grad              15                           4    16           53
##    Masters               3                           0     2           11
##    Preschool             1                           0     0            2
##    Prof-school           0                           0     0           14
##    Some-college         13                           9    10           58
##                
##                  Poland  Portugal  Puerto-Rico  Scotland  South  Taiwan
##    10th               1         6            4         0      0       0
##    11th               3         2           18         1      3       0
##    12th               0         1            3         0      2       0
##    1st-4th            1         9            5         0      0       0
##    5th-6th            0         1            7         1      0       0
##    7th-8th            5        11           14         0      1       0
##    9th                1         6            9         0      0       0
##    Assoc-acdm         3         1            3         1      1       1
##    Assoc-voc          7         2            3         0      3       0
##    Bachelors         13         2           17         4     30      22
##    Doctorate          1         0            0         0      2      11
##    HS-grad           28        24           68         8     37       5
##    Masters            8         1            1         2      9      16
##    Preschool          0         0            1         0      1       0
##    Prof-school        0         0            0         0      2       4
##    Some-college      16         1           31         4     24       6
##                
##                  Thailand  Trinadad&Tobago  United-States  Vietnam  Yugoslavia
##    10th                 0                0           1276        3           1
##    11th                 0                1           1666        3           0
##    12th                 1                1            579        1           0
##    1st-4th              1                0             70        3           0
##    5th-6th              0                1            178        3           1
##    7th-8th              0                1            765        1           1
##    9th                  0                3            599        1           1
##    Assoc-acdm           3                1           1492        3           2
##    Assoc-voc            2                0           1941        3           1
##    Bachelors            6                0           7394       17           4
##    Doctorate            1                0            510        1           0
##    HS-grad             10                9          14768       25           9
##    Masters              2                3           2427        0           0
##    Preschool            0                0             25        0           0
##    Prof-school          1                0            750        1           0
##    Some-college         3                7          10249       21           3
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Education-num y Marital-status :
##     
##       Divorced  Married-AF-spouse  Married-civ-spouse  Married-spouse-absent
##   1          2                  0                  30                      5
##   2         17                  0                 125                     18
##   3         31                  0                 271                     27
##   4        101                  0                 541                     20
##   5         98                  0                 349                     13
##   6        172                  1                 525                     22
##   7        192                  0                 545                     25
##   8         63                  0                 199                     10
##   9       2416                 15                7243                    199
##   10      1560                  9                4179                    109
##   11       361                  2                1013                     19
##   12       280                  2                 697                     21
##   13       843                  6                4136                     98
##   14       367                  0                1527                     24
##   15        74                  1                 596                      5
##   16        56                  1                 403                     13
##     
##       Never-married  Separated  Widowed
##   1              38          3        5
##   2              50         12       25
##   3             128         31       21
##   4             158         39       96
##   5             220         41       35
##   6             525         75       69
##   7             913         79       58
##   8             348         19       18
##   9            4671        607      633
##   10           4453        317      251
##   11            539         64       63
##   12            522         43       36
##   13           2681        136      125
##   14            635         44       60
##   15            139          9       10
##   16             97         11       13
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Education-num y Occupation :
##     
##       Adm-clerical  Armed-Forces  Craft-repair  Exec-managerial
##   1              3             0             6                1
##   2              6             0            28                6
##   3              8             0            71                6
##   4             20             0           172               28
##   5             20             0           144               23
##   6             59             0           239               42
##   7            100             0           270               51
##   8             52             1            92               18
##   9           2047             5          2911             1192
##   10          1858             4          1258             1288
##   11           269             1           375              234
##   12           281             0           167              240
##   13           765             1           332             2025
##   14           105             2            34              779
##   15            12             1             9               69
##   16             6             0             4               84
##     
##       Farming-fishing  Handlers-cleaners  Machine-op-inspct  Other-service
##   1                17                  5                 12             22
##   2                33                 26                 36             55
##   3                52                 59                 95             98
##   4               106                 66                129            149
##   5                44                 72                102            142
##   6                71                108                152            280
##   7                67                177                153            368
##   8                29                 55                 61            129
##   9               573                943               1531           1936
##   10              253                400                492           1171
##   11               85                 43                 95            160
##   12               25                 34                 51            110
##   13              113                 79                 99            259
##   14               14                  5                 12             35
##   15                7                  0                  1              7
##   16                1                  0                  1              2
##     
##       Priv-house-serv  Prof-specialty  Protective-serv  Sales  Tech-support
##   1                 2              11                0      2             0
##   2                14              22                1      8             0
##   3                20              43                1     17             1
##   4                17             124               11     40             6
##   5                16              73                9     47             3
##   6                 8             164               12    120             5
##   7                18             215               18    232             9
##   8                 8              71               11     70             4
##   9                91            1156              326   1580           270
##   10               26            1481              308   1503           426
##   11                5             328               67    163           181
##   12                3             279               50    209           116
##   13               12            2486              147   1268           346
##   14                1            1369               20    206            61
##   15                0             691                1     23            10
##   16                1             468                1     16             8
##     
##       Transport-moving
##   1                  2
##   2                 12
##   3                 38
##   4                 87
##   5                 61
##   6                129
##   7                134
##   8                 56
##   9               1223
##   10               410
##   11                55
##   12                36
##   13                93
##   14                14
##   15                 3
##   16                 2
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Education-num y Relationship :
##     
##       Husband  Not-in-family  Other-relative  Own-child  Unmarried  Wife
##   1        23             38               6          7          6     3
##   2       111             67              23         11         29     6
##   3       238            105              61         19         65    21
##   4       503            208              46         66        103    29
##   5       305            160              44        108        104    35
##   6       471            303              62        326        181    46
##   7       479            332              90        651        216    44
##   8       166            128              38        239         65    21
##   9      6388           3841             607       2287       1937   724
##   10     3658           2729             299       2573       1151   468
##   11      883            536              38        212        274   118
##   12      581            483              30        198        201   108
##   13     3636           2503             132        774        516   464
##   14     1339            825              19         82        212   180
##   15      558            182               7         17         32    38
##   16      377            143               4         11         33    26
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Education-num y Race :
##     
##       Amer-Indian-Eskimo  Asian-Pac-Islander  Black  Other  White
##   1                    1                   7     12      2     61
##   2                    4                  10     24     13    196
##   3                    2                  28     41     23    415
##   4                   10                  14     90     23    818
##   5                    9                  10    111     15    611
##   6                   22                  16    182     11   1158
##   7                   26                  27    252     22   1485
##   8                    5                  15    105     17    515
##   9                  176                 336   1780    105  13387
##   10                 124                 302   1078     85   9289
##   11                  31                  53    165      9   1803
##   12                  13                  49    161     10   1368
##   13                  29                 408    504     50   7034
##   14                  13                 140    143     13   2348
##   15                   2                  58     21      5    748
##   16                   3                  46     16      3    526
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Education-num y Sex :
##     
##       Female  Male
##   1       24    59
##   2       61   186
##   3      127   382
##   4      239   716
##   5      220   536
##   6      457   932
##   7      650  1162
##   8      211   446
##   9     5097 10687
##   10    4178  6700
##   11     734  1327
##   12     627   974
##   13    2477  5548
##   14     845  1812
##   15     132   702
##   16     113   481
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Education-num y Native-country :
##     
##       Cambodia  Canada  China  Columbia  Cuba  Dominican-Republic  Ecuador
##   1          1       0      2         0     0                   2        0
##   2          1       0      1         2     5                  10        1
##   3          0       1      2         2     8                   5        2
##   4          1       5      5         2     7                  12        2
##   5          0       3      2         4     6                   7        0
##   6          0       2      3         4     4                   4        2
##   7          0       5      0         2     2                   4        1
##   8          1       3      1         3     5                   8        2
##   9          9      51     28        34    33                  26       14
##   10         8      34      9        11    22                  13       11
##   11         2      11      0         5     6                   2        2
##   12         0       4      0         4     4                   1        0
##   13         5      38     29         6    19                   6        5
##   14         0      11     24         1     9                   3        3
##   15         0       3      3         4     6                   0        0
##   16         0      11     13         1     2                   0        0
##     
##       El-Salvador  England  France  Germany  Greece  Guatemala  Haiti
##   1             7        0       0        0       0          2      4
##   2            13        1       0        0       0          9      1
##   3            29        1       0        0       1         12      3
##   4             6        0       1        2       3         11      2
##   5            15        1       0        0       0          6      4
##   6             6        4       0        2       3          6      3
##   7             7        2       0        5       1          7      3
##   8             2        1       0        4       0          3      2
##   9            34       37       6       47      17         17     24
##   10           18       15       6       50       8          9     17
##   11            1        3       1       16       5          2      0
##   12            4        6       3       11       0          1      4
##   13            8       34      10       50       5          3      5
##   14            3       12       7        8       5          0      2
##   15            1        3       1        5       1          0      1
##   16            1        7       3        6       0          0      0
##     
##       Holand-Netherlands  Honduras  Hong  Hungary  India  Iran  Ireland  Italy
##   1                    0         0     1        0      1     0        0      0
##   2                    0         1     0        0      0     0        0      4
##   3                    0         3     1        1      0     0        0     12
##   4                    0         0     1        0      0     0        2      7
##   5                    0         0     2        0      1     0        1      0
##   6                    0         1     0        0      2     0        0      3
##   7                    0         2     1        0      8     0        2      1
##   8                    0         0     0        0      0     0        0      7
##   9                    0         5     7        6      9     7       17     32
##   10                   1         5     1        2     17    12        3     10
##   11                   0         0     1        1      3     2        2      2
##   12                   0         0     2        1      5     5        1      4
##   13                   0         2     6        5     37    18        8     14
##   14                   0         0     5        2     36    10        1      6
##   15                   0         1     0        1     22     0        0      2
##   16                   0         0     2        0     10     5        0      1
##     
##       Jamaica  Japan  Laos  Mexico  Nicaragua  Outlying-US(Guam-USVI-etc)  Peru
##   1         1      0     1      31          1                           0     0
##   2         0      0     0     103          1                           0     0
##   3         1      1     3     216          3                           0     0
##   4         1      0     1      79          1                           0     2
##   5         2      0     0      76          1                           2     0
##   6         2      1     0      42          1                           0     2
##   7         5      0     1      40          3                           1     3
##   8         2      0     0      23          1                           0     0
##   9        38     23     6     178         15                           4    16
##   10       27     19     4      91         13                           9    10
##   11        3      2     2      13          1                           1     1
##   12        8      2     1       4          1                           1     2
##   13       10     26     3      43          3                           5     8
##   14        4     11     0       9          3                           0     2
##   15        0      5     1       2          0                           0     0
##   16        2      2     0       1          1                           0     0
##     
##       Philippines  Poland  Portugal  Puerto-Rico  Scotland  South  Taiwan
##   1             2       0         0            1         0      1       0
##   2             5       1         9            5         0      0       0
##   3            10       0         1            7         1      0       0
##   4             3       5        11           14         0      1       0
##   5             3       1         6            9         0      0       0
##   6             1       1         6            4         0      0       0
##   7             9       3         2           18         1      3       0
##   8             1       0         1            3         0      2       0
##   9            53      28        24           68         8     37       5
##   10           58      16         1           31         4     24       6
##   11            9       7         2            3         0      3       0
##   12           11       3         1            3         1      1       1
##   13          105      13         2           17         4     30      22
##   14           11       8         1            1         2      9      16
##   15           14       0         0            0         0      2       4
##   16            0       1         0            0         0      2      11
##     
##       Thailand  Trinadad&Tobago  United-States  Vietnam  Yugoslavia
##   1          0                0             25        0           0
##   2          1                0             70        3           0
##   3          0                1            178        3           1
##   4          0                1            765        1           1
##   5          0                3            599        1           1
##   6          0                0           1276        3           1
##   7          0                1           1666        3           0
##   8          1                1            579        1           0
##   9         10                9          14768       25           9
##   10         3                7          10249       21           3
##   11         2                0           1941        3           1
##   12         3                1           1492        3           2
##   13         6                0           7394       17           4
##   14         2                3           2427        0           0
##   15         1                0            750        1           0
##   16         1                0            510        1           0
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Marital-status y Occupation :
##                         
##                           Adm-clerical  Armed-Forces  Craft-repair
##    Divorced                       1192             0           679
##    Married-AF-spouse                 6             0             4
##    Married-civ-spouse             1495             7          3818
##    Married-spouse-absent            84             0            77
##    Never-married                  2360             8          1301
##    Separated                       224             0           160
##    Widowed                         250             0            73
##                         
##                           Exec-managerial  Farming-fishing  Handlers-cleaners
##    Divorced                           890               90                197
##    Married-AF-spouse                    3                1                  1
##    Married-civ-spouse                3600              869                724
##    Married-spouse-absent               52               35                 32
##    Never-married                     1260              434               1029
##    Separated                          126               23                 63
##    Widowed                            155               38                 26
##                         
##                           Machine-op-inspct  Other-service  Priv-house-serv
##    Divorced                             434            762               46
##    Married-AF-spouse                      1              5                0
##    Married-civ-spouse                  1469           1088               27
##    Married-spouse-absent                 37             92                9
##    Never-married                        872           2442               99
##    Separated                            123            275               21
##    Widowed                               86            259               40
##                         
##                           Prof-specialty  Protective-serv  Sales  Tech-support
##    Divorced                         1065              121    664           239
##    Married-AF-spouse                   9                1      5             0
##    Married-civ-spouse               4110              583   2491           609
##    Married-spouse-absent             109                7     55             9
##    Never-married                    3091              237   1992           506
##    Separated                         242               23    146            48
##    Widowed                           355               11    151            35
##                         
##                           Transport-moving
##    Divorced                            254
##    Married-AF-spouse                     1
##    Married-civ-spouse                 1489
##    Married-spouse-absent                30
##    Never-married                       486
##    Separated                            56
##    Widowed                              39
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Marital-status y Relationship :
##                         
##                           Husband  Not-in-family  Other-relative  Own-child
##    Divorced                     0           3628             181        455
##    Married-AF-spouse           12              0               1          1
##    Married-civ-spouse       19704             23             201        143
##    Married-spouse-absent        0            330              54         61
##    Never-married                0           7114             920       6750
##    Separated                    0            637              79        146
##    Widowed                      0            851              70         25
##                         
##                           Unmarried  Wife
##    Divorced                    2369     0
##    Married-AF-spouse              0    23
##    Married-civ-spouse             0  2308
##    Married-spouse-absent        183     0
##    Never-married               1333     0
##    Separated                    668     0
##    Widowed                      572     0
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Marital-status y Race :
##                         
##                           Amer-Indian-Eskimo  Asian-Pac-Islander  Black  Other
##    Divorced                               90                 108    709     42
##    Married-AF-spouse                       0                   1      3      0
##    Married-civ-spouse                    168                 737   1263    157
##    Married-spouse-absent                  12                  64     89     17
##    Never-married                         163                 544   2032    160
##    Separated                              17                  26    396     21
##    Widowed                                20                  39    193      9
##                         
##                           White
##    Divorced                5684
##    Married-AF-spouse         33
##    Married-civ-spouse     20054
##    Married-spouse-absent    446
##    Never-married          13218
##    Separated               1070
##    Widowed                 1257
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Marital-status y Sex :
##                         
##                           Female  Male
##    Divorced                 4001  2632
##    Married-AF-spouse          25    12
##    Married-civ-spouse       2480 19899
##    Married-spouse-absent     304   324
##    Never-married            7218  8899
##    Separated                 931   599
##    Widowed                  1233   285
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Marital-status y Native-country :
##                         
##                           Cambodia  Canada  China  Columbia  Cuba
##    Divorced                      1      27      6         9    22
##    Married-AF-spouse             0       0      0         0     0
##    Married-civ-spouse           16      93     81        33    73
##    Married-spouse-absent         1       4      8         6     3
##    Never-married                 9      47     21        27    26
##    Separated                     0       5      3         7     9
##    Widowed                       1       6      3         3     5
##                         
##                           Dominican-Republic  Ecuador  El-Salvador  England
##    Divorced                               13        5            6       22
##    Married-AF-spouse                       0        0            0        0
##    Married-civ-spouse                     36       22           52       54
##    Married-spouse-absent                  11        3            8        1
##    Never-married                          36       11           76       41
##    Separated                               6        3            9        2
##    Widowed                                 1        1            4        7
##                         
##                           France  Germany  Greece  Guatemala  Haiti
##    Divorced                    8       35       3          4      7
##    Married-AF-spouse           0        0       0          0      0
##    Married-civ-spouse         16       94      35         22     30
##    Married-spouse-absent       2        4       0          7      7
##    Never-married              11       58       9         45     26
##    Separated                   0        7       0          6      3
##    Widowed                     1        8       2          4      2
##                         
##                           Holand-Netherlands  Honduras  Hong  Hungary  India
##    Divorced                                0         5     0        2      4
##    Married-AF-spouse                       0         0     0        0      0
##    Married-civ-spouse                      0         4    19        9     89
##    Married-spouse-absent                   0         2     0        1     15
##    Never-married                           1         6    10        5     38
##    Separated                               0         3     1        0      2
##    Widowed                                 0         0     0        2      3
##                         
##                           Iran  Ireland  Italy  Jamaica  Japan  Laos  Mexico
##    Divorced                  7        3      5        9     13     0      42
##    Married-AF-spouse         0        0      0        0      0     0       0
##    Married-civ-spouse       33       14     73       32     48    13     461
##    Married-spouse-absent     1        2      2        8      2     2      55
##    Never-married            14       17     16       48     26     8     330
##    Separated                 2        1      3        8      2     0      45
##    Widowed                   2        0      6        1      1     0      18
##                         
##                           Nicaragua  Outlying-US(Guam-USVI-etc)  Peru
##    Divorced                       4                           5     5
##    Married-AF-spouse              0                           0     0
##    Married-civ-spouse            18                           5    17
##    Married-spouse-absent          0                           0     3
##    Never-married                 20                          11    17
##    Separated                      4                           1     4
##    Widowed                        3                           1     0
##                         
##                           Philippines  Poland  Portugal  Puerto-Rico  Scotland
##    Divorced                        21       4         7           20         3
##    Married-AF-spouse                1       0         0            0         0
##    Married-civ-spouse             146      44        41           68        10
##    Married-spouse-absent           13      10         0            9         0
##    Never-married                   96      21        13           59         6
##    Separated                       10       2         3           18         2
##    Widowed                          8       6         3           10         0
##                         
##                           South  Taiwan  Thailand  Trinadad&Tobago
##    Divorced                  11       1         3                2
##    Married-AF-spouse          0       0         0                0
##    Married-civ-spouse        54      36        13               14
##    Married-spouse-absent      4       3         2                0
##    Never-married             39      25        10               10
##    Separated                  2       0         0                1
##    Widowed                    5       0         2                0
##                         
##                           United-States  Vietnam  Yugoslavia
##    Divorced                        6280        7           2
##    Married-AF-spouse                 36        0           0
##    Married-civ-spouse             20411       35          15
##    Married-spouse-absent            426        3           0
##    Never-married                  14782       41           5
##    Separated                       1356        0           0
##    Widowed                         1398        0           1
## chi cuadrado p-valor: 0.0009995002 
## 
## Tabla de contingencia entre Occupation y Relationship :
##                     
##                       Husband  Not-in-family  Other-relative  Own-child
##    Adm-clerical           932           1730             211       1144
##    Armed-Forces             6              5               2          2
##    Craft-repair          3731           1217             148        587
##    Exec-managerial       3231           1564              74        349
##    Farming-fishing        841            277              52        210
##    Handlers-cleaners      673            458             138        618
##    Machine-op-inspct     1323            720             120        401
##    Other-service          772           1465             279       1305
##    Priv-house-serv          2             97              30         35
##    Prof-specialty        3404           2575             185       1344
##    Protective-serv        563            220              22        113
##    Sales                 2259           1344             166       1058
##    Tech-support           529            447              36        181
##    Transport-moving      1450            464              43        234
##                     
##                       Unmarried  Wife
##    Adm-clerical            1072   522
##    Armed-Forces               0     0
##    Craft-repair             381    48
##    Exec-managerial          523   345
##    Farming-fishing           92    18
##    Handlers-cleaners        156    29
##    Machine-op-inspct        340   118
##    Other-service            832   270
##    Priv-house-serv           61    17
##    Prof-specialty           812   661
##    Protective-serv           55    10
##    Sales                    481   196
##    Tech-support             180    73
##    Transport-moving         140    24
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Occupation y Race :
##                     
##                       Amer-Indian-Eskimo  Asian-Pac-Islander  Black  Other
##    Adm-clerical                       54                 198    738     43
##    Armed-Forces                        1                   0      1      0
##    Craft-repair                       61                 118    371     58
##    Exec-managerial                    48                 193    351     20
##    Farming-fishing                    14                  23     56     14
##    Handlers-cleaners                  34                  39    262     18
##    Machine-op-inspct                  28                  85    429     47
##    Other-service                      61                 184    824     51
##    Priv-house-serv                     1                   5     51      4
##    Prof-specialty                     83                 385    700     76
##    Protective-serv                    13                  21    149      8
##    Sales                              36                 167    353     38
##    Tech-support                        8                  62    132      9
##    Transport-moving                   28                  39    268     20
##                     
##                       White
##    Adm-clerical        4578
##    Armed-Forces          13
##    Craft-repair        5504
##    Exec-managerial     5474
##    Farming-fishing     1383
##    Handlers-cleaners   1719
##    Machine-op-inspct   2433
##    Other-service       3803
##    Priv-house-serv      181
##    Prof-specialty      7737
##    Protective-serv      792
##    Sales               4910
##    Tech-support        1235
##    Transport-moving    2000
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Occupation y Sex :
##                     
##                       Female  Male
##    Adm-clerical         3769  1842
##    Armed-Forces            0    15
##    Craft-repair          323  5789
##    Exec-managerial      1748  4338
##    Farming-fishing        95  1395
##    Handlers-cleaners     254  1818
##    Machine-op-inspct     804  2218
##    Other-service        2698  2225
##    Priv-house-serv       228    14
##    Prof-specialty       3515  5466
##    Protective-serv       122   861
##    Sales                1947  3557
##    Tech-support          562   884
##    Transport-moving      127  2228
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Occupation y Native-country :
##                     
##                       Cambodia  Canada  China  Columbia  Cuba
##    Adm-clerical              0      14      4         9    19
##    Armed-Forces              0       0      0         0     0
##    Craft-repair              9      23      5        13     8
##    Exec-managerial           1      20     16         6    20
##    Farming-fishing           1       3      0         1     3
##    Handlers-cleaners         0       2      0         5     5
##    Machine-op-inspct         5       8     10        14     9
##    Other-service             1      20     26        11    17
##    Priv-house-serv           0       0      0         3     2
##    Prof-specialty            6      60     49        11    22
##    Protective-serv           0       2      0         0     7
##    Sales                     4      17      6         4    15
##    Tech-support              0       4      4         3     1
##    Transport-moving          1       9      2         5    10
##                     
##                       Dominican-Republic  Ecuador  El-Salvador  England  France
##    Adm-clerical                        9        4            4        9       3
##    Armed-Forces                        0        0            0        0       0
##    Craft-repair                       11        8           23       12       1
##    Exec-managerial                     4        4            2       34       9
##    Farming-fishing                     0        0            3        1       1
##    Handlers-cleaners                   7        2           11        2       0
##    Machine-op-inspct                  27        9            7        3       0
##    Other-service                      19        4           56        9       2
##    Priv-house-serv                     1        1           11        4       2
##    Prof-specialty                      9        6           15       35      13
##    Protective-serv                     1        0            1        3       1
##    Sales                              10        3           12        9       2
##    Tech-support                        0        1            2        5       4
##    Transport-moving                    5        3            8        1       0
##                     
##                       Germany  Greece  Guatemala  Haiti  Holand-Netherlands
##    Adm-clerical            32       2          4      7                   0
##    Armed-Forces             0       0          0      0                   0
##    Craft-repair            23       8         15      6                   0
##    Exec-managerial         29      18          2      0                   0
##    Farming-fishing          3       0          5      2                   0
##    Handlers-cleaners        5       1         11      3                   0
##    Machine-op-inspct        8       2         12      7                   1
##    Other-service           12       6         10     26                   0
##    Priv-house-serv          1       0         14      1                   0
##    Prof-specialty          53       3          4     12                   0
##    Protective-serv          7       0          1      1                   0
##    Sales                   20       6          6      3                   0
##    Tech-support             8       1          2      1                   0
##    Transport-moving         5       2          2      6                   0
##                     
##                       Honduras  Hong  Hungary  India  Iran  Ireland  Italy
##    Adm-clerical              2     4        1     18     3        2     13
##    Armed-Forces              0     0        0      0     0        0      0
##    Craft-repair              3     5        5     10     5       10     19
##    Exec-managerial           1     4        4     16    12        4      8
##    Farming-fishing           0     1        0      0     0        1      2
##    Handlers-cleaners         2     0        0      3     0        2      7
##    Machine-op-inspct         2     2        0      5     4        4     11
##    Other-service             2     2        1      4     6        4     14
##    Priv-house-serv           1     0        1      0     0        0      0
##    Prof-specialty            3     8        5     61    18        8     20
##    Protective-serv           1     1        0      4     0        0      0
##    Sales                     3     2        0     21     8        2      7
##    Tech-support              0     1        1      7     1        0      1
##    Transport-moving          0     0        1      2     2        0      3
##                     
##                       Jamaica  Japan  Laos  Mexico  Nicaragua
##    Adm-clerical            21     13     4      53         10
##    Armed-Forces             0      0     0       0          0
##    Craft-repair             7      7     5     149          6
##    Exec-managerial         12     23     1      26          0
##    Farming-fishing          0      1     0     124          0
##    Handlers-cleaners        4      2     0     108          5
##    Machine-op-inspct        2      2     8     141          7
##    Other-service           25     13     1     160          8
##    Priv-house-serv          2      0     0      23          3
##    Prof-specialty          14     19     2      72          3
##    Protective-serv          1      1     0       5          0
##    Sales                    8      9     2      53          2
##    Tech-support             5      2     0       7          3
##    Transport-moving         5      0     0      30          2
##                     
##                       Outlying-US(Guam-USVI-etc)  Peru  Philippines  Poland
##    Adm-clerical                                3     1           56       3
##    Armed-Forces                                0     0            0       0
##    Craft-repair                                2     3           22      22
##    Exec-managerial                             4     2           18       6
##    Farming-fishing                             0     1            6       1
##    Handlers-cleaners                           3     4           15       8
##    Machine-op-inspct                           1     6           20      10
##    Other-service                               4    11           52       8
##    Priv-house-serv                             0     0            2       2
##    Prof-specialty                              3     7           58      15
##    Protective-serv                             0     2            5       1
##    Sales                                       2     6           22       6
##    Tech-support                                0     1           15       2
##    Transport-moving                            1     2            4       3
##                     
##                       Portugal  Puerto-Rico  Scotland  South  Taiwan  Thailand
##    Adm-clerical              6           29         3      5       3         3
##    Armed-Forces              0            0         0      0       0         0
##    Craft-repair             18           21         0     11       2         2
##    Exec-managerial           3           14         4     19      13         6
##    Farming-fishing           3            8         0      0       0         1
##    Handlers-cleaners         6            7         0      3       0         0
##    Machine-op-inspct        15           22         2      4       0         1
##    Other-service             6           33         5     15       2         8
##    Priv-house-serv           0            2         0      0       0         1
##    Prof-specialty            5           21         4     28      36         5
##    Protective-serv           1            4         2      0       0         1
##    Sales                     3           10         1     29       7         1
##    Tech-support              0            2         0      0       2         0
##    Transport-moving          1           11         0      1       0         1
##                     
##                       Trinadad&Tobago  United-States  Vietnam  Yugoslavia
##    Adm-clerical                     6           5211       17           1
##    Armed-Forces                     0             15        0           0
##    Craft-repair                     2           5595       13           3
##    Exec-managerial                  1           5708        4           8
##    Farming-fishing                  0           1315        2           1
##    Handlers-cleaners                0           1834        5           0
##    Machine-op-inspct                4           2613       11           3
##    Other-service                    6           4298       11           5
##    Priv-house-serv                  0            164        0           1
##    Prof-specialty                   3           8256        9           0
##    Protective-serv                  1            929        0           0
##    Sales                            1           5174        8           0
##    Tech-support                     2           1354        4           0
##    Transport-moving                 1           2223        2           1
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Relationship y Race :
##                  
##                    Amer-Indian-Eskimo  Asian-Pac-Islander  Black  Other  White
##    Husband                        135                 589   1011    123  17858
##    Not-in-family                  125                 325   1223    105  10805
##    Other-relative                  22                 123    242     41   1078
##    Own-child                       67                 249    839     57   6369
##    Unmarried                       93                 130   1144     57   3701
##    Wife                            28                 103    226     23   1951
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Relationship y Sex :
##                  
##                    Female  Male
##    Husband              1 19715
##    Not-in-family     5870  6713
##    Other-relative     689   817
##    Own-child         3376  4205
##    Unmarried         3928  1197
##    Wife              2328     3
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Relationship y Native-country :
##                  
##                    Cambodia  Canada  China  Columbia  Cuba  Dominican-Republic
##    Husband               11      77     63        27    64                  28
##    Not-in-family          7      55     24        23    28                  24
##    Other-relative         5       1     11         7     6                  10
##    Own-child              2      16      7         8    18                  15
##    Unmarried              2      18      6        17    15                  20
##    Wife                   1      15     11         3     7                   6
##                  
##                    Ecuador  El-Salvador  England  France  Germany  Greece
##    Husband              17           41       44      15       82      31
##    Not-in-family         9           29       47      14       54       8
##    Other-relative       10           29        1       0        3       2
##    Own-child             2           26       16       3       26       4
##    Unmarried             6           23       10       5       29       2
##    Wife                  1            7        9       1       12       2
##                  
##                    Guatemala  Haiti  Holand-Netherlands  Honduras  Hong
##    Husband                21     24                   0         1    12
##    Not-in-family          29     12                   0         4     7
##    Other-relative         17     10                   1         3     3
##    Own-child               9      8                   0         1     0
##    Unmarried              11     16                   0         8     1
##    Wife                    1      5                   0         3     7
##                  
##                    Hungary  India  Iran  Ireland  Italy  Jamaica  Japan  Laos
##    Husband               8     82    29       13     63       27     41     9
##    Not-in-family         9     34    12       19     18       30     29     4
##    Other-relative        1     10     1        2      3       11      1     1
##    Own-child             0     14     7        1      9       13      7     3
##    Unmarried             0      8     6        2      4       21      8     2
##    Wife                  1      3     4        0      8        4      6     4
##                  
##                    Mexico  Nicaragua  Outlying-US(Guam-USVI-etc)  Peru
##    Husband            400         13                           4    14
##    Not-in-family      183          4                          12     7
##    Other-relative     133         10                           1     3
##    Own-child           84          9                           2    10
##    Unmarried          123         10                           3     9
##    Wife                28          3                           1     3
##                  
##                    Philippines  Poland  Portugal  Puerto-Rico  Scotland  South
##    Husband                 107      38        36           52         7     42
##    Not-in-family            52      24         8           48         5     18
##    Other-relative           28       5         1           12         1     12
##    Own-child                53       8        10           16         1     21
##    Unmarried                29       8         7           40         4     14
##    Wife                     26       4         5           16         3      8
##                  
##                    Taiwan  Thailand  Trinadad&Tobago  United-States  Vietnam
##    Husband             30        10               11          18078       30
##    Not-in-family       14         8                6          11646       15
##    Other-relative       2         2                2           1133       12
##    Own-child           10         3                1           7122       13
##    Unmarried            4         5                4           4611       13
##    Wife                 5         2                3           2099        3
##                  
##                    Yugoslavia
##    Husband                 14
##    Not-in-family            4
##    Other-relative           0
##    Own-child                3
##    Unmarried                1
##    Wife                     1
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Race y Sex :
##                      
##                        Female  Male
##    Amer-Indian-Eskimo     185   285
##    Asian-Pac-Islander     517  1002
##    Black                 2308  2377
##    Other                  155   251
##    White                13027 28735
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Race y Native-country :
##                      
##                        Cambodia  Canada  China  Columbia  Cuba
##    Amer-Indian-Eskimo         0       0      0         1     0
##    Asian-Pac-Islander        24       1    119         0     0
##    Black                      2       0      0         0     5
##    Other                      0       2      0         9     3
##    White                      2     179      3        75   130
##                      
##                        Dominican-Republic  Ecuador  El-Salvador  England
##    Amer-Indian-Eskimo                   0        0            0        0
##    Asian-Pac-Islander                   1        0            0        2
##    Black                               18        1            1       11
##    Other                               21       12            8        1
##    White                               63       32          146      113
##                      
##                        France  Germany  Greece  Guatemala  Haiti
##    Amer-Indian-Eskimo       0        1       0          0      0
##    Asian-Pac-Islander       1        4       1          1      1
##    Black                    1       11       0          0     72
##    Other                    1        1       0          8      0
##    White                   35      189      48         79      2
##                      
##                        Holand-Netherlands  Honduras  Hong  Hungary  India  Iran
##    Amer-Indian-Eskimo                   0         0     1        0      0     0
##    Asian-Pac-Islander                   0         0    25        0    124     9
##    Black                                0         3     0        0      2     0
##    Other                                0         1     0        0      9     6
##    White                                1        16     4       19     16    44
##                      
##                        Ireland  Italy  Jamaica  Japan  Laos  Mexico  Nicaragua
##    Amer-Indian-Eskimo        0      0        0      0     0      11          0
##    Asian-Pac-Islander        1      0        0     53    23       1          1
##    Black                     0      0       98      4     0       5          3
##    Other                     0      0        3      3     0      63          4
##    White                    36    105        5     32     0     871         41
##                      
##                        Outlying-US(Guam-USVI-etc)  Peru  Philippines  Poland
##    Amer-Indian-Eskimo                           0     0            1       0
##    Asian-Pac-Islander                           3     0          279       2
##    Black                                        8     0            2       0
##    Other                                        0     4            0       0
##    White                                       12    42           13      85
##                      
##                        Portugal  Puerto-Rico  Scotland  South  Taiwan  Thailand
##    Amer-Indian-Eskimo         0            1         0      2       0         0
##    Asian-Pac-Islander         1            1         0    110      61        25
##    Black                      0           13         1      1       0         1
##    Other                      1           30         0      1       1         0
##    White                     65          139        20      1       3         4
##                      
##                        Trinadad&Tobago  United-States  Vietnam  Yugoslavia
##    Amer-Indian-Eskimo                0            452        0           0
##    Asian-Pac-Islander                4            557       84           0
##    Black                            21           4401        0           0
##    Other                             1            213        0           0
##    White                             1          39066        2          23
## chi cuadrado p-valor: 0.0004997501 
## 
## Tabla de contingencia entre Sex y Native-country :
##          
##            Cambodia  Canada  China  Columbia  Cuba  Dominican-Republic  Ecuador
##    Female         6      63     33        32    50                  48       16
##    Male          22     119     89        53    88                  55       29
##          
##            El-Salvador  England  France  Germany  Greece  Guatemala  Haiti
##    Female           54       45      14       87       9         26     31
##    Male            101       82      24      119      40         62     44
##          
##            Holand-Netherlands  Honduras  Hong  Hungary  India  Iran  Ireland
##    Female                   1        11    11        7     18    12        9
##    Male                     0         9    19       12    133    47       28
##          
##            Italy  Jamaica  Japan  Laos  Mexico  Nicaragua
##    Female     28       58     31     9     215         22
##    Male       77       48     61    14     736         27
##          
##            Outlying-US(Guam-USVI-etc)  Peru  Philippines  Poland  Portugal
##    Female                           9    18          115      24        14
##    Male                            14    28          180      63        53
##          
##            Puerto-Rico  Scotland  South  Taiwan  Thailand  Trinadad&Tobago
##    Female           75         8     44      19        14               14
##    Male            109        13     71      46        16               13
##          
##            United-States  Vietnam  Yugoslavia
##    Female          14857       30           5
##    Male            29832       56          18
## chi cuadrado p-valor: 0.0004997501

Podemos observar que los p-valores observados son menores a 0.05 por lo tanto existen diferencias significativas entre las variables categoricas y pueden estar relacionadas entre si.