Facultad de Derecho y Ciencia Politica

Escuela de Ciencia Política

Guia de Clase de ESTADISTICA


NOTAS: Recodificacion

# comentario: enlace está entre comillas

link="https://docs.google.com/spreadsheets/d/e/2PACX-1vTFp8tPWkUD3qMcuXsqySAHJUZBoIjiFb_pyIJfTiQAK070YNo8G__7wOD_nl_UPYdWnMbW7I5VbRxr/pub?gid=477181888&single=true&output=csv"

# comentario: funcion read.csv le entrega datos al objeto 'sere19':
escolares=read.csv(link, stringsAsFactors = F,na.strings = '')

Verificando tipo de datos:

str(escolares)
## 'data.frame':    600 obs. of  13 variables:
##  $ ID    : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ SEX   : chr  "HOMBRE" "MUJER" "HOMBRE" "HOMBRE" ...
##  $ RACE  : chr  "ASIATICO" "ASIATICO" "ASIATICO" "ASIATICO" ...
##  $ SES   : chr  "ALTO" "ALTO" "ALTO" "MEDIO" ...
##  $ SCTYP : chr  "PUBLICA" "PUBLICA" "PUBLICA" "PUBLICA" ...
##  $ LOCUS : num  0.29 -0.42 0.71 0.06 0.22 0.46 0.44 0.68 0.06 0.05 ...
##  $ CONCPT: num  0.88 0.03 0.03 0.03 -0.28 0.03 -0.47 0.25 0.56 0.15 ...
##  $ MOT   : num  0.67 0.33 0.67 0 0 0 0.33 1 0.33 1 ...
##  $ RDG   : num  33.6 46.9 41.6 38.9 36.3 49.5 62.7 44.2 46.9 44.2 ...
##  $ WRTG  : num  43.7 35.9 59.3 41.1 48.9 46.3 64.5 51.5 41.1 49.5 ...
##  $ MATH  : num  40.2 41.9 41.9 32.7 39.5 46.2 48 36.9 45.3 40.5 ...
##  $ SCI   : num  39 36.3 44.4 41.7 41.7 41.7 63.4 49.8 47.1 39 ...
##  $ CIV   : num  40.6 45.6 45.6 40.6 45.6 35.6 55.6 55.6 55.6 50.6 ...

Si uno piden resumen estadistico aqui, obtiene esto:

summary(escolares)
##        ID            SEX                RACE               SES           
##  Min.   :  1.0   Length:600         Length:600         Length:600        
##  1st Qu.:150.8   Class :character   Class :character   Class :character  
##  Median :300.5   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :300.5                                                           
##  3rd Qu.:450.2                                                           
##  Max.   :600.0                                                           
##     SCTYP               LOCUS              CONCPT         
##  Length:600         Min.   :-2.23000   Min.   :-2.620000  
##  Class :character   1st Qu.:-0.37250   1st Qu.:-0.300000  
##  Mode  :character   Median : 0.21000   Median : 0.030000  
##                     Mean   : 0.09653   Mean   : 0.004917  
##                     3rd Qu.: 0.51000   3rd Qu.: 0.440000  
##                     Max.   : 1.36000   Max.   : 1.190000  
##       MOT              RDG            WRTG            MATH      
##  Min.   :0.0000   Min.   :28.3   Min.   :25.50   Min.   :31.80  
##  1st Qu.:0.3300   1st Qu.:44.2   1st Qu.:44.30   1st Qu.:44.50  
##  Median :0.6700   Median :52.1   Median :54.10   Median :51.30  
##  Mean   :0.6608   Mean   :51.9   Mean   :52.38   Mean   :51.85  
##  3rd Qu.:1.0000   3rd Qu.:60.1   3rd Qu.:59.90   3rd Qu.:58.38  
##  Max.   :1.0000   Max.   :76.0   Max.   :67.10   Max.   :75.50  
##       SCI             CIV       
##  Min.   :26.00   Min.   :25.70  
##  1st Qu.:44.40   1st Qu.:45.60  
##  Median :52.60   Median :50.60  
##  Mean   :51.76   Mean   :52.05  
##  3rd Qu.:58.65   3rd Qu.:60.50  
##  Max.   :74.20   Max.   :70.50

Las numericas estan bien, pero si tuvieras que transformarlas:

# de la columa 6 a la 13:
# aplicar funcion as.numeric
escolares[,c(6:13)]=lapply(escolares[,c(6:13)], as.numeric)

Aqui si cambiamos la nominales:

# de la columa 1 a la 3 y la 5:
# aplicar as.factor
escolares[,c(1:3,5)]=lapply(escolares[,c(1:3,5)], as.factor)

Aqui las Ordinales:

Las vemos:

table(escolares$SES)
## 
##  ALTO  BAJO MEDIO 
##   139   162   299

Usemos dplyr (instalalo si no lo tienes)

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
escolares$SES= recode(escolares$SES,
       'ALTO'='3_alto',
       'MEDIO'='2_medio',
       'BAJO'='1_bajo')

# poner numero delante, ayuda a crear una ordinal
escolares$SES=as.ordered(escolares$SES)
str(escolares)
## 'data.frame':    600 obs. of  13 variables:
##  $ ID    : Factor w/ 600 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ SEX   : Factor w/ 2 levels "HOMBRE","MUJER": 1 2 1 1 1 2 2 1 2 1 ...
##  $ RACE  : Factor w/ 4 levels "ASIATICO","BLANCO",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ SES   : Ord.factor w/ 3 levels "1_bajo"<"2_medio"<..: 3 3 3 2 2 2 3 3 2 3 ...
##  $ SCTYP : Factor w/ 2 levels "PRIVADA","PUBLICA": 2 2 2 2 2 2 2 2 2 2 ...
##  $ LOCUS : num  0.29 -0.42 0.71 0.06 0.22 0.46 0.44 0.68 0.06 0.05 ...
##  $ CONCPT: num  0.88 0.03 0.03 0.03 -0.28 0.03 -0.47 0.25 0.56 0.15 ...
##  $ MOT   : num  0.67 0.33 0.67 0 0 0 0.33 1 0.33 1 ...
##  $ RDG   : num  33.6 46.9 41.6 38.9 36.3 49.5 62.7 44.2 46.9 44.2 ...
##  $ WRTG  : num  43.7 35.9 59.3 41.1 48.9 46.3 64.5 51.5 41.1 49.5 ...
##  $ MATH  : num  40.2 41.9 41.9 32.7 39.5 46.2 48 36.9 45.3 40.5 ...
##  $ SCI   : num  39 36.3 44.4 41.7 41.7 41.7 63.4 49.8 47.1 39 ...
##  $ CIV   : num  40.6 45.6 45.6 40.6 45.6 35.6 55.6 55.6 55.6 50.6 ...

Compara tambien:

summary(escolares)
##        ID          SEX            RACE          SES          SCTYP    
##  1      :  1   HOMBRE:327   ASIATICO: 71   1_bajo :162   PRIVADA: 94  
##  2      :  1   MUJER :273   BLANCO  : 34   2_medio:299   PUBLICA:506  
##  3      :  1                HISPANO :437   3_alto :139                
##  4      :  1                NEGRO   : 58                              
##  5      :  1                                                          
##  6      :  1                                                          
##  (Other):594                                                          
##      LOCUS              CONCPT               MOT              RDG      
##  Min.   :-2.23000   Min.   :-2.620000   Min.   :0.0000   Min.   :28.3  
##  1st Qu.:-0.37250   1st Qu.:-0.300000   1st Qu.:0.3300   1st Qu.:44.2  
##  Median : 0.21000   Median : 0.030000   Median :0.6700   Median :52.1  
##  Mean   : 0.09653   Mean   : 0.004917   Mean   :0.6608   Mean   :51.9  
##  3rd Qu.: 0.51000   3rd Qu.: 0.440000   3rd Qu.:1.0000   3rd Qu.:60.1  
##  Max.   : 1.36000   Max.   : 1.190000   Max.   :1.0000   Max.   :76.0  
##                                                                        
##       WRTG            MATH            SCI             CIV       
##  Min.   :25.50   Min.   :31.80   Min.   :26.00   Min.   :25.70  
##  1st Qu.:44.30   1st Qu.:44.50   1st Qu.:44.40   1st Qu.:45.60  
##  Median :54.10   Median :51.30   Median :52.60   Median :50.60  
##  Mean   :52.38   Mean   :51.85   Mean   :51.76   Mean   :52.05  
##  3rd Qu.:59.90   3rd Qu.:58.38   3rd Qu.:58.65   3rd Qu.:60.50  
##  Max.   :67.10   Max.   :75.50   Max.   :74.20   Max.   :70.50  
## 

De aqui ya puedes hacer estadistica!!

Volver al indice