Estadisticas descriptivas

En el presente documento se analiza la base de daos de un banco, solamnete los 10 principales atributos.

Paso 1: Importación de datos

A continuación importamos los datos en formato csv

BD <- read.csv("~/Angela/BA_PUCP/2.Programacion_Estadistica/BankChurners.csv")
summary(BD)
data <- BD[,2:11]
summary(data)
##  Attrition_Flag      Customer_Age      Gender          Dependent_count
##  Length:10127       Min.   :26.00   Length:10127       Min.   :0.000  
##  Class :character   1st Qu.:41.00   Class :character   1st Qu.:1.000  
##  Mode  :character   Median :46.00   Mode  :character   Median :2.000  
##                     Mean   :46.33                      Mean   :2.346  
##                     3rd Qu.:52.00                      3rd Qu.:3.000  
##                     Max.   :73.00                      Max.   :5.000  
##  Education_Level    Marital_Status     Income_Category    Card_Category     
##  Length:10127       Length:10127       Length:10127       Length:10127      
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##  Months_on_book  Total_Relationship_Count
##  Min.   :13.00   Min.   :1.000           
##  1st Qu.:31.00   1st Qu.:3.000           
##  Median :36.00   Median :4.000           
##  Mean   :35.93   Mean   :3.813           
##  3rd Qu.:40.00   3rd Qu.:5.000           
##  Max.   :56.00   Max.   :6.000

#Paso2: Generar un codigo para lectura de datos mediante uso de for, if e if.numeric

str(data)
## 'data.frame':    10127 obs. of  10 variables:
##  $ Attrition_Flag          : chr  "Existing Customer" "Existing Customer" "Existing Customer" "Existing Customer" ...
##  $ Customer_Age            : int  45 49 51 40 40 44 51 32 37 48 ...
##  $ Gender                  : chr  "M" "F" "M" "F" ...
##  $ Dependent_count         : int  3 5 3 4 3 2 4 0 3 2 ...
##  $ Education_Level         : chr  "High School" "Graduate" "Graduate" "High School" ...
##  $ Marital_Status          : chr  "Married" "Single" "Married" "Unknown" ...
##  $ Income_Category         : chr  "$60K - $80K" "Less than $40K" "$80K - $120K" "Less than $40K" ...
##  $ Card_Category           : chr  "Blue" "Blue" "Blue" "Blue" ...
##  $ Months_on_book          : int  39 44 36 34 21 36 46 27 36 36 ...
##  $ Total_Relationship_Count: int  5 6 4 3 5 3 6 2 5 6 ...
columnas <-dim(data)[2]
par(mfrow=c(2,columnas/2))
for (i in 1:columnas) {
if(is.numeric(data[,i])==TRUE)
{
hist(data[,i])
}
else
{
pie(table(data[,i]))
}
}