En el presente analisis de detalan las estadisticas descriptivas en la base de datos propuesta por Kaggle.
ToyotaCorolla <- read.csv("C:/Users/frank/OneDrive/Escritorio/DIPLOMADO/Statistics Programming for Business Analytics/Nueva carpeta/ToyotaCorolla.csv")
dim(ToyotaCorolla)
## [1] 1436 10
length(ToyotaCorolla)
## [1] 10
colnames(ToyotaCorolla)
## [1] "Price" "Age" "KM" "FuelType" "HP" "MetColor"
## [7] "Automatic" "CC" "Doors" "Weight"
names(ToyotaCorolla)
## [1] "Price" "Age" "KM" "FuelType" "HP" "MetColor"
## [7] "Automatic" "CC" "Doors" "Weight"
summary(ToyotaCorolla)
## Price Age KM FuelType
## Min. : 4350 Min. : 1.00 Min. : 1 Length:1436
## 1st Qu.: 8450 1st Qu.:44.00 1st Qu.: 43000 Class :character
## Median : 9900 Median :61.00 Median : 63390 Mode :character
## Mean :10731 Mean :55.95 Mean : 68533
## 3rd Qu.:11950 3rd Qu.:70.00 3rd Qu.: 87021
## Max. :32500 Max. :80.00 Max. :243000
## HP MetColor Automatic CC
## Min. : 69.0 Min. :0.0000 Min. :0.00000 Min. :1300
## 1st Qu.: 90.0 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:1400
## Median :110.0 Median :1.0000 Median :0.00000 Median :1600
## Mean :101.5 Mean :0.6748 Mean :0.05571 Mean :1567
## 3rd Qu.:110.0 3rd Qu.:1.0000 3rd Qu.:0.00000 3rd Qu.:1600
## Max. :192.0 Max. :1.0000 Max. :1.00000 Max. :2000
## Doors Weight
## Min. :2.000 Min. :1000
## 1st Qu.:3.000 1st Qu.:1040
## Median :4.000 Median :1070
## Mean :4.033 Mean :1072
## 3rd Qu.:5.000 3rd Qu.:1085
## Max. :5.000 Max. :1615
str(ToyotaCorolla)
## 'data.frame': 1436 obs. of 10 variables:
## $ Price : int 13500 13750 13950 14950 13750 12950 16900 18600 21500 12950 ...
## $ Age : int 23 23 24 26 30 32 27 30 27 23 ...
## $ KM : int 46986 72937 41711 48000 38500 61000 94612 75889 19700 71138 ...
## $ FuelType : chr "Diesel" "Diesel" "Diesel" "Diesel" ...
## $ HP : int 90 90 90 90 90 90 90 90 192 69 ...
## $ MetColor : int 1 1 1 0 0 0 1 1 0 0 ...
## $ Automatic: int 0 0 0 0 0 0 0 0 0 0 ...
## $ CC : int 2000 2000 2000 2000 2000 2000 2000 2000 1800 1900 ...
## $ Doors : int 3 3 3 3 3 3 3 3 3 3 ...
## $ Weight : int 1165 1165 1165 1165 1170 1170 1245 1245 1185 1105 ...
Se analizaron los datos usando FOR:
BDN <- NULL
BDC <- NULL
columna <- ncol(ToyotaCorolla)
par(mfrow = c(2, 5))
for (i in 1:columna) {
if (is.numeric(ToyotaCorolla[ , i])) {
texto <- paste("AnĂ¡lisis del atributo", colnames(ToyotaCorolla)[i])
hist(ToyotaCorolla[ , i], main = texto, xlab = colnames(ToyotaCorolla)[i])
BDN <- c(BDN, i)
} else {
texto2 <- paste("AnĂ¡lisis del atributo", colnames(ToyotaCorolla)[i])
pie(table(ToyotaCorolla[ , i]), main = texto2)
BDC <- c(BDC, i)
}
}
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.