Descargando datos del sitio https://archive.ics.uci.edu/ml/datasets.html. Para este ejemplo se descargaron los datos correspondientes “wine”.

Los nommbres de las columnas se encuentran en un documento ubicado en el directorio https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.names que es una descripción de la información contenida en la base. (Abrir el documento y observar en la sección 4 donde se muestra información relevante y ademas una lista de los atributos o variables)

rm(list=ls())
library(data.table)
urlData <- "https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data"
data <- fread(urlData)
columnas <- c('clase','Alcohol','Acido Malic','Ceniza','Alcalinidad de la Ceniza',
              'Magnesio','Fenoloes Totales','Flavanoides','Fenoles Noflavanoides',
              'Proscaracyanins','Intensidad Color','Hue','OD280/OD315 of diluted wines','Proline')
data <- data.frame(data)
colnames(data)<- columnas
head(data)
##   clase Alcohol Acido Malic Ceniza Alcalinidad de la Ceniza Magnesio
## 1     1   14.23        1.71   2.43                     15.6      127
## 2     1   13.20        1.78   2.14                     11.2      100
## 3     1   13.16        2.36   2.67                     18.6      101
## 4     1   14.37        1.95   2.50                     16.8      113
## 5     1   13.24        2.59   2.87                     21.0      118
## 6     1   14.20        1.76   2.45                     15.2      112
##   Fenoloes Totales Flavanoides Fenoles Noflavanoides Proscaracyanins
## 1             2.80        3.06                  0.28            2.29
## 2             2.65        2.76                  0.26            1.28
## 3             2.80        3.24                  0.30            2.81
## 4             3.85        3.49                  0.24            2.18
## 5             2.80        2.69                  0.39            1.82
## 6             3.27        3.39                  0.34            1.97
##   Intensidad Color  Hue OD280/OD315 of diluted wines Proline
## 1             5.64 1.04                         3.92    1065
## 2             4.38 1.05                         3.40    1050
## 3             5.68 1.03                         3.17    1185
## 4             7.80 0.86                         3.45    1480
## 5             4.32 1.04                         2.93     735
## 6             6.75 1.05                         2.85    1450

La base de datos está clasificada en tres clases correspondientes a tres regiones vitivinicolas del sur de Italia; Identificamos los datos correspondientes a cada región.

clase1 <- data[which(data$clase == "1"),]
clase2 <- data[which(data$clase == "2"), ]
clase3 <- data[which(data$clase == 3 ), ]

Se calculan las estadísticas generales por clase

summary(clase1)
##      clase      Alcohol       Acido Malic        Ceniza     
##  Min.   :1   Min.   :12.85   Min.   :1.350   Min.   :2.040  
##  1st Qu.:1   1st Qu.:13.40   1st Qu.:1.665   1st Qu.:2.295  
##  Median :1   Median :13.75   Median :1.770   Median :2.440  
##  Mean   :1   Mean   :13.74   Mean   :2.011   Mean   :2.456  
##  3rd Qu.:1   3rd Qu.:14.10   3rd Qu.:1.935   3rd Qu.:2.615  
##  Max.   :1   Max.   :14.83   Max.   :4.040   Max.   :3.220  
##  Alcalinidad de la Ceniza    Magnesio     Fenoloes Totales  Flavanoides   
##  Min.   :11.20            Min.   : 89.0   Min.   :2.20     Min.   :2.190  
##  1st Qu.:16.00            1st Qu.: 98.0   1st Qu.:2.60     1st Qu.:2.680  
##  Median :16.80            Median :104.0   Median :2.80     Median :2.980  
##  Mean   :17.04            Mean   :106.3   Mean   :2.84     Mean   :2.982  
##  3rd Qu.:18.70            3rd Qu.:114.0   3rd Qu.:3.00     3rd Qu.:3.245  
##  Max.   :25.00            Max.   :132.0   Max.   :3.88     Max.   :3.930  
##  Fenoles Noflavanoides Proscaracyanins Intensidad Color      Hue       
##  Min.   :0.170         Min.   :1.250   Min.   :3.520    Min.   :0.820  
##  1st Qu.:0.255         1st Qu.:1.640   1st Qu.:4.550    1st Qu.:0.995  
##  Median :0.290         Median :1.870   Median :5.400    Median :1.070  
##  Mean   :0.290         Mean   :1.899   Mean   :5.528    Mean   :1.062  
##  3rd Qu.:0.320         3rd Qu.:2.090   3rd Qu.:6.225    3rd Qu.:1.130  
##  Max.   :0.500         Max.   :2.960   Max.   :8.900    Max.   :1.280  
##  OD280/OD315 of diluted wines    Proline      
##  Min.   :2.510                Min.   : 680.0  
##  1st Qu.:2.870                1st Qu.: 987.5  
##  Median :3.170                Median :1095.0  
##  Mean   :3.158                Mean   :1115.7  
##  3rd Qu.:3.420                3rd Qu.:1280.0  
##  Max.   :4.000                Max.   :1680.0
summary(clase2)
##      clase      Alcohol       Acido Malic        Ceniza     
##  Min.   :2   Min.   :11.03   Min.   :0.740   Min.   :1.360  
##  1st Qu.:2   1st Qu.:11.91   1st Qu.:1.270   1st Qu.:2.000  
##  Median :2   Median :12.29   Median :1.610   Median :2.240  
##  Mean   :2   Mean   :12.28   Mean   :1.933   Mean   :2.245  
##  3rd Qu.:2   3rd Qu.:12.52   3rd Qu.:2.145   3rd Qu.:2.420  
##  Max.   :2   Max.   :13.86   Max.   :5.800   Max.   :3.230  
##  Alcalinidad de la Ceniza    Magnesio      Fenoloes Totales
##  Min.   :10.60            Min.   : 70.00   Min.   :1.100   
##  1st Qu.:18.00            1st Qu.: 85.50   1st Qu.:1.895   
##  Median :20.00            Median : 88.00   Median :2.200   
##  Mean   :20.24            Mean   : 94.55   Mean   :2.259   
##  3rd Qu.:22.00            3rd Qu.: 99.50   3rd Qu.:2.560   
##  Max.   :30.00            Max.   :162.00   Max.   :3.520   
##   Flavanoides    Fenoles Noflavanoides Proscaracyanins Intensidad Color
##  Min.   :0.570   Min.   :0.1300        Min.   :0.410   Min.   :1.280   
##  1st Qu.:1.605   1st Qu.:0.2700        1st Qu.:1.350   1st Qu.:2.535   
##  Median :2.030   Median :0.3700        Median :1.610   Median :2.900   
##  Mean   :2.081   Mean   :0.3637        Mean   :1.630   Mean   :3.087   
##  3rd Qu.:2.475   3rd Qu.:0.4300        3rd Qu.:1.885   3rd Qu.:3.400   
##  Max.   :5.080   Max.   :0.6600        Max.   :3.580   Max.   :6.000   
##       Hue        OD280/OD315 of diluted wines    Proline     
##  Min.   :0.690   Min.   :1.590                Min.   :278.0  
##  1st Qu.:0.925   1st Qu.:2.440                1st Qu.:406.5  
##  Median :1.040   Median :2.830                Median :495.0  
##  Mean   :1.056   Mean   :2.785                Mean   :519.5  
##  3rd Qu.:1.205   3rd Qu.:3.160                3rd Qu.:625.0  
##  Max.   :1.710   Max.   :3.690                Max.   :985.0
summary(clase3)
##      clase      Alcohol       Acido Malic        Ceniza     
##  Min.   :3   Min.   :12.20   Min.   :1.240   Min.   :2.100  
##  1st Qu.:3   1st Qu.:12.80   1st Qu.:2.587   1st Qu.:2.300  
##  Median :3   Median :13.16   Median :3.265   Median :2.380  
##  Mean   :3   Mean   :13.15   Mean   :3.334   Mean   :2.437  
##  3rd Qu.:3   3rd Qu.:13.51   3rd Qu.:3.958   3rd Qu.:2.603  
##  Max.   :3   Max.   :14.34   Max.   :5.650   Max.   :2.860  
##  Alcalinidad de la Ceniza    Magnesio      Fenoloes Totales
##  Min.   :17.50            Min.   : 80.00   Min.   :0.980   
##  1st Qu.:20.00            1st Qu.: 89.75   1st Qu.:1.407   
##  Median :21.00            Median : 97.00   Median :1.635   
##  Mean   :21.42            Mean   : 99.31   Mean   :1.679   
##  3rd Qu.:23.00            3rd Qu.:106.00   3rd Qu.:1.808   
##  Max.   :27.00            Max.   :123.00   Max.   :2.800   
##   Flavanoides     Fenoles Noflavanoides Proscaracyanins Intensidad Color
##  Min.   :0.3400   Min.   :0.1700        Min.   :0.550   Min.   : 3.850  
##  1st Qu.:0.5800   1st Qu.:0.3975        1st Qu.:0.855   1st Qu.: 5.438  
##  Median :0.6850   Median :0.4700        Median :1.105   Median : 7.550  
##  Mean   :0.7815   Mean   :0.4475        Mean   :1.154   Mean   : 7.396  
##  3rd Qu.:0.9200   3rd Qu.:0.5300        3rd Qu.:1.350   3rd Qu.: 9.225  
##  Max.   :1.5700   Max.   :0.6300        Max.   :2.700   Max.   :13.000  
##       Hue         OD280/OD315 of diluted wines    Proline     
##  Min.   :0.4800   Min.   :1.270                Min.   :415.0  
##  1st Qu.:0.5875   1st Qu.:1.510                1st Qu.:545.0  
##  Median :0.6650   Median :1.660                Median :627.5  
##  Mean   :0.6827   Mean   :1.684                Mean   :629.9  
##  3rd Qu.:0.7525   3rd Qu.:1.820                3rd Qu.:695.0  
##  Max.   :0.9600   Max.   :2.470                Max.   :880.0

Histogramas por variable incluida en cada clase:

hist.default(clase1$Alcohol, main = "Histograma Alcohol", xlab = "Cantidad de Acohol", ylab = "Frecuencia")