Objetivo

Este ejercicio tiene el objetivo de aplicar el Clustering jerarquico a un conjunto de datos nutricionales de tal manera que se puedan encontrar las combinaciones adecuadas de alimentos. Dataset disponible en Kaggle (https://www.kaggle.com/ofrancisco/emoji-diet-nutritional-data-sr28/data).

#Cargamos las librerias

#Cargamos el conjunto de datos y observamos sus variables

##  [1] "names"                   "Calories (kcal)"        
##  [3] "Carbohydrates (g)"       "Total Sugar (g)"        
##  [5] "Protein (g)"             "Total Fat (g)"          
##  [7] "Saturated Fat (g)"       "Monounsaturated Fat (g)"
##  [9] "Polyunsaturated Fat (g)" "Total Fiber (g)"        
## [11] "Cholesterol (mg)"        "Vitamin B6 (mg)"        
## [13] "Vitamin A (IU)"          "Vitamin B12 (ug)"       
## [15] "Vitamin C (mg)"          "Vitamin D (IU)"         
## [17] "Vitamin E (IU)"          "Vitamin K (ug)"         
## [19] "Thiamin (mg)"            "Riboflavin (mg)"        
## [21] "Niacin (mg)"             "Folate (ug)"            
## [23] "Pantothenic Acid (mg)"   "Choline (mg)"           
## [25] "Calcium (g)"             "Copper (mg)"            
## [27] "Iron (mg)"               "Magnesium (mg)"         
## [29] "Manganese (mg)"          "Phosphorus (g)"         
## [31] "Potassium (g)"           "Selenium (ug)"          
## [33] "Sodium (g)"              "Zinc (mg)"

Calculo estadísticas descriptivas

##  Calories (kcal) Carbohydrates (g) Total Sugar (g)    Protein (g)      
##  Min.   :0.010   Min.   :0.00000   Min.   :0.00000   Min.   :0.000000  
##  1st Qu.:0.540   1st Qu.:0.06822   1st Qu.:0.00000   1st Qu.:0.006825  
##  Median :1.395   Median :0.16070   Median :0.03080   Median :0.020050  
##  Mean   :1.920   Mean   :0.23736   Mean   :0.08027   Mean   :0.050148  
##  3rd Qu.:2.735   3rd Qu.:0.30090   3rd Qu.:0.09440   3rd Qu.:0.069000  
##  Max.   :8.980   Max.   :0.98000   Max.   :0.82120   Max.   :0.274800  
##  Total Fat (g)     
##  Min.   :0.000000  
##  1st Qu.:0.001825  
##  Median :0.007250  
##  Mean   :0.084650  
##  3rd Qu.:0.125875  
##  Max.   :0.995000

#Graficos de dispersión

Cálculo de la matriz de correlaciones

Visualización del Dendrograma

## # A tibble: 3 x 35
##   names `Calories (kcal~ `Carbohydrates ~ `Total Sugar (g~ `Protein (g)`
##   <chr>            <dbl>            <dbl>            <dbl>         <dbl>
## 1 grap~           -0.685           -0.243            0.493        -0.634
## 2 melon           -0.914           -0.739           -0.155        -0.577
## 3 wate~           -0.902           -0.697           -0.121        -0.650
## # ... with 30 more variables: `Total Fat (g)` <dbl>, `Saturated Fat (g)` <dbl>,
## #   `Monounsaturated Fat (g)` <dbl>, `Polyunsaturated Fat (g)` <dbl>, `Total
## #   Fiber (g)` <dbl>, `Cholesterol (mg)` <dbl>, `Vitamin B6 (mg)` <dbl>,
## #   `Vitamin A (IU)` <dbl>, `Vitamin B12 (ug)` <dbl>, `Vitamin C (mg)` <dbl>,
## #   `Vitamin D (IU)` <dbl>, `Vitamin E (IU)` <dbl>, `Vitamin K (ug)` <dbl>,
## #   `Thiamin (mg)` <dbl>, `Riboflavin (mg)` <dbl>, `Niacin (mg)` <dbl>, `Folate
## #   (ug)` <dbl>, `Pantothenic Acid (mg)` <dbl>, `Choline (mg)` <dbl>, `Calcium
## #   (g)` <dbl>, `Copper (mg)` <dbl>, `Iron (mg)` <dbl>, `Magnesium (mg)` <dbl>,
## #   `Manganese (mg)` <dbl>, `Phosphorus (g)` <dbl>, `Potassium (g)` <dbl>,
## #   `Selenium (ug)` <dbl>, `Sodium (g)` <dbl>, `Zinc (mg)` <dbl>, ID <chr>