This work is conducted with the purpose of applying dimension reduction of factors on the Open Food Facts data. Then, clustering will be performed.
#Load the data
open_food_facts <- read.csv("trimmed_open_food_facts_dataset.csv", sep = ",")
head(open_food_facts)## product_name brands manufacturing_places
## 1 Roasted & Salted Pistachios Paramount Farms
## 2 Panettone Coop
## 3 Creme legere semi epaisse Carrefour
## 4 Organic Cheese Crackers Full Circle
## 5 Miel de Montagne Eric Bur
## 6 Gummies, Peach Rings Kroger
## purchase_places allergens traces additives_n ingredients_from_palm_oil_n
## 1 0 0
## 2 NA NA
## 3 0 0
## 4 3 0
## 5 Lyon,France 0 0
## 6 7 0
## energy.from.fat_100g trans.fat_100g sugars_100g fiber_100g proteins_100g
## 1 NA 0 3.9 5.2 11.69
## 2 NA NA 29.0 2.0 7.00
## 3 NA NA 3.1 NA 2.80
## 4 NA 0 0.0 3.3 6.67
## 5 NA NA 80.0 0.0 0.40
## 6 NA 0 50.0 0.0 5.00
## sodium_100g alcohol_100g fruits.vegetables.nuts_100g cocoa_100g
## 1 0.286000000 NA NA NA
## 2 0.157480315 NA NA NA
## 3 0.055118110 NA NA NA
## 4 0.900000000 NA NA NA
## 5 0.002362205 NA NA NA
## 6 0.012000000 NA NA NA
## nutrition_grade_fr
## 1 a
## 2 d
## 3 d
## 4 d
## 5 d
## 6 d
## 'data.frame': 1500 obs. of 18 variables:
## $ product_name : chr "Roasted & Salted Pistachios" "Panettone" "Creme legere semi epaisse" "Organic Cheese Crackers" ...
## $ brands : chr "Paramount Farms" "Coop" "Carrefour" "Full Circle" ...
## $ manufacturing_places : chr "" "" "" "" ...
## $ purchase_places : chr "" "" "" "" ...
## $ allergens : chr "" "" "" "" ...
## $ traces : chr "" "" "" "" ...
## $ additives_n : num 0 NA 0 3 0 7 0 5 1 NA ...
## $ ingredients_from_palm_oil_n: num 0 NA 0 0 0 0 0 0 0 NA ...
## $ energy.from.fat_100g : num NA NA NA NA NA NA NA NA NA NA ...
## $ trans.fat_100g : num 0 NA NA 0 NA 0 NA NA 0 NA ...
## $ sugars_100g : num 3.9 29 3.1 0 80 50 30 1.5 2.27 4 ...
## $ fiber_100g : num 5.2 2 NA 3.3 0 0 NA 2 2.3 NA ...
## $ proteins_100g : num 11.69 7 2.8 6.67 0.4 ...
## $ sodium_100g : num 0.286 0.15748 0.05512 0.9 0.00236 ...
## $ alcohol_100g : num NA NA NA NA NA NA NA 0 NA NA ...
## $ fruits.vegetables.nuts_100g: num NA NA NA NA NA NA NA NA NA NA ...
## $ cocoa_100g : num NA NA NA NA NA NA NA NA NA NA ...
## $ nutrition_grade_fr : chr "a" "d" "d" "d" ...
## product_name brands manufacturing_places purchase_places
## Length:1500 Length:1500 Length:1500 Length:1500
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## allergens traces additives_n
## Length:1500 Length:1500 Min. : 0.000
## Class :character Class :character 1st Qu.: 0.000
## Mode :character Mode :character Median : 1.000
## Mean : 2.041
## 3rd Qu.: 3.000
## Max. :22.000
## NA's :146
## ingredients_from_palm_oil_n energy.from.fat_100g trans.fat_100g
## Min. :0.00000 Min. : 0.0 Min. :0.0000
## 1st Qu.:0.00000 1st Qu.: 65.0 1st Qu.:0.0000
## Median :0.00000 Median :130.0 Median :0.0000
## Mean :0.03323 Mean :193.3 Mean :0.0251
## 3rd Qu.:0.00000 3rd Qu.:290.0 3rd Qu.:0.0000
## Max. :2.00000 Max. :450.0 Max. :4.5500
## NA's :146 NA's :1497 NA's :746
## sugars_100g fiber_100g proteins_100g sodium_100g
## Min. : 0.00 Min. : 0.00 Min. : 0.000 Min. : 0.0000
## 1st Qu.: 1.23 1st Qu.: 0.00 1st Qu.: 1.820 1st Qu.: 0.0330
## Median : 5.00 Median : 1.40 Median : 5.710 Median : 0.2362
## Mean : 14.92 Mean : 2.79 Mean : 7.834 Mean : 0.4476
## 3rd Qu.: 23.00 3rd Qu.: 3.60 3rd Qu.:11.000 3rd Qu.: 0.5118
## Max. :100.00 Max. :51.60 Max. :55.000 Max. :23.7500
## NA's :1 NA's :254 NA's :1 NA's :1
## alcohol_100g fruits.vegetables.nuts_100g cocoa_100g nutrition_grade_fr
## Min. :0 Min. : 0.00 Min. :28.00 Length:1500
## 1st Qu.:0 1st Qu.: 7.75 1st Qu.:29.50 Class :character
## Median :0 Median : 45.00 Median :38.50 Mode :character
## Mean :0 Mean : 42.19 Mean :43.75
## 3rd Qu.:0 3rd Qu.: 53.75 3rd Qu.:52.75
## Max. :0 Max. :100.00 Max. :70.00
## NA's :1489 NA's :1486 NA's :1496
## [1] 1500 18
#Remove empty columns if there are more than 90% of the rows are empty
data_cleaned <- open_food_facts[, colSums(is.na(open_food_facts)) / nrow(open_food_facts) < 0.9]
# Imputation to eliminate any missing values
# mice() function performs imputation by filling in the missing values of data_cleaned
# method pnm (Predictive Mean Matching) is used to predict missing values using regression model and matches the predicted values to observe values closest in distribution and within the range of observed data
imputed_data <- mice(data_cleaned, method = "pmm", m = 5, maxit = 50, seed = 486)##
## iter imp variable
## 1 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 1 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 1 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 1 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 1 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 2 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 2 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 2 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 2 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 2 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 3 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 3 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 3 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 3 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 3 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 4 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 4 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 4 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 4 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 4 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 5 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 5 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 5 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 5 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 5 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 6 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 6 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 6 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 6 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 6 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 7 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 7 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 7 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 7 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 7 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 8 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 8 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 8 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 8 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 8 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 9 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 9 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 9 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 9 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 9 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 10 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 10 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 10 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 10 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 10 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 11 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 11 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 11 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 11 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 11 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 12 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 12 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 12 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 12 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 12 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 13 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 13 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 13 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 13 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 13 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 14 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 14 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 14 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 14 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 14 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 15 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 15 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 15 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 15 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 15 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 16 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 16 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 16 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 16 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 16 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 17 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 17 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 17 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 17 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 17 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 18 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 18 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 18 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 18 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 18 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 19 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 19 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 19 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 19 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 19 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 20 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 20 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 20 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 20 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 20 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 21 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 21 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 21 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 21 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 21 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 22 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 22 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 22 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 22 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 22 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 23 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 23 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 23 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 23 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 23 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 24 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 24 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 24 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 24 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 24 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 25 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 25 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 25 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 25 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 25 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 26 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 26 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 26 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 26 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 26 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 27 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 27 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 27 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 27 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 27 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 28 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 28 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 28 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 28 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 28 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 29 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 29 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 29 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 29 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 29 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 30 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 30 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 30 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 30 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 30 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 31 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 31 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 31 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 31 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 31 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 32 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 32 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 32 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 32 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 32 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 33 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 33 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 33 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 33 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 33 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 34 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 34 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 34 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 34 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 34 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 35 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 35 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 35 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 35 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 35 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 36 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 36 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 36 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 36 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 36 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 37 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 37 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 37 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 37 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 37 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 38 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 38 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 38 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 38 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 38 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 39 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 39 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 39 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 39 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 39 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 40 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 40 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 40 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 40 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 40 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 41 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 41 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 41 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 41 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 41 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 42 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 42 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 42 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 42 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 42 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 43 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 43 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 43 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 43 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 43 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 44 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 44 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 44 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 44 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 44 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 45 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 45 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 45 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 45 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 45 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 46 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 46 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 46 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 46 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 46 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 47 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 47 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 47 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 47 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 47 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 48 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 48 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 48 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 48 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 48 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 49 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 49 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 49 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 49 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 49 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 50 1 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 50 2 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 50 3 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 50 4 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## 50 5 additives_n ingredients_from_palm_oil_n trans.fat_100g sugars_100g fiber_100g proteins_100g sodium_100g
## Warning: Number of logged events: 7
# a completed dataset is extracted which has missing values filling in
data_complete <- complete(imputed_data)
# Check variable types
sapply(data_complete, class)## product_name brands
## "character" "character"
## manufacturing_places purchase_places
## "character" "character"
## allergens traces
## "character" "character"
## additives_n ingredients_from_palm_oil_n
## "numeric" "numeric"
## trans.fat_100g sugars_100g
## "numeric" "numeric"
## fiber_100g proteins_100g
## "numeric" "numeric"
## sodium_100g nutrition_grade_fr
## "numeric" "character"
# Convert non-numeric columns to numeric
data_complete <- data_complete %>%
mutate_if(is.character, as.factor) %>%
mutate_if(is.factor, as.numeric)
# To view the summary of the data
summary(data_complete)## product_name brands manufacturing_places purchase_places
## Min. : 1.0 Min. : 1.0 Min. : 1.000 Min. : 1.0
## 1st Qu.: 356.8 1st Qu.: 251.0 1st Qu.: 1.000 1st Qu.: 1.0
## Median : 723.5 Median : 557.0 Median : 1.000 Median : 1.0
## Mean : 723.7 Mean : 561.3 Mean : 7.663 Mean : 13.5
## 3rd Qu.:1089.2 3rd Qu.: 864.2 3rd Qu.: 1.000 3rd Qu.: 1.0
## Max. :1458.0 Max. :1164.0 Max. :111.000 Max. :131.0
## allergens traces additives_n
## Min. : 1.00 Min. : 1.000 Min. : 0.000
## 1st Qu.: 1.00 1st Qu.: 1.000 1st Qu.: 0.000
## Median : 1.00 Median : 1.000 Median : 1.000
## Mean : 12.82 Mean : 5.603 Mean : 2.003
## 3rd Qu.: 1.00 3rd Qu.: 1.000 3rd Qu.: 3.000
## Max. :175.00 Max. :108.000 Max. :22.000
## ingredients_from_palm_oil_n trans.fat_100g sugars_100g
## Min. :0.00000 Min. :0.00000 Min. : 0.000
## 1st Qu.:0.00000 1st Qu.:0.00000 1st Qu.: 1.245
## Median :0.00000 Median :0.00000 Median : 5.000
## Mean :0.03333 Mean :0.02543 Mean : 14.938
## 3rd Qu.:0.00000 3rd Qu.:0.00000 3rd Qu.: 23.020
## Max. :2.00000 Max. :4.55000 Max. :100.000
## fiber_100g proteins_100g sodium_100g nutrition_grade_fr
## Min. : 0.000 Min. : 0.000 Min. : 0.0000 Min. :1.000
## 1st Qu.: 0.000 1st Qu.: 1.830 1st Qu.: 0.0330 1st Qu.:2.000
## Median : 1.400 Median : 5.705 Median : 0.2362 Median :3.000
## Mean : 2.729 Mean : 7.832 Mean : 0.4476 Mean :3.212
## 3rd Qu.: 3.600 3rd Qu.:11.000 3rd Qu.: 0.5118 3rd Qu.:4.000
## Max. :51.600 Max. :55.000 Max. :23.7500 Max. :5.000
## [1] FALSE
# Normalize data for comparability
data_normalized <- data.Normalization(data_complete, type = "n1", normalization = "column")
str(data_normalized)## 'data.frame': 1500 obs. of 14 variables:
## $ product_name : num 1.039 0.615 -0.833 0.49 0.241 ...
## $ brands : num 0.681 -0.987 -1.169 -0.51 -0.677 ...
## $ manufacturing_places : num -0.331 -0.331 -0.331 -0.331 -0.331 ...
## $ purchase_places : num -0.436 -0.436 -0.436 -0.436 2.282 ...
## $ allergens : num -0.335 -0.335 -0.335 -0.335 -0.335 ...
## $ traces : num -0.268 -0.268 -0.268 -0.268 -0.268 ...
## $ additives_n : num -0.75 -0.376 -0.75 0.373 -0.75 ...
## $ ingredients_from_palm_oil_n: num -0.178 -0.178 -0.178 -0.178 -0.178 ...
## $ trans.fat_100g : num -0.0834 -0.0834 -0.0834 -0.0834 -0.0834 ...
## $ sugars_100g : num -0.557 0.71 -0.597 -0.754 3.283 ...
## $ fiber_100g : num 0.546 -0.161 -0.183 0.126 -0.603 ...
## $ proteins_100g : num 0.484 -0.104 -0.632 -0.146 -0.933 ...
## $ sodium_100g : num -0.129 -0.231 -0.313 0.361 -0.355 ...
## $ nutrition_grade_fr : num -1.643 0.585 0.585 0.585 0.585 ...
## - attr(*, "normalized:shift")= Named num [1:14] 723.71 561.35 7.66 13.5 12.82 ...
## ..- attr(*, "names")= chr [1:14] "product_name" "brands" "manufacturing_places" "purchase_places" ...
## - attr(*, "normalized:scale")= Named num [1:14] 424.8 345.9 20.1 28.7 35.2 ...
## ..- attr(*, "names")= chr [1:14] "product_name" "brands" "manufacturing_places" "purchase_places" ...
# Scree plot to find elbow point
fviz_eig(princomp(data_normalized), addlabels = TRUE, ylim = c(0, 20)) +
geom_vline(xintercept = 4, linetype = "dashed", color = "red") + # Add elbow point
labs(title = "Scree Plot with Suggested Elbow Point")# It was shown on the scree plot that the suggested elbow point is 4.
# Rotation (varimax) improves interpretability by maximizing the variance explained by each factor
# When nfactors = 4
pca <- principal(data_normalized, nfactors = 4, rotate = "varimax")
summary(pca)##
## Factor analysis with Call: principal(r = data_normalized, nfactors = 4, rotate = "varimax")
##
## Test of the hypothesis that 4 factors are sufficient.
## The degrees of freedom for the model is 41 and the objective function was 1.15
## The number of observations was 1500 with Chi Square = 1711.94 with prob < 0
##
## The root mean square of the residuals (RMSA) is 0.09
#Identify variables with high loadings for each factor
print(loadings(pca), digits = 3, cutoff = 0.4, sort = TRUE)##
## Loadings:
## RC1 RC2 RC3 RC4
## manufacturing_places 0.717
## purchase_places 0.798
## allergens 0.647
## traces 0.629
## additives_n 0.566
## sugars_100g 0.723
## nutrition_grade_fr 0.706
## fiber_100g 0.800
## proteins_100g 0.645 0.470
## sodium_100g 0.781
## product_name
## brands
## ingredients_from_palm_oil_n
## trans.fat_100g 0.412
##
## RC1 RC2 RC3 RC4
## SS loadings 2.019 1.719 1.300 1.161
## Proportion Var 0.144 0.123 0.093 0.083
## Cumulative Var 0.144 0.267 0.360 0.443
# When nfactors = 3
pca <- principal(data_normalized, nfactors = 3, rotate = "varimax")
summary(pca)##
## Factor analysis with Call: principal(r = data_normalized, nfactors = 3, rotate = "varimax")
##
## Test of the hypothesis that 3 factors are sufficient.
## The degrees of freedom for the model is 52 and the objective function was 0.87
## The number of observations was 1500 with Chi Square = 1299.83 with prob < 7.8e-238
##
## The root mean square of the residuals (RMSA) is 0.09
#Identify variables with high loadings for each factor
print(loadings(pca), digits = 3, cutoff = 0.4, sort = TRUE)##
## Loadings:
## RC1 RC2 RC3
## manufacturing_places 0.714
## purchase_places 0.799
## allergens 0.653
## traces 0.625
## additives_n 0.584
## sugars_100g 0.691
## nutrition_grade_fr 0.735
## fiber_100g 0.680
## proteins_100g 0.769
## product_name
## brands
## ingredients_from_palm_oil_n
## trans.fat_100g
## sodium_100g
##
## RC1 RC2 RC3
## SS loadings 2.030 1.737 1.295
## Proportion Var 0.145 0.124 0.093
## Cumulative Var 0.145 0.269 0.362
# When nfactors = 5
pca <- principal(data_normalized, nfactors = 5, rotate = "varimax")
summary(pca)##
## Factor analysis with Call: principal(r = data_normalized, nfactors = 5, rotate = "varimax")
##
## Test of the hypothesis that 5 factors are sufficient.
## The degrees of freedom for the model is 31 and the objective function was 1.5
## The number of observations was 1500 with Chi Square = 2229.02 with prob < 0
##
## The root mean square of the residuals (RMSA) is 0.1
#Identify variables with high loadings for each factor
print(loadings(pca), digits = 3, cutoff = 0.4, sort = TRUE)##
## Loadings:
## RC1 RC2 RC3 RC4 RC5
## manufacturing_places 0.721
## purchase_places 0.799
## allergens 0.647
## traces 0.635
## additives_n 0.592
## sugars_100g 0.702
## nutrition_grade_fr 0.724
## fiber_100g 0.840
## proteins_100g 0.564 0.560
## sodium_100g 0.780
## product_name 0.662
## brands 0.717
## ingredients_from_palm_oil_n
## trans.fat_100g 0.422
##
## RC1 RC2 RC3 RC4 RC5
## SS loadings 2.016 1.724 1.261 1.170 1.120
## Proportion Var 0.144 0.123 0.090 0.084 0.080
## Cumulative Var 0.144 0.267 0.357 0.441 0.521
The analysis used Principal Component Analysis (PCA) with 4 factors and varimax rotation to simplify interpretation by maximizing variance explained by each factor and their comparisons.
RMSR measures the average difference between observed and predicted correlations. Smaller values indicate a better fit.
Interpretation: RMSR < 0.05: Excellent fit; RMSR ~ 0.05 to 0.10: Adequate fit; RMSR > 0.10: Poor fit.
For experiment, PCA with nfactors = 3 and 5 are also conducted to compare the results.
When nfactors = 4, Chi-Square = 1711.94, p-value < 0, the null hypothesis that 4 factors are sufficient is rejected, indicating 4 factors are not necessarily to be sufficient, meaning 4 factors leave some unexplained variance. It could be due to the noise in the data since chi-square test is sensitive to it.
The objective function was 1.15. The smaller the values, the better the model fit.
The root mean square of the residuals (RMSR) is 0.09:
Since the RSMR = 0.09, it is considered to be adequately fit.
When nfactors = 3, Chi-Square = 1299.83, p-value < 7.8e-238, the null hypothesis that 3 factors are sufficient is rejected, indicating 3 factors are not sufficient to perfectly explain the data.
The objective function was 0.87 so it is slightly better than when nfactors = 4.
The root mean square of the residuals (RMSR) is 0.09:
Since the RSMR = 0.09, it is considered to be adequately fit.
When nfactors =5, Chi-Square = 2229.02, p-value < 0, the null hypothesis that 5 factors are sufficient is rejected, indicating 5 factors are not sufficient to perfectly explain the data.
The objective function was 1.15 and it is the worse among 3 models.
The root mean square of the residuals (RMSR) is 0.1, hence it is on the borderline of an acceptable fit and a poor fit.
| Metric | nfactors = 3 | nfactors = 4 | nfactors = 5 |
|---|---|---|---|
| Degrees of Freedom | 52 | 41 | 31 |
| Objective Function | 0.87 | 1.15 | 1.5 |
| Chi-Square | 1299.83 | 1711.94 | 2229.02 |
| RMSR | 0.09 | 0.09 | 0.10 |
# Elbow point is 4
pca <- prcomp(data_normalized, scale. = TRUE)
# Contribution of variables to the first two principal components
contrib_PC1 <- fviz_contrib(pca, choice = "var", axes = 1, xtickslab.rt = 90)
contrib_PC2 <- fviz_contrib(pca, choice = "var", axes = 2, xtickslab.rt = 90)
gridExtra::grid.arrange(contrib_PC1, contrib_PC2,top='Contribution to the first two Principal Components')# MDS
dist_matrix <- dist(data_normalized)
mds_fit <- mds(dist_matrix, ndim = 2, type = "ratio")
#summary(mds_fit)
plot(mds_fit, plot.type = "stressplot")The red line in the plots created by fviz_contrib() represents the expected average contribution of each variable to the specified principal component.
Based on the Contribution to the first two Principal Components plot, purchase_places, manufacturing_places, allergens and traces contributed the most and above average contribution to Dim-1 whereas, sugars_100g, nutrition_grade_fr, additives_n and proteins_100g contributes the most and above average contribution to Dim-2.
Clustering is performed for comparison.
# Determine optimal number of clusters
fviz_nbclust(data_normalized, kmeans, method = "silhouette") +
labs(title = "Optimal Number of Clusters")## K-means clustering
set.seed(791)
kmeans_result <- kmeans(data_normalized, centers = 4, nstart = 25)
fviz_cluster(kmeans_result, data = data_normalized, ellipse.type = "convex", main = "K-means Clustering")## PAM clustering
pam_result <- pam(data_normalized, k = 4)
fviz_cluster(pam_result, data = data_normalized, ellipse.type = "convex", main = "PAM Clustering")## Hierarchical clustering
hc <- hclust(dist_matrix, method = "ward.D2")
plot(hc, main = "Hierarchical Clustering Dendrogram")
rect.hclust(hc, k = 4, border = "red")## Analysis and comparison
# Comparing clustering results
compare_clusters <- table(kmeans_result$cluster, pam_result$clustering)
print(compare_clusters)##
## 1 2 3 4
## 1 10 147 209 1
## 2 14 49 19 161
## 3 0 5 3 0
## 4 362 508 4 8
| PAM Cluster 1 | PAM Cluster 2 | PAM Cluster 3 | PAM Cluster 4 | |
|---|---|---|---|---|
| K-Means 1 | 10 | 147 | 209 | 1 |
| K-Means 2 | 14 | 49 | 19 | 161 |
| K-Means 3 | 0 | 5 | 2 | 0 |
| K-Means 4 | 362 | 508 | 4 | 8 |
Based on the number of observations in the clusters shown in table above, K-Means Cluster 4 and PAM Cluster 2 have strong cluster agreements while K-Means Cluster 1 and PAM Cluster 3 have some agreements. On the other hand, K-Means Cluster 3 has very few points (8 total), spread across PAM Clusters 2 and 3 with 5 in PAM cluster 2 and 3 in PAM cluster 3. This may suggest that K-means Cluster 3 represents a small and distinct group or just noise.
Furthermore, the K-means clusters are well defined and separated whereas PAM clusters are defined with overlapped clusters. The shape of clusters varied for both clustering.
For hierarchical clustering, at height of 50, it shows 4 clusters.