Introduction
Group : Alexis TARIS, Tom TREMEREL, Aashita Gloria NOAH, Saanika MAMORIA, Helene QU
In this presentation, we will analyze the relationship between countries and their cuisine preferences using Correspondence Analysis (CA) on the dataset provided.
Data Loading
We begin by loading the dataset and cleaning it up:
library (ca)
library (dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library (FactoMineR)
library (FactoMineR)
library (factoextra)
Loading required package: ggplot2
Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
cuisine <- read.delim ("Cusine-pref-raw.txt" , header= T, na.strings = "" , sep= " \t " , dec= "." )
row.names (cuisine) <- cuisine$ X
cuisine <- cuisine[,- 1 ]
Descriptive Statistics
We run some basic descriptive statistics to check the total number of cuisines, countries, and Top 5 most preferred Cuisines globally
# Number of cuisines and countries
num_cuisines <- nrow (cuisine)
num_countries <- ncol (cuisine)
cat ("Number of Cuisines:" , num_cuisines, " \n " )
cat ("Number of Countries:" , num_countries, " \n " )
# Total preference count per cuisine
cuisine_totals <- rowSums (cuisine)
cuisine_totals <- sort (cuisine_totals, decreasing = TRUE )
# Top 5 most preferred cuisines globally
top5_cuisines <- head (cuisine_totals, 5 )
top5_cuisines
Italian Chinese Mexican Thai French
21929 20507 19609 18110 18021
library (knitr)
kable (as.data.frame (top5_cuisines), col.names = c ("Cuisine" , "Total Count" ), caption = "Top 5 Most Preferred Cuisines Globally" )
Top 5 Most Preferred Cuisines Globally
Italian
21929
Chinese
20507
Mexican
19609
Thai
18110
French
18021
Per-Cuisine Summary Statistics
cuisine_stats <- data.frame (
Mean = round (rowMeans (cuisine), 2 ),
Median = apply (cuisine, 1 , median),
Min = apply (cuisine, 1 , min),
Max = apply (cuisine, 1 , max),
SD = apply (cuisine, 1 , sd)
)
kable (head (cuisine_stats, 10 ),
caption = "Summary Statistics for First 10 Cuisines" )
Summary Statistics for First 10 Cuisines
Spanish
734.75
687.5
369
1428
273.2920
British
553.38
521.5
200
1603
281.8474
French
750.88
721.0
471
1265
182.0058
German
582.21
524.5
351
1775
287.1356
Italian
913.71
879.5
598
1816
271.4011
Danish
434.42
406.5
140
856
172.6174
Norwegian
388.58
401.0
170
673
118.7075
Finnish
320.62
338.0
131
530
102.2511
Swedish
470.08
453.0
180
926
188.2962
Greek
610.79
557.0
220
1612
321.6810
Per-Country Summary Statistics
country_stats <- data.frame (
Mean = round (colMeans (cuisine), 2 ),
Median = apply (cuisine, 2 , median),
Min = apply (cuisine, 2 , min),
Max = apply (cuisine, 2 , max),
SD = apply (cuisine, 2 , sd)
)
kable (head (country_stats, 10 ),
caption = "Summary Statistics for First 10 Countries" )
Summary Statistics for First 10 Countries
Australia
644.35
704.0
51
918
185.6275
China
470.00
441.0
264
963
165.0315
Hong.Kong
611.41
608.5
286
951
207.2495
Indonesia
493.38
481.0
236
993
174.0707
Japan
389.71
330.0
110
880
222.2132
Malaysia
552.79
542.0
261
973
185.4720
Philippines
642.59
606.0
285
988
164.1572
Taiwan
594.21
569.5
342
995
169.1327
Thailand
465.91
438.5
214
1000
213.6809
Vietnam
525.38
480.0
263
971
181.3604
Bar Plot – Top 10 Most Preferred Cuisines Globally
library (ggplot2)
# Recalculate cuisine totals and sort
cuisine_totals <- rowSums (cuisine)
cuisine_totals_sorted <- sort (cuisine_totals, decreasing = TRUE )
# Create data frame for top 10
top10_df <- data.frame (
Cuisine = names (cuisine_totals_sorted[1 : 10 ]),
Total = cuisine_totals_sorted[1 : 10 ]
)
# Plot
ggplot (top10_df, aes (x = reorder (Cuisine, Total), y = Total)) +
geom_bar (stat = "identity" , fill = "steelblue" ) +
coord_flip () +
labs (title = "Top 10 Most Preferred Cuisines Globally" ,
x = "Cuisine" ,
y = "Total Preference Count" ) +
theme_minimal ()
Chi-square Test of Independence
Next, we perform a Chi-Square test to check the relationship between countries and their cuisine preferences.
cuisine <- as.table (as.matrix (cuisine))
chi <- chisq.test (cuisine)
chi
Pearson's Chi-squared test
data: cuisine
X-squared = 32158, df = 759, p-value < 2.2e-16
The p-value is extremely small (< 2.2e-16), indicating a highly significant relationship between countries and their cuisine preferences.
Phi-squared Statistic
We calculate the Phi-squared statistic, which measures the strength of association between countries and their cuisine preferences.
Phi2 <- chi$ statistic / sum (cuisine)
Phi2
The Phi-squared statistic is 0.0685, indicating a weak association between countries and their cuisine preferences.
Normalized Phi-squared
We normalize the Phi-squared value to understand the relative strength of the association.
J <- ncol (cuisine)
I <- nrow (cuisine)
(I-1 )* (J-1 )
max_Phi <- (min (I,J)- 1 )
Phi2/ max_Phi
The normalized Phi-squared value is close to 0, suggesting a weak strength of association between countries and their cuisine preferences.
Row Marginal Frequencies
We calculate the row and column proportions to better understand the distribution of preferences:
cuisine<- as.table (as.matrix (cuisine))
F <- as.matrix (prop.table (cuisine))
Row.F <- prop.table (cuisine, margin= 1 )
round (addmargins (Row.F), 3 )
Australia China Hong.Kong Indonesia Japan Malaysia Philippines
Spanish 0.043 0.022 0.039 0.021 0.036 0.025 0.050
British 0.055 0.033 0.047 0.045 0.015 0.049 0.054
French 0.040 0.035 0.041 0.026 0.038 0.032 0.046
German 0.046 0.033 0.048 0.027 0.033 0.025 0.043
Italian 0.042 0.027 0.038 0.033 0.039 0.034 0.042
Danish 0.046 0.031 0.039 0.028 0.013 0.032 0.050
Norwegian 0.043 0.034 0.046 0.025 0.018 0.033 0.051
Finnish 0.049 0.040 0.051 0.031 0.018 0.034 0.062
Swedish 0.042 0.031 0.044 0.024 0.016 0.032 0.045
Greek 0.054 0.022 0.033 0.019 0.015 0.021 0.040
Turkish 0.048 0.032 0.031 0.044 0.026 0.039 0.041
Moroccan 0.004 0.029 0.033 0.033 0.013 0.038 0.043
Lebanese 0.060 0.026 0.026 0.027 0.011 0.036 0.034
Saudi Arabian 0.044 0.033 0.033 0.069 0.012 0.070 0.051
Emirati 0.045 0.030 0.029 0.055 0.014 0.052 0.053
Indian 0.046 0.016 0.030 0.031 0.039 0.043 0.034
Chinese 0.043 0.047 0.045 0.028 0.043 0.036 0.044
Thai 0.048 0.025 0.047 0.029 0.027 0.050 0.043
Malaysian 0.061 0.033 0.054 0.053 0.020 0.077 0.052
Vietnamese 0.053 0.026 0.055 0.024 0.031 0.031 0.047
Singaporean 0.059 0.042 0.057 0.043 0.025 0.057 0.056
Hong Kong 0.049 0.049 0.067 0.034 0.037 0.038 0.057
Korean 0.046 0.036 0.055 0.044 0.045 0.045 0.061
Filipino 0.060 0.031 0.031 0.031 0.022 0.035 0.104
Indonesian 0.055 0.024 0.036 0.076 0.025 0.061 0.046
Taiwanese 0.046 0.049 0.069 0.031 0.050 0.037 0.053
Australian 0.073 0.033 0.047 0.034 0.022 0.043 0.060
Japanese 0.039 0.036 0.047 0.037 0.027 0.036 0.037
USA 0.032 0.029 0.055 0.045 0.039 0.041 0.032
Mexican 0.040 0.037 0.045 0.037 0.033 0.039 0.037
Caribbean 0.037 0.041 0.052 0.036 0.025 0.049 0.038
Argentinian 0.057 0.058 0.036 0.034 0.021 0.034 0.045
Brazilian 0.052 0.044 0.043 0.038 0.030 0.038 0.038
Peruvian 0.046 0.059 0.052 0.044 0.027 0.032 0.033
Sum 1.603 1.172 1.501 1.239 0.904 1.376 1.623
Taiwan Thailand Vietnam India Singapore Italy Spain USA
Spanish 0.036 0.023 0.026 0.038 0.039 0.049 0.056 0.053
British 0.039 0.034 0.046 0.060 0.058 0.024 0.020 0.044
French 0.041 0.035 0.040 0.045 0.044 0.033 0.039 0.045
German 0.046 0.039 0.037 0.041 0.046 0.029 0.031 0.053
Italian 0.038 0.032 0.033 0.042 0.041 0.045 0.043 0.048
Danish 0.039 0.028 0.039 0.046 0.047 0.024 0.019 0.047
Norwegian 0.046 0.033 0.039 0.047 0.052 0.029 0.029 0.048
Finnish 0.044 0.034 0.051 0.059 0.049 0.031 0.023 0.042
Swedish 0.045 0.026 0.036 0.049 0.056 0.029 0.027 0.050
Greek 0.034 0.016 0.022 0.036 0.034 0.049 0.042 0.056
Turkish 0.034 0.022 0.032 0.043 0.042 0.031 0.035 0.034
Moroccan 0.038 0.021 0.022 0.049 0.042 0.042 0.045 0.047
Lebanese 0.030 0.019 0.022 0.052 0.042 0.037 0.036 0.046
Saudi Arabian 0.036 0.024 0.030 0.074 0.045 0.027 0.030 0.030
Emirati 0.051 0.030 0.050 0.075 0.044 0.024 0.028 0.028
Indian 0.038 0.017 0.027 0.066 0.048 0.035 0.032 0.040
Chinese 0.044 0.040 0.036 0.044 0.046 0.030 0.035 0.049
Thai 0.043 0.055 0.042 0.042 0.051 0.029 0.031 0.044
Malaysian 0.050 0.027 0.035 0.051 0.073 0.017 0.019 0.032
Vietnamese 0.046 0.048 0.068 0.034 0.047 0.025 0.026 0.049
Singaporean 0.054 0.041 0.050 0.055 0.074 0.022 0.021 0.031
Hong Kong 0.057 0.049 0.048 0.042 0.065 0.025 0.027 0.042
Korean 0.052 0.054 0.056 0.033 0.059 0.022 0.025 0.047
Filipino 0.038 0.027 0.040 0.048 0.044 0.026 0.027 0.056
Indonesian 0.039 0.020 0.030 0.043 0.065 0.027 0.026 0.035
Taiwanese 0.076 0.042 0.048 0.037 0.067 0.024 0.027 0.043
Australian 0.041 0.039 0.046 0.055 0.060 0.028 0.022 0.041
Japanese 0.045 0.039 0.042 0.052 0.038 0.039 0.036 0.055
USA 0.034 0.042 0.037 0.047 0.040 0.032 0.029 0.063
Mexican 0.041 0.041 0.040 0.041 0.043 0.039 0.037 0.052
Caribbean 0.042 0.033 0.036 0.040 0.040 0.036 0.040 0.060
Argentinian 0.041 0.034 0.031 0.039 0.040 0.056 0.057 0.042
Brazilian 0.046 0.033 0.031 0.041 0.037 0.051 0.043 0.050
Peruvian 0.047 0.025 0.032 0.043 0.035 0.046 0.058 0.061
Sum 1.472 1.122 1.301 1.610 1.654 1.114 1.119 1.561
UAE Saudi.Arabia Great.Britain Germany France Denmark Sweden
Spanish 0.039 0.032 0.080 0.081 0.050 0.042 0.047
British 0.049 0.040 0.121 0.038 0.021 0.020 0.027
French 0.040 0.035 0.066 0.070 0.054 0.038 0.044
German 0.035 0.027 0.064 0.127 0.037 0.027 0.041
Italian 0.041 0.036 0.073 0.083 0.043 0.039 0.042
Danish 0.038 0.028 0.071 0.070 0.034 0.082 0.060
Norwegian 0.045 0.028 0.059 0.072 0.043 0.029 0.044
Finnish 0.047 0.034 0.053 0.069 0.043 0.017 0.047
Swedish 0.040 0.031 0.073 0.076 0.037 0.026 0.082
Greek 0.039 0.035 0.089 0.110 0.052 0.047 0.058
Turkish 0.053 0.057 0.079 0.087 0.042 0.034 0.044
Moroccan 0.054 0.057 0.098 0.072 0.069 0.037 0.043
Lebanese 0.070 0.070 0.084 0.059 0.056 0.032 0.056
Saudi Arabian 0.073 0.095 0.034 0.052 0.031 0.024 0.023
Emirati 0.088 0.073 0.041 0.056 0.031 0.018 0.023
Indian 0.045 0.047 0.091 0.077 0.044 0.041 0.041
Chinese 0.035 0.027 0.074 0.078 0.039 0.036 0.036
Thai 0.035 0.017 0.073 0.077 0.039 0.041 0.043
Malaysian 0.042 0.029 0.081 0.055 0.028 0.025 0.024
Vietnamese 0.028 0.013 0.065 0.081 0.052 0.044 0.038
Singaporean 0.042 0.020 0.074 0.052 0.024 0.022 0.018
Hong Kong 0.033 0.016 0.075 0.056 0.031 0.023 0.021
Korean 0.035 0.020 0.052 0.063 0.031 0.022 0.033
Filipino 0.052 0.027 0.054 0.082 0.033 0.032 0.027
Indonesian 0.037 0.036 0.073 0.078 0.034 0.034 0.034
Taiwanese 0.033 0.018 0.050 0.064 0.035 0.021 0.019
Australian 0.043 0.026 0.073 0.059 0.028 0.034 0.027
Japanese 0.037 0.027 0.063 0.074 0.044 0.038 0.041
USA 0.045 0.039 0.073 0.064 0.033 0.041 0.037
Mexican 0.037 0.033 0.068 0.074 0.040 0.040 0.040
Caribbean 0.036 0.025 0.086 0.076 0.041 0.033 0.036
Argentinian 0.034 0.022 0.060 0.091 0.040 0.034 0.031
Brazilian 0.038 0.030 0.067 0.077 0.045 0.033 0.031
Peruvian 0.044 0.027 0.055 0.067 0.046 0.025 0.032
Sum 1.480 1.177 2.390 2.464 1.352 1.133 1.289
Finland Norway Sum
Spanish 0.040 0.034 1.000
British 0.039 0.025 1.000
French 0.039 0.030 1.000
German 0.038 0.027 1.000
Italian 0.038 0.028 1.000
Danish 0.042 0.047 1.000
Norwegian 0.042 0.064 1.000
Finnish 0.046 0.027 1.000
Swedish 0.041 0.040 1.000
Greek 0.040 0.037 1.000
Turkish 0.040 0.030 1.000
Moroccan 0.042 0.029 1.000
Lebanese 0.042 0.029 1.000
Saudi Arabian 0.045 0.015 1.000
Emirati 0.043 0.018 1.000
Indian 0.039 0.034 1.000
Chinese 0.039 0.028 1.000
Thai 0.040 0.030 1.000
Malaysian 0.042 0.020 1.000
Vietnamese 0.040 0.029 1.000
Singaporean 0.042 0.019 1.000
Hong Kong 0.040 0.019 1.000
Korean 0.041 0.022 1.000
Filipino 0.041 0.031 1.000
Indonesian 0.041 0.024 1.000
Taiwanese 0.042 0.017 1.000
Australian 0.041 0.024 1.000
Japanese 0.045 0.028 1.000
USA 0.040 0.029 1.000
Mexican 0.035 0.031 1.000
Caribbean 0.034 0.029 1.000
Argentinian 0.033 0.029 1.000
Brazilian 0.036 0.027 1.000
Peruvian 0.038 0.026 1.000
Sum 1.367 0.976 34.000
Row.F <- as.data.frame (Row.F)
This table provides insight into the most frequent cuisine preferences in each country. For example, in Australia, Australian food is the most frequent.
Column Marginal Frequencies
We also calculate the column proportions to see the distribution of cuisines across countries
Col.F <- prop.table (cuisine, margin= 2 )
round (addmargins (Col.F), 3 )
Australia China Hong.Kong Indonesia Japan Malaysia Philippines
Spanish 0.034 0.024 0.033 0.022 0.048 0.023 0.040
British 0.033 0.027 0.030 0.035 0.015 0.035 0.033
French 0.033 0.040 0.036 0.028 0.051 0.031 0.038
German 0.029 0.029 0.032 0.023 0.035 0.019 0.028
Italian 0.042 0.037 0.040 0.043 0.064 0.039 0.042
Danish 0.022 0.020 0.020 0.017 0.011 0.018 0.024
Norwegian 0.018 0.020 0.021 0.014 0.013 0.017 0.022
Finnish 0.017 0.019 0.019 0.014 0.011 0.014 0.022
Swedish 0.022 0.022 0.024 0.016 0.014 0.019 0.023
Greek 0.036 0.020 0.023 0.016 0.017 0.017 0.027
Turkish 0.033 0.030 0.022 0.040 0.029 0.031 0.028
Moroccan 0.002 0.022 0.019 0.024 0.012 0.024 0.023
Lebanese 0.033 0.020 0.015 0.020 0.010 0.023 0.019
Saudi Arabian 0.019 0.020 0.015 0.039 0.008 0.035 0.022
Emirati 0.020 0.018 0.014 0.032 0.011 0.027 0.024
Indian 0.034 0.017 0.024 0.030 0.048 0.037 0.026
Chinese 0.040 0.060 0.045 0.035 0.066 0.039 0.041
Thai 0.040 0.028 0.041 0.031 0.036 0.049 0.035
Malaysian 0.035 0.026 0.033 0.040 0.019 0.052 0.030
Vietnamese 0.035 0.023 0.038 0.021 0.033 0.024 0.031
Singaporean 0.035 0.034 0.036 0.034 0.025 0.039 0.034
Hong Kong 0.032 0.043 0.046 0.029 0.040 0.029 0.037
Korean 0.031 0.033 0.038 0.038 0.050 0.035 0.041
Filipino 0.026 0.018 0.014 0.018 0.016 0.018 0.045
Indonesian 0.033 0.020 0.023 0.059 0.024 0.043 0.028
Taiwanese 0.027 0.041 0.043 0.024 0.050 0.026 0.032
Australian 0.041 0.026 0.029 0.025 0.021 0.029 0.035
Japanese 0.029 0.036 0.036 0.035 0.032 0.030 0.027
USA 0.025 0.031 0.045 0.046 0.050 0.037 0.025
Mexican 0.036 0.046 0.042 0.043 0.048 0.041 0.033
Caribbean 0.023 0.034 0.033 0.029 0.025 0.035 0.023
Argentinian 0.035 0.049 0.024 0.027 0.022 0.025 0.028
Brazilian 0.031 0.036 0.027 0.029 0.029 0.026 0.023
Peruvian 0.018 0.032 0.022 0.023 0.017 0.014 0.013
Sum 1.000 1.000 1.000 1.000 1.000 1.000 1.000
Taiwan Thailand Vietnam India Singapore Italy Spain USA
Spanish 0.032 0.025 0.025 0.031 0.030 0.056 0.062 0.043
British 0.026 0.028 0.035 0.036 0.034 0.021 0.017 0.027
French 0.036 0.040 0.041 0.037 0.035 0.038 0.045 0.038
German 0.032 0.035 0.029 0.026 0.028 0.026 0.027 0.034
Italian 0.042 0.044 0.041 0.043 0.040 0.063 0.060 0.048
Danish 0.020 0.019 0.023 0.022 0.021 0.016 0.013 0.022
Norwegian 0.021 0.019 0.020 0.020 0.021 0.017 0.017 0.021
Finnish 0.017 0.017 0.022 0.021 0.016 0.015 0.011 0.015
Swedish 0.025 0.019 0.023 0.026 0.028 0.021 0.019 0.026
Greek 0.025 0.015 0.018 0.025 0.022 0.045 0.039 0.038
Turkish 0.025 0.021 0.027 0.030 0.028 0.030 0.034 0.023
Moroccan 0.023 0.016 0.015 0.027 0.022 0.032 0.034 0.026
Lebanese 0.018 0.014 0.015 0.029 0.022 0.029 0.027 0.026
Saudi Arabian 0.017 0.014 0.016 0.032 0.019 0.016 0.018 0.013
Emirati 0.025 0.019 0.028 0.034 0.019 0.015 0.017 0.013
Indian 0.031 0.017 0.025 0.050 0.034 0.036 0.033 0.030
Chinese 0.045 0.052 0.041 0.041 0.042 0.040 0.045 0.046
Thai 0.039 0.063 0.042 0.035 0.041 0.033 0.036 0.037
Malaysian 0.031 0.022 0.025 0.030 0.041 0.013 0.015 0.019
Vietnamese 0.032 0.043 0.054 0.022 0.030 0.023 0.024 0.032
Singaporean 0.035 0.033 0.036 0.033 0.043 0.019 0.017 0.019
Hong Kong 0.040 0.044 0.038 0.027 0.041 0.022 0.024 0.027
Korean 0.038 0.050 0.046 0.022 0.038 0.021 0.023 0.032
Filipino 0.018 0.016 0.021 0.021 0.018 0.016 0.017 0.024
Indonesian 0.025 0.017 0.022 0.026 0.037 0.022 0.022 0.021
Taiwanese 0.049 0.035 0.035 0.022 0.039 0.021 0.023 0.026
Australian 0.025 0.031 0.032 0.031 0.033 0.022 0.017 0.023
Japanese 0.036 0.039 0.037 0.038 0.027 0.040 0.036 0.040
USA 0.029 0.045 0.036 0.036 0.030 0.035 0.031 0.049
Mexican 0.040 0.050 0.044 0.037 0.037 0.049 0.046 0.047
Caribbean 0.028 0.028 0.027 0.025 0.024 0.031 0.034 0.037
Argentinian 0.028 0.029 0.023 0.024 0.024 0.048 0.049 0.026
Brazilian 0.030 0.027 0.023 0.025 0.021 0.042 0.036 0.030
Peruvian 0.020 0.014 0.015 0.017 0.013 0.025 0.032 0.024
Sum 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
UAE Saudi.Arabia Great.Britain Germany France Denmark Sweden
Spanish 0.035 0.036 0.042 0.041 0.047 0.046 0.046
British 0.033 0.033 0.048 0.015 0.015 0.017 0.020
French 0.037 0.040 0.036 0.037 0.052 0.043 0.044
German 0.024 0.024 0.027 0.051 0.027 0.023 0.032
Italian 0.045 0.050 0.048 0.053 0.050 0.054 0.052
Danish 0.020 0.019 0.022 0.021 0.019 0.053 0.035
Norwegian 0.021 0.017 0.016 0.020 0.021 0.017 0.023
Finnish 0.018 0.017 0.012 0.015 0.018 0.008 0.020
Swedish 0.023 0.022 0.025 0.025 0.022 0.018 0.052
Greek 0.028 0.033 0.039 0.047 0.040 0.043 0.048
Turkish 0.040 0.054 0.035 0.038 0.033 0.031 0.036
Moroccan 0.033 0.043 0.035 0.025 0.043 0.028 0.029
Lebanese 0.042 0.054 0.030 0.021 0.036 0.024 0.038
Saudi Arabian 0.035 0.057 0.009 0.014 0.016 0.014 0.012
Emirati 0.043 0.045 0.012 0.016 0.016 0.011 0.012
Indian 0.037 0.049 0.044 0.037 0.038 0.042 0.038
Chinese 0.036 0.034 0.045 0.046 0.043 0.046 0.041
Thai 0.032 0.020 0.039 0.040 0.038 0.046 0.044
Malaysian 0.027 0.024 0.030 0.020 0.019 0.020 0.017
Vietnamese 0.020 0.012 0.028 0.034 0.040 0.039 0.030
Singaporean 0.027 0.017 0.029 0.020 0.017 0.018 0.013
Hong Kong 0.023 0.014 0.031 0.023 0.024 0.021 0.016
Korean 0.026 0.019 0.023 0.027 0.024 0.020 0.027
Filipino 0.025 0.016 0.015 0.022 0.017 0.019 0.014
Indonesian 0.024 0.029 0.028 0.030 0.024 0.028 0.025
Taiwanese 0.022 0.015 0.019 0.024 0.025 0.018 0.014
Australian 0.027 0.020 0.027 0.021 0.019 0.026 0.019
Japanese 0.030 0.027 0.030 0.034 0.038 0.038 0.037
USA 0.039 0.042 0.037 0.031 0.030 0.044 0.035
Mexican 0.037 0.041 0.040 0.042 0.042 0.049 0.043
Caribbean 0.024 0.021 0.035 0.030 0.029 0.028 0.027
Argentinian 0.023 0.019 0.024 0.035 0.029 0.028 0.023
Brazilian 0.025 0.025 0.026 0.029 0.031 0.026 0.022
Peruvian 0.019 0.015 0.014 0.017 0.021 0.013 0.015
Sum 1.000 1.000 1.000 1.000 1.000 1.000 1.000
Finland Norway Sum
Spanish 0.038 0.044 0.905
British 0.027 0.024 0.662
French 0.038 0.040 0.933
German 0.028 0.028 0.697
Italian 0.045 0.045 1.128
Danish 0.024 0.036 0.535
Norwegian 0.021 0.044 0.481
Finnish 0.019 0.015 0.392
Swedish 0.025 0.034 0.571
Greek 0.032 0.041 0.732
Turkish 0.032 0.034 0.765
Moroccan 0.027 0.026 0.606
Lebanese 0.027 0.026 0.617
Saudi Arabian 0.022 0.011 0.493
Emirati 0.022 0.013 0.507
Indian 0.034 0.041 0.830
Chinese 0.042 0.042 1.054
Thai 0.039 0.040 0.923
Malaysian 0.028 0.019 0.635
Vietnamese 0.031 0.031 0.730
Singaporean 0.029 0.019 0.659
Hong Kong 0.031 0.020 0.722
Korean 0.032 0.023 0.756
Filipino 0.021 0.022 0.477
Indonesian 0.028 0.023 0.660
Taiwanese 0.029 0.017 0.677
Australian 0.027 0.022 0.631
Japanese 0.039 0.033 0.827
USA 0.036 0.037 0.882
Mexican 0.037 0.046 1.015
Caribbean 0.025 0.029 0.684
Argentinian 0.024 0.029 0.695
Brazilian 0.025 0.026 0.669
Peruvian 0.017 0.016 0.446
Sum 1.000 1.000 24.000
Col.F <- as.data.frame (Col.F)
This shows how the cuisines are distributed across different countries.
Correspondence Analysis
We perform Correspondence Analysis (CA) to visualize the relationship between countries and cuisines.
Call:
CA(X = cuisine)
The chi square of independence between the two variables is equal to 32158.4 (p-value = 0 ).
Eigenvalues
Dim.1 Dim.2 Dim.3 Dim.4 Dim.5 Dim.6 Dim.7
Variance 0.020 0.017 0.006 0.004 0.004 0.003 0.002
% of var. 29.441 25.046 9.444 6.154 5.540 4.410 3.602
Cumulative % of var. 29.441 54.487 63.931 70.085 75.625 80.035 83.637
Dim.8 Dim.9 Dim.10 Dim.11 Dim.12 Dim.13 Dim.14
Variance 0.002 0.002 0.001 0.001 0.001 0.001 0.001
% of var. 2.997 2.263 2.148 1.856 1.498 1.202 1.038
Cumulative % of var. 86.635 88.898 91.046 92.903 94.401 95.603 96.641
Dim.15 Dim.16 Dim.17 Dim.18 Dim.19 Dim.20 Dim.21
Variance 0.001 0.001 0.000 0.000 0.000 0.000 0.000
% of var. 0.884 0.798 0.571 0.364 0.285 0.229 0.104
Cumulative % of var. 97.525 98.323 98.893 99.257 99.542 99.772 99.875
Dim.22 Dim.23
Variance 0.000 0.000
% of var. 0.093 0.031
Cumulative % of var. 99.969 100.000
Rows (the 10 first)
Iner*1000 Dim.1 ctr cos2 Dim.2 ctr cos2
Spanish | 2.406 | -0.200 7.433 0.623 | -0.090 1.764 0.126 |
British | 2.838 | 0.141 2.795 0.199 | 0.157 4.041 0.244 |
French | 0.651 | -0.058 0.636 0.197 | -0.053 0.628 0.165 |
German | 1.711 | -0.034 0.169 0.020 | -0.128 2.838 0.284 |
Italian | 0.978 | -0.104 2.487 0.513 | -0.041 0.458 0.080 |
Danish | 2.579 | -0.108 1.291 0.101 | -0.045 0.267 0.018 |
Norwegian | 1.166 | -0.036 0.131 0.023 | -0.038 0.165 0.024 |
Finnish | 0.685 | 0.065 0.340 0.100 | 0.030 0.086 0.022 |
Swedish | 1.800 | -0.094 1.054 0.118 | -0.021 0.061 0.006 |
Greek | 3.506 | -0.298 13.714 0.789 | -0.068 0.851 0.042 |
Dim.3 ctr cos2
Spanish 0.029 0.484 0.013 |
British -0.109 5.218 0.119 |
French 0.026 0.408 0.040 |
German 0.011 0.054 0.002 |
Italian 0.048 1.637 0.108 |
Danish -0.232 18.499 0.464 |
Norwegian -0.092 2.595 0.144 |
Finnish -0.026 0.165 0.016 |
Swedish -0.148 8.106 0.291 |
Greek -0.073 2.561 0.047 |
Columns (the 10 first)
Iner*1000 Dim.1 ctr cos2 Dim.2 ctr cos2
Australia | 2.380 | 0.083 1.595 0.135 | 0.015 0.065 0.005 |
China | 2.673 | 0.116 2.285 0.172 | -0.074 1.093 0.070 |
Hong.Kong | 2.319 | 0.166 6.075 0.528 | -0.115 3.416 0.253 |
Indonesia | 3.934 | 0.143 3.624 0.186 | 0.216 9.687 0.422 |
Japan | 3.785 | 0.088 1.093 0.058 | -0.206 6.990 0.317 |
Malaysia | 3.277 | 0.166 5.468 0.336 | 0.172 6.940 0.363 |
Philippines | 2.820 | 0.119 3.246 0.232 | 0.013 0.044 0.003 |
Taiwan | 1.532 | 0.136 3.939 0.518 | -0.057 0.803 0.090 |
Thailand | 3.192 | 0.194 6.271 0.396 | -0.159 4.986 0.268 |
Vietnam | 2.587 | 0.192 6.968 0.543 | -0.071 1.105 0.073 |
Dim.3 ctr cos2
Australia -0.067 3.192 0.087 |
China 0.155 12.605 0.305 |
Hong.Kong 0.013 0.118 0.003 |
Indonesia 0.076 3.228 0.053 |
Japan 0.138 8.281 0.141 |
Malaysia -0.013 0.108 0.002 |
Philippines -0.057 2.303 0.053 |
Taiwan 0.040 1.044 0.044 |
Thailand 0.015 0.121 0.002 |
Vietnam -0.045 1.176 0.029 |
$coord
Dim 1 Dim 2 Dim 3 Dim 4 Dim 5
Australia 0.08302005 0.015458427 -0.06651434 0.048309050 -0.13749088
China 0.11636375 -0.074221549 0.15477191 0.036122312 -0.06783837
Hong.Kong 0.16633782 -0.115040602 0.01312089 -0.034251859 0.03487409
Indonesia 0.14300827 0.215668276 0.07644340 -0.091139659 -0.02254778
Japan 0.08839067 -0.206139310 0.13776466 -0.074998055 0.12097466
Malaysia 0.16597221 0.172453324 -0.01323118 -0.104430896 -0.02000844
Philippines 0.11859589 0.012681574 -0.05657836 0.140223977 -0.06519760
Taiwan 0.13586556 -0.056572381 0.03962091 0.027554751 0.01156622
Thailand 0.19360528 -0.159222967 0.01521012 0.008073417 0.10828190
Vietnam 0.19217897 -0.070595302 -0.04472258 0.069616494 0.08000705
India 0.01859268 0.160750801 -0.02082196 0.019764236 0.02358728
Singapore 0.17301652 0.020108128 -0.06858183 -0.031258811 -0.02764422
Italy -0.20517073 -0.063331621 0.13022158 -0.019302881 -0.07970447
Spain -0.19406168 -0.072168422 0.18260571 -0.012604036 -0.06944199
USA -0.07242346 -0.105271191 0.01557629 -0.006611319 0.02710857
UAE -0.04138612 0.228636926 0.02420461 0.076638085 0.03735896
Saudi.Arabia -0.17968909 0.406887194 0.07586517 0.031552953 0.09787759
Great.Britain -0.06293473 0.007351443 -0.08739636 -0.135649102 -0.04635864
Germany -0.12155128 -0.096561187 0.01088307 0.038673434 -0.03989288
France -0.17626602 -0.044630256 0.04232722 0.011030364 0.03780012
Denmark -0.17115684 -0.087737443 -0.11416890 -0.054505232 0.04752876
Sweden -0.23389641 -0.052763176 -0.13582623 0.027205028 0.05911801
Finland 0.01304349 0.024829005 -0.02092509 0.020400905 0.02122852
Norway -0.16092400 -0.069835342 -0.12285816 0.036780175 0.01071321
$contrib
Dim 1 Dim 2 Dim 3 Dim 4 Dim 5
Australia 1.59484529 0.06499741 3.1915596 2.58339692 23.24682875
China 2.28539903 1.09294812 12.6046493 1.05355995 4.12801106
Hong.Kong 6.07497542 3.41568449 0.1178444 1.23228854 1.41916073
Indonesia 3.62355346 9.68718987 3.2278415 7.04058479 0.47872205
Japan 1.09339871 6.99037159 8.2805885 3.76571363 10.88473648
Malaysia 5.46843332 6.93982020 0.1083452 10.35694687 0.42235933
Philippines 3.24564352 0.04362352 2.3029369 21.70639279 5.21300464
Taiwan 3.93898613 0.80276332 1.0443221 0.77506879 0.15170932
Thailand 6.27143323 4.98605160 0.1206751 0.05217096 10.42575531
Vietnam 6.96812514 1.10527094 1.1764618 4.37431545 6.41836140
India 0.07957391 6.99207353 0.3111357 0.43015839 0.68062179
Singapore 7.17207131 0.11387429 3.5132372 1.11994309 0.97306435
Italy 6.96039218 0.77957342 8.7415548 0.29473298 5.58253751
Spain 6.25887331 1.01747538 17.2768734 0.12630409 4.25916384
USA 1.20550025 2.99392358 0.1738428 0.04805809 0.89760502
UAE 0.35895882 12.87777811 0.3827819 5.88851984 1.55449099
Saudi.Arabia 5.39271540 32.50312085 2.9968761 0.79547346 8.50342784
Great.Britain 1.40562629 0.02254484 8.4507580 31.23953279 4.05334714
Germany 5.38332033 3.99345936 0.1345407 2.60698306 3.08166441
France 6.18124105 0.46581149 1.1112147 0.11579765 1.51073497
Denmark 4.96947975 1.53498902 6.8934701 2.41090656 2.03656699
Sweden 10.35926118 0.61966427 10.8910268 0.67044258 3.51709997
Finland 0.03370548 0.14356374 0.2704381 0.39445154 0.47447794
Norway 3.67448751 0.81342706 6.6770255 0.91825718 0.08654815
$cos2
Dim 1 Dim 2 Dim 3 Dim 4 Dim 5
Australia 0.135110269 0.004684389 0.086726713 0.0457487624 0.370569877
China 0.172355432 0.070121235 0.304911522 0.0166088899 0.058578789
Hong.Kong 0.528016964 0.252562094 0.003285432 0.0223889730 0.023209807
Indonesia 0.185695505 0.422329679 0.053058998 0.0754212741 0.004616223
Japan 0.058232493 0.316719035 0.141458212 0.0419230176 0.109079050
Malaysia 0.336365850 0.363148510 0.002137660 0.1331677732 0.004888410
Philippines 0.232045811 0.002653266 0.052812301 0.3243986110 0.070128996
Taiwan 0.518495765 0.089894962 0.044093578 0.0213264920 0.003757592
Thailand 0.396037367 0.267863292 0.002444371 0.0006886786 0.123883506
Vietnam 0.543041762 0.073277917 0.029408661 0.0712599972 0.094119162
India 0.008506262 0.635859451 0.010668369 0.0096120248 0.013690223
Singapore 0.654065308 0.008834648 0.102769469 0.0213496561 0.016697624
Italy 0.528989514 0.050403123 0.213099372 0.0046823136 0.079832844
Spain 0.424229573 0.058669956 0.375621170 0.0017895367 0.054320746
USA 0.161290432 0.340776065 0.007460676 0.0013440834 0.022597644
UAE 0.025316396 0.772654355 0.008659420 0.0868124082 0.020629184
Saudi.Arabia 0.146093177 0.749090979 0.026041821 0.0045047069 0.043346403
Great.Britain 0.087721627 0.001196936 0.169165814 0.4075301567 0.047597836
Germany 0.308266011 0.194541276 0.002471207 0.0312055645 0.033204532
France 0.562139978 0.036038446 0.032415064 0.0022013392 0.025851972
Denmark 0.302131137 0.079392030 0.134431978 0.0306396184 0.023298065
Sweden 0.537791761 0.027367107 0.181356998 0.0072755422 0.034356323
Finland 0.038606939 0.139893235 0.099360281 0.0944445590 0.102262756
Norway 0.352352364 0.066356902 0.205373109 0.0184061586 0.001561618
$inertia
[1] 0.0023796267 0.0026731030 0.0023193980 0.0039337967 0.0037852258
[6] 0.0032774015 0.0028197182 0.0015315030 0.0031923409 0.0025867910
[11] 0.0018858648 0.0022105590 0.0026525604 0.0029742252 0.0015067366
[16] 0.0028583893 0.0074414265 0.0032302923 0.0035204908 0.0022167149
[21] 0.0033158438 0.0038832314 0.0001760004 0.0021023171
The CA results provide us with information about how countries and cuisines relate in a low-dimensional space.
On this map we can already make some conclusions about the relationship between the regions and the cuisines, with Occidental, Oriental and Asian region.
Scree plot analysis
This visualization presents the percentage of variance explained by each dimension in our culinary preferences analysis. The scree plot is a critical tool for determining how many dimensions we should retain for meaningful interpretation. Each bar represents one dimension, with its height indicating the proportion of total variance explained by that dimension.
# Sccreeplot with fviz_screeplot
fviz_screeplot (results,
addlabels = TRUE , # Add percentage values
ncp = 10 , # Display the first 10 dimensions
choice = "variance" ,
main = "Percentage of Variance Explained by Dimensions" ,
xlab = "Dimensions" ,
ylab = "Percentage of Explained Variance" ,
barfill = "steelblue" ,
barcolor = "steelblue" ,
linecolor = "red" ) +
geom_hline (yintercept = 1 / (ncol (results$ row$ coord))* 100 ,
linetype = "dashed" ,
color = "gray70" )
Elbow method
After extracting country coordinates from our first two principal dimensions, we apply the elbow method to identify the optimal number of clusters. This method plots the within-cluster sum of squares against different cluster counts, helping us find where adding more clusters provides diminishing returns.
Based on our analysis, k=3 emerges as the optimal choice, indicating three distinct culinary preference patterns exist among countries. This selection balances between describing meaningful differences while avoiding unnecessary complexity in our classification.
# Récupérer les coordonnées des individus (limité aux 2 premières dimensions)
ind_coordinates <- results$ row$ coord[, 1 : 2 ]
# Méthode du coude
fviz_nbclust (ind_coordinates, kmeans, method = "wss" ) +
labs (title = "Elbow method" )
K-means Clustering of Culinary Preferences
After determining three as our optimal number of clusters, we apply K-means clustering to group countries based on their culinary preference patterns. Setting a seed ensures our analysis is reproducible.
The visualization displays countries positioned according to their coordinates on the first two principal dimensions, with each cluster represented by a different color. Countries within the same cluster share similar culinary preferences, while those in different clusters exhibit distinct patterns. The convex hulls drawn around each cluster help visualize their boundaries and separation.
The text labels identify individual countries, with a repel feature preventing overlap for better readability. This clustering provides a meaningful segmentation of global culinary landscapes, revealing how geographical, cultural, or historical factors may influence food preferences across different regions of the world.
# Clustering avec K-means
set.seed (123 ) # Pour la reproductibilité
k <- 3 # Choisir le nombre de clusters
kmeans_result <- kmeans (ind_coordinates, centers = k)
# Visualisation des clusters avec étiquettes forcées
p <- fviz_cluster (kmeans_result,
data = ind_coordinates,
geom = c ("point" , "text" ),
stand = TRUE ,
ellipse.type = "convex" ,
repel = TRUE ,
ggtheme = theme_minimal (),
main = "Clustering of Countries Based on Culinary Preferences" )
# Afficher le graphique
print (p)
Clustering sizes
This bar chart illustrates how countries are distributed among our three identified culinary preference clusters. Each bar represents a cluster, with the count of countries displayed above.
The varying sizes help us quickly identify which culinary patterns are more common globally versus those that represent more distinctive traditions. This simple visualization quantifies the membership of each culinary group, providing context for our subsequent interpretation of what characterizes each cluster of countries.
# Cluster sizes
cluster_sizes <- table (kmeans_result$ cluster)
bp <- barplot (cluster_sizes,
main = "Cluster Sizes" ,
xlab = "Cluster" ,
ylab = "Number of Countries" ,
col = "steelblue" ,
ylim = c (0 , max (cluster_sizes) * 1.2 )) # Increase y-axis limit to make room for labels
# Add text labels on top of each bar showing count
text (x = bp,
y = cluster_sizes + max (cluster_sizes) * 0.05 , # Position labels slightly above bars
labels = cluster_sizes,
col = "black" ,
font = 2 )
Dispersion within Clusters
This boxplot shows how tightly grouped countries are within each culinary cluster by measuring their distances from cluster centers.
Clusters with lower distances indicate groups of countries sharing very similar culinary preferences, while higher distances suggest more diverse traditions within that cluster. This visualization helps us evaluate the internal cohesion of each culinary pattern and understand which clusters represent more uniform versus more varied preference groups.
# Distance des points au centre de leur cluster
within_cluster_distances <- rep (0 , nrow (ind_coordinates))
for (i in 1 : nrow (ind_coordinates)) {
cluster_i <- kmeans_result$ cluster[i]
center_i <- kmeans_result$ centers[cluster_i,]
within_cluster_distances[i] <- dist (rbind (ind_coordinates[i,], center_i))
}
boxplot (within_cluster_distances ~ kmeans_result$ cluster,
main = "Dispersion within Clusters" ,
xlab = "Cluster" ,
ylab = "Distance to Center" )
Conclusion
This analysis highlights culinary preferences across 24 countries and 34 cuisines, using a combination of descriptive statistics, independence tests, and exploratory methods. Italian, Chinese, and Mexican cuisines emerge as the most globally preferred.
The chi-square test reveals a statistically significant relationship between countries and cuisines, but with weak association strength (Phi² ≈ 0.068, normalized ≈ 0.003), reflecting high diversity in preferences both between and within countries.
Correspondence Analysis (CA) was then used to visualize these complex relationships, revealing clusters of countries and cuisines with similar profiles. Finally, clustering techniques helped identify coherent groups, illustrating regional or cultural similarities in food preferences.
In summary, culinary preferences are both globalized and culturally rooted, and the multidimensional approach used here provides valuable insights into global taste patterns.