library(factoextra)
## Loading required package: ggplot2
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
library(gridExtra)
This study aims to systematically analyze the overall evolution and temporal structure of the European aviation industry from 2013 to 2024 using dimensionality reduction techniques. Specifically, the objectives of this study are as follows:
To apply principal component analysis (PCA) to high-dimensional country–year aviation transport data in order to extract the main patterns of variation, identify the core common factors driving changes in the European aviation industry, and assess the relative roles and degree of synchronization among different countries within the overall market.
To characterize the temporal evolution of the European aviation industry, with a particular focus on the structural differences in the principal component space across the pre-pandemic period (2013–2019), the pandemic-induced shock period (2020–2021), and the post-pandemic recovery phase (2022–2024), thereby revealing the structural break caused by COVID-19 and the subsequent recovery process.
To complement the PCA results with multidimensional scaling (MDS), examining similarities between years from a distance-based perspective and comparing the results with those obtained from PCA, in order to assess the robustness of the identified temporal segmentation and its dependence on the choice of dimensionality reduction method.
To compare the aviation development patterns of EU and non-EU countries, exploring the high degree of synchronization within the aviation industry under the EU single market framework and the relative heterogeneity exhibited by non-EU countries in the overall structure.
Through these analyses, this study seeks to enhance the understanding of the evolution mechanisms of the European aviation market from both an overall and a structural perspective, and to provide quantitative evidence for subsequent policy analysis and industry-related research.
The data are sourced from the Eurostat database, the official statistical database maintained by the European Statistical Office (Eurostat). It provides comprehensive statistics covering multiple domains, including the economy, society, population, employment, education, health, and the environment. The data in this database are highly standardized and comparable across countries, covering all EU Member States as well as several related countries, and are regularly updated. As a result, Eurostat data are widely used in academic research and policy analysis.
https://ec.europa.eu/eurostat/web/main/data/database
Within the Traffic subdirectory, information on air transport activity for EU Member States and other European countries from 2015 to 2024 can be found. These data are used in this study to analyze annual changes in the aviation industry across EU countries.
aireu<-read.csv("C:/Users/13640/Desktop/pass.csv", header=TRUE)
head(aireu)
## freq.unit.tra_meas.tra_cov.schedule.geo.TIME_PERIOD X2013 X2014
## 1 A,PAS,PAS_CRD,TOTAL,TOT,AT 25749724 26378676
## 2 A,PAS,PAS_CRD,TOTAL,TOT,BA : :
## 3 A,PAS,PAS_CRD,TOTAL,TOT,BE 26389927 28776258
## 4 A,PAS,PAS_CRD,TOTAL,TOT,BG 7079292 7520697
## 5 A,PAS,PAS_CRD,TOTAL,TOT,CH 44217568 46127426
## 6 A,PAS,PAS_CRD,TOTAL,TOT,CY 7011437 7328546
## X2015 X2016 X2017 X2018 X2019 X2020 X2021
## 1 26754007 27181511 28327279 31138417 35644188 9168431 11105564
## 2 : : : : : : 987659
## 3 30958841 30115832 33260493 34506309 35385188 9465828 13500020
## 4 7610949 9324217 11092651 12137714 11713068 3729017 5047877
## 5 48026375 50505492 53564943 56139549 57194328 16006811 19109708
## 6 7590787 8961817 10238913 10927101 11261410 2270577 5099704
## X2022 X2023 X2024
## 1 26381180 33063166 35281811
## 2 1769813 1917082 :
## 3 27873892 32341221 34759837
## 4 8807502 10561597 10961466
## 5 42568368 52090531 56674840
## 6 9200931 11616238 12264970
summary(aireu)
## freq.unit.tra_meas.tra_cov.schedule.geo.TIME_PERIOD X2013
## Length:37 Length:37
## Class :character Class :character
## Mode :character Mode :character
## X2014 X2015 X2016 X2017
## Length:37 Length:37 Length:37 Length:37
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
## X2018 X2019 X2020 X2021
## Length:37 Length:37 Length:37 Length:37
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
## X2022 X2023 X2024
## Length:37 Length:37 Length:37
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
We performed data preprocessing steps including data cleaning, extraction of the year column, and converting the year variable into a numeric format to ensure the feasibility of subsequent numerical analyses.
aireu[aireu == ":"] <- NA
year_cols <- grep("^X[0-9]{4}$", colnames(aireu))
colnames(aireu)[year_cols]
## [1] "X2013" "X2014" "X2015" "X2016" "X2017" "X2018" "X2019" "X2020" "X2021"
## [10] "X2022" "X2023" "X2024"
aireu[year_cols] <- lapply(aireu[year_cols], as.numeric)
summary(aireu)
## freq.unit.tra_meas.tra_cov.schedule.geo.TIME_PERIOD X2013
## Length:37 Min. : 1265766
## Class :character 1st Qu.: 5487400
## Mode :character Median : 23939062
## Mean : 62310055
## 3rd Qu.: 38569165
## Max. :746100398
## NA's :5
## X2014 X2015 X2016
## Min. : 1307128 Min. : 1436003 Min. : 1404152
## 1st Qu.: 5806168 1st Qu.: 5145856 1st Qu.: 5232303
## Median : 26012624 Median : 26754007 Median : 18099954
## Mean : 65320751 Mean : 66690435 Mean : 67253019
## 3rd Qu.: 40870231 3rd Qu.: 42096402 3rd Qu.: 43236708
## Max. :781202599 Max. :819698948 Max. :871695782
## NA's :5 NA's :4 NA's :2
## X2017 X2018 X2019
## Min. : 1682133 Min. : 1810567 Min. :1.719e+06
## 1st Qu.: 6042792 1st Qu.: 6921444 1st Qu.:7.451e+06
## Median : 20054947 Median : 22173530 Median :2.329e+07
## Mean : 72534524 Mean : 76774819 Mean :7.948e+07
## 3rd Qu.: 48921892 3rd Qu.: 52638712 3rd Qu.:5.555e+07
## Max. :938854476 Max. :996295411 Max. :1.035e+09
## NA's :2 NA's :2 NA's :2
## X2020 X2021 X2022
## Min. : 287787 Min. : 419346 Min. : 968811
## 1st Qu.: 1837992 1st Qu.: 2450871 1st Qu.: 5614983
## Median : 6031034 Median : 5099704 Median : 13812577
## Mean : 19706028 Mean : 26179610 Mean : 57003915
## 3rd Qu.: 15461473 3rd Qu.: 19001760 3rd Qu.: 40958032
## Max. :276758108 Max. :373809763 Max. :816699952
## NA's :3 NA's :2 NA's :2
## X2023 X2024
## Min. : 1268352 Min. :1.438e+06
## 1st Qu.: 7514804 1st Qu.:8.691e+06
## Median : 19783568 Median :2.459e+07
## Mean : 70582699 Mean :7.877e+07
## 3rd Qu.: 54344226 3rd Qu.:6.052e+07
## Max. :972941917 Max. :1.054e+09
## NA's :1 NA's :2
Subsequently, the data were transposed from a “region × year” structure to a “year × region” time-series matrix, and the column names were cleaned and standardized.
dim(aireu)
## [1] 37 13
year_cols <- grep("^X[0-9]{4}$", colnames(aireu))
air_mat <- as.matrix(aireu[, year_cols])
air_time <- t(air_mat)
summary(air_time)
## V1 V2 V3 V4
## Min. : 9168431 Min. : 987659 Min. : 9465828 Min. : 3729017
## 1st Qu.:26221438 1st Qu.:1378736 1st Qu.:27502901 1st Qu.: 7410346
## Median :26967759 Median :1769813 Median :30537336 Median : 9065860
## Mean :26347830 Mean :1558185 Mean :28111137 Mean : 8798837
## 3rd Qu.:31619604 3rd Qu.:1843448 3rd Qu.:33571947 3rd Qu.:10994262
## Max. :35644188 Max. :1917082 Max. :35385188 Max. :12137714
## NA's :9
## V5 V6 V7 V8
## Min. :16006811 Min. : 2270577 Min. : 3821372 Min. : 57795978
## 1st Qu.:43805268 1st Qu.: 7249269 1st Qu.:11802022 1st Qu.:174413052
## Median :49265934 Median : 9081374 Median :13172183 Median :190191122
## Mean :45185495 Mean : 8647703 Mean :12995590 Mean :174579965
## 3rd Qu.:54208594 3rd Qu.:11010678 3rd Qu.:16620692 3rd Qu.:203612806
## Max. :57194328 Max. :12264970 Max. :18767088 Max. :226764086
##
## V9 V10 V11 V12
## Min. : 8658654 Min. : 857837 Min. :17341192 Min. : 57797305
## 1st Qu.:27257110 1st Qu.:2004496 1st Qu.:37844358 1st Qu.:163448780
## Median :30916440 Median :2425067 Median :47857050 Median :197799836
## Mean :27900418 Mean :2378618 Mean :46931793 Mean :183126034
## 3rd Qu.:33621195 3rd Qu.:2958663 3rd Qu.:56542888 3rd Qu.:222524165
## Max. :34865711 Max. :3471878 Max. :71028749 Max. :259739884
##
## V13 V14 V15 V16
## Min. :2.768e+08 Min. : 4554497 Min. : 50724011 Min. : 1943547
## 1st Qu.:7.724e+08 1st Qu.:15877188 1st Qu.:135461222 1st Qu.: 6036210
## Median :8.457e+08 Median :17325588 Median :143074086 Median : 8159258
## Mean :8.069e+08 Mean :16087157 Mean :135071213 Mean : 7863082
## 3rd Qu.:9.788e+08 3rd Qu.:18588702 3rd Qu.:160545991 3rd Qu.: 9954280
## Max. :1.054e+09 Max. :23287929 Max. :168726788 Max. :12610863
##
## V17 V18 V19 V20
## Min. : 3962687 Min. : 8268297 Min. : 1527633 Min. : 40405355
## 1st Qu.: 8901466 1st Qu.:25884030 1st Qu.: 3690027 1st Qu.:119693216
## Median :12026939 Median :32500800 Median : 6632646 Median :133451750
## Mean :11527551 Mean :29272366 Mean : 6032473 Mean :127890384
## 3rd Qu.:14980317 3rd Qu.:36745631 3rd Qu.: 8252340 3rd Qu.:155181318
## Max. :17781961 Max. :40837815 Max. :10166386 Max. :182370612
##
## V21 V22 V23 V24
## Min. :1804500 Min. :1426310 Min. :1995133 Min. : 521959
## 1st Qu.:3719172 1st Qu.:2367806 1st Qu.:4797276 1st Qu.:1845464
## Median :5016831 Median :3269486 Median :5376264 Median :2173494
## Mean :4706512 Mean :3297885 Mean :5369451 Mean :2024112
## 3rd Qu.:6057558 3rd Qu.:4134328 3rd Qu.:6723326 3rd Qu.:2493209
## Max. :6582749 Max. :5147854 Max. :7785726 Max. :2871774
## NA's :3
## V25 V26 V27 V28
## Min. : 709241 Min. :1752445 Min. :23594783 Min. :13216883
## 1st Qu.:1501623 1st Qu.:4225531 1st Qu.:60241570 1st Qu.:33059246
## Median :1998135 Median :5471022 Median :67444466 Median :37553124
## Mean :1989149 Mean :5424804 Mean :62714618 Mean :33338751
## 3rd Qu.:2303182 3rd Qu.:6933952 3rd Qu.:76246802 3rd Qu.:38399377
## Max. :3169418 Max. :8968239 Max. :81192507 Max. :40348437
## NA's :2
## V29 V30 V31 V32
## Min. :13825460 Min. :16548993 Min. : 6633447 Min. :1938468
## 1st Qu.:25104438 1st Qu.:31844002 1st Qu.:10776768 1st Qu.:4414858
## Median :34975764 Median :44301550 Median :16544246 Median :5521250
## Mean :34819941 Mean :42337867 Mean :16006462 Mean :5519979
## 3rd Qu.:44561354 3rd Qu.:52126117 3rd Qu.:20248698 3rd Qu.:6450643
## Max. :57045518 Max. :63996712 Max. :24590200 Max. :8715276
## NA's :3
## V33 V34 V35 V36
## Min. : 9317677 Min. : 287787 Min. : 500604 Min. :168558965
## 1st Qu.:28340082 1st Qu.:1191527 1st Qu.:1642755 1st Qu.:172087594
## Median :32104634 Median :1355640 Median :2050959 Median :175616222
## Mean :29467972 Mean :1250562 Mean :1963682 Mean :175616222
## 3rd Qu.:36368109 3rd Qu.:1498780 3rd Qu.:2494402 3rd Qu.:179144851
## Max. :38945096 Max. :1810567 Max. :2839787 Max. :182673480
## NA's :10
## V37
## Min. :210468980
## 1st Qu.:226146280
## Median :248868873
## Mean :246554629
## 3rd Qu.:268409804
## Max. :277432380
## NA's :5
head(air_time)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## X2013 25749724 NA 26389927 7079292 44217568 7011437 11891812 180783188
## X2014 26378676 NA 28776258 7520697 46127426 7328546 12079873 186445814
## X2015 26754007 NA 30958841 7610949 48026375 7590787 12672004 193936430
## X2016 27181511 NA 30115832 9324217 50505492 8961817 13672362 200687293
## X2017 28327279 NA 33260493 11092651 53564943 10238913 16245554 212389343
## X2018 31138417 NA 34506309 12137714 56139549 10927101 17838221 222422361
## [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16]
## X2013 27459623 1958565 34023934 157731973 746100398 16565391 132762875 5722447
## X2014 29015133 2019806 39117833 165354382 781202599 17171931 136360671 6140797
## X2015 30095505 2160978 42096402 174652503 819698948 17479246 140867569 6571698
## X2016 32763142 2214989 45543371 193872037 871695782 18099954 145280602 7475463
## X2017 33261214 2635145 50170728 209824089 938854476 20054947 154096485 8843053
## X2018 34701139 2995528 54258826 220611429 996295411 22173530 161991179 9731294
## [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24]
## X2013 8441319 24603640 3199266 115279105 3482358 2169327 4782257 NA
## X2014 9054848 26310826 3853614 121164587 3798110 2433966 4802282 NA
## X2015 10228352 29545020 4847288 127665221 4227389 2651751 5145856 NA
## X2016 11660366 32595709 6801814 134477781 4787561 2984242 5384160 1845464
## X2017 13350029 34271771 8728509 144306325 5246101 3554730 6077854 2173494
## X2018 15176493 36345005 10166386 153352444 6254178 3988804 7037070 2440486
## [,25] [,26] [,27] [,28] [,29] [,30] [,31] [,32]
## X2013 NA 4032029 58077271 36686364 23274484 29694146 10016933 NA
## X2014 NA 4290032 60963003 37603195 25714422 32560621 10907487 NA
## X2015 1452373 4619557 64570938 37503052 28907439 36005814 12580711 NA
## X2016 1649374 5080446 70317995 37727546 32266861 40930044 15153719 4414858
## X2017 1861282 6007731 76240304 38739778 37684668 47673057 17934774 4828171
## X2018 2152746 6805817 79644163 40030105 43767548 51018598 19809642 5521250
## [,33] [,34] [,35] [,36] [,37]
## X2013 31443225 1265766 1557149 NA 210468980
## X2014 32766043 1307128 1671290 NA 220022122
## X2015 34011263 1436003 1943656 NA 232270437
## X2016 35952558 1404152 2158261 NA 248868873
## X2017 38456213 1682133 2402651 NA 264629454
## X2018 38945096 1810567 2794094 NA 272190155
colnames(air_time)
## NULL
colnames(air_time) <- aireu[, 1]
head(colnames(air_time))
## [1] "A,PAS,PAS_CRD,TOTAL,TOT,AT" "A,PAS,PAS_CRD,TOTAL,TOT,BA"
## [3] "A,PAS,PAS_CRD,TOTAL,TOT,BE" "A,PAS,PAS_CRD,TOTAL,TOT,BG"
## [5] "A,PAS,PAS_CRD,TOTAL,TOT,CH" "A,PAS,PAS_CRD,TOTAL,TOT,CY"
colnames(air_time) <- sub(".*,", "", colnames(air_time))
head(colnames(air_time))
## [1] "AT" "BA" "BE" "BG" "CH" "CY"
summary(air_time)
## AT BA BE BG
## Min. : 9168431 Min. : 987659 Min. : 9465828 Min. : 3729017
## 1st Qu.:26221438 1st Qu.:1378736 1st Qu.:27502901 1st Qu.: 7410346
## Median :26967759 Median :1769813 Median :30537336 Median : 9065860
## Mean :26347830 Mean :1558185 Mean :28111137 Mean : 8798837
## 3rd Qu.:31619604 3rd Qu.:1843448 3rd Qu.:33571947 3rd Qu.:10994262
## Max. :35644188 Max. :1917082 Max. :35385188 Max. :12137714
## NA's :9
## CH CY CZ DE
## Min. :16006811 Min. : 2270577 Min. : 3821372 Min. : 57795978
## 1st Qu.:43805268 1st Qu.: 7249269 1st Qu.:11802022 1st Qu.:174413052
## Median :49265934 Median : 9081374 Median :13172183 Median :190191122
## Mean :45185495 Mean : 8647703 Mean :12995590 Mean :174579965
## 3rd Qu.:54208594 3rd Qu.:11010678 3rd Qu.:16620692 3rd Qu.:203612806
## Max. :57194328 Max. :12264970 Max. :18767088 Max. :226764086
##
## DK EE EL ES
## Min. : 8658654 Min. : 857837 Min. :17341192 Min. : 57797305
## 1st Qu.:27257110 1st Qu.:2004496 1st Qu.:37844358 1st Qu.:163448780
## Median :30916440 Median :2425067 Median :47857050 Median :197799836
## Mean :27900418 Mean :2378618 Mean :46931793 Mean :183126034
## 3rd Qu.:33621195 3rd Qu.:2958663 3rd Qu.:56542888 3rd Qu.:222524165
## Max. :34865711 Max. :3471878 Max. :71028749 Max. :259739884
##
## EU27_2020 FI FR HR
## Min. :2.768e+08 Min. : 4554497 Min. : 50724011 Min. : 1943547
## 1st Qu.:7.724e+08 1st Qu.:15877188 1st Qu.:135461222 1st Qu.: 6036210
## Median :8.457e+08 Median :17325588 Median :143074086 Median : 8159258
## Mean :8.069e+08 Mean :16087157 Mean :135071213 Mean : 7863082
## 3rd Qu.:9.788e+08 3rd Qu.:18588702 3rd Qu.:160545991 3rd Qu.: 9954280
## Max. :1.054e+09 Max. :23287929 Max. :168726788 Max. :12610863
##
## HU IE IS IT
## Min. : 3962687 Min. : 8268297 Min. : 1527633 Min. : 40405355
## 1st Qu.: 8901466 1st Qu.:25884030 1st Qu.: 3690027 1st Qu.:119693216
## Median :12026939 Median :32500800 Median : 6632646 Median :133451750
## Mean :11527551 Mean :29272366 Mean : 6032473 Mean :127890384
## 3rd Qu.:14980317 3rd Qu.:36745631 3rd Qu.: 8252340 3rd Qu.:155181318
## Max. :17781961 Max. :40837815 Max. :10166386 Max. :182370612
##
## LT LU LV ME
## Min. :1804500 Min. :1426310 Min. :1995133 Min. : 521959
## 1st Qu.:3719172 1st Qu.:2367806 1st Qu.:4797276 1st Qu.:1845464
## Median :5016831 Median :3269486 Median :5376264 Median :2173494
## Mean :4706512 Mean :3297885 Mean :5369451 Mean :2024112
## 3rd Qu.:6057558 3rd Qu.:4134328 3rd Qu.:6723326 3rd Qu.:2493209
## Max. :6582749 Max. :5147854 Max. :7785726 Max. :2871774
## NA's :3
## MK MT NL NO
## Min. : 709241 Min. :1752445 Min. :23594783 Min. :13216883
## 1st Qu.:1501623 1st Qu.:4225531 1st Qu.:60241570 1st Qu.:33059246
## Median :1998135 Median :5471022 Median :67444466 Median :37553124
## Mean :1989149 Mean :5424804 Mean :62714618 Mean :33338751
## 3rd Qu.:2303182 3rd Qu.:6933952 3rd Qu.:76246802 3rd Qu.:38399377
## Max. :3169418 Max. :8968239 Max. :81192507 Max. :40348437
## NA's :2
## PL PT RO RS
## Min. :13825460 Min. :16548993 Min. : 6633447 Min. :1938468
## 1st Qu.:25104438 1st Qu.:31844002 1st Qu.:10776768 1st Qu.:4414858
## Median :34975764 Median :44301550 Median :16544246 Median :5521250
## Mean :34819941 Mean :42337867 Mean :16006462 Mean :5519979
## 3rd Qu.:44561354 3rd Qu.:52126117 3rd Qu.:20248698 3rd Qu.:6450643
## Max. :57045518 Max. :63996712 Max. :24590200 Max. :8715276
## NA's :3
## SE SI SK TR
## Min. : 9317677 Min. : 287787 Min. : 500604 Min. :168558965
## 1st Qu.:28340082 1st Qu.:1191527 1st Qu.:1642755 1st Qu.:172087594
## Median :32104634 Median :1355640 Median :2050959 Median :175616222
## Mean :29467972 Mean :1250562 Mean :1963682 Mean :175616222
## 3rd Qu.:36368109 3rd Qu.:1498780 3rd Qu.:2494402 3rd Qu.:179144851
## Max. :38945096 Max. :1810567 Max. :2839787 Max. :182673480
## NA's :10
## UK
## Min. :210468980
## 1st Qu.:226146280
## Median :248868873
## Mean :246554629
## 3rd Qu.:268409804
## Max. :277432380
## NA's :5
dim(air_time)
## [1] 12 37
col_var <- apply(air_time, 2, var, na.rm = TRUE)
col_var
## AT BA BE BG CH CY
## 6.971638e+13 2.495467e+11 6.901332e+13 7.157922e+12 1.900447e+14 8.695569e+12
## CZ DE DK EE EL ES
## 2.276762e+13 2.965730e+15 7.991660e+13 6.120454e+11 2.239351e+14 3.486241e+15
## EU27_2020 FI FR HR HU IE
## 6.102672e+16 3.336933e+13 1.446896e+15 9.251603e+12 1.967285e+13 1.162854e+14
## IS IT LT LU LV ME
## 7.837456e+12 1.704888e+15 2.499450e+12 1.420915e+12 3.179985e+12 5.427816e+11
## MK MT NL NO PL PT
## 6.094041e+11 4.549305e+12 3.471177e+14 8.979898e+13 1.761256e+14 2.296188e+14
## RO RS SE SI SK TR
## 3.431209e+13 4.801279e+12 9.898155e+13 2.284819e+11 5.962417e+11 9.960977e+13
## UK
## 6.921428e+14
Next, the variance of each region over time was calculated to identify and remove variables that exhibit no variation across the entire time period, thereby improving the effectiveness and stability of the analysis.
zero_var_cols <- which(col_var == 0)
zero_var_cols
## named integer(0)
Missing values were imputed to ensure that the computations could be carried out smoothly. Finally, the data were standardized to prevent variables with large dispersion from dominating the variance and unduly influencing the PCA results.
air_time <- apply(
air_time,
2,
function(x) {
x[is.na(x)] <- mean(x, na.rm = TRUE)
x
}
)
air_scaled <- scale(air_time)
summary(air_scaled)
## AT BA BE BG
## Min. :-2.05750 Min. :-2.678 Min. :-2.24441 Min. :-1.89496
## 1st Qu.:-0.01514 1st Qu.: 0.000 1st Qu.:-0.07322 1st Qu.:-0.51898
## Median : 0.07425 Median : 0.000 Median : 0.29205 Median : 0.09981
## Mean : 0.00000 Mean : 0.000 Mean : 0.00000 Mean : 0.00000
## 3rd Qu.: 0.63138 3rd Qu.: 0.000 3rd Qu.: 0.65734 3rd Qu.: 0.82059
## Max. : 1.11339 Max. : 1.685 Max. : 0.87561 Max. : 1.24798
## CH CY CZ DE
## Min. :-2.1166 Min. :-2.1626 Min. :-1.92269 Min. :-2.144458
## 1st Qu.:-0.1001 1st Qu.:-0.4742 1st Qu.:-0.25014 1st Qu.:-0.003065
## Median : 0.2960 Median : 0.1471 Median : 0.03701 Median : 0.286661
## Mean : 0.0000 Mean : 0.0000 Mean : 0.00000 Mean : 0.000000
## 3rd Qu.: 0.6545 3rd Qu.: 0.8013 3rd Qu.: 0.75973 3rd Qu.: 0.533119
## Max. : 0.8711 Max. : 1.2267 Max. : 1.20957 Max. : 0.958236
## DK EE EL ES
## Min. :-2.15242 Min. :-1.94390 Min. :-1.97739 Min. :-2.1226
## 1st Qu.:-0.07196 1st Qu.:-0.47821 1st Qu.:-0.60727 1st Qu.:-0.3333
## Median : 0.33738 Median : 0.05937 Median : 0.06183 Median : 0.2485
## Mean : 0.00000 Mean : 0.00000 Mean : 0.00000 Mean : 0.0000
## 3rd Qu.: 0.63994 3rd Qu.: 0.74143 3rd Qu.: 0.64226 3rd Qu.: 0.6673
## Max. : 0.77915 Max. : 1.39744 Max. : 1.61028 Max. : 1.2976
## EU27_2020 FI FR HR
## Min. :-2.1460 Min. :-1.99644 Min. :-2.21744 Min. :-1.94616
## 1st Qu.:-0.1396 1st Qu.:-0.03635 1st Qu.: 0.01025 1st Qu.:-0.60062
## Median : 0.1570 Median : 0.21439 Median : 0.21039 Median : 0.09737
## Mean : 0.0000 Mean : 0.00000 Mean : 0.00000 Mean : 0.00000
## 3rd Qu.: 0.6957 3rd Qu.: 0.43305 3rd Qu.: 0.66972 3rd Qu.: 0.68752
## Max. : 0.9991 Max. : 1.24654 Max. : 0.88479 Max. : 1.56093
## HU IE IS IT
## Min. :-1.7056 Min. :-1.9478 Min. :-1.6091 Min. :-2.1188
## 1st Qu.:-0.5921 1st Qu.:-0.3142 1st Qu.:-0.8367 1st Qu.:-0.1985
## Median : 0.1126 Median : 0.2994 Median : 0.2144 Median : 0.1347
## Mean : 0.0000 Mean : 0.0000 Mean : 0.0000 Mean : 0.0000
## 3rd Qu.: 0.7785 3rd Qu.: 0.6930 3rd Qu.: 0.7929 3rd Qu.: 0.6610
## Max. : 1.4101 Max. : 1.0725 Max. : 1.4766 Max. : 1.3194
## LT LU LV ME
## Min. :-1.8356 Min. :-1.57009 Min. :-1.892228 Min. :-2.3909
## 1st Qu.:-0.6245 1st Qu.:-0.78025 1st Qu.:-0.320861 1st Qu.:-0.2090
## Median : 0.1963 Median :-0.02382 Median : 0.003821 Median : 0.0000
## Mean : 0.0000 Mean : 0.00000 Mean : 0.000000 Mean : 0.0000
## 3rd Qu.: 0.8546 3rd Qu.: 0.70170 3rd Qu.: 0.759217 3rd Qu.: 0.6837
## Max. : 1.1868 Max. : 1.55196 Max. : 1.354983 Max. : 1.3492
## MK MT NL NO
## Min. :-1.8126 Min. :-1.72176 Min. :-2.0997 Min. :-2.1234
## 1st Qu.:-0.5509 1st Qu.:-0.56227 1st Qu.:-0.1327 1st Qu.:-0.0295
## Median : 0.0000 Median : 0.02167 Median : 0.2539 Median : 0.4447
## Mean : 0.0000 Mean : 0.00000 Mean : 0.0000 Mean : 0.0000
## 3rd Qu.: 0.3027 3rd Qu.: 0.70755 3rd Qu.: 0.7263 3rd Qu.: 0.5340
## Max. : 1.6715 Max. : 1.66131 Max. : 0.9918 Max. : 0.7397
## PL PT RO RS
## Min. :-1.58195 Min. :-1.7019 Min. :-1.60013 Min. :-1.9166
## 1st Qu.:-0.73207 1st Qu.:-0.6925 1st Qu.:-0.89280 1st Qu.:-0.4255
## Median : 0.01174 Median : 0.1296 Median : 0.09181 Median : 0.0000
## Mean : 0.00000 Mean : 0.0000 Mean : 0.00000 Mean : 0.0000
## 3rd Qu.: 0.73402 3rd Qu.: 0.6460 3rd Qu.: 0.72422 3rd Qu.: 0.3138
## Max. : 1.67472 Max. : 1.4293 Max. : 1.46539 Max. : 1.7100
## SE SI SK TR
## Min. :-2.0254 Min. :-2.0142 Min. :-1.8948 Min. :-2.345
## 1st Qu.:-0.1134 1st Qu.:-0.1235 1st Qu.:-0.4156 1st Qu.: 0.000
## Median : 0.2650 Median : 0.2198 Median : 0.1130 Median : 0.000
## Mean : 0.0000 Mean : 0.0000 Mean : 0.0000 Mean : 0.000
## 3rd Qu.: 0.6936 3rd Qu.: 0.5193 3rd Qu.: 0.6873 3rd Qu.: 0.000
## Max. : 0.9526 Max. : 1.1716 Max. : 1.1346 Max. : 2.345
## UK
## Min. :-1.8572
## 1st Qu.:-0.1838
## Median : 0.0000
## Mean : 0.0000
## 3rd Qu.: 0.3219
## Max. : 1.5892
We then conducted principal component analysis (PCA) on the standardized data. The results show that the first principal component (PC1) explains approximately 83.0% of the total variance, indicating that the vast majority of the information in the data can be summarized by a dominant common pattern of variation. The second principal component (PC2) explains an additional 7.3% of the variance, and together the first two principal components account for about 90.3% of the total variance. This suggests that most of the information in the original data can be retained using only two principal components.
The remaining principal components contribute very little additional variance, implying that their marginal role in explaining the data structure is limited. After dimensionality reduction, the data exhibit strong correlations and a substantially reduced dimensionality, indicating that PCA achieves an effective reduction. Overall, retaining the first two principal components allows for efficient dimensionality reduction without significant loss of information.
pca_air <- prcomp(air_scaled, center = FALSE, scale. = FALSE)
summary(pca_air)
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6 PC7
## Standard deviation 5.542 1.64442 1.23790 1.12140 0.69030 0.4036 0.2854
## Proportion of Variance 0.830 0.07308 0.04142 0.03399 0.01288 0.0044 0.0022
## Cumulative Proportion 0.830 0.90310 0.94451 0.97850 0.99138 0.9958 0.9980
## PC8 PC9 PC10 PC11 PC12
## Standard deviation 0.18660 0.1614 0.09607 0.06734 7.488e-16
## Proportion of Variance 0.00094 0.0007 0.00025 0.00012 0.000e+00
## Cumulative Proportion 0.99892 0.9996 0.99988 1.00000 1.000e+00
The first principal component (Dim1) explains the vast majority of the common variation among variables (83%), while the second principal component (Dim2) contributes only a limited amount of additional information (7.3%). Most variables are strongly positively correlated with Dim1, indicating that the data are mainly driven by a dominant overall level or scale factor.
At the same time, all countries point in the same (rightward) direction along the first principal component, suggesting that their characteristics are highly similar on Dim1 and that this component has very strong explanatory power. Along the second principal component, the distribution of countries is relatively compact, largely within the range of −0.5 to 0.5. Although some countries differ in their directions on Dim2, the overall pattern indicates that the PCA performs very well: the first two principal components are sufficient to explain most of the features while preserving the essential structure of the data.
fviz_eig(pca_air, choice='eigenvalue') # eigenvalues on y-axis
## Warning in geom_bar(stat = "identity", fill = barfill, color = barcolor, :
## Ignoring empty aesthetic: `width`.
fviz_eig(pca_air) # percentage of explained variance on y-axis
## Warning in geom_bar(stat = "identity", fill = barfill, color = barcolor, :
## Ignoring empty aesthetic: `width`.
fviz_pca_var(pca_air, col.var="steelblue")
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## ℹ The deprecated feature was likely used in the ggpubr package.
## Please report the issue at <https://github.com/kassambara/ggpubr/issues>.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## ℹ The deprecated feature was likely used in the factoextra package.
## Please report the issue at <https://github.com/kassambara/factoextra/issues>.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
We extracted and ranked the variable loadings of the first principal component (PC1) to identify which original variables contribute most strongly to it. The results show that the loadings of most countries are very similar, with a large number falling within the range of 0.16–0.18. This indicates that PC1 is not driven by a small subset of countries versus others, but rather reflects a pattern in which all countries tend to move together. In other words, although differences exist across national aviation industries, their dynamics over time exhibit a high degree of common fluctuation. The integrated nature of the European (and broader regional) aviation market leads EU member states and neighboring countries to share a common underlying trend. Moreover, almost all countries have positive loadings on PC1, implying that increases in PC1 are associated with simultaneous increases across nearly all country-level variables.
Notably, Bosnia and Herzegovina, the United Kingdom, and Turkey, which are not EU member states, display relatively small loadings on PC1. This suggests that their aviation development patterns are less tightly coupled with the EU single market and therefore contribute less to the dominant EU-wide factor captured by PC1. This finding further supports the interpretation that the EU single market fosters greater similarity in aviation industry dynamics among member states, whereas non-EU countries exhibit greater heterogeneity and uncertainty.
loading_scores_PC_1 <-pca_air$rotation[, 1]
fac_scores_PC_1<-abs(loading_scores_PC_1)
fac_scores_PC_1_ranked<-names(sort(fac_scores_PC_1, decreasing=T))
pca_air$rotation[fac_scores_PC_1_ranked, 1]
## IT ES EU27_2020 IE LV SK CY
## 0.17911015 0.17899747 0.17896279 0.17863464 0.17829860 0.17817018 0.17726000
## AT HU EE LT FR CZ NL
## 0.17704003 0.17682900 0.17649837 0.17632070 0.17627003 0.17591024 0.17543933
## BE BG CH MT ME HR DK
## 0.17470083 0.17416380 0.17398518 0.17286873 0.17203878 0.17190225 0.17151484
## PT PL EL IS DE LU RO
## 0.17011542 0.16772058 0.16571838 0.16458533 0.16375791 0.16300440 0.16221009
## FI SI NO MK RS SE BA
## 0.16034850 0.15908578 0.15700788 0.15212015 0.15112119 0.14913563 0.09649906
## UK TR
## 0.07120124 0.01316652
Based on the figure, it is also evident that the European aviation market is still largely dominated by the European Union, which constitutes the largest market in the region. In contrast, non-EU countries—regardless of their economic strength or geographic location—have only a very limited influence on the principal component of the European aviation market.
var<-get_pca_var(pca_air)
a<-fviz_contrib(pca_air, "var", axes=1, xtickslab.rt=90) # default angle=45°
b<-fviz_contrib(pca_air, "var", axes=2, xtickslab.rt=90)
grid.arrange(a,b,top='Contribution to the first two Principal Components')
Each year (2013–2024) is treated as an individual, and its position in the principal component space, as well as its contribution to each principal component, is calculated in order to analyze the temporal structure and evolution of the data. The results based on individual coordinates reveal a very clear time-related structure in the distribution of years within the principal component space. The first principal component (Dim.1) explains the vast majority of the variance in the data, and its scores exhibit pronounced stage-wise changes over time. Specifically, from 2013 to 2019, the scores on Dim.1 gradually increase from negative to positive values, reflecting an overall growth trend in aviation-related indicators during this period. However, in 2020 and 2021, the Dim.1 scores drop sharply to extreme negative values, deviating markedly from the previous trajectory and indicating a strong structural shock to the aviation industry. This abrupt break is highly consistent with the global impact of the COVID-19 pandemic on air transport activity. After 2022, the Dim.1 scores recover rapidly and continue to rise in 2023 and 2024, exceeding pre-pandemic levels and suggesting a rebound of the aviation industry following the short-term disruption. Therefore, the first principal component can be reasonably interpreted as a composite indicator capturing changes in the overall scale or activity level of the aviation industry over time.
In contrast, the second principal component (Dim.2) explains a much smaller proportion of the total variance and mainly reflects secondary structural differences beyond the overall trend captured by Dim.1. The results show that during the period from 2013 to 2018, most years take negative values on Dim.2, while from 2019 onwards the scores gradually move toward zero and even become positive, with a clear sign change after 2020. This indicates that Dim.2 distinguishes, to some extent, structural differences between the pre-pandemic and post-pandemic periods. However, given its limited explanatory power, this component should not be overinterpreted. The third and higher-order principal components display noticeable values only for a small number of years and account for very little variance overall, suggesting that they mainly capture localized fluctuations or random noise and contribute little to explaining the overall temporal structure.
ind<-get_pca_ind(pca_air)
print(ind)
## Principal Component Analysis Results for individuals
## ===================================================
## Name Description
## 1 "$coord" "Coordinates for the individuals"
## 2 "$cos2" "Cos2 for the individuals"
## 3 "$contrib" "contributions of the individuals"
ind$coord
## Dim.1 Dim.2 Dim.3 Dim.4 Dim.5 Dim.6
## X2013 -2.2964177 -2.05493666 1.558274933 0.8587761 -0.24049236 -0.269035960
## X2014 -1.4952409 -1.89289428 1.178860944 0.6531005 -0.18740534 -0.079076811
## X2015 -0.6189945 -1.82087495 0.423267073 0.2602832 0.06130885 0.118584147
## X2016 0.6220097 -1.35264023 -0.547747668 -0.3574027 0.27697392 0.552906528
## X2017 2.7295112 -0.92222767 -1.295340671 -0.6405873 0.05913927 0.393696428
## X2018 4.5633485 -0.50252227 -1.598038327 -0.7070166 -0.09687421 0.021021491
## X2019 5.4476581 -0.03605946 -1.260883436 -0.4175117 -0.28192314 -0.913730091
## X2020 -11.3140346 0.93106732 -0.312627165 -0.8221830 1.19857114 -0.425177325
## X2021 -9.1175468 1.79602284 -1.063763079 0.7255411 -1.47569333 0.226928234
## X2022 0.6937917 1.63309164 0.679672586 -0.2710359 0.81981402 0.411574800
## X2023 4.4446719 2.01646261 2.246447245 -1.8607971 -0.65006043 -0.006651514
## X2024 6.3412433 2.20551111 -0.008122433 2.5788334 0.51664161 -0.031039928
## Dim.7 Dim.8 Dim.9 Dim.10 Dim.11
## X2013 0.22421271 0.193095386 0.027456507 -0.025192179 -0.131735522
## X2014 0.06231824 0.119930745 -0.009108138 0.046518286 0.173370426
## X2015 -0.42368083 -0.419868985 0.178752926 -0.004497616 -0.019484567
## X2016 -0.09343430 0.006363235 -0.368876617 -0.125729954 -0.007703347
## X2017 0.21484363 0.015643722 -0.001611594 0.241299969 -0.034543730
## X2018 0.38811806 0.025142745 0.259684390 -0.155318040 0.026479676
## X2019 -0.40017391 0.090371194 -0.101728111 0.020041055 -0.004865412
## X2020 0.20163692 -0.102246045 -0.037352900 0.006416580 0.007038375
## X2021 -0.09603629 0.024828679 0.008984570 -0.006414667 -0.002839429
## X2022 -0.42919625 0.314654730 0.173210984 -0.005106588 -0.007023877
## X2023 0.15732365 -0.136733863 -0.064769040 0.002573454 0.000145308
## X2024 0.19406838 -0.131181543 -0.064642977 0.005409701 0.001162100
## Dim.12
## X2013 4.440892e-16
## X2014 4.996004e-16
## X2015 4.996004e-16
## X2016 6.956241e-16
## X2017 9.853229e-16
## X2018 1.471046e-15
## X2019 3.885781e-16
## X2020 -1.665335e-16
## X2021 0.000000e+00
## X2022 5.282233e-16
## X2023 6.383782e-16
## X2024 1.026956e-15
Based on the individual contribution results, the first principal component is mainly defined by a small number of years, among which 2020 and 2021 contribute significantly more to PC1 than other years. This indicates that the orientation of the principal component is largely driven by these extreme years. It suggests that the aviation industry experienced a strong structural shock during this period, while the intermediate years contribute relatively little to the construction of the component and mainly reflect stable phases of the overall trend. From 2013 to 2019, the observations move gradually from the lower-left to the right in the PCA space, indicating a period of steady growth in the European aviation industry, which reached its peak in 2019. During 2020–2021, the data points shift abruptly to the far left, representing a strong negative deviation. This reflects the severe impact of the COVID-19 pandemic, during which changes in entry policies, widespread flight suspensions, and temporary airspace closures led to a rapid contraction of the aviation sector. From 2022 to 2024, the points move rapidly back to the right, showing a clear recovery that even exceeds pre-pandemic levels. This rebound may be attributed to pent-up travel demand following two to three years of restrictions, as well as policy measures implemented by the EU to revitalize the aviation industry. Overall, the first principal component primarily captures the long-term temporal evolution of the aviation industry and clearly identifies the structural shock occurring in 2020–2021. The second principal component appears to describe a secondary growth-related pattern: values are negative before the pandemic, possibly reflecting a smoother development phase, and become positive after the pandemic, potentially corresponding to a period of rapid expansion or contraction. However, this interpretation requires further validation and should be regarded only as a supplementary explanation. Finally, the cos²-based coloring confirms that the years associated with the pandemic shock and subsequent recovery are well represented by the first two principal components, indicating that these key periods are robustly and reliably captured by the PCA results.
ind$contrib
## Dim.1 Dim.2 Dim.3 Dim.4 Dim.5 Dim.6
## X2013 1.4309800 13.013357712 1.320485e+01 4.8871897 1.01145769 3.703697727
## X2014 0.6066723 11.041935577 7.557366e+00 2.8265673 0.61419898 0.319972873
## X2015 0.1039694 10.217690211 9.742586e-01 0.4489434 0.06573421 0.719561863
## X2016 0.1049847 5.638415679 1.631572e+00 0.8464768 1.34159958 15.642937334
## X2017 2.0216295 2.621010669 9.124589e+00 2.7192932 0.06116417 7.931180217
## X2018 5.6506586 0.778221312 1.388736e+01 3.3125201 0.16411999 0.022612152
## X2019 8.0528848 0.004007107 8.645601e+00 1.1551451 1.38997372 42.721916330
## X2020 34.7349604 2.671496795 5.314944e-01 4.4795699 25.12307424 9.250283877
## X2021 22.5573357 9.940682466 6.153683e+00 3.4883768 38.08354604 2.635068673
## X2022 0.1306141 8.218898167 2.512146e+00 0.4868026 11.75373346 8.667870666
## X2023 5.3605731 12.530628035 2.744339e+01 22.9454915 7.39013911 0.002263893
## X2024 10.9114041 14.990322938 3.587712e-04 44.0702900 4.66792546 0.049301061
## Dim.7 Dim.8 Dim.9 Dim.10 Dim.11 Dim.12
## X2013 5.1440860 8.923317277 0.241201448 0.573036269 3.189460e+01 2.9308675
## X2014 0.3973911 3.442259464 0.026542909 1.953880822 5.524099e+01 3.7093792
## X2015 18.3681497 42.190059047 10.223407602 0.018264818 6.977389e-01 3.7093792
## X2016 0.8933065 0.009690329 43.536309026 14.273424115 1.090612e-01 7.1912602
## X2017 4.7231611 0.058568311 0.000830999 52.573293993 2.193057e+00 14.4282258
## X2018 15.4140046 0.151289203 21.576502228 21.781834792 1.288655e+00 32.1594020
## X2019 16.3864667 1.954531097 3.311092526 0.362653434 4.350615e-02 2.2439455
## X2020 4.1603309 2.501932426 0.446414216 0.037175586 9.104503e-02 0.4121532
## X2021 0.9437534 0.147533207 0.025827593 0.037153429 1.481745e-02 0.0000000
## X2022 18.8494906 23.694692151 9.599314370 0.023545741 9.067035e-02 4.1465877
## X2023 2.5326537 4.474399037 1.342221768 0.005979757 3.880523e-05 6.0563630
## X2024 3.8538724 4.118395119 1.337001981 0.026423912 2.481979e-03 15.6732721
fviz_pca_ind(pca_air, col.ind="cos2", geom="point", gradient.cols=c("white", "#2E9FDF", "#FC4E07" ))
performance was not satisfactory. This is mainly because the clustering results are largely dominated by the first principal component. Although PC1 effectively captures the overall trend, it has limited ability to distinguish structural differences between distinct phases, which restricts the interpretability of the clustering results in a real-world context.
Given the relatively small number of years in the dataset, we therefore prefer an observation- and experience-based classification. Specifically, 2013–2019 are classified as years of smooth growth 2020–2021 as years of pandemic-induced shock 2022–2024 as years of rapid rebound and expansion.
km <- eclust(
pca_air$x[, 1:2], # PCA scores
k = 3
)
We also compared an alternative dimensionality reduction method, multidimensional scaling (MDS), which projects the high-dimensional data onto a two-dimensional plane based on the Euclidean distances between years.
dist_year <- dist(air_time, method = "euclidean")
The results of the classical multidimensional scaling (MDS) analysis show that the distribution of years in the two-dimensional space exhibits a clear temporal structure. The years 2013–2019 are highly clustered in the MDS space, indicating a strong similarity in the aviation industry during the pre-pandemic period. In contrast, 2020–2021 are clearly separated from the other years, reflecting the significant structural shock caused by the COVID-19 pandemic. The years 2022–2024 gradually move closer to the pre-pandemic period, indicating a recovery trend in the aviation industry.
These results are highly consistent with the findings from the PCA individual score analysis. Although PCA is based on the principle of variance maximization, whereas MDS directly relies on Euclidean distances between years, both methods identify the same temporal segmentation. This consistency suggests that the observed structure is not driven by the choice of dimensionality reduction method, but rather reflects intrinsic characteristics of the data itself. Overall, the high degree of integration within the EU aviation market leads to strong synchronization among national aviation industries under normal conditions, resulting in relatively small differences between years, with pronounced divergence emerging only under major external shocks such as the pandemic.
mds_res <- cmdscale(dist_year, k = 2, eig = TRUE)
Z_mds <- mds_res$points
plot(
Z_mds[,1], Z_mds[,2],
pch = 19,
xlab = "MDS Dimension 1",
ylab = "MDS Dimension 2",
main = "MDS of Years"
)
text(
Z_mds[,1], Z_mds[,2],
labels = rownames(Z_mds),
cex = 0.7,
pos = 3
)
By computing the correlation coefficients between each country’s aviation time series and the first MDS dimension, we find that MDS Dimension 1 is highly consistent with the overall evolution of the EU aviation industry. This indicates that the main structure identified by MDS is not driven by a small number of individual countries, but rather reflects a highly synchronized overall trend of the EU aviation market. In contrast, non-EU countries exhibit significantly lower correlations with this dimension, suggesting that their aviation development paths differ to some extent from the overall EU structure.
dim1 <- Z_mds[,1]
dim2 <- Z_mds[,2]
cor_dim1 <- apply(air_time, 2, cor, y = dim1)
cor_dim2 <- apply(air_time, 2, cor, y = dim2)
sort(cor_dim1, decreasing = TRUE)[1:36]
## EU27_2020 FR BE NL IT CH AT IE
## 0.9999967 0.9953865 0.9904314 0.9898371 0.9897729 0.9890947 0.9876944 0.9875429
## SK DK ES CZ LV CY ME HU
## 0.9824974 0.9817974 0.9808813 0.9795346 0.9790786 0.9595260 0.9504110 0.9500485
## DE EE BG LT FI NO SI MT
## 0.9500243 0.9486498 0.9482696 0.9458141 0.9260874 0.9235008 0.9209768 0.9179689
## HR PT SE IS PL EL LU RO
## 0.9121098 0.8978454 0.8846912 0.8809248 0.8774151 0.8752077 0.8447133 0.8389763
## RS MK BA UK
## 0.8121083 0.8050209 0.5326404 0.3237185
sort(cor_dim1)[1:36]
## TR UK BA MK RS RO LU
## 0.07035906 0.32371853 0.53264044 0.80502088 0.81210825 0.83897626 0.84471326
## EL PL IS SE PT HR MT
## 0.87520774 0.87741512 0.88092481 0.88469120 0.89784536 0.91210979 0.91796892
## SI NO FI LT BG EE DE
## 0.92097684 0.92350080 0.92608740 0.94581409 0.94826965 0.94864980 0.95002431
## HU ME CY LV CZ ES DK
## 0.95004849 0.95041095 0.95952601 0.97907856 0.97953458 0.98088129 0.98179737
## SK IE AT CH IT NL BE
## 0.98249739 0.98754287 0.98769439 0.98909470 0.98977293 0.98983707 0.99043137
## FR
## 0.99538645
The second MDS dimension does not represent the overall EU aviation trend; instead, it distinguishes secondary differences among countries in terms of aviation structure or recovery paths, and its explanatory power is clearly weaker than that of the first dimension.
sort(cor_dim2, decreasing = TRUE)[1:36]
## RO LU PL EL UK
## 0.5386647460 0.5303463664 0.4673689632 0.4579470824 0.4432114232
## PT HR MT MK RS
## 0.4350933184 0.3987166238 0.3743986202 0.3718903761 0.2998165886
## LT EE HU IS CY
## 0.2881653151 0.2785855707 0.2738939123 0.2729895829 0.2511898145
## ES BG IT IE LV
## 0.1810506391 0.1414987517 0.1063854368 0.1057219262 0.0792326081
## BA ME SK TR AT
## 0.0771344556 0.0670487723 0.0634026271 0.0500965430 0.0010886517
## EU27_2020 CZ FR NL BE
## -0.0004526982 -0.0335727519 -0.0786496378 -0.0919080765 -0.1109951960
## CH DK DE FI SI
## -0.1412580205 -0.1755016350 -0.3019937952 -0.3124779500 -0.3187286902
## NO
## -0.3635349164
sort(cor_dim2)[1:36]
## SE NO SI FI DE
## -0.4354981587 -0.3635349164 -0.3187286902 -0.3124779500 -0.3019937952
## DK CH BE NL FR
## -0.1755016350 -0.1412580205 -0.1109951960 -0.0919080765 -0.0786496378
## CZ EU27_2020 AT TR SK
## -0.0335727519 -0.0004526982 0.0010886517 0.0500965430 0.0634026271
## ME BA LV IE IT
## 0.0670487723 0.0771344556 0.0792326081 0.1057219262 0.1063854368
## BG ES CY IS HU
## 0.1414987517 0.1810506391 0.2511898145 0.2729895829 0.2738939123
## EE LT RS MK MT
## 0.2785855707 0.2881653151 0.2998165886 0.3718903761 0.3743986202
## HR PT UK EL PL
## 0.3987166238 0.4350933184 0.4432114232 0.4579470824 0.4673689632
## LU
## 0.5303463664
The first two dimensions explain 99.51% of the total distance variation, indicating that the two-dimensional MDS representation almost completely preserves the original distance structure.
eig <- mds_res$eig
eig_pos <- eig[eig > 0]
prop_2d <- sum(eig_pos[1:2]) / sum(eig_pos)
prop_2d
## [1] 0.995103
The correlation coefficient is approximately 0.9999, indicating that the pairwise distances between points in the MDS space are almost identical to those in the original high-dimensional space. This demonstrates that both the relative ordering of distances and the relative proximities are preserved nearly perfectly.
D_high <- dist(air_time)
D_mds <- dist(Z_mds)
cor_dist <- cor(as.vector(D_high), as.vector(D_mds))
cor_dist
## [1] 0.9998819
The points are distributed almost perfectly along a straight line, with virtually no systematic curvature or dispersion. A near-linear Shepard diagram indicates an excellent fit of the MDS representation.
plot(
as.vector(D_high),
as.vector(D_mds),
pch = 19, col = rgb(0,0,0,0.4),
xlab = "Original distances",
ylab = "MDS distances",
main = "Shepard Diagram"
)
abline(lm(as.vector(D_mds) ~ as.vector(D_high)), col = "red", lwd = 2)
Because the data are highly similar and exhibit an almost one-dimensional structure dominated by time, the resulting dimensionality reduction performs exceptionally well.
The analysis shows that principal component analysis (PCA) and classical multidimensional scaling (MDS) provide highly consistent conclusions in characterizing the structural relationships between years. PCA, based on the principle of variance maximization, identifies a time-evolution pattern dominated by the first principal component, clearly capturing the steady growth of the aviation industry before the pandemic, the sharp decline during the pandemic period, and the rapid recovery thereafter. In contrast, MDS directly relies on Euclidean distances between years to preserve similarity relationships in a low-dimensional space, and similarly separates the pandemic years (2020–2021) from other periods, yielding a temporal segmentation that closely aligns with the PCA individual score results.
Although the two methods differ in their theoretical foundations—PCA focusing on variance structure and MDS emphasizing distance preservation—both reveal the same core pattern. This strong consistency indicates that the observed temporal structure is not an artifact introduced by a specific dimensionality reduction technique, but rather reflects intrinsic characteristics of the data itself. Furthermore, the near-perfect fit demonstrated by the MDS Shepard diagram and distance correlation diagnostics confirms that the two-dimensional representation preserves the original distance structure almost without loss, thereby further validating the robustness and reliability of the structure identified by PCA.
By synthesizing the results from PCA, MDS, and correlation analyses, a clear business-relevant conclusion can be drawn: under normal conditions, the EU aviation market exhibits a high degree of synchronization and integration, operating much more like a single unified market than a collection of independently driven national systems. During the period from 2013 to 2019, aviation indicators across countries are highly similar and differences between years are limited, suggesting that market demand, capacity allocation, and aviation activity levels are primarily shaped by EU-level macroeconomic conditions and the mechanisms of the single market.
However, the results for 2020–2021 clearly demonstrate that when confronted with a strong external shock such as the COVID-19 pandemic, the structure of the aviation industry undergoes a pronounced break, with all countries deviating simultaneously from their previous trajectories. This implies that in a highly integrated market, systemic risks are strongly synchronized and amplified, allowing shocks to propagate rapidly across the entire market. The post-2022 results show a rapid recovery and even an expansion beyond pre-pandemic levels, indicating that the EU aviation market possesses substantial resilience and recovery capacity supported by unified regulations, cross-border networks, and scale effects.
From a commercial and strategic perspective, these findings suggest that analyses and decisions concerning the EU aviation market gain limited additional insight from purely country-level distinctions. Instead, greater value lies in assessing overall market cycles and systemic risks. Airlines, airport operators, and related investors should therefore focus more on macroeconomic cycles, cross-border coordination, and the management of sudden external shocks, rather than overemphasizing short-term fluctuations in individual countries. In contrast, non-EU countries exhibit weaker synchronization with the EU-wide trend, reflecting higher uncertainty and structural heterogeneity, which from a business standpoint implies higher risk premia and lower predictability.