This study is based on original data collected personally last year as part of a survey for NLB Banka. The primary objective is to analyse and better understand the preferences of young individuals (specifically those aged 18 to 27) regarding the ongoing transition from cash to digital payment methods. In an increasingly dynamic financial ecosystem, understanding how “digital natives” perceive the security, speed, and convenience of emerging payment instruments is critically important for banking institutions seeking to support and guide this transformation effectively.
The resulting dataset consists of 440 observations and a comprehensive set of variables covering spending behaviour and risk perception, providing a robust foundation for multivariate statistical analysis. To address the complexity of the data structure, the quantitative techniques discussed during the course will be applied within an exploratory analytical framework. Specifically, the empirical analysis will be organised into two main stages:
Recent academic literature confirms that the transition towards a “cashless society” is a global phenomenon, although it is characterised by dynamics specific to younger populations.
According to Demir et al. (2024), in their study on the intention to use integrated payment systems, the most influential factor for young people is not simply perceived ease of use, but rather lifestyle compatibility. This suggests that young consumers are more likely to adopt digital payment instruments that integrate seamlessly into their daily routines, such as mobile applications and “one-click” payment solutions. Usman et al. (2025) also emphasise the crucial role of financial literacy and perceived behavioural control. Young people who feel more competent in managing their financial resources are significantly more likely to develop a clear behavioural intention towards fintech adoption.
Despite strong momentum towards digitalisation, cash continues to retain both psychological and practical relevance. Puusniekka (2020) introduces the concept of the “pain of paying,” arguing that the tangible nature of cash enhances mental accounting and expenditure control. In contrast, debit cards and other electronic instruments tend to increase willingness to spend, as the separation from money becomes less salient and less psychologically “visible.” This research highlights that many young people still perceive cash as a “safe” or even “sacred” method for preventing overspending and over-indebtedness – concerns that also emerge clearly from the data collected for NLB (Puusniekka, 2020).
The adoption of financial technologies is strongly shaped by social environments. Both Demir et al. (2024) and Usman et al. (2025) identify social influence – stemming from friends, family, and peer groups – as a significant predictor of behavioural intention, particularly for peer-to-peer (P2P) payment applications, where network effects are central. Nevertheless, substantial barriers persist, especially those related to privacy and data security. Concerns about the protection of personal information often hinder a full transition to exclusively smartphone-based systems or neobanking solutions.
library(readr)
library(dplyr)
NLB_data <- read_csv2(
"~/Desktop/NLB/nlb_data.csv",
locale = locale(encoding = "UTF-8")
)
# Remove columns 2 to 8 and 101 to 116
NLB_data <- NLB_data %>%
select(-c(2:8, 101:116))
#Remove first row
NLB_data <- NLB_data[-1, ]## # A tibble: 6 × 93
## status Q1 Q3a Q3b Q3c Q3d Q4a Q4b Q4c Q4d Q5 Q5_4_text
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 6 7 3 2 5 2 4 3 2 1 2 -2
## 2 6 9 2 5 1 1 2 1 1 1 1 -2
## 3 6 9 2 4 4 1 2 2 1 1 2 -2
## 4 6 7 3 5 5 1 2 2 1 1 1 -2
## 5 6 7 2 5 5 1 2 1 1 1 1 -2
## 6 6 9 3 3 1 2 4 3 2 1 2 -2
## # ℹ 81 more variables: Q6a <chr>, Q6b <chr>, Q6c <chr>, Q6d <chr>, Q6e <chr>,
## # Q6f <chr>, Q6g <chr>, Q7 <chr>, Q8a <dbl>, Q9a <chr>, Q9b <chr>, Q9c <chr>,
## # Q9d <chr>, Q9e <chr>, Q9f <chr>, Q9g <chr>, Q9h <chr>, Q31_2a <chr>,
## # Q31_2b <chr>, Q31_2c <chr>, Q31_2d <chr>, Q31_2e <chr>, Q31_2f <chr>,
## # Q10a <chr>, Q10b <chr>, Q10c <chr>, Q10d <chr>, Q10e <chr>, Q11a <chr>,
## # Q11b <chr>, Q11c <chr>, Q11d <chr>, Q11e <chr>, Q12a <chr>, Q12b <chr>,
## # Q12c <chr>, Q12d <chr>, Q12e <chr>, Q13a <chr>, Q13b <chr>, Q13c <chr>, …
Status: status of the questionnaire (6 = valid, 5 = invalid)
Q1a: Age
Q3: How many times a month do you use the following payment methods on average? (1 = never, 2 = 1-3x month, 3 = 1x week, 4 = many times x week, 5 = everyday)
Q4: How often do you use cash payment for the following purchases? (1 = never, 2 = 1-3x month, 3 = 1x week, 4 = many times x week, 5 = everyday)
Q5: How do you usually respond if a merchant does not accept digital payments?
Q6: Which of the following income sources do you have, and how do you receive them? (1 = fully cash, 2 = half cash/half digital, 3 = fully digital, 4 = don’t get income from this source, 5 = don’t want to answer)
Q7: Do you save money? (This question does NOT include savings from parents or family members.)
Q8: In what form do you save money? (Digital vs. cash savings) (1 = fully cash…7 = fully digital)
Q9: Various attitudes towards digital and cash payments. (1 = strongly disagree…7 = strongly agree)
Q31_2: Please rate the importance of each of the following factors that influence your choice of payment method. (1 = not important at all…7 = extremely important)
Q10: How safe do you think the following payment methods are? (1 = strongly not safe…7 = strongly safe)
Q11: How easy do you find the following payment methods to use? (1 = strongly not easy…7 = strongly easy)
Q12: How accepted do you think the following payment methods are in stores in your environment? (1 = strongly not accepted…7 = strongly accepted)
Q13: How fast do you consider the following payment methods? (1 = strongly not fast…7 = strongly fast)
Q14: How do you consider the following payment methods from a privacy perspective? (1 = strongly not private…7 = strongly private)
Q15: How do you consider the following payment methods from a saving check perspective? (1 = I totally don’t have control…7 = I have total control)
Q16: Social influence on payment method choice. (1 = strongly disagree…7 = strongly agree)
Q17: How do you most often share expenses among friends?
Q18: Reasons for preferring digital payments. (0 = Yes, 1 = No)
Q19: Concerns about digital payment security. (1 = strongly not concerns, 7 = strongly concerns)
Q20: Experience with online fraud.
Q21: How did this affect your behavior in further payment habits? (0 = Yes, 1 = No)
Q22: If we had the opportunity, would you switch to digital payments entirely?
Q23: What is your gender?
Q24: What is your highest level of educational attainment?
Q25: What is your current status?
Q26: What is your current net monthly income?
Q27: Which bank do you currently use as your primary bank?
#Remove all questionnaires with non valid status
library(dplyr)
NLB_data <- NLB_data %>%
filter(!status == 5)
#Remove status column
NLB_data <- NLB_data[ , -1]
#Remove all under 18/over 27
library(dplyr)
NLB_data <- NLB_data %>%
filter(!Q1 %in% c(1, 12))
# Rename columns
colnames(NLB_data) <- c("Age", "Cash_Use", "Card_Use", "Phone_Use", "OtherPay_Use",
"Cash_Up10", "Cash_11_99", "Cash_100_1000", "Cash_Over1000",
"NoDigital_Response", "NoDigital_Response_Text", "Income_StudJobSalary", "Income_PocketMoney",
"Income_Gifts", "Income_Occasional", "Income_Subsidy",
"Income_Scholarship", "Income_Investments", "Save_Money",
"Save_Form", "Spend_SameForm", "Concern_Security", "LessSecure_Digital",
"Trust_2FA", "Safe_CashCarry", "Prefer_Digital_Convenience",
"Prefer_Cash_Control", "Use_Cash_IfNoDigital", "Importance_Ease", "Importance_Speed",
"Importance_Availability", "Importance_Security", "Importance_TrackingBudgeting",
"Importance_Privacy", "Safe_Cash", "Safe_DebitCard", "Safe_CreditCard", "Safe_PhonePay",
"Safe_Neobank", "Easy_Cash", "Easy_DebitCard", "Easy_CreditCard", "Easy_PhonePay",
"Easy_Neobank", "Accept_Cash", "Accept_DebitCard", "Accept_CreditCard",
"Accept_PhonePay", "Accept_Neobank", "Fast_Cash", "Fast_DebitCard",
"Fast_CreditCard", "Fast_PhonePay", "Fast_Neobank", "Private_Cash",
"Private_DebitCard", "Private_CreditCard", "Private_PhonePay",
"Private_Neobank", "Control_Cash", "Control_DebitCard", "Control_CreditCard",
"Control_PhonePay", "Control_Neobank", "Social_Friends", "Social_Family",
"Expense_Sharing", "Expense_Sharing_Text", "Reason_ExactSum", "Reason_NoCash",
"Reason_Convenient", "Reason_TrackFinances", "Reason_Other", "Reason_Other_Text",
"Concern_Fraud", "Concern_PersonalInfo", "Concern_IDTheft",
"Concern_Hacker", "OnlineFraud_Exp", "Behavior_MoreCash",
"Behavior_SecureDigital", "Behavior_Cautious", "Behavior_SecureOption",
"Behavior_NoChange", "Switch_Digital", "Gender", "Education",
"Status_Employment", "Status_Employment_Text", "Income_Level", "Primary_Bank", "Primary_Bank_Text")# Q1
NLB_data$AgeF <- factor(NLB_data$Age,
levels = c(2:11),
labels = c(18:27))
#Q3a
NLB_data$Cash_UseF <- factor(NLB_data$Cash_Use,
levels = c(1, 2, 3, 4, 5),
labels = c("Never", "1-3 monthly", "1 per week", "Several times a week", "Daily"))
#Q3b
NLB_data$Card_UseF <- factor(NLB_data$Card_Use,
levels = c(1, 2, 3, 4, 5),
labels = c("Never", "1-3 monthly", "1 per week", "Several times a week", "Daily"))
#Q3c
NLB_data$Phone_UseF <- factor(NLB_data$Phone_Use,
levels = c(1, 2, 3, 4, 5),
labels = c("Never", "1-3 monthly", "1 per week", "Several times a week", "Daily"))
#Q3d
NLB_data$OtherPay_UseF <- factor(NLB_data$OtherPay_Use,
levels = c(1, 2, 3, 4, 5),
labels = c("Never", "1-3 monthly", "1 per week", "Several times a week", "Daily"))
#Q4a
NLB_data$Cash_Up10F <- factor(NLB_data$Cash_Up10,
levels = c(1, 2, 3, 4, 5),
labels = c("Never", "Less than half", "Half", "More than half", "Always"))
#Q4b
NLB_data$Cash_11_99F <- factor(NLB_data$Cash_11_99,
levels = c(1, 2, 3, 4, 5),
labels = c("Never", "Less than half", "Half", "More than half", "Always"))
#Q4c
NLB_data$Cash_100_1000F <- factor(NLB_data$Cash_100_1000,
levels = c(1, 2, 3, 4, 5),
labels = c("Never", "Less than half", "Half", "More than half", "Always"))
#Q4d
NLB_data$Cash_Over1000F <- factor(NLB_data$Cash_Over1000,
levels = c(1, 2, 3, 4, 5),
labels = c("Never", "Less than half", "Half", "More than half", "Always"))
# Q5
NLB_data$NoDigital_ResponseF <- factor(NLB_data$NoDigital_Response,
levels = c(1, 2, 3, 4),
labels = c("Pay digital elsewhere", "Pay as available", "Never occurred", "Other"))
#Q6a
NLB_data$Income_StudJobSalaryF <- factor(NLB_data$Income_StudJobSalary,
levels = c(1, 2, 3, 4, 5),
labels = c("Cash", "Cash&Digitally", "Digitally", "Not using", "Don't want to answer"))
#Q6b
NLB_data$Income_PocketMoneyF <- factor(NLB_data$Income_PocketMoney,
levels = c(1, 2, 3, 4, 5),
labels = c("Cash", "Cash&Digitally", "Digitally", "Not using", "Don't want to answer"))
#Q6c
NLB_data$Income_GiftsF <- factor(NLB_data$Income_Gifts,
levels = c(1, 2, 3, 4, 5),
labels = c("Cash", "Cash&Digitally", "Digitally", "Not using", "Don't want to answer"))
#Q6d
NLB_data$Income_OccasionalF <- factor(NLB_data$Income_Occasional,
levels = c(1, 2, 3, 4, 5),
labels = c("Cash", "Cash&Digitally", "Digitally", "Not using", "Don't want to answer"))
#Q6e
NLB_data$Income_SubsidyF <- factor(NLB_data$Income_Subsidy,
levels = c(1, 2, 3, 4, 5),
labels = c("Cash", "Cash&Digitally", "Digitally", "Not using", "Don't want to answer"))
#Q6f
NLB_data$Income_ScholarshipF <- factor(NLB_data$Income_Scholarship,
levels = c(1, 2, 3, 4, 5),
labels = c("Cash", "Cash&Digitally", "Digitally", "Not using", "Don't want to answer"))
# Q7
NLB_data$Income_InvestmentsF <- factor(NLB_data$Income_Investments,
levels = c(1, 2, 3, 4, 5),
labels = c("Cash", "Cash&Digitally", "Digitally", "Not using", "Don't want to answer"))
# Q17
NLB_data$Expense_SharingF <- factor(NLB_data$Expense_Sharing,
levels = c(1, 2, 3, 4, 5),
labels = c("Cash", "Mobile apps", "Bank transfer", "Don't share", "Other"))
# Q20
NLB_data$OnlineFraud_ExpF <- factor(NLB_data$OnlineFraud_Exp,
levels = c(1, 2, 3, 4),
labels = c("Yes - me", "Yes - others", "Yes - both", "No"))
# Q22
NLB_data$Switch_DigitalF <- factor(NLB_data$Switch_Digital,
levels = c(1, 2, 3, 4),
labels = c("Fully digital", "Balance digital-cash", "Cash", "Don't know"))
# Q23
NLB_data$GenderF <- factor(NLB_data$Gender,
levels = c(1, 2, 3, 4),
labels = c("Man", "Woman", "Other", "Don't want to answer"))
# Q24
NLB_data$EducationF <- factor(NLB_data$Education,
levels = c(1, 2, 3, 4, 5, 6, 7),
labels = c("Unfinished primary", "Primary school", "Vocational education", "High School", "Bachelor Degree", "Master Degree", "PhD" ))
# Q25
NLB_data$Status_EmploymentF <- factor(NLB_data$Status_Employment,
levels = c(1, 2, 3, 4, 5),
labels = c("Student", "Employed", "Self-employed", "Unemployed", "Other"))
# Q26
NLB_data$Income_LevelF <- factor(NLB_data$Income_Level,
levels = c(1, 2, 3, 4, 5),
labels = c("0-200 EUR", "201-500 EUR", "501-800 EUR", "801-1300 EUR", "Above 1300 EUR"))
# Q27
NLB_data$Primary_BankF <- factor(NLB_data$Primary_Bank,
levels = c(1, 2, 3, 4, 5, 6, 7),
labels = c("NLB", "OTP", "Intesa Sanpaolo", "Sparkasse", "Addiko Bank", "Delavska Hranilnica", "Other"))## # A tibble: 6 × 25
## AgeF Cash_UseF Card_UseF Phone_UseF OtherPay_UseF Cash_Up10F Cash_11_99F
## <fct> <fct> <fct> <fct> <fct> <fct> <fct>
## 1 23 1 per week 1-3 monthly Daily 1-3 monthly More than… Half
## 2 25 1-3 monthly Daily Never Never Less than… Never
## 3 25 1-3 monthly Several tim… Several t… Never Less than… Less than …
## 4 23 1 per week Daily Daily Never Less than… Less than …
## 5 23 1-3 monthly Daily Daily Never Less than… Never
## 6 25 1 per week 1 per week Never 1-3 monthly More than… Half
## # ℹ 18 more variables: Cash_100_1000F <fct>, Cash_Over1000F <fct>,
## # NoDigital_ResponseF <fct>, Income_StudJobSalaryF <fct>,
## # Income_PocketMoneyF <fct>, Income_GiftsF <fct>, Income_OccasionalF <fct>,
## # Income_SubsidyF <fct>, Income_ScholarshipF <fct>,
## # Income_InvestmentsF <fct>, Expense_SharingF <fct>, OnlineFraud_ExpF <fct>,
## # Switch_DigitalF <fct>, GenderF <fct>, EducationF <fct>,
## # Status_EmploymentF <fct>, Income_LevelF <fct>, Primary_BankF <fct>
library(dplyr)
# Convert character columns in specified ranges to numeric
NLB_data <- NLB_data %>%
mutate(across(c(20:66, 75:78),
~ ifelse(!is.na(.), as.numeric(.), .)))# Select specific numerical data for summary
NLBdata_Likert <- NLB_data[, c("Save_Form", "Spend_SameForm", "Concern_Security", "LessSecure_Digital",
"Trust_2FA", "Safe_CashCarry", "Prefer_Digital_Convenience",
"Prefer_Cash_Control", "Use_Cash_IfNoDigital", "Importance_Ease",
"Importance_Speed", "Importance_Availability", "Importance_Security",
"Importance_TrackingBudgeting", "Importance_Privacy", "Safe_Cash",
"Safe_DebitCard", "Safe_CreditCard", "Safe_PhonePay", "Safe_Neobank",
"Easy_Cash", "Easy_DebitCard", "Easy_CreditCard", "Easy_PhonePay",
"Easy_Neobank", "Accept_Cash", "Accept_DebitCard", "Accept_CreditCard",
"Accept_PhonePay", "Accept_Neobank", "Fast_Cash", "Fast_DebitCard",
"Fast_CreditCard", "Fast_PhonePay", "Fast_Neobank", "Private_Cash",
"Private_DebitCard", "Private_CreditCard", "Private_PhonePay",
"Private_Neobank", "Control_Cash", "Control_DebitCard", "Control_CreditCard",
"Control_PhonePay", "Control_Neobank", "Social_Friends", "Social_Family",
"Concern_Fraud", "Concern_PersonalInfo", "Concern_IDTheft", "Concern_Hacker")]
summary(NLBdata_Likert)## Save_Form Spend_SameForm Concern_Security LessSecure_Digital
## Min. :-2.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.: 1.000 1st Qu.:4.000 1st Qu.:1.000 1st Qu.:2.000
## Median : 4.000 Median :6.000 Median :3.000 Median :4.000
## Mean : 3.395 Mean :5.316 Mean :3.115 Mean :3.655
## 3rd Qu.: 7.000 3rd Qu.:7.000 3rd Qu.:4.000 3rd Qu.:5.000
## Max. : 7.000 Max. :7.000 Max. :7.000 Max. :7.000
## Trust_2FA Safe_CashCarry Prefer_Digital_Convenience Prefer_Cash_Control
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:4.000 1st Qu.:3.000 1st Qu.:6.000 1st Qu.:1.000
## Median :6.000 Median :4.000 Median :7.000 Median :4.000
## Mean :5.359 Mean :4.135 Mean :6.046 Mean :3.441
## 3rd Qu.:7.000 3rd Qu.:5.000 3rd Qu.:7.000 3rd Qu.:5.000
## Max. :7.000 Max. :7.000 Max. :7.000 Max. :7.000
## Use_Cash_IfNoDigital Importance_Ease Importance_Speed Importance_Availability
## Min. :1.000 Min. :1.000 Min. :1 Min. :1.000
## 1st Qu.:5.000 1st Qu.:6.000 1st Qu.:6 1st Qu.:6.000
## Median :6.000 Median :7.000 Median :7 Median :7.000
## Mean :5.704 Mean :6.105 Mean :6 Mean :6.168
## 3rd Qu.:7.000 3rd Qu.:7.000 3rd Qu.:7 3rd Qu.:7.000
## Max. :7.000 Max. :7.000 Max. :7 Max. :7.000
## Importance_Security Importance_TrackingBudgeting Importance_Privacy
## Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:5.000 1st Qu.:4.000 1st Qu.:4.000
## Median :7.000 Median :5.000 Median :6.000
## Mean :6.076 Mean :5.201 Mean :5.658
## 3rd Qu.:7.000 3rd Qu.:7.000 3rd Qu.:7.000
## Max. :7.000 Max. :7.000 Max. :7.000
## Safe_Cash Safe_DebitCard Safe_CreditCard Safe_PhonePay
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:4.000 1st Qu.:4.000 1st Qu.:4.000 1st Qu.:4.000
## Median :6.000 Median :5.000 Median :5.000 Median :5.000
## Mean :5.503 Mean :5.138 Mean :4.947 Mean :5.174
## 3rd Qu.:7.000 3rd Qu.:6.000 3rd Qu.:6.000 3rd Qu.:7.000
## Max. :7.000 Max. :7.000 Max. :7.000 Max. :7.000
## Safe_Neobank Easy_Cash Easy_DebitCard Easy_CreditCard Easy_PhonePay
## Min. :1.000 Min. :1.000 Min. :3.000 Min. :3.00 Min. :1.00
## 1st Qu.:4.000 1st Qu.:4.000 1st Qu.:5.000 1st Qu.:5.00 1st Qu.:7.00
## Median :5.000 Median :5.000 Median :7.000 Median :7.00 Median :7.00
## Mean :4.826 Mean :5.102 Mean :6.148 Mean :6.02 Mean :6.48
## 3rd Qu.:6.000 3rd Qu.:7.000 3rd Qu.:7.000 3rd Qu.:7.00 3rd Qu.:7.00
## Max. :7.000 Max. :7.000 Max. :7.000 Max. :7.00 Max. :7.00
## Easy_Neobank Accept_Cash Accept_DebitCard Accept_CreditCard
## Min. :1.000 Min. :1.000 Min. :3.000 Min. :2.000
## 1st Qu.:4.000 1st Qu.:6.000 1st Qu.:6.000 1st Qu.:6.000
## Median :6.000 Median :7.000 Median :7.000 Median :7.000
## Mean :5.592 Mean :6.227 Mean :6.484 Mean :6.336
## 3rd Qu.:7.000 3rd Qu.:7.000 3rd Qu.:7.000 3rd Qu.:7.000
## Max. :7.000 Max. :7.000 Max. :7.000 Max. :7.000
## Accept_PhonePay Accept_Neobank Fast_Cash Fast_DebitCard
## Min. :1.00 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:5.00 1st Qu.:4.000 1st Qu.:2.750 1st Qu.:5.000
## Median :7.00 Median :5.000 Median :4.000 Median :7.000
## Mean :6.03 Mean :5.076 Mean :4.076 Mean :6.174
## 3rd Qu.:7.00 3rd Qu.:7.000 3rd Qu.:6.000 3rd Qu.:7.000
## Max. :7.00 Max. :7.000 Max. :7.000 Max. :7.000
## Fast_CreditCard Fast_PhonePay Fast_Neobank Private_Cash
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:5.000 1st Qu.:7.000 1st Qu.:4.000 1st Qu.:4.000
## Median :7.000 Median :7.000 Median :6.000 Median :7.000
## Mean :6.122 Mean :6.526 Mean :5.684 Mean :5.681
## 3rd Qu.:7.000 3rd Qu.:7.000 3rd Qu.:7.000 3rd Qu.:7.000
## Max. :7.000 Max. :7.000 Max. :7.000 Max. :7.000
## Private_DebitCard Private_CreditCard Private_PhonePay Private_Neobank
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:4.000 1st Qu.:4.000 1st Qu.:3.000 1st Qu.:4.000
## Median :4.000 Median :4.000 Median :4.000 Median :4.000
## Mean :4.457 Mean :4.457 Mean :4.395 Mean :4.484
## 3rd Qu.:6.000 3rd Qu.:6.000 3rd Qu.:6.000 3rd Qu.:6.000
## Max. :7.000 Max. :7.000 Max. :7.000 Max. :7.000
## Control_Cash Control_DebitCard Control_CreditCard Control_PhonePay
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
## 1st Qu.:3.000 1st Qu.:4.000 1st Qu.:4.000 1st Qu.:4.000
## Median :5.000 Median :5.000 Median :5.000 Median :5.000
## Mean :4.852 Mean :5.155 Mean :5.007 Mean :5.161
## 3rd Qu.:7.000 3rd Qu.:7.000 3rd Qu.:7.000 3rd Qu.:7.000
## Max. :7.000 Max. :7.000 Max. :7.000 Max. :7.000
## Control_Neobank Social_Friends Social_Family Concern_Fraud
## Min. :1.000 Min. :1.000 Min. :1.000 Min. :-1.000
## 1st Qu.:4.000 1st Qu.:1.000 1st Qu.:1.000 1st Qu.: 4.000
## Median :5.000 Median :3.000 Median :4.000 Median : 5.000
## Mean :5.115 Mean :3.125 Mean :3.339 Mean : 4.507
## 3rd Qu.:7.000 3rd Qu.:4.000 3rd Qu.:5.000 3rd Qu.: 6.000
## Max. :7.000 Max. :7.000 Max. :7.000 Max. : 7.000
## Concern_PersonalInfo Concern_IDTheft Concern_Hacker
## Min. :-1.000 Min. :-1.000 Min. :-1.000
## 1st Qu.: 4.000 1st Qu.: 3.000 1st Qu.: 4.000
## Median : 4.000 Median : 4.000 Median : 5.000
## Mean : 4.457 Mean : 4.273 Mean : 4.849
## 3rd Qu.: 6.000 3rd Qu.: 6.000 3rd Qu.: 6.000
## Max. : 7.000 Max. : 7.000 Max. : 7.000
Trust in Security Measures: The data reveals
significant variation in users’ trust in different security measures,
particularly 2FA (Two-Factor Authentication). The median value for
Trust_2FA is 6, indicating that most users
place high trust in two-factor authentication systems.
Security Concerns: Users show notable concern
regarding personal information security, with
Concern_PersonalInfo having a mean value of
4.457. This suggests that there is a moderate level of concern
regarding the security of personal data. Interestingly, concerns related
to fraud (Concern_Fraud) and digital theft
(Concern_IDTheft) also follow similar patterns, with
medians around 5, highlighting a general apprehension
about fraud and security breaches in the digital space.
Preference for Digital Convenience: The
Prefer_Digital_Convenience variable shows that most
participants favor digital payment solutions for their ease of use, with
a high median score of 7. This aligns with the growing
trend toward digitalization of financial transactions.
Preference for Cash Control: In contrast, a
significant portion of users still values cash control,
with the Prefer_Cash_Control variable showing a
mean of 3.441 and a median of 4,
pointing towards a preference for having tangible control over their
spending.
Willingness to Use Cash in the Absence of Digital
Payments: Interestingly, the Use_Cash_IfNoDigital
variable has a mean of 5.704, indicating that while
most users prefer digital methods, they are still willing to revert to
cash if necessary.
Security of Digital Payment Methods: There is a
noticeable difference in the perception of security between various
digital payment methods. Safe_CashCarry has a
median of 4, while methods like
Safe_DebitCard and Safe_CreditCard have
medians of 5, suggesting that while users feel
relatively secure with traditional methods, there is more trust in newer
forms of payment like digital wallets.
Cash vs. Digital Payment Security: It’s clear
from the data that traditional cash-based payments (e.g.,
Safe_CashCarry) are generally perceived as safer, but
users’ trust in digital solutions like debit and credit cards is
growing. However, there’s still a gap when it comes to mobile payment
methods, with mobile solutions such as Safe_PhonePay
scoring lower in perceived safety compared to physical cards.
Fast_Cash has a mean of 4.076, suggesting
a moderate preference for faster cash-based transactions. Conversely,
digital payments such as Fast_CreditCard and
Fast_PhonePay are rated highly, with means over
6. This reflects the growing demand for quick
transactions in today’s fast-paced world.In summary, the findings highlight a clear dichotomy between the traditional and digital worlds in users’ preferences for financial transactions. While there’s a strong trust in traditional security measures and cash-based transactions, there is also a growing acceptance of digital solutions, albeit with a more cautious approach toward security. Users’ social circles influence their preferences and decisions, underscoring the importance of peer-driven advice in financial matters. The research points toward a future where the convergence of digital convenience and robust security measures will drive the evolution of financial decision-making.
# Select specific categorical data for summary
categorical_columns <- c("AgeF", "Cash_UseF", "Card_UseF", "Phone_UseF", "OtherPay_UseF",
"Cash_Up10F", "Cash_11_99F", "Cash_100_1000F", "Cash_Over1000F",
"NoDigital_ResponseF", "Income_StudJobSalaryF", "Income_PocketMoneyF",
"Income_GiftsF", "Income_OccasionalF", "Income_SubsidyF", "Income_ScholarshipF",
"Income_InvestmentsF", "Expense_SharingF", "OnlineFraud_ExpF", "Switch_DigitalF",
"GenderF", "EducationF", "Status_EmploymentF", "Income_LevelF", "Primary_BankF")
# Use summary to view frequency counts for categorical columns
summary(NLB_data[, categorical_columns])## AgeF Cash_UseF Card_UseF
## 23 :64 Never : 23 Never : 22
## 22 :44 1-3 monthly :133 1-3 monthly : 62
## 25 :43 1 per week : 66 1 per week : 27
## 20 :40 Several times a week: 65 Several times a week:125
## 24 :30 Daily : 17 Daily : 68
## 21 :23
## (Other):60
## Phone_UseF OtherPay_UseF Cash_Up10F
## Never : 81 Never :180 Never : 34
## 1-3 monthly : 24 1-3 monthly : 96 Less than half:169
## 1 per week : 20 1 per week : 15 Half : 55
## Several times a week: 71 Several times a week: 12 More than half: 31
## Daily :108 Daily : 1 Always : 15
##
##
## Cash_11_99F Cash_100_1000F Cash_Over1000F
## Never : 88 Never :197 Never :252
## Less than half:153 Less than half: 68 Less than half: 21
## Half : 37 Half : 16 Half : 8
## More than half: 18 More than half: 17 More than half: 8
## Always : 8 Always : 6 Always : 15
##
##
## NoDigital_ResponseF Income_StudJobSalaryF
## Pay digital elsewhere: 52 Cash : 10
## Pay as available :195 Cash&Digitally : 43
## Never occurred : 42 Digitally :217
## Other : 15 Not using : 34
## Don't want to answer: 0
##
##
## Income_PocketMoneyF Income_GiftsF
## Cash : 76 Cash :251
## Cash&Digitally : 63 Cash&Digitally : 36
## Digitally : 57 Digitally : 3
## Not using :107 Not using : 13
## Don't want to answer: 1 Don't want to answer: 1
##
##
## Income_OccasionalF Income_SubsidyF
## Cash : 88 Cash : 1
## Cash&Digitally : 17 Cash&Digitally : 1
## Digitally : 3 Digitally : 62
## Not using :187 Not using :235
## Don't want to answer: 9 Don't want to answer: 5
##
##
## Income_ScholarshipF Income_InvestmentsF
## Cash : 0 Cash : 2
## Cash&Digitally : 2 Cash&Digitally : 2
## Digitally :137 Digitally : 93
## Not using :164 Not using :203
## Don't want to answer: 1 Don't want to answer: 4
##
##
## Expense_SharingF OnlineFraud_ExpF Switch_DigitalF
## Cash : 28 Yes - me : 22 Fully digital : 78
## Mobile apps :250 Yes - others: 96 Balance digital-cash:171
## Bank transfer: 14 Yes - both : 11 Cash : 44
## Don't share : 8 No :172 Don't know : 10
## Other : 4 NA's : 3 NA's : 1
##
##
## GenderF EducationF Status_EmploymentF
## Man :120 Unfinished primary : 0 Student :236
## Woman :182 Primary school : 8 Employed : 49
## Other : 1 Vocational education: 2 Self-employed: 7
## Don't want to answer: 1 High School :132 Unemployed : 3
## Bachelor Degree : 99 Other : 9
## Master Degree : 61
## PhD : 2
## Income_LevelF Primary_BankF
## 0-200 EUR :48 NLB :121
## 201-500 EUR :91 OTP :109
## 501-800 EUR :58 Intesa Sanpaolo : 17
## 801-1300 EUR :48 Sparkasse : 7
## Above 1300 EUR:57 Addiko Bank : 8
## NA's : 2 Delavska Hranilnica: 19
## Other : 23
Cash Usage:
A large proportion of participants (133 respondents) reported using cash
1-3 times a month, while 66 participants use cash
1 per week, and 17 use it daily.
Interestingly, the group that never uses cash for payments is smaller
(23 respondents), suggesting that cash remains a popular choice for most
users.
Card Usage:
For card payments, the most common frequency is several times a
week (125 participants), followed by daily
usage (68 participants), highlighting that many users rely
heavily on cards for transactions. Only 22 respondents never use cards
for payments.
Phone Payments:
Mobile payments, as expected, have a high daily usage rate with
108 respondents using their phones for payments daily.
The group that never uses phone payments is notably large (81
respondents), which could reflect concerns around mobile
payment security or simply a preference for other methods.
Sharing Expenses:
Mobile apps are commonly used for expense-sharing, with
250 respondents using them for this purpose. A smaller
group of 28 respondents still use cash for sharing
expenses, while 8 respondents do not share their
expenses at all.
Switch to Digital Payments:
There is a notable shift toward digital payments, with 78
participants fully digital and 171
participants balancing digital and cash. However, 44
respondents still primarily use cash for their
payments.
Gender Distribution:
The gender distribution in the sample is 120 males and
182 females, with a small portion of participants
marking their gender as other or not wanting to
answer.
Educational Background:
A significant number of participants have completed high
school (132 respondents), with 99 respondents
holding a bachelor’s degree, and a smaller portion
possessing a master’s degree (61 respondents). The
educational distribution indicates a predominantly young, educated
sample, which could influence financial decisions and
preferences.
Employment Status:
The majority of the participants are students (236
respondents), followed by employed individuals (49
respondents). The data suggests that many participants may be
financially dependent, influencing their preferences toward cash,
digital payments, and financial control.
Income Levels:
Students and young individuals are more likely to
report lower-income brackets (e.g., 0-200 EUR), with a
median income level appearing to be low overall, particularly for
scholarships and casual income sources. However, a small portion of
individuals report incomes above 1300 EUR, particularly
for those engaged in digital or investment-based income.
Primary Bank Usage:
NLB and OTP are the two most used
banks among respondents, with 121 and
109 participants, respectively, favoring these
institutions. Other banks like Intesa Sanpaolo and
Sparkasse are used by significantly fewer respondents,
indicating a reliance on a few major banks for digital
transactions.
The categorical analysis highlights the increasing shift towards digital payments, especially with mobile phones and cards. However, cash remains relevant for many respondents, particularly for daily purchases and income sources. Participants’ educational backgrounds and employment statuses suggest that students and younger individuals, particularly those with lower incomes, are more likely to engage with digital payment systems and seek financial control through cash or digital combinations.
Furthermore, the reliance on social platforms and digital methods for expense sharing is evident, with mobile apps dominating as the preferred platform for dividing costs among peers. Financial preferences, particularly between cash and digital methods, are influenced by age, income level, and bank affiliations, pointing to the growing but cautious adoption of digital financial solutions among younger, student populations.
The decision to use Principal Component Analysis (PCA) as the initial step in the analytical sequence addresses several methodological considerations identified during the exploratory assessment of the NLB dataset.
Dimensionality Reduction and Management of Analytical Complexity The original dataset contains a substantial number of metric variables reflecting young individuals’ perceptions of five distinct payment instruments (Cash, Debit Cards, Credit Cards, Smartphone Payments, and Neobanks). Treating each of these 24 variables individually – capturing aspects such as security, ease of use, availability, speed, privacy, and control for each payment method – would have made the analysis both diffuse and technically redundant. PCA was therefore selected to reduce dimensionality, synthesising these measures into six latent principal components (PCASafety, PCAEase, PCAAvailability, PCASpeed, PCAPrivacy, PCAControl) that effectively capture the essence of respondents’ perceptions while preserving the substantive informational content of the dataset.
Mitigation of Multicollinearity
Previous research in the field of digital payments indicates that variables such as “ease of use” and “speed” are often highly correlated, as systems perceived as user-friendly are frequently considered efficient. In our dataset, the biplot interpretation confirms that instruments such as cards and mobile payments cluster according to shared efficiency characteristics. The application of PCA allows correlated variables to be transformed into orthogonal (uncorrelated) components, a critical prerequisite for ensuring the robustness of subsequent multivariate analyses, such as Cluster Analysis, and for preventing the undue influence of redundant variables on analytical outcomes.
Construction of a Perception Map (Visual Interpretation)
In addition to data reduction, PCA was employed for its capacity to generate a perceptual map. The first two dimensions (Dim1 and Dim2) together account for 93.1% of the total variance, providing sufficient fidelity to visually represent the psychological trade-offs faced by young consumers:
This visualisation is consistent with findings in the literature (Usman et al., 2025), which highlight financial literacy and perceived behavioural control as primary drivers of digital adoption, while perceived risk continues to act as a deterrent.
Identification of Latent Constructs Substantiated by Prior Research
The extraction of components related to security and spending control finds empirical support in studies such as Puusniekka (2020), which highlight the enduring perception of cash as a “sacred” instrument for tangible budgetary control. Through PCA, these latent constructs were statistically isolated, confirming that cash remains strongly associated with privacy while exhibiting negative correlations with speed and availability.
NLB_PCASafety <- NLB_data[ , c("Safe_DebitCard", "Safe_CreditCard", "Safe_PhonePay", "Safe_Neobank")]
NLB_PCAEase <- NLB_data[ , c("Easy_DebitCard", "Easy_CreditCard", "Easy_PhonePay", "Easy_Neobank")]
NLB_PCAAvailability <- NLB_data[ , c("Accept_DebitCard", "Accept_CreditCard", "Accept_PhonePay", "Accept_Neobank")]
NLB_PCASpeed <- NLB_data[ , c("Fast_DebitCard", "Fast_CreditCard", "Fast_PhonePay", "Fast_Neobank")]
NLB_PCAPrivacy <- NLB_data[ , c("Private_DebitCard", "Private_CreditCard", "Private_PhonePay", "Private_Neobank")]
NLB_PCAControl <- NLB_data[ , c("Control_DebitCard", "Control_CreditCard", "Control_PhonePay", "Control_Neobank")]
library(pastecs)## Safe_DebitCard Safe_CreditCard Safe_PhonePay Safe_Neobank
## median 5.00 5.00 5.00 5.00
## mean 5.14 4.95 5.17 4.83
## SE.mean 0.07 0.08 0.09 0.09
## CI.mean.0.95 0.14 0.15 0.17 0.17
## var 1.58 1.85 2.28 2.30
## std.dev 1.26 1.36 1.51 1.52
## coef.var 0.25 0.28 0.29 0.31
## Easy_DebitCard Easy_CreditCard Easy_PhonePay Easy_Neobank
## median 7.00 7.00 7.00 6.00
## mean 6.15 6.02 6.48 5.59
## SE.mean 0.06 0.07 0.06 0.09
## CI.mean.0.95 0.13 0.14 0.12 0.17
## var 1.26 1.45 1.22 2.31
## std.dev 1.12 1.20 1.10 1.52
## coef.var 0.18 0.20 0.17 0.27
## Accept_DebitCard Accept_CreditCard Accept_PhonePay Accept_Neobank
## median 7.00 7.00 7.00 5.00
## mean 6.48 6.34 6.03 5.08
## SE.mean 0.05 0.06 0.07 0.10
## CI.mean.0.95 0.11 0.13 0.14 0.19
## var 0.88 1.24 1.51 2.81
## std.dev 0.94 1.11 1.23 1.68
## coef.var 0.14 0.18 0.20 0.33
## Fast_DebitCard Fast_CreditCard Fast_PhonePay Fast_Neobank
## median 7.00 7.00 7.00 6.00
## mean 6.17 6.12 6.53 5.68
## SE.mean 0.07 0.07 0.06 0.08
## CI.mean.0.95 0.13 0.14 0.12 0.17
## var 1.34 1.47 1.08 2.18
## std.dev 1.16 1.21 1.04 1.48
## coef.var 0.19 0.20 0.16 0.26
## Private_DebitCard Private_CreditCard Private_PhonePay
## median 4.00 4.00 4.00
## mean 4.46 4.46 4.39
## SE.mean 0.10 0.10 0.10
## CI.mean.0.95 0.20 0.20 0.20
## var 3.01 2.99 3.22
## std.dev 1.73 1.73 1.80
## coef.var 0.39 0.39 0.41
## Private_Neobank
## median 4.00
## mean 4.48
## SE.mean 0.09
## CI.mean.0.95 0.19
## var 2.71
## std.dev 1.64
## coef.var 0.37
## Control_DebitCard Control_CreditCard Control_PhonePay
## median 5.00 5.00 5.00
## mean 5.15 5.01 5.16
## SE.mean 0.10 0.10 0.11
## CI.mean.0.95 0.19 0.20 0.21
## var 2.92 3.03 3.38
## std.dev 1.71 1.74 1.84
## coef.var 0.33 0.35 0.36
## Control_Neobank
## median 5.00
## mean 5.12
## SE.mean 0.10
## CI.mean.0.95 0.19
## var 2.84
## std.dev 1.69
## coef.var 0.33
library(FactoMineR)
components12 <- PCA(NLB_PCAAvailability,
scale.unit = TRUE,
graph = FALSE,
ncp = 1)NLB_data$PCASafety <- components10$ind$coord[ , 1]
NLB_data$PCAEase <- components11$ind$coord[ , 1]
NLB_data$PCAAvailability <- components12$ind$coord[ , 1]
NLB_data$PCASpeed <- components13$ind$coord[ , 1]
NLB_data$PCAPrivacy <- components14$ind$coord[ , 1]
NLB_data$PCAControl <- components15$ind$coord[ , 1]
head(NLB_data)## # A tibble: 6 × 123
## Age Cash_Use Card_Use Phone_Use OtherPay_Use Cash_Up10 Cash_11_99
## <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 7 3 2 5 2 4 3
## 2 9 2 5 1 1 2 1
## 3 9 2 4 4 1 2 2
## 4 7 3 5 5 1 2 2
## 5 7 2 5 5 1 2 1
## 6 9 3 3 1 2 4 3
## # ℹ 116 more variables: Cash_100_1000 <chr>, Cash_Over1000 <chr>,
## # NoDigital_Response <chr>, NoDigital_Response_Text <chr>,
## # Income_StudJobSalary <chr>, Income_PocketMoney <chr>, Income_Gifts <chr>,
## # Income_Occasional <chr>, Income_Subsidy <chr>, Income_Scholarship <chr>,
## # Income_Investments <chr>, Save_Money <chr>, Save_Form <dbl>,
## # Spend_SameForm <dbl>, Concern_Security <dbl>, LessSecure_Digital <dbl>,
## # Trust_2FA <dbl>, Safe_CashCarry <dbl>, Prefer_Digital_Convenience <dbl>, …
Following the reduction of dataset complexity through PCA, the second phase of the analytical sequence involved applying Cluster Analysis. This technique was chosen to address the limitations of mean-based analyses and to capture the intrinsic heterogeneity of Generation Z regarding digital payment behaviours.
In line with Puusniekka (2020), young individuals do not form a homogeneous group: their path towards financial independence is shaped by personal preferences, family habits, and varying levels of digital competence. The primary aim of the clustering procedure is therefore to identify distinct behavioural profiles, enabling the bank to develop targeted strategies and move from a generic understanding of the population to segmentation grounded in the latent perceptions extracted through PCA.
To ensure the robustness of the results, a two-step procedure was adopted:
NLB_CluStd <- as.data.frame(scale(NLB_data[c("PCASafety", "PCAEase", "PCAAvailability", "PCASpeed", "PCAPrivacy", "PCAControl")]))NLB_CluStd$Dissimilarity <- sqrt(NLB_CluStd$PCASafety^2 + NLB_CluStd$PCAEase^2 + NLB_CluStd$PCAAvailability^2 + NLB_CluStd$PCASpeed^2 +NLB_CluStd$PCAPrivacy^2 + NLB_CluStd$PCAControl^2)Distances <- get_dist(NLB_CluStd,
method = "euclidian")
fviz_dist(Distances,
gradient = list(low = "#230078", # NLB INDIGO BLUE
mid = "#A7A8AA", # NLB LIGHT GRAY
high = "white")) NLB_CluStd <- NLB_CluStd %>% rename(Security = PCASafety,
`Ease of use` = PCAEase,
Availability = PCAAvailability,
Speed = PCASpeed,
Privacy = PCAPrivacy,
`Spending Control` = PCAControl)## $hopkins_stat
## [1] 0.6785538
##
## $plot
## NULL
library(dplyr)
library(factoextra)
WARD <- NLB_CluStd %>%
get_dist(method = "euclidean") %>%
hclust(method = "ward.D2")
WARD##
## Call:
## hclust(d = ., method = "ward.D2")
##
## Cluster method : ward.D2
## Distance : euclidean
## Number of objects: 304
library(factoextra)
fviz_dend(WARD,
k=3,
cex = 0.5,
palette = "jama",
color_labels_by_k = TRUE,
rect = TRUE)## Warning: The `<scale>` argument of `guides()` cannot be `FALSE`. Use "none" instead as
## of ggplot2 3.3.4.
## ℹ The deprecated feature was likely used in the factoextra package.
## Please report the issue at <https://github.com/kassambara/factoextra/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
library(factoextra)
library(NbClust)
fviz_nbclust(NLB_CluStd, kmeans, method = "wss") +
labs(subtitle = "Elbow Method")## K-means clustering with 4 clusters of sizes 63, 93, 103, 45
##
## Cluster means:
## Security Ease of use Availability Speed Privacy Spending Control
## 1 0.4866563 0.4709095 -0.15616090 0.24994550 0.8603929 -0.8773218
## 2 -0.7517767 0.7902180 -0.57596893 0.69484422 -0.8043929 0.5618051
## 3 0.1430845 -0.5409715 0.04180835 -0.07174967 -0.0235824 0.1264165
## 4 0.5448485 -1.0541669 1.31326640 -1.62170805 0.5118394 -0.2221667
## Dissimilarity
## 1 2.469170
## 2 2.324214
## 3 1.698921
## 4 3.443997
##
## Clustering vector:
## [1] 2 1 2 3 2 2 3 3 3 3 2 3 1 3 2 2 1 3 3 2 1 2 2 4 3 3 3 2 2 4 1 4 3 3 3 2 2
## [38] 4 2 2 1 2 4 3 3 4 1 3 3 2 4 3 2 2 1 3 2 2 1 3 2 4 1 1 1 1 2 3 3 2 4 1 3 2
## [75] 2 1 2 1 3 3 1 4 3 1 1 3 3 3 2 3 4 2 1 1 2 2 3 2 4 1 1 2 2 2 3 4 1 2 2 1 3
## [112] 1 1 1 2 3 1 4 2 3 2 1 4 4 1 2 3 2 2 3 1 1 4 2 3 3 4 1 1 4 3 3 2 2 2 4 2 2
## [149] 1 3 2 3 3 1 3 3 2 3 3 2 2 3 2 4 3 3 2 1 4 4 4 2 4 1 2 4 1 1 3 2 3 3 2 3 1
## [186] 3 3 3 4 1 4 3 2 3 4 3 3 3 1 3 1 4 1 4 2 4 4 4 4 3 3 3 2 1 2 3 1 3 2 3 1 3
## [223] 3 3 3 4 2 2 4 2 2 1 2 3 3 2 2 2 2 3 2 3 1 1 3 3 2 2 2 2 2 2 2 4 2 1 3 3 1
## [260] 1 1 3 2 2 3 2 3 3 3 3 2 2 3 3 3 3 3 3 3 3 3 3 2 1 1 1 1 4 1 4 4 2 2 1 2 3
## [297] 3 4 1 4 3 4 2 4
##
## Within cluster sum of squares by cluster:
## [1] 301.0014 275.2221 329.2284 299.6181
## (between_SS / total_SS = 40.2 %)
##
## Available components:
##
## [1] "cluster" "centers" "totss" "withinss" "tot.withinss"
## [6] "betweenss" "size" "iter" "ifault"
library(factoextra)
# Define the NLB colors
NLB_colors <- c("#230078", "#84BD00", "#FA7800", "#63666A", "#A7A8AA", "black", "orange")
fviz_cluster(Clustering,
palette = NLB_colors, # Use NLB colors for clusters
repel = FALSE,
ggtheme = theme_bw(), # Black and white theme
data = NLB_CluStd)## Security Ease of use Availability Speed Privacy Spending Control
## 1 0.4866563 0.4709095 -0.15616090 0.24994550 0.8603929 -0.8773218
## 2 -0.7517767 0.7902180 -0.57596893 0.69484422 -0.8043929 0.5618051
## 3 0.1430845 -0.5409715 0.04180835 -0.07174967 -0.0235824 0.1264165
## 4 0.5448485 -1.0541669 1.31326640 -1.62170805 0.5118394 -0.2221667
## Dissimilarity
## 1 2.469170
## 2 2.324214
## 3 1.698921
## 4 3.443997
Figure <- pivot_longer(Figure, cols = c("Security", "Ease of use", "Availability", "Speed", "Privacy", "Spending Control"))
Figure$Group <- factor(Figure$id,
levels = c(1, 2, 3, 4, 5),
labels = c("1", "2", "3", "4", "5"))
Figure$ImeF <- factor(Figure$name,
levels = c("Security", "Ease of use", "Availability", "Speed", "Privacy", "Spending Control"),
labels = c("Security", "Ease of use", "Availability", "Speed", "Privacy", "Spending Control"))
library(ggplot2)
ggplot(Figure, aes(x = ImeF, y = value)) +
geom_hline(yintercept = 0) +
theme_bw() +
geom_point(aes(shape = Group, col = Group), size = 3) +
geom_line(aes(group = id), linewidth = 1) +
ylab("Averages") +
xlab("Cluster variables") +
scale_color_manual(values = NLB_colors) + # Use NLB colors for points and lines
ylim(-3, 3) +
theme(axis.text.x = element_text(angle = 45, vjust = 0.50, size = 10))NLB_CluStd$Group <- Clustering$cluster
fit <- aov(cbind(`Security`, `Ease of use`, `Availability`, `Speed`, `Privacy`, `Spending Control`) ~ as.factor(Group),
data = NLB_CluStd)
summary(fit)## Response Security :
## Df Sum Sq Mean Sq F value Pr(>F)
## as.factor(Group) 3 82.949 27.6495 37.695 < 2.2e-16 ***
## Residuals 300 220.051 0.7335
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Response Ease of use :
## Df Sum Sq Mean Sq F value Pr(>F)
## as.factor(Group) 3 152.19 50.731 100.92 < 2.2e-16 ***
## Residuals 300 150.81 0.503
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Response Availability :
## Df Sum Sq Mean Sq F value Pr(>F)
## as.factor(Group) 3 110.18 36.726 57.14 < 2.2e-16 ***
## Residuals 300 192.82 0.643
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Response Speed :
## Df Sum Sq Mean Sq F value Pr(>F)
## as.factor(Group) 3 167.71 55.905 123.97 < 2.2e-16 ***
## Residuals 300 135.29 0.451
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Response Privacy :
## Df Sum Sq Mean Sq F value Pr(>F)
## as.factor(Group) 3 118.66 39.553 64.37 < 2.2e-16 ***
## Residuals 300 184.34 0.614
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Response Spending Control :
## Df Sum Sq Mean Sq F value Pr(>F)
## as.factor(Group) 3 81.711 27.2370 36.925 < 2.2e-16 ***
## Residuals 300 221.289 0.7376
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
NLB_data$Phone_Use_merged <- ifelse(
NLB_data$Phone_Use == 1, "never",
ifelse(NLB_data$Phone_Use %in% c(2, 3), "irregular basis", "regular basis")
)
# Convert the merged column to a factor with levels in the desired order
NLB_data$Phone_Use_merged <- factor(
NLB_data$Phone_Use_merged,
levels = c("never", "irregular basis", "regular basis")
)##
## Pearson's Chi-squared test
##
## data: NLB_data$Phone_Use_merged and as.factor(NLB_data$Group)
## X-squared = 19.394, df = 6, p-value = 0.003547
NLB_data$Cash_Up10_merged <- ifelse(
NLB_data$Cash_Up10 %in% c(1, 2), "less than half",
ifelse(NLB_data$Cash_Up10 == 3, "half", "more than half")
)
# Convert the merged column to a factor with levels in the desired order
NLB_data$Cash_Up10_merged <- factor(
NLB_data$Cash_Up10_merged,
levels = c("less than half", "half", "more than half")
)##
## Pearson's Chi-squared test
##
## data: NLB_data$Cash_Up10_merged and as.factor(NLB_data$Group)
## X-squared = 20.107, df = 6, p-value = 0.00265
# Calculate frequency by Mobile payment usage
Phone_freq <- NLB_data %>%
group_by(Group, Phone_Use_merged) %>%
summarise(Count = n(), .groups = 'drop') %>%
group_by(Group) %>%
mutate(Percentage = Count / sum(Count) * 100)
# Plot the frequency by Mobile payment usage
ggplot(Phone_freq, aes(x = Group, y = Percentage, fill = Phone_Use_merged)) +
geom_bar(stat = "identity", position = "stack") +
labs(
title = "Distribution of Mobile Payment Usage Across Youngs (18-27 y.o.)",
x = "Group",
y = "Percentage(%)",
fill = "Frequency of Mobile Payment Usage"
) +
scale_fill_manual(values = NLB_colors) + # Use NLB colors for the fill
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Rotate x-axis labels if needed# Assuming Phone_freq is already calculated, reshape it
Phone_wide <- Phone_freq %>%
select(-Count) %>% # Remove the Count column (optional)
spread(key = Phone_Use_merged, value = Percentage) # Spread data across columns
# Create and style the wide format table with borders and grey title row
Phone_wide %>%
kable(caption = "Mobile Payment Usage Distribution by Group (in %)", digits = 1) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>% # Style header row
column_spec(1, bold = TRUE) %>% # Make the first column bold (Group column)
kable_styling(bootstrap_options = "bordered") # Add borders around the table| Group | never | irregular basis | regular basis |
|---|---|---|---|
| 1 | 38.1 | 14.3 | 47.6 |
| 2 | 18.3 | 17.2 | 64.5 |
| 3 | 22.3 | 8.7 | 68.9 |
| 4 | 37.8 | 22.2 | 40.0 |
# Calculate frequency by cash usage for purchases up to 10 EUR
Purchases_freq <- NLB_data %>%
group_by(Group, Cash_Up10_merged) %>%
summarise(Count = n(), .groups = 'drop') %>%
group_by(Group) %>%
mutate(Percentage = Count / sum(Count) * 100)
# Plot the frequency by cash usage for purchases up to 10 EUR
ggplot(Purchases_freq, aes(x = Group, y = Percentage, fill = Cash_Up10_merged)) +
geom_bar(stat = "identity", position = "stack") +
labs(
title = "Distribution of Cash Payment for Purchases up to 10 EUR Among Youngs (18-27 y.o.)",
x = "Group",
y = "Percentage(%)",
fill = "Number of Cash Payments (Up to 10 EUR)"
) +
scale_fill_manual(values = NLB_colors) + # Use NLB colors for the fill
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Rotate x-axis labels if needed# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)
# Assuming Purchases_freq is already calculated, reshape it
Purchases_wide <- Purchases_freq %>%
select(-Count) %>% # Remove the Count column (optional)
spread(key = Cash_Up10_merged, value = Percentage) # Spread data across columns
# Create and style the wide format table with borders and grey title row
Purchases_wide %>%
kable(caption = "Cash Usage for Purchases Distribution by Group (in %)", digits = 1) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>% # Style header row
column_spec(1, bold = TRUE) %>% # Make the first column bold (Group column)
kable_styling(bootstrap_options = "bordered") # Add borders around the table| Group | less than half | half | more than half |
|---|---|---|---|
| 1 | 65.1 | 23.8 | 11.1 |
| 2 | 72.0 | 8.6 | 19.4 |
| 3 | 68.9 | 23.3 | 7.8 |
| 4 | 53.3 | 17.8 | 28.9 |
NLB_data$Card_Use_merged <- ifelse(
NLB_data$Card_Use == 1, "never",
ifelse(NLB_data$Card_Use %in% c(2, 3), "irregular basis", "regular basis"))
# Convert the merged column to a factor with levels in the desired order
NLB_data$Card_Use_merged <- factor(
NLB_data$Card_Use_merged,
levels = c("never", "irregular basis", "regular basis"))
# Calculate frequency by Mobile payment usage
Card_freq <- NLB_data %>%
group_by(Group, Card_Use_merged) %>%
summarise(Count = n(), .groups = 'drop') %>%
group_by(Group) %>%
mutate(Percentage = Count / sum(Count) * 100)
# Plot the frequency by Mobile payment usage
ggplot(Card_freq, aes(x = Group, y = Percentage, fill = Card_Use_merged)) +
geom_bar(stat = "identity", position = "stack") +
labs(
title = "Distribution of Card Payment Usage Across Youngs (18-27 y.o.)",
x = "Group",
y = "Percentage(%)",
fill = "Frequency of Card Payment Usage"
) +
scale_fill_manual(values = NLB_colors) + # Use NLB colors for the fill
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Rotate x-axis labels if needed# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)
# Assuming Purchases_freq is already calculated, reshape it
Purchases_wide <- Card_freq %>%
select(-Count) %>% # Remove the Count column (optional)
spread(key = Card_Use_merged, value = Percentage) # Spread data across columns
# Create and style the wide format table with borders and grey title row
Purchases_wide %>%
kable(caption = "Card Usage for Purchases Distribution by Group (in %)", digits = 1) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>% # Style header row
column_spec(1, bold = TRUE) %>% # Make the first column bold (Group column)
kable_styling(bootstrap_options = "bordered") # Add borders around the table| Group | never | irregular basis | regular basis |
|---|---|---|---|
| 1 | 6.3 | 28.6 | 65.1 |
| 2 | 10.8 | 28.0 | 61.3 |
| 3 | 4.9 | 32.0 | 63.1 |
| 4 | 6.7 | 26.7 | 66.7 |
NLB_data$OtherPay_Use_merged <- ifelse(
NLB_data$OtherPay_Use == 1, "never",
ifelse(NLB_data$OtherPay_Use %in% c(2, 3), "irregular basis", "regular basis"))
# Convert the merged column to a factor with levels in the desired order
NLB_data$OtherPay_Use_merged <- factor(
NLB_data$OtherPay_Use_merged,
levels = c("never", "irregular basis", "regular basis"))
# Calculate frequency by Mobile payment usage
OtherPay_freq <- NLB_data %>%
group_by(Group, OtherPay_Use_merged) %>%
summarise(Count = n(), .groups = 'drop') %>%
group_by(Group) %>%
mutate(Percentage = Count / sum(Count) * 100)
# Plot the frequency by Mobile payment usage
ggplot(OtherPay_freq, aes(x = Group, y = Percentage, fill = OtherPay_Use_merged)) +
geom_bar(stat = "identity", position = "stack") +
labs(
title = "Distribution of Other Payment (PayPal, Stripe, etc.) Usage Across Youngs (18-27 y.o.)",
x = "Group",
y = "Percentage(%)",
fill = "Frequency of Other Payment Usage"
) +
scale_fill_manual(values = NLB_colors) + # Use NLB colors for the fill
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Rotate x-axis labels if needed# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)
# Assuming OtherPay_freq is already calculated, reshape it
OtherPay_wide <- OtherPay_freq %>%
select(-Count) %>% # Remove the Count column (optional)
spread(key = OtherPay_Use_merged, value = Percentage) # Spread data across columns
# Create and style the wide format table with borders and grey title row
OtherPay_wide %>%
kable(caption = "Other Payment Methods Distribution by Group (in %)", digits = 1) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>% # Style header row
column_spec(1, bold = TRUE) %>% # Make the first column bold (Group column)
kable_styling(bootstrap_options = "bordered") # Add borders around the table| Group | never | irregular basis | regular basis |
|---|---|---|---|
| 1 | 73.0 | 25.4 | 1.6 |
| 2 | 57.0 | 38.7 | 4.3 |
| 3 | 50.5 | 42.7 | 6.8 |
| 4 | 64.4 | 33.3 | 2.2 |
# Merge cash usage for purchases between 11-99 EUR
NLB_data$Cash_11_99_merged <- ifelse(
NLB_data$Cash_11_99 %in% c(1, 2), "less than half",
ifelse(NLB_data$Cash_11_99 == 3, "half", "more than half")
)
# Convert the merged column to a factor with levels in the desired order
NLB_data$Cash_11_99_merged <- factor(
NLB_data$Cash_11_99_merged,
levels = c("less than half", "half", "more than half")
)
# Calculate frequency by cash usage for purchases between 11-99 EUR
Purchases1199_freq <- NLB_data %>%
group_by(Group, Cash_11_99_merged) %>%
summarise(Count = n(), .groups = 'drop') %>%
group_by(Group) %>%
mutate(Percentage = Count / sum(Count) * 100)
# Plot the frequency by cash usage for purchases between 11-99 EUR
ggplot(Purchases1199_freq, aes(x = Group, y = Percentage, fill = Cash_11_99_merged)) +
geom_bar(stat = "identity", position = "stack") +
labs(
title = "Distribution of Cash Payment for Purchases Between 11-99 EUR Among Youngs (18-27 y.o.)",
x = "Group",
y = "Percentage(%)",
fill = "Number of Cash Payments (11-99 EUR)"
) +
scale_fill_manual(values = NLB_colors) + # Use NLB colors for the fill
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Rotate x-axis labels if needed# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)
# Reshape Purchases1199_freq to a wide format using pivot_wider()
Purchases_wide <- Purchases1199_freq %>%
select(-Count) %>% # Remove the Count column (optional)
pivot_wider(names_from = Cash_11_99_merged, values_from = Percentage) # Reshape data
# Create and style the wide format table with borders and grey title row
Purchases_wide %>%
kable(caption = "Cash Usage for Purchases (11-99 EUR) Distribution by Group (in %)", digits = 1) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>% # Style header row
column_spec(1, bold = TRUE) %>% # Make the first column bold (Group column)
kable_styling(bootstrap_options = "bordered") # Add borders around the table| Group | less than half | half | more than half |
|---|---|---|---|
| 1 | 79.4 | 15.9 | 4.8 |
| 2 | 76.3 | 12.9 | 10.8 |
| 3 | 87.4 | 6.8 | 5.8 |
| 4 | 66.7 | 17.8 | 15.6 |
# Merge cash usage for purchases over 1000 EUR
NLB_data$Cash_Over1000_merged <- ifelse(
NLB_data$Cash_Over1000 %in% c(1, 2), "less than half",
ifelse(NLB_data$Cash_Over1000 == 3, "half", "more than half")
)
# Convert the merged column to a factor with levels in the desired order
NLB_data$Cash_Over1000_merged <- factor(
NLB_data$Cash_Over1000_merged,
levels = c("less than half", "half", "more than half")
)
# Calculate frequency by cash usage for purchases over 1000 EUR
PurchasesOver1000_freq <- NLB_data %>%
group_by(Group, Cash_Over1000_merged) %>%
summarise(Count = n(), .groups = 'drop') %>%
group_by(Group) %>%
mutate(Percentage = Count / sum(Count) * 100)
# Plot the frequency by cash usage for purchases over 1000 EUR
ggplot(PurchasesOver1000_freq, aes(x = Group, y = Percentage, fill = Cash_Over1000_merged)) +
geom_bar(stat = "identity", position = "stack") +
labs(
title = "Distribution of Cash Payment for Purchases Over 1000 EUR Among Youngs (18-27 y.o.)",
x = "Group",
y = "Percentage(%)",
fill = "Number of Cash Payments (Over 1000 EUR)"
) +
scale_fill_manual(values = NLB_colors) + # Use NLB colors for the fill
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Rotate x-axis labels if needed# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)
# Reshape PurchasesOver1000_freq to a wide format using pivot_wider()
Purchases_wide <- PurchasesOver1000_freq %>%
select(-Count) %>% # Remove the Count column (optional)
pivot_wider(names_from = Cash_Over1000_merged, values_from = Percentage) # Reshape data
# Create and style the wide format table with borders and grey title row
Purchases_wide %>%
kable(caption = "Cash Usage for Purchases ( Over 1000 EUR) Distribution by Group (in %)", digits = 1) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>% # Style header row
column_spec(1, bold = TRUE) %>% # Make the first column bold (Group column)
kable_styling(bootstrap_options = "bordered") # Add borders around the table| Group | less than half | half | more than half |
|---|---|---|---|
| 1 | 87.3 | 1.6 | 11.1 |
| 2 | 92.5 | 3.2 | 4.3 |
| 3 | 92.2 | 1.0 | 6.8 |
| 4 | 82.2 | 6.7 | 11.1 |
# Calculate percentage by Group
Age_freq <- NLB_data %>%
group_by(Group, AgeF) %>%
summarise(Count = n(), .groups = 'drop') %>%
group_by(Group) %>%
mutate(Percentage = Count / sum(Count) * 100)
NLB_colors2 <- c("#230078", # Deep Blue
"#3A1A8B", # Purple-blue
"#4D33B1", # Lighter purple-blue
"#5F47D8", # Soft lavender
"#84BD00", # Green (from the original)
"#A7C700", # Soft green
"#98FA00", # Light green
"#FA7800", # Orange (from the original)
"#FF9A33", # Lighter orange
"#FFB266") # Light peachy-orange
# Plot the percentage by Response
ggplot(Age_freq, aes(x = Group, y = Percentage, fill = as.factor(AgeF))) +
geom_bar(stat = "identity", position = "stack") +
labs(
title = "Distribution of Age by Youngs' Clusters (18-27 y.o.)",
x = "Group",
y = "Percentage (%)",
fill = "Age"
) +
scale_fill_manual(values = NLB_colors2) + # Apply the new homogeneous color palette
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Rotate x-axis labels if needed# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)
# Assuming Age_freq is already calculated, reshape it
Age_wide <- Age_freq %>%
select(-Count) %>% # Remove the Count column (optional)
pivot_wider(
names_from = AgeF,
values_from = Percentage,
values_fill = list(Percentage = 0) # Replace NA with 0
) # Reshape data
# Create and style the wide format table with borders and grey title row
Age_wide %>%
kable(caption = "Age Distribution by Group (in %)", digits = 1) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>% # Style header row
column_spec(1, bold = TRUE) %>% # Make the first column bold (Group column)
kable_styling(bootstrap_options = "bordered") # Add borders around the table| Group | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 4.8 | 3.2 | 14.3 | 9.5 | 12.7 | 23.8 | 14.3 | 9.5 | 6.3 | 1.6 |
| 2 | 1.1 | 7.5 | 8.6 | 8.6 | 15.1 | 24.7 | 8.6 | 20.4 | 2.2 | 3.2 |
| 3 | 4.9 | 9.7 | 12.6 | 7.8 | 13.6 | 15.5 | 10.7 | 13.6 | 4.9 | 6.8 |
| 4 | 13.3 | 4.4 | 22.2 | 2.2 | 17.8 | 22.2 | 4.4 | 8.9 | 4.4 | 0.0 |
# Convert AgeF from factor to numeric
NLB_data$AgeF <- as.numeric(as.character(NLB_data$AgeF))
# Now aggregate and calculate the mean
Age_means <- aggregate(AgeF ~ Group, data = NLB_data, FUN = mean, na.rm = TRUE)
# Print results
print(Age_means)## Group AgeF
## 1 1 22.47619
## 2 2 22.75269
## 3 3 22.49515
## 4 4 21.62222
# Calculate frequency by Income (Income_LevelF) and Group
income_freq <- NLB_data %>%
group_by(Group, Income_LevelF) %>%
summarise(Count = n(), .groups = 'drop') %>%
group_by(Group) %>%
mutate(Percentage = Count / sum(Count) * 100)
# Plot the frequency by Income
ggplot(income_freq, aes(x = Group, y = Percentage, fill = Income_LevelF)) +
geom_bar(stat = "identity", position = "stack") +
labs(
title = "Distribution of Income among Youngs (18-27 y.o.)",
x = "Group",
y = "Frequency",
fill = "Income"
) +
scale_fill_manual(values = NLB_colors) + # Apply the previous color palette
theme_minimal() # Using minimal theme# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)
# Reshape the data using pivot_wider and remove NA column
income_wide <- income_freq %>%
select(-Count) %>% # Remove the Count column (optional)
pivot_wider(names_from = Income_LevelF, values_from = Percentage) %>%
select(-`NA`) # Remove the column with NA values (if it exists)
# Create and style the wide format table with borders and grey title row
income_wide %>%
kable(caption = "Income Distribution by Group (in %)", digits = 1) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>% # Style header row
column_spec(1, bold = TRUE) %>% # Make the first column bold (Group column)
kable_styling(bootstrap_options = "bordered") # Add borders around the table| Group | 0-200 EUR | 201-500 EUR | 501-800 EUR | 801-1300 EUR | Above 1300 EUR |
|---|---|---|---|---|---|
| 1 | 12.7 | 41.3 | 15.9 | 15.9 | 14.3 |
| 2 | 17.2 | 25.8 | 20.4 | 17.2 | 19.4 |
| 3 | 12.6 | 29.1 | 22.3 | 13.6 | 21.4 |
| 4 | 24.4 | 24.4 | 13.3 | 17.8 | 17.8 |
# Create a new column with calculated midpoints of Income Level
NLB_data <- NLB_data %>%
mutate(Income_Mid = case_when(
Income_LevelF == "0-200 EUR" ~ (0 + 200) / 2,
Income_LevelF == "201-500 EUR" ~ (201 + 500) / 2,
Income_LevelF == "501-800 EUR" ~ (501 + 800) / 2,
Income_LevelF == "801-1300 EUR" ~ (800 + 1300) / 2,
Income_LevelF == "Above 1300 EUR" ~ (1300 + 1700) / 2, # Assuming 1300-1700 as range
TRUE ~ NA_real_ # Assign NA for any unexpected values
))
# Print first rows to verify transformation
print(head(NLB_data[, c("Income_LevelF", "Income_Mid")]))## # A tibble: 6 × 2
## Income_LevelF Income_Mid
## <fct> <dbl>
## 1 201-500 EUR 350.
## 2 201-500 EUR 350.
## 3 501-800 EUR 650.
## 4 201-500 EUR 350.
## 5 801-1300 EUR 1050
## 6 201-500 EUR 350.
# Compute mean income for each cluster
Income_means <- aggregate(Income_Mid ~ Group, data = NLB_data, FUN = mean, na.rm = TRUE)
# Print results
print(Income_means)## Group Income_Mid
## 1 1 641.5556
## 2 2 711.5215
## 3 3 730.1618
## 4 4 664.9659
# Calculate frequency by Employment status (Status_Employment) and Group
status_freq <- NLB_data %>%
group_by(Group, Status_EmploymentF) %>%
summarise(Count = n(), .groups = 'drop') %>%
group_by(Group) %>%
mutate(Percentage = Count / sum(Count) * 100)
# Plot the frequency by status
ggplot(status_freq, aes(x = Group, y = Percentage, fill = Status_EmploymentF)) +
geom_bar(stat = "identity", position = "stack") +
labs(
title = "Distribution of Employment among Youngs (18-27 y.o.)",
x = "Group",
y = "Frequency",
fill = "Employment status"
) +
scale_fill_manual(values = NLB_colors) + # Apply the NLB colors palette
theme_minimal() # Keep the minimal theme# Print results
# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)
# Reshape the data using pivot_wider
status_wide <- status_freq %>%
select(-Count) %>% # Remove the Count column (optional)
pivot_wider(
names_from = Status_EmploymentF,
values_from = Percentage,
values_fill = list(Percentage = 0) # Replace NA with 0
) # Reshape data to wide format
# Create and style the wide format table with borders and grey title row
status_wide %>%
kable(caption = "Employment Status Distribution by Group (in %)", digits = 1) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>% # Style header row
column_spec(1, bold = TRUE) %>% # Make the first column bold (Group column)
kable_styling(bootstrap_options = "bordered") # Add borders around the table| Group | Student | Employed | Self-employed | Other | Unemployed |
|---|---|---|---|---|---|
| 1 | 82.5 | 14.3 | 1.6 | 1.6 | 0.0 |
| 2 | 82.8 | 15.1 | 2.2 | 0.0 | 0.0 |
| 3 | 72.8 | 17.5 | 3.9 | 2.9 | 2.9 |
| 4 | 71.1 | 17.8 | 0.0 | 11.1 | 0.0 |
# Reclassify Employment Status
NLB_data <- NLB_data %>%
mutate(Status_EmploymentF = case_when(
Status_EmploymentF == "Employed" ~ "With Job",
Status_EmploymentF == "Self-employed" ~ "With Job",
TRUE ~ Status_EmploymentF # Keep other categories unchanged
))
# Recalculate the frequency by employment status and group
status_freq <- NLB_data %>%
group_by(Group, Status_EmploymentF) %>%
summarise(Count = n(), .groups = 'drop') %>%
group_by(Group) %>%
mutate(Percentage = Count / sum(Count) * 100)# Print only the percentages for "Student" and "With Job"
# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)
# Filter and reshape the data for "Student" and "With Job"
status_wide_filtered <- status_freq %>%
filter(Status_EmploymentF %in% c("Student", "With Job")) %>% # Filter specific statuses
select(-Count) %>% # Remove the Count column (optional)
pivot_wider(names_from = Status_EmploymentF, values_from = Percentage) # Reshape data to wide format
# Create and style the wide format table
status_wide_filtered %>%
kable(caption = "Percentage of Students and Individuals With Jobs by Group (in %)", digits = 1) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>% # Style header row
column_spec(1, bold = TRUE) %>% # Make the first column bold (Group column)
kable_styling(bootstrap_options = "bordered") # Add borders around the table| Group | Student | With Job |
|---|---|---|
| 1 | 82.5 | 15.9 |
| 2 | 82.8 | 17.2 |
| 3 | 72.8 | 21.4 |
| 4 | 71.1 | 17.8 |
# Calculate frequency by Education (Q24F) and Group
education_freq <- NLB_data %>%
group_by(Group, EducationF) %>%
summarise(Count = n(), .groups = 'drop') %>%
group_by(Group) %>%
mutate(Percentage = Count / sum(Count) * 100)
# Plot the frequency by Education
ggplot(education_freq, aes(x = Group, y = Percentage, fill = EducationF)) +
geom_bar(stat = "identity", position = "stack") +
labs(
title = "Distribution of Education level among Youngs (18-27 y.o.)",
x = "Group",
y = "Frequency",
fill = "Education level"
) +
scale_fill_manual(values = NLB_colors) + # Apply the NLB colors palette
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Rotate x-axis labels if needed# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)
# Reshape the data using pivot_wider
education_wide <- education_freq %>%
select(-Count) %>% # Remove the Count column (optional)
pivot_wider(
names_from = EducationF,
values_from = Percentage,
values_fill = list(Percentage = 0) # Replace NA with 0
) # Reshape data to wide format
# Create and style the wide format table
education_wide %>%
kable(caption = "Education Level Distribution by Group (in %)", digits = 1) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>% # Style header row
column_spec(1, bold = TRUE) %>% # Make the first column bold (Group column)
kable_styling(bootstrap_options = "bordered") # Add borders around the table| Group | High School | Bachelor Degree | Master Degree | PhD | Primary school | Vocational education |
|---|---|---|---|---|---|---|
| 1 | 46.0 | 41.3 | 12.7 | 0.0 | 0.0 | 0.0 |
| 2 | 44.1 | 32.3 | 21.5 | 2.2 | 0.0 | 0.0 |
| 3 | 39.8 | 31.1 | 24.3 | 0.0 | 2.9 | 1.9 |
| 4 | 46.7 | 24.4 | 17.8 | 0.0 | 11.1 | 0.0 |
# Calculate frequency by Response
response_freq <- NLB_data %>%
group_by(Group, NoDigital_ResponseF) %>%
summarise(Count = n(), .groups = 'drop') %>%
group_by(Group) %>%
mutate(Percentage = Count / sum(Count) * 100)
# Plot the frequency by Response
ggplot(response_freq, aes(x = Group, y = Percentage, fill = NoDigital_ResponseF)) +
geom_bar(stat = "identity", position = "stack") +
labs(
title = "Responses to Merchants Not Accepting Digital Payments - Youngs (18-27 y.o.)",
x = "Group",
y = "Frequency",
fill = "Response"
) +
scale_fill_manual(values = NLB_colors) + # Apply the NLB colors palette
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Rotate x-axis labels if needed# Reshape the data using pivot_wider and replace NA with 0
response_wide <- response_freq %>%
select(-Count) %>% # Remove the Count column (optional)
pivot_wider(
names_from = NoDigital_ResponseF,
values_from = Percentage,
values_fill = list(Percentage = 0) # Replace NA with 0
)
# Create and style the wide format table
response_wide %>%
kable(caption = "Responses to Merchants Not Accepting Digital Payments by Group (in %)", digits = 1) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>% # Style header row
column_spec(1, bold = TRUE) %>% # Make the first column bold (Group column)
kable_styling(bootstrap_options = "bordered") # Add borders around the table| Group | Pay digital elsewhere | Pay as available | Never occurred | Other |
|---|---|---|---|---|
| 1 | 14.3 | 63.5 | 12.7 | 9.5 |
| 2 | 22.6 | 64.5 | 7.5 | 5.4 |
| 3 | 15.5 | 62.1 | 18.4 | 3.9 |
| 4 | 13.3 | 68.9 | 17.8 | 0.0 |
# Calculate frequency by Banks
Banks_freq <- NLB_data %>%
group_by(Group, Primary_BankF) %>%
summarise(Count = n(), .groups = 'drop') %>%
group_by(Group) %>%
mutate(Percentage = Count / sum(Count) * 100)
# Plot the frequency by Bank
ggplot(Banks_freq, aes(x = Group, y = Percentage, fill = Primary_BankF)) +
geom_bar(stat = "identity", position = "stack") +
labs(
title = "Distribution of Primary Banks used by Youngs (18-27 y.o.)",
x = "Group",
y = "Frequency",
fill = "Bank"
) +
scale_fill_manual(values = NLB_colors) + # Apply the NLB colors palette
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Rotate x-axis labels if needed# Reshape the data using pivot_wider and replace NA with 0
banks_wide <- Banks_freq %>%
select(-Count) %>% # Remove the Count column (optional)
pivot_wider(
names_from = Primary_BankF,
values_from = Percentage,
values_fill = list(Percentage = 0) # Replace NA with 0
)
# Create and style the wide format table
banks_wide %>%
kable(caption = "Distribution of Primary Banks Used by Group (in %)", digits = 1) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>% # Style header row
column_spec(1, bold = TRUE) %>% # Make the first column bold (Group column)
kable_styling(bootstrap_options = "bordered") # Add borders around the table| Group | NLB | OTP | Intesa Sanpaolo | Addiko Bank | Delavska Hranilnica | Other | Sparkasse |
|---|---|---|---|---|---|---|---|
| 1 | 39.7 | 36.5 | 9.5 | 1.6 | 3.2 | 9.5 | 0.0 |
| 2 | 31.2 | 36.6 | 7.5 | 2.2 | 5.4 | 11.8 | 5.4 |
| 3 | 48.5 | 34.0 | 2.9 | 2.9 | 7.8 | 2.9 | 1.0 |
| 4 | 37.8 | 37.8 | 2.2 | 4.4 | 8.9 | 6.7 | 2.2 |
# Calculate frequency by Gender
Gender_freq <- NLB_data %>%
group_by(Group, GenderF) %>%
summarise(Count = n(), .groups = 'drop') %>%
group_by(Group) %>%
mutate(Percentage = Count / sum(Count) * 100)
# Plot the frequency by Bank
ggplot(Gender_freq, aes(x = Group, y = Percentage, fill = GenderF)) +
geom_bar(stat = "identity", position = "stack") +
labs(
title = "Distribution of Gender among Youngs (18-27 y.o.)",
x = "Group",
y = "Frequency",
fill = "Gender"
) +
scale_fill_manual(values = NLB_colors) + # Apply the NLB colors palette
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Rotate x-axis labels if needed# Reshape the data using pivot_wider and replace NA with 0
gender_wide <- Gender_freq %>%
select(-Count) %>% # Remove the Count column (optional)
pivot_wider(
names_from = GenderF,
values_from = Percentage,
values_fill = list(Percentage = 0) # Replace NA with 0
)
# Create and style the wide format table
gender_wide %>%
kable(caption = "Gender Distribution by Group (in %)", digits = 1) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>% # Style header row
column_spec(1, bold = TRUE) %>% # Make the first column bold (Group column)
kable_styling(bootstrap_options = "bordered") # Add borders around the table| Group | Man | Woman | Other | Don’t want to answer |
|---|---|---|---|---|
| 1 | 33.3 | 65.1 | 1.6 | 0.0 |
| 2 | 40.9 | 58.1 | 0.0 | 1.1 |
| 3 | 40.8 | 59.2 | 0.0 | 0.0 |
| 4 | 42.2 | 57.8 | 0.0 | 0.0 |
Group demographics: The average age in this group is 22.5 years. The gender distribution is slightly unbalanced, with 65.1% women and 33.3% men. Most individuals have completed high school education (46%), followed by bachelor degree (41.3 %). The little majority use NLB as their primary bank (39.7%), with an average income of 641.55 EUR.
Group behaviors: regarding mobile payments,
47.6% of people in this group use mobile payment
methods regularly, and 65.1% of payments are made using
card.
Cash payments are used for more than half small
purchases (up to 10 EUR) in only 11.1% of cases, for
medium purchases (11–99 EUR) in 4.8% of cases, and for
large purchases (100-1000 EUR) in 11.1% of cases.
These individuals prioritize security and privacy, particularly when it comes to transactions involving larger sums or unfamiliar methods. While they are open to digital payments, they remain cautious and conservative in their payment preferences, sticking to well-known and reliable forms like cash and debit cards. This group’s higher reliance on traditional payment methods may reflect broader generational trust issues with new technologies, particularly with respect to data privacy.
Group demographics: The average age in this group is 22.8 years. The gender distribution is slightly unbalanced, with 58.1% women and 40.9% men. Most individuals have completed high school education (44.1%), followed by bachelor degree (32.3%), and 21.5% have a master’s degree. The primary banks used by this group are OTP (36.6%) and NLB (31.2%), with an average income of 711.52 EUR.
Group behaviors: Regarding mobile payments,
64.5% of people in this group use mobile payment
methods regularly, and 61.3% of payments are made using
card.
Cash payments are used for more than half of small
purchases (up to 10 EUR) in 19.4% of cases, for medium
purchases (11–99 EUR) in 10.8% of cases, and for large
purchases (100–1000 EUR) in 4.3% of cases.
The Tech-Savvy Enthusiasts are the early adopters of digital innovations. They prioritize convenience, flexibility, and speed, which digital payment solutions like mobile payments and neobanks deliver. This group is more comfortable embracing new payment systems and frequently uses them to support their on-the-go, fast-paced lifestyles. Their high engagement with mobile payments (64.5% use regularly) showcases their affinity for technology and digital banking solutions.
Group demographics: The average age in this group is 22.5 years. The gender distribution is slightly unbalanced, with 59.2% women and 40.8% men. Most individuals have completed high school education (39.8%), followed by bachelor degree (31.1%) and 24.3% have a master’s degree. The most common bank in this group is NLB (48.5%), with an average income of 730.2 EUR.
Group behaviors: Regarding mobile payments,
68.9% of people in this group use mobile payment
methods regularly, and 63.1% of payments are made using
card.
Cash payments are used for more than half of small
purchases (up to 10 EUR) in 7.8% of cases, for medium
purchases (11–99 EUR) in 5.8% of cases, and for large
purchases (100–1000 EUR) in 6.8% of cases.
The Pragmatic Minimalists are skeptical of excessive novelty but still open to using digital payment methods that are straightforward and well-accepted. They are not as enthusiastic as the Tech-Savvy Enthusiasts, but they prefer the familiarity and simplicity of physical cards and cash for transactions. This group likely seeks functionality and efficiency without unnecessary complexity.
Group demographics: The average age in this group is 21.6 years, the youngest among all clusters. The gender distribution leans toward women (57.8%), with 42.2% men. Most individuals have completed high school education (46.7%), while 24.4% have a bachelor degree, and 17.8% have a master’s degree. The most popular banks in this group are OTP (37.8%) and NLB (37.8%), with an average income of 665.0 EUR.
Group behaviors: Regarding mobile payments,
40.0% of people in this group use mobile payment
methods regularly, and 66.7% of payments are made using
card.
Cash payments dominate, with small purchases (up to 10
EUR) made for more than half of purchases in 28.9% of
cases, medium purchases (11–99 EUR) in 15.6% of
cases, and large purchases (100–1000 EUR) in 11.1% of
cases.
The Skeptical Traditionalists show resistance to change, preferring more familiar, traditional payment methods. Their inclination towards cash and reluctance toward mobile payments or neobanks likely stems from their concerns about security and privacy. They have a strong belief in having control over their spending (mean: -0.22 for Spending Control), which they feel is best achieved with tangible forms of payment like cash or debit cards.
library(pastecs)
NLB_PCA <- NLB_data[ , c("Safe_Cash", "Safe_DebitCard", "Safe_CreditCard", "Safe_PhonePay", "Safe_Neobank",
"Easy_Cash", "Easy_DebitCard", "Easy_CreditCard", "Easy_PhonePay", "Easy_Neobank",
"Accept_Cash", "Accept_DebitCard", "Accept_CreditCard", "Accept_PhonePay", "Accept_Neobank",
"Fast_Cash", "Fast_DebitCard", "Fast_CreditCard", "Fast_PhonePay", "Fast_Neobank",
"Private_Cash", "Private_DebitCard", "Private_CreditCard", "Private_PhonePay", "Private_Neobank",
"Control_Cash", "Control_DebitCard", "Control_CreditCard", "Control_PhonePay", "Control_Neobank")]library(dplyr)
NLB_PCA <- NLB_PCA %>%
rename(
Cash_Security = Safe_Cash,
Debit_Security = Safe_DebitCard,
Credit_Security = Safe_CreditCard,
Mobile_Security = Safe_PhonePay,
NeoBanks_Security = Safe_Neobank,
Cash_Ease = Easy_Cash,
Debit_Ease = Easy_DebitCard,
Credit_Ease = Easy_CreditCard,
Mobile_Ease = Easy_PhonePay,
NeoBanks_Ease = Easy_Neobank,
Cash_Availability = Accept_Cash,
Debit_Availability = Accept_DebitCard,
Credit_Availability = Accept_CreditCard,
Mobile_Availability = Accept_PhonePay,
NeoBanks_Availability = Accept_Neobank,
Cash_Speed = Fast_Cash,
Debit_Speed = Fast_DebitCard,
Credit_Speed = Fast_CreditCard,
Mobile_Speed = Fast_PhonePay,
NeoBanks_Speed = Fast_Neobank,
Cash_Privacy = Private_Cash,
Debit_Privacy = Private_DebitCard,
Credit_Privacy = Private_CreditCard,
Mobile_Privacy = Private_PhonePay,
NeoBanks_Privacy = Private_Neobank,
Cash_Control = Control_Cash,
Debit_Control = Control_DebitCard,
Credit_Control = Control_CreditCard,
Mobile_Control = Control_PhonePay,
NeoBanks_Control = Control_Neobank)library(tibble)
perceptual <- NLB_PCA %>%
pivot_longer(everything(), names_to = "name", values_to = "score") %>%
separate(name, into = c("Payment method", "Variable"), sep = "_")%>%
pivot_wider(names_from = Variable, values_from = score, values_fn = mean) %>%
column_to_rownames(var = "Payment method")
print(perceptual)## Security Ease Availability Speed Privacy Control
## Cash 5.503289 5.101974 6.226974 4.075658 5.680921 4.851974
## Debit 5.138158 6.148026 6.483553 6.174342 4.457237 5.154605
## Credit 4.947368 6.019737 6.335526 6.121711 4.457237 5.006579
## Mobile 5.174342 6.480263 6.029605 6.526316 4.394737 5.161184
## NeoBanks 4.825658 5.592105 5.075658 5.684211 4.483553 5.115132
## Dim.1 Dim.2
## Security -0.7563320 0.534605714
## Ease 0.8659604 0.465478553
## Availability -0.1808968 0.942766082
## Speed 0.9753947 0.204881709
## Privacy -0.9934258 0.026725632
## Control 0.9275822 -0.001610424
This Principal Component Analysis (PCA) biplot provides a visual representation of financial payment methods and how they relate to key perceptual factors such as security, privacy, speed, availability, control, and ease of use.
This perception map illustrates the trade-offs in different payment methods:
This visualization helps us understand consumer preferences when choosing payment methods and their perceived benefits and drawbacks.
Let’s check if assumptions to perform a parametric are met:
n * 𝜋> 5 -> 304 * 0.5 = 152 > 5
n(1-𝜋) > 5 -> 304 * 0.5 = 152 > 5
Both assumptions are met, we can proceed with the parametric test - test of population proportion
H0: π = 0.5
H1: π > 0.5
## [1] 281
##
## 1-sample proportions test without continuity correction
##
## data: 281 out of 304, null probability 0.5
## X-squared = 218.96, df = 1, p-value < 2.2e-16
## alternative hypothesis: true p is greater than 0.5
## 95 percent confidence interval:
## 0.8954807 1.0000000
## sample estimates:
## p
## 0.9243421
We reject H0 at p<0.001, the proportion of young people who still use cash at least once a month is larger than 50%.
To test this, we will do a test of population proportion. Let’s check if assumptions are met.
n * 𝜋> 5 -> 304 * 0.5 = 152 > 5
n(1-𝜋) > 5 -> 304 * 0.5 = 152 > 5
Assumptions are met, so we can proceed with the test of population proportion.
H0: 𝜋 = 0.5
H1: 𝜋 > 0.5
## [1] 270
##
## 1-sample proportions test without continuity correction
##
## data: 270 out of 304, null probability 0.5
## X-squared = 183.21, df = 1, p-value < 2.2e-16
## alternative hypothesis: true p is greater than 0.5
## 95 percent confidence interval:
## 0.8549349 1.0000000
## sample estimates:
## p
## 0.8881579
We reject H0 at p<0.001. We have found that more than 50% use cash for their small (less than 10 EUR payments) at least some of the time.
Let’s do the same test, to test how many people use cash for less than half of their small payments.
H0: 𝜋 = 0.5
H1: 𝜋 > 0.5
## [1] 169
##
## 1-sample proportions test without continuity correction
##
## data: 169 out of 304, null probability 0.5
## X-squared = 3.8026, df = 1, p-value = 0.02559
## alternative hypothesis: true p is greater than 0.5
## 95 percent confidence interval:
## 0.5087589 1.0000000
## sample estimates:
## p
## 0.5559211
We reject H0 at p = 0.026. We have found that the proportion of people, who use cash for less than half of their their small payments is greater than 50%.
NLB_data_save$Save_Form <-ifelse(test = NLB_data_save$Save_Form == -2,
yes = NA,
no = NLB_data_save$Save_Form)## [1] 34
NLB_data_save$Save_FormF<- factor(NLB_data_save$Save_Form,
levels = c(1, 2, 3, 4, 5, 6, 7),
labels = c("Fully in cash", "Up to 25% in cash", "25% and 50% in cash", "About 50% in cash, 50% digital", "25% to 50% digital", "Up to 25% digital", "Fully digital"))library(ggplot2)
library(dplyr)
# Calculate percentages
data_percent <- NLB_data_save %>%
count(Save_FormF) %>%
mutate(percentage = (n / sum(n)) * 100)
# Create the bar plot with NLB colors
ggplot(data_percent, aes(x = Save_FormF, y = percentage)) +
geom_bar(stat = "identity", fill = NLB_colors[1], color = "black") + # Apply NLB Indigo Blue color
labs(title = "How do you save?",
x = "Preference",
y = "Percentage") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))We have 233 people who save in our sample. Based on the frequency graph and the calculations, we have found that in our sample, only 34 (14.6%) people who save, save more than 50% in cash. Based on this, we have decided to reject our hypothesis, that young people who save, mostly do so in cash.
To test this, we will do a test of population proportion. Let’s check if assumptions are met.
n * 𝜋> 5 -> 304 * 0.5 = 152 > 5
n(1-𝜋) > 5 -> 304 * 0.5 = 152 > 5
Assumptions are met, so we can proceed with the test of population proportion.
H0: 𝜋 = 0.5
H1: 𝜋 > 0.5
## [1] 207
##
## 1-sample proportions test without continuity correction
##
## data: 207 out of 304, null probability 0.5
## X-squared = 39.803, df = 1, p-value = 1.405e-10
## alternative hypothesis: true p is greater than 0.5
## 95 percent confidence interval:
## 0.6355172 1.0000000
## sample estimates:
## p
## 0.6809211
We reject H0 at p<0.001. We have found that more than 50% of young people somewhat agree, agree or completely agree that they tend to use money in the same form as they receive it.
## # A tibble: 6 × 2
## Importance_Ease Importance_Security
## <dbl> <dbl>
## 1 7 6
## 2 6 7
## 3 7 6
## 4 7 7
## 5 7 7
## 6 5 7
## tibble [304 × 2] (S3: tbl_df/tbl/data.frame)
## $ Importance_Ease : num [1:304] 7 6 7 7 7 5 6 6 4 7 ...
## $ Importance_Security: num [1:304] 6 7 6 7 7 7 5 7 7 7 ...
## tibble [304 × 2] (S3: tbl_df/tbl/data.frame)
## $ Importance_Ease : num [1:304] 7 6 7 7 7 5 6 6 4 7 ...
## $ Importance_Security: num [1:304] 6 7 6 7 7 7 5 7 7 7 ...
## Importance_Ease Importance_Security
## Min. :1.000 Min. :1.000
## 1st Qu.:6.000 1st Qu.:5.000
## Median :7.000 Median :7.000
## Mean :6.105 Mean :6.076
## 3rd Qu.:7.000 3rd Qu.:7.000
## Max. :7.000 Max. :7.000
# Calculate the mean of column 29 (Enostavnost)
mean_enostavnost <- mean(NLB_R6H6[[1]], na.rm = TRUE)
print(mean_enostavnost)## [1] 6.105263
# Calculate the mean of column 29 (Varnost)
mean_varnost <- mean(NLB_R6H6[[2]], na.rm = TRUE)
print(mean_varnost)## [1] 6.075658
##
## Shapiro-Wilk normality test
##
## data: NLB_R6H6$diffs
## W = 0.8771, p-value = 6.971e-15
Since the p-value is much smaller than 0.05, we reject the null hypothesis that the differences are normally distributed.
This means the data does not follow a normal distribution, which justifies using a non-parametric test like the Wilcoxon Signed-Rank test instead of a paired t-test.
wilcox.test(
NLB_R6H6[[1]],
NLB_R6H6[[2]],
paired = TRUE,
correct = FALSE,
exact = FALSE,
alternative = "two.sided")##
## Wilcoxon signed rank test
##
## data: NLB_R6H6[[1]] and NLB_R6H6[[2]]
## V = 5908, p-value = 0.537
## alternative hypothesis: true location shift is not equal to 0
p-value = 0.537
Since the p-value is greater than 0.05, we fail to reject the null hypothesis.
This indicates that there is no statistically significant difference between the two variables (Enostavnost and Varnost).
In the context of our hypothesis, this means that we do not have enough evidence to say that convenience is a significantly stronger factor than security in influencing digital payment adoption among young people.
The Wilcoxon test results indicate that there is no statistically significant difference between the perceived importance of convenience and security (p = 0.537). This means that we fail to reject the null hypothesis, suggesting that convenience is not significantly more important than security in influencing digital payment adoption among young people (18-27 years old).
Therefore, our data does not support the hypothesis that convenience is a more important factor than security in digital payment adoption. Both factors appear to have similar levels of importance for young users.
NLB_R4H4$Q22F <- factor(NLB_R4H4$Switch_Digital,
levels = c(1, 2, 3, 4),
labels = c("Fully digital", "Balance digital-cash", "Cash", "Don't know"))library(ggplot2)
library(dplyr)
# Calculate percentages
data_percent <- NLB_R4H4 %>%
count(Q22F) %>%
mutate(percentage = (n / sum(n)) * 100)
# Create the bar plot with percentages
ggplot(data_percent, aes(x = Q22F, y = percentage)) +
geom_bar(stat = "identity", fill = NLB_colors[1], color = "black") + # Apply NLB Blue color
labs(title = "Would you switch to a fully digital society?",
x = "Preference",
y = "Percentage") +
theme_minimal()## # A tibble: 4 × 3
## Q22F n percentage
## <fct> <int> <dbl>
## 1 Fully digital 78 25.7
## 2 Balance digital-cash 171 56.2
## 3 Cash 44 14.5
## 4 Don't know 11 3.62
From the graph, we can clearly see that the largest percentage of our respondents have answered that they would consider switching to a mostly digital society, but the do not want to fully give up cash.
To further test this, we will do a test of population proportion. Let’s check if assumptions are met.
n * 𝜋> 5 -> 304 * 0.5 = 152 > 5
n(1-𝜋) > 5 -> 304 * 0.5 = 152 > 5
Assumptions are met, so we can proceed with the test of population proportion.
H0: 𝜋 = 0.5
H1: 𝜋 > 0.5
##
## 1-sample proportions test without continuity correction
##
## data: 171 out of 304, null probability 0.5
## X-squared = 4.75, df = 1, p-value = 0.01465
## alternative hypothesis: true p is greater than 0.5
## 95 percent confidence interval:
## 0.5153528 1.0000000
## sample estimates:
## p
## 0.5625
We reject H0 at p = 0.015. Population proportion of people who like digital but are not prepared to fully give up cash is larger than 50%.
## # A tibble: 6 × 1
## Reason_ExactSum
## <chr>
## 1 1
## 2 1
## 3 1
## 4 1
## 5 1
## 6 1
## tibble [304 × 1] (S3: tbl_df/tbl/data.frame)
## $ Reason_ExactSum: chr [1:304] "1" "1" "1" "1" ...
col_name <- colnames(NLB_R8H8)[1]
NLB_R8H8$Reason_ExactSum <- factor(NLB_R8H8$Reason_ExactSum,
levels = c(1, 0, -2),
labels = c("Transfer whole amounts", "Other reasons", "Not applicable"))
library(dplyr)
NLB_R8H8 %>% count (Reason_ExactSum)## # A tibble: 3 × 2
## Reason_ExactSum n
## <fct> <int>
## 1 Transfer whole amounts 236
## 2 Other reasons 28
## 3 Not applicable 40
To test this, we will do a test of population proportion. Let’s check if assumptions are met.
n * 𝜋> 5 -> 264 * 0.5 = 132 > 5
n(1-𝜋) > 5 -> 264 * 0.5 = 132 > 5
Assumptions are met, so we can proceed with the test of population proportion.
H0: 𝜋 = 0.5
H1: 𝜋 > 0.5
##
## 1-sample proportions test without continuity correction
##
## data: 236 out of 264, null probability 0.5
## X-squared = 163.88, df = 1, p-value < 2.2e-16
## alternative hypothesis: true p is greater than 0.5
## 95 percent confidence interval:
## 0.8586738 1.0000000
## sample estimates:
## p
## 0.8939394
Since p-value < 0.05, we reject the null hypothesis H₀. This means there is strong statistical evidence that the true population proportion is greater than 50%.
Monthly Cash Usage
Cash Usage Out of Necessity (Lack of Digital Alternatives)
Small Payments (Under €10)
Savings in Physical Cash
Consistency Between Income Form and Spending Form
Convenience vs. Security
“Cashless” Society, But Not Completely
Splitting Expenses Among Friends
5.1.1.5 Social Influence on Financial Decisions
Social_Friendsand 4 forSocial_Family, it indicates that social circles play a critical role in shaping financial behavior, particularly in the areas of trust and digital adoption.