library(readr)
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

# Load the CSV file
NLB_data <- read.csv("~/Desktop/NLB/nlb_data.csv", sep = ";")

# Remove columns 2 to 8 and 101 to 116
NLB_data <- NLB_data %>% 
  select(-c(2:8, 101:116))

#Remove first row
NLB_data <- NLB_data[-1, ]

head(NLB_data)

##   status Q1 Q3a Q3b Q3c Q3d Q4a Q4b Q4c Q4d Q5 Q5_4_text Q6a Q6b Q6c Q6d Q6e
## 2      6  7   3   2   5   2   4   3   2   1  2        -2   3   2   1   1   3
## 3      6  9   2   5   1   1   2   1   1   1  1        -2   4   2   1   4   4
## 4      6  9   2   4   4   1   2   2   1   1  2        -2   3   3   1   4   4
## 5      6  7   3   5   5   1   2   2   1   1  1        -2   3   3   2   4   4
## 6      6  7   2   5   5   1   2   1   1   1  1        -2   4   3   3   4   4
## 7      6  9   3   3   1   2   4   3   2   1  2        -2   3   4   1   1   3
##   Q6f Q6g Q7 Q8a Q9a Q9b Q9c Q9d Q9e Q9f Q9g Q9h Q31_2a Q31_2b Q31_2c Q31_2d
## 2   3   3  1   5   6   3   5   6   4   7   6   7      7      7      6      6
## 3   4   3  1   7   7   6   3   6   3   7   1   6      6      7      5      7
## 4   4   2  1   7   6   2   3   6   3   7   1   7      7      6      6      6
## 5   3   4  1   7   6   5   6   7   4   7   1   7      7      7      7      7
## 6   4   4  2  -2   7   4   7   4   4   7   4   7      7      7      7      7
## 7   3   4  2  -2   7   3   5   7   7   5   6   5      5      3      7      7
##   Q31_2e Q31_2f Q10a Q10b Q10c Q10d Q10e Q11a Q11b Q11c Q11d Q11e Q12a Q12b
## 2      3      6    6    6    6    7    6    5    6    6    7    6    6    7
## 3      7      6    3    7    7    3    6    7    7    7    6    6    6    6
## 4      4      5    6    6    4    6    6    4    7    7    7    7    7    7
## 5      7      7    6    5    4    6    6    3    6    6    7    7    6    7
## 6      7      7    7    4    4    4    4    7    7    7    7    7    7    7
## 7      4      4    7    6    6    5    5    6    7    7    6    6    6    7
##   Q12c Q12d Q12e Q13a Q13b Q13c Q13d Q13e Q14a Q14b Q14c Q14d Q14e Q15a Q15b
## 2    7    7    6    5    6    6    7    7    7    6    5    5    6    7    5
## 3    6    6    6    5    7    7    7    7    7    1    1    1    1    2    7
## 4    7    7    7    4    7    7    7    7    6    5    5    5    5    5    7
## 5    7    6    6    1    6    6    7    7    7    4    4    4    4    6    5
## 6    7    7    4    7    7    7    7    7    4    7    7    7    7    4    7
## 7    7    6    6    2    7    7    6    6    5    7    7    7    7    7    6
##   Q15c Q15d Q15e Q16a Q16b Q17 Q17_5_text Q18a Q18b Q18c Q18d Q18e Q18e_text
## 2    5    3    3    5    2   2         -2    1    0    1    0    0        -2
## 3    4    1    6    4    4   2         -2    1    1    1    0    0        -2
## 4    7    7    7    4    4   2         -2    1    1    1    1    0        -2
## 5    1    5    5    7    7   2         -2    1    1    1    1    0        -2
## 6    7    7    7    4    4   2         -2    1    1    1    1    0        -2
## 7    5    3    5    3    5   2         -2    1    0    0    0    0        -2
##   Q19a Q19b Q19c Q19d Q20 Q21a Q21b Q21c Q21d Q21e Q22 Q23 Q24 Q25 Q25_5_text
## 2    3    5    3    3   2    0    0    1    0    0   2   1   6   1         -2
## 3    5    5    6    7   3    0    0    1    0    0   2   2   5   1         -2
## 4    3    3    3    3   4   -2   -2   -2   -2   -2   2   2   4   1         -2
## 5    5    5    5    6   4   -2   -2   -2   -2   -2   1   2   4   1         -2
## 6    7    7    7    7   1   -1   -1   -1   -1   -1   1   2   4   1         -2
## 7    5    4    6    6   2    1    0    0    0    0   4   2   6   1         -2
##   Q26 Q27 Q27_7_text
## 2   2   2         -2
## 3   2   2         -2
## 4   3   7  Unicredit
## 5   2   1         -2
## 6   4   1         -2
## 7   2   1         -2

1 Dataset overview

Unit of observation: Each row represents a single respondent from the questionnaire, which was completed by young people aged 18 to 27.
Sample size: The dataset includes responses from a total of 440 young individuals.

1.1 Variables description

Status: status of the questionnaire (6 = valid, 5 = invalid)

Q1a: Age

Q3: How many times a month do you use the following payment methods on average? (1 = never, 2 = 1-3x month, 3 = 1x week, 4 = many times x week, 5 = everyday)

Q3a: Cash
Q3b: Debit or Credit (physical) cards
Q3c: Phone
Q3d: Other (PayPal, Stripe…)

Q4: How often do you use cash payment for the following purchases? (1 = never, 2 = 1-3x month, 3 = 1x week, 4 = many times x week, 5 = everyday)

Q4a: Purchases up to €10
Q4b: Purchases between €11 and €99
Q4c: Purchases between €100 and €1000
Q4d: Purchases over €1000

Q5: How do you usually respond if a merchant does not accept digital payments?

1: I prefer to go elsewhere and pay with a digital payment method.
2: I pay in the form available, even if it means withdrawing cash from an ATM.
3: I have not encountered such a situation yet.
4: Other (please specify).

Q6: Which of the following income sources do you have, and how do you receive them? (1 = fully cash, 2 = half cash/half digital, 3 = fully digital, 4 = don’t get income from this source, 5 = don’t want to answer)

Q6a: Salary or earnings from student work
Q6b: Pocket Money (from family members)
Q6c: Gifts (for birthdays, holidays, etc.)
Q6d: Unreported or occasional work (childcare, tutoring, etc.)
Q6e: Government and/or social benefits
Q6f: Government and other forms of scholarships (e.g., Zois, corporate)
Q6g: Returns from investments (stocks, bonds, cryptocurrencies, etc.)

Q7: Do you save money? (This question does NOT include savings from parents or family members.)

1: Yes
2: No

Q8: In what form do you save money? (Digital vs. cash savings) (1 = fully cash…7 = fully digital)

Q9: Various attitudes towards digital and cash payments. (1 = strongly disagree…7 = strongly agree)

Q9a: I usually spend money in the form in which I received it.
Q9b: I feel concerned about the security of my personal information when using digital payment methods.
Q9c: I find digital payments less secure than cash payments.
Q9d: I have more confidence in digital payment methods if they offer features like two-factor authentication.
Q9e: I feel safe when I carry cash with me.
Q9f: I prefer to use digital payments because they are more convenient and save time.
Q9g: I prefer to use cash to avoid overspending.
Q9h: I use cash only when digital payments are not possible.

Q31_2: Please rate the importance of each of the following factors that influence your choice of payment method. (1 = not important at all…7 = extremely important)

Q31_2a: Ease of use
Q31_2b: Speed of transaction
Q31_2c: Ability to use in stores
Q31_2d: Security of the payment method
Q31_2e: Features for tracking and budgeting

Q10: How safe do you think the following payment methods are? (1 = strongly not safe…7 = strongly safe)

Q10a: Cash
Q10b: (Physical) Debit Card
Q10c: (Physical) Credit Card
Q10d: Paying with your phone (Flik, Apple Pay…)
Q10e: Neobanks (Revolut, N26…)

Q11: How easy do you find the following payment methods to use? (1 = strongly not easy…7 = strongly easy)

Q11a: Cash
Q11b: (Physical) Debit Card
Q11c: (Physical) Credit Card
Q11d: Paying with your phone (Flik, Apple Pay…)
Q11e: Neobanks (Revolut, N26…)

Q12: How accepted do you think the following payment methods are in stores in your environment? (1 = strongly not accepted…7 = strongly accepted)

Q12a: Cash
Q12b: (Physical) Debit Card
Q12c: (Physical) Credit Card
Q12d: Paying with your phone (Flik, Apple Pay…)
Q12e: Neobanks (Revolut, N26…)

Q13: How fast do you consider the following payment methods? (1 = strongly not fast…7 = strongly fast)

Q13a: Cash
Q13b: (Physical) Debit Card
Q13c: (Physical) Credit Card
Q13d: Paying with your phone (Flik, Apple Pay…)
Q13e: Neobanks (Revolut, N26…)

Q14: How do you consider the following payment methods from a privacy perspective? (1 = strongly not private…7 = strongly private)

Q14a: Cash
Q14b: (Physical) Debit Card
Q14c: (Physical) Credit Card
Q14d: Paying with your phone (Flik, Apple Pay…)
Q14e: Neobanks (Revolut, N26…)

Q15: How do you consider the following payment methods from a saving check perspective? (1 = I totally don’t have control…7 = I have total control)

Q15a: Cash
Q15b: (Physical) Debit Card
Q15c: (Physical) Credit Card
Q15d: Paying with your phone (Flik, Apple Pay…)
Q15e: Neobanks (Revolut, N26…)

Q16: Social influence on payment method choice. (1 = strongly disagree…7 = strongly agree)

Q16a: I choose the payment methods my friends choose.
Q16b: I choose the payment methods my family members choose.

Q17: How do you most often share expenses among friends?

1: With cash
2: Through mobile applications (Flik, Revolut, PayPal…)
3: By bank transfer
4: I don’t share expenses among friends
5: Other (please specify)

Q18: Reasons for preferring digital payments. (0 = Yes, 1 = No)

Q18a: I can quote the exact sum.
Q18b: To avoid paying with cash.
Q18c: Because the process is quick and convenient.
Q18d: Because I have my transactions recorded and it is easier to manage finances.
Q18e: Other (please specify).

Q19: Concerns about digital payment security. (1 = strongly not concerns, 7 = strongly concerns)

Q19a: Fraud (e.g., stealing money)
Q19b: Disclosure of Personal Information
Q19c: Identity theft
Q19d: Loss of access due to a hacker attack

Q20: Experience with online fraud.

1: Yes, it happened to me.
2: Yes, I know people who have had this happen.
3: Yes, it has happened to me and others I know.
4: I’ve never encountered such a situation.

Q21: How did this affect your behavior in further payment habits? (0 = Yes, 1 = No)

Q21a: I use cash more often in unfamiliar or suspicious situations (e.g., when traveling).
Q21b: I use digital payment methods (e.g., virtual or disposable cards facilitated by neobanks) more often in unfamiliar or suspicious situations.
Q21c: I am more cautious with online payments.
Q21d: I switched to more secure payment options (e.g., mobile wallets with authentication).
Q21e: My behavior hasn’t changed.

Q22: If we had the opportunity, would you switch to digital payments entirely?

1: Yes, immediately.
2: I would consider it, but I wouldn’t want to give up cash completely.
3: No, I prefer to use cash.
4: I don’t know.

Q23: What is your gender?

1: Man.
2: Woman.
3: Another.
4: I don’t want to answer.

Q24: What is your highest level of educational attainment?

1: Unfinished primary school.
2: Completed primary school.
3: Completed lower or secondary vocational education.
4: Completed secondary professional or general education.
5: Completed tertiary professional or tertiary professional education (including 1st Bologna level).
6: Completed higher university education (including 2nd Bologna level).
7: Completed specialization, scientific master’s degree, PhD.

Q25: What is your current status?

1: Student.
2: Employed.
3: Self-employed.
4: Unemployed.
5: Other (please specify).

Q26: What is your current net monthly income?

1: 0-200 EUR.
2: 201-500 EUR.
3: 501-800 EUR.
4: 801-1300 EUR.
5: More than 1300 EUR.

Q27: Which bank do you currently use as your primary bank?

1: NLB.
2: OTP.
3: Intesa Sanpaolo.
4: Sparkasse.
5: Addiko Bank.
6: Workers’ Savings Bank.
7: Other (please specify).

2 Data manipulation

#Remove all questionnaires with non valid status
library(dplyr)

NLB_data <- NLB_data %>%
  filter(!status == 5)

#Remove status column
NLB_data <- NLB_data[ , -1]

#Remove all under 18/over 27
library(dplyr)

NLB_data <- NLB_data %>%
 filter(!Q1 %in% c(1, 12))

# Rename columns
colnames(NLB_data) <- c("Age", "Cash_Use", "Card_Use", "Phone_Use", "OtherPay_Use",
                        "Cash_Up10", "Cash_11_99", "Cash_100_1000", "Cash_Over1000",
                        "NoDigital_Response", "NoDigital_Response_Text", "Income_StudJobSalary", "Income_PocketMoney", 
                        "Income_Gifts", "Income_Occasional", "Income_Subsidy", 
                        "Income_Scholarship", "Income_Investments", "Save_Money", 
                        "Save_Form", "Spend_SameForm", "Concern_Security", "LessSecure_Digital", 
                        "Trust_2FA", "Safe_CashCarry", "Prefer_Digital_Convenience", 
                        "Prefer_Cash_Control", "Use_Cash_IfNoDigital", "Importance_Ease", "Importance_Speed",  
                        "Importance_Availability", "Importance_Security", "Importance_TrackingBudgeting", 
                        "Importance_Privacy", "Safe_Cash", "Safe_DebitCard", "Safe_CreditCard", "Safe_PhonePay", 
                        "Safe_Neobank", "Easy_Cash", "Easy_DebitCard", "Easy_CreditCard", "Easy_PhonePay", 
                        "Easy_Neobank", "Accept_Cash", "Accept_DebitCard", "Accept_CreditCard", 
                        "Accept_PhonePay", "Accept_Neobank", "Fast_Cash", "Fast_DebitCard", 
                        "Fast_CreditCard", "Fast_PhonePay", "Fast_Neobank", "Private_Cash", 
                        "Private_DebitCard", "Private_CreditCard", "Private_PhonePay", 
                        "Private_Neobank", "Control_Cash", "Control_DebitCard", "Control_CreditCard", 
                        "Control_PhonePay", "Control_Neobank", "Social_Friends", "Social_Family", 
                        "Expense_Sharing", "Expense_Sharing_Text", "Reason_ExactSum", "Reason_NoCash", 
                        "Reason_Convenient", "Reason_TrackFinances", "Reason_Other", "Reason_Other_Text",  
                        "Concern_Fraud", "Concern_PersonalInfo", "Concern_IDTheft", 
                        "Concern_Hacker", "OnlineFraud_Exp", "Behavior_MoreCash", 
                        "Behavior_SecureDigital", "Behavior_Cautious", "Behavior_SecureOption", 
                        "Behavior_NoChange", "Switch_Digital", "Gender", "Education", 
                        "Status_Employment", "Status_Employment_Text", "Income_Level", "Primary_Bank", "Primary_Bank_Text")

2.1 Factoring

# Q1
NLB_data$AgeF <- factor(NLB_data$Age,
                       levels = c(2:11),
                       labels = c(18:27))

#Q3a

NLB_data$Cash_UseF <- factor(NLB_data$Cash_Use,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Never", "1-3 monthly", "1 per week", "Several times a week", "Daily"))

#Q3b
NLB_data$Card_UseF <- factor(NLB_data$Card_Use,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Never", "1-3 monthly", "1 per week", "Several times a week", "Daily"))

#Q3c

NLB_data$Phone_UseF <- factor(NLB_data$Phone_Use,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Never", "1-3 monthly", "1 per week", "Several times a week", "Daily"))

#Q3d

NLB_data$OtherPay_UseF <- factor(NLB_data$OtherPay_Use,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Never", "1-3 monthly", "1 per week", "Several times a week", "Daily"))

#Q4a

NLB_data$Cash_Up10F <- factor(NLB_data$Cash_Up10,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Never", "Less than half", "Half", "More than half", "Always"))

#Q4b

NLB_data$Cash_11_99F <- factor(NLB_data$Cash_11_99,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Never", "Less than half", "Half", "More than half", "Always"))

#Q4c

NLB_data$Cash_100_1000F <- factor(NLB_data$Cash_100_1000,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Never", "Less than half", "Half", "More than half", "Always"))

#Q4d

NLB_data$Cash_Over1000F <- factor(NLB_data$Cash_Over1000,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Never", "Less than half", "Half", "More than half", "Always"))

# Q5
NLB_data$NoDigital_ResponseF <- factor(NLB_data$NoDigital_Response,
                       levels = c(1, 2, 3, 4),
                       labels = c("Pay digital elsewhere", "Pay as available", "Never occurred", "Other"))

#Q6a 

NLB_data$Income_StudJobSalaryF <- factor(NLB_data$Income_StudJobSalary,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Cash", "Cash&Digitally", "Digitally", "Not using", "Don't want to answer"))

#Q6b

NLB_data$Income_PocketMoneyF <- factor(NLB_data$Income_PocketMoney,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Cash", "Cash&Digitally", "Digitally", "Not using", "Don't want to answer"))

#Q6c

NLB_data$Income_GiftsF <- factor(NLB_data$Income_Gifts,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Cash", "Cash&Digitally", "Digitally", "Not using", "Don't want to answer"))

#Q6d

NLB_data$Income_OccasionalF <- factor(NLB_data$Income_Occasional,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Cash", "Cash&Digitally", "Digitally", "Not using", "Don't want to answer"))

#Q6e

NLB_data$Income_SubsidyF <- factor(NLB_data$Income_Subsidy,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Cash", "Cash&Digitally", "Digitally", "Not using", "Don't want to answer"))

#Q6f

NLB_data$Income_ScholarshipF <- factor(NLB_data$Income_Scholarship,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Cash", "Cash&Digitally", "Digitally", "Not using", "Don't want to answer"))
# Q7

NLB_data$Income_InvestmentsF <- factor(NLB_data$Income_Investments,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Cash", "Cash&Digitally", "Digitally", "Not using", "Don't want to answer"))

# Q17

NLB_data$Expense_SharingF <- factor(NLB_data$Expense_Sharing,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Cash", "Mobile apps", "Bank transfer", "Don't share", "Other"))

# Q20

NLB_data$OnlineFraud_ExpF <- factor(NLB_data$OnlineFraud_Exp,
                       levels = c(1, 2, 3, 4),
                       labels = c("Yes - me", "Yes - others", "Yes - both", "No"))

# Q22

NLB_data$Switch_DigitalF <- factor(NLB_data$Switch_Digital,
                       levels = c(1, 2, 3, 4),
                       labels = c("Fully digital", "Balance digital-cash", "Cash", "Don't know"))

# Q23

NLB_data$GenderF <- factor(NLB_data$Gender,
                       levels = c(1, 2, 3, 4),
                       labels = c("Man", "Woman", "Other", "Don't want to answer"))

# Q24

NLB_data$EducationF <- factor(NLB_data$Education,
                       levels = c(1, 2, 3, 4, 5, 6, 7),
                       labels = c("Unfinished primary", "Primary school", "Vocational education", "High School", "Bachelor Degree", "Master Degree", "PhD" ))

# Q25

NLB_data$Status_EmploymentF <- factor(NLB_data$Status_Employment,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("Student", "Employed", "Self-employed", "Unemployed", "Other"))

# Q26

NLB_data$Income_LevelF <- factor(NLB_data$Income_Level,
                       levels = c(1, 2, 3, 4, 5),
                       labels = c("0-200 EUR", "201-500 EUR", "501-800 EUR", "801-1300 EUR", "Above 1300 EUR"))

# Q27

NLB_data$Primary_BankF <- factor(NLB_data$Primary_Bank,
                       levels = c(1, 2, 3, 4, 5, 6, 7),
                       labels = c("NLB", "OTP", "Intesa Sanpaolo", "Sparkasse", "Addiko Bank", "Delavska Hranilnica", "Other"))

head(NLB_data %>% select(ends_with("F")))

##   AgeF   Cash_UseF            Card_UseF           Phone_UseF OtherPay_UseF
## 1   23  1 per week          1-3 monthly                Daily   1-3 monthly
## 2   25 1-3 monthly                Daily                Never         Never
## 3   25 1-3 monthly Several times a week Several times a week         Never
## 4   23  1 per week                Daily                Daily         Never
## 5   23 1-3 monthly                Daily                Daily         Never
## 6   25  1 per week           1 per week                Never   1-3 monthly
##       Cash_Up10F    Cash_11_99F Cash_100_1000F Cash_Over1000F
## 1 More than half           Half Less than half          Never
## 2 Less than half          Never          Never          Never
## 3 Less than half Less than half          Never          Never
## 4 Less than half Less than half          Never          Never
## 5 Less than half          Never          Never          Never
## 6 More than half           Half Less than half          Never
##     NoDigital_ResponseF Income_StudJobSalaryF Income_PocketMoneyF
## 1      Pay as available             Digitally      Cash&Digitally
## 2 Pay digital elsewhere             Not using      Cash&Digitally
## 3      Pay as available             Digitally           Digitally
## 4 Pay digital elsewhere             Digitally           Digitally
## 5 Pay digital elsewhere             Not using           Digitally
## 6      Pay as available             Digitally           Not using
##    Income_GiftsF Income_OccasionalF Income_SubsidyF Income_ScholarshipF
## 1           Cash               Cash       Digitally           Digitally
## 2           Cash          Not using       Not using           Not using
## 3           Cash          Not using       Not using           Not using
## 4 Cash&Digitally          Not using       Not using           Digitally
## 5      Digitally          Not using       Not using           Not using
## 6           Cash               Cash       Digitally           Digitally
##   Income_InvestmentsF Expense_SharingF OnlineFraud_ExpF      Switch_DigitalF
## 1           Digitally      Mobile apps     Yes - others Balance digital-cash
## 2           Digitally      Mobile apps       Yes - both Balance digital-cash
## 3      Cash&Digitally      Mobile apps               No Balance digital-cash
## 4           Not using      Mobile apps               No        Fully digital
## 5           Not using      Mobile apps         Yes - me        Fully digital
## 6           Not using      Mobile apps     Yes - others           Don't know
##   GenderF      EducationF Status_EmploymentF Income_LevelF Primary_BankF
## 1     Man   Master Degree            Student   201-500 EUR           OTP
## 2   Woman Bachelor Degree            Student   201-500 EUR           OTP
## 3   Woman     High School            Student   501-800 EUR         Other
## 4   Woman     High School            Student   201-500 EUR           NLB
## 5   Woman     High School            Student  801-1300 EUR           NLB
## 6   Woman   Master Degree            Student   201-500 EUR           NLB

3 Descriptive statistics

3.1 Numerical data

library(dplyr)

# Convert character columns in specified ranges to numeric
NLB_data <- NLB_data %>%
  mutate(across(c(20:66, 75:78), 
                ~ ifelse(!is.na(.), as.numeric(.), .)))

# Select specific numerical data for summary
NLBdata_Likert <- NLB_data[, c("Save_Form", "Spend_SameForm", "Concern_Security", "LessSecure_Digital", 
                                "Trust_2FA", "Safe_CashCarry", "Prefer_Digital_Convenience", 
                                "Prefer_Cash_Control", "Use_Cash_IfNoDigital", "Importance_Ease", 
                                "Importance_Speed", "Importance_Availability", "Importance_Security", 
                                "Importance_TrackingBudgeting", "Importance_Privacy", "Safe_Cash", 
                                "Safe_DebitCard", "Safe_CreditCard", "Safe_PhonePay", "Safe_Neobank", 
                                "Easy_Cash", "Easy_DebitCard", "Easy_CreditCard", "Easy_PhonePay", 
                                "Easy_Neobank", "Accept_Cash", "Accept_DebitCard", "Accept_CreditCard", 
                                "Accept_PhonePay", "Accept_Neobank", "Fast_Cash", "Fast_DebitCard", 
                                "Fast_CreditCard", "Fast_PhonePay", "Fast_Neobank", "Private_Cash", 
                                "Private_DebitCard", "Private_CreditCard", "Private_PhonePay", 
                                "Private_Neobank", "Control_Cash", "Control_DebitCard", "Control_CreditCard", 
                                "Control_PhonePay", "Control_Neobank", "Social_Friends", "Social_Family", 
                                "Concern_Fraud", "Concern_PersonalInfo", "Concern_IDTheft", "Concern_Hacker")]

summary(NLBdata_Likert)

##    Save_Form      Spend_SameForm  Concern_Security LessSecure_Digital
##  Min.   :-2.000   Min.   :1.000   Min.   :1.000    Min.   :1.000     
##  1st Qu.: 1.000   1st Qu.:4.000   1st Qu.:1.000    1st Qu.:2.000     
##  Median : 4.000   Median :6.000   Median :3.000    Median :4.000     
##  Mean   : 3.395   Mean   :5.316   Mean   :3.115    Mean   :3.655     
##  3rd Qu.: 7.000   3rd Qu.:7.000   3rd Qu.:4.000    3rd Qu.:5.000     
##  Max.   : 7.000   Max.   :7.000   Max.   :7.000    Max.   :7.000     
##    Trust_2FA     Safe_CashCarry  Prefer_Digital_Convenience Prefer_Cash_Control
##  Min.   :1.000   Min.   :1.000   Min.   :1.000              Min.   :1.000      
##  1st Qu.:4.000   1st Qu.:3.000   1st Qu.:6.000              1st Qu.:1.000      
##  Median :6.000   Median :4.000   Median :7.000              Median :4.000      
##  Mean   :5.359   Mean   :4.135   Mean   :6.046              Mean   :3.441      
##  3rd Qu.:7.000   3rd Qu.:5.000   3rd Qu.:7.000              3rd Qu.:5.000      
##  Max.   :7.000   Max.   :7.000   Max.   :7.000              Max.   :7.000      
##  Use_Cash_IfNoDigital Importance_Ease Importance_Speed Importance_Availability
##  Min.   :1.000        Min.   :1.000   Min.   :1        Min.   :1.000          
##  1st Qu.:5.000        1st Qu.:6.000   1st Qu.:6        1st Qu.:6.000          
##  Median :6.000        Median :7.000   Median :7        Median :7.000          
##  Mean   :5.704        Mean   :6.105   Mean   :6        Mean   :6.168          
##  3rd Qu.:7.000        3rd Qu.:7.000   3rd Qu.:7        3rd Qu.:7.000          
##  Max.   :7.000        Max.   :7.000   Max.   :7        Max.   :7.000          
##  Importance_Security Importance_TrackingBudgeting Importance_Privacy
##  Min.   :1.000       Min.   :1.000                Min.   :1.000     
##  1st Qu.:5.000       1st Qu.:4.000                1st Qu.:4.000     
##  Median :7.000       Median :5.000                Median :6.000     
##  Mean   :6.076       Mean   :5.201                Mean   :5.658     
##  3rd Qu.:7.000       3rd Qu.:7.000                3rd Qu.:7.000     
##  Max.   :7.000       Max.   :7.000                Max.   :7.000     
##    Safe_Cash     Safe_DebitCard  Safe_CreditCard Safe_PhonePay  
##  Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000  
##  1st Qu.:4.000   1st Qu.:4.000   1st Qu.:4.000   1st Qu.:4.000  
##  Median :6.000   Median :5.000   Median :5.000   Median :5.000  
##  Mean   :5.503   Mean   :5.138   Mean   :4.947   Mean   :5.174  
##  3rd Qu.:7.000   3rd Qu.:6.000   3rd Qu.:6.000   3rd Qu.:7.000  
##  Max.   :7.000   Max.   :7.000   Max.   :7.000   Max.   :7.000  
##   Safe_Neobank     Easy_Cash     Easy_DebitCard  Easy_CreditCard Easy_PhonePay 
##  Min.   :1.000   Min.   :1.000   Min.   :3.000   Min.   :3.00    Min.   :1.00  
##  1st Qu.:4.000   1st Qu.:4.000   1st Qu.:5.000   1st Qu.:5.00    1st Qu.:7.00  
##  Median :5.000   Median :5.000   Median :7.000   Median :7.00    Median :7.00  
##  Mean   :4.826   Mean   :5.102   Mean   :6.148   Mean   :6.02    Mean   :6.48  
##  3rd Qu.:6.000   3rd Qu.:7.000   3rd Qu.:7.000   3rd Qu.:7.00    3rd Qu.:7.00  
##  Max.   :7.000   Max.   :7.000   Max.   :7.000   Max.   :7.00    Max.   :7.00  
##   Easy_Neobank    Accept_Cash    Accept_DebitCard Accept_CreditCard
##  Min.   :1.000   Min.   :1.000   Min.   :3.000    Min.   :2.000    
##  1st Qu.:4.000   1st Qu.:6.000   1st Qu.:6.000    1st Qu.:6.000    
##  Median :6.000   Median :7.000   Median :7.000    Median :7.000    
##  Mean   :5.592   Mean   :6.227   Mean   :6.484    Mean   :6.336    
##  3rd Qu.:7.000   3rd Qu.:7.000   3rd Qu.:7.000    3rd Qu.:7.000    
##  Max.   :7.000   Max.   :7.000   Max.   :7.000    Max.   :7.000    
##  Accept_PhonePay Accept_Neobank    Fast_Cash     Fast_DebitCard 
##  Min.   :1.00    Min.   :1.000   Min.   :1.000   Min.   :1.000  
##  1st Qu.:5.00    1st Qu.:4.000   1st Qu.:2.750   1st Qu.:5.000  
##  Median :7.00    Median :5.000   Median :4.000   Median :7.000  
##  Mean   :6.03    Mean   :5.076   Mean   :4.076   Mean   :6.174  
##  3rd Qu.:7.00    3rd Qu.:7.000   3rd Qu.:6.000   3rd Qu.:7.000  
##  Max.   :7.00    Max.   :7.000   Max.   :7.000   Max.   :7.000  
##  Fast_CreditCard Fast_PhonePay    Fast_Neobank    Private_Cash  
##  Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000  
##  1st Qu.:5.000   1st Qu.:7.000   1st Qu.:4.000   1st Qu.:4.000  
##  Median :7.000   Median :7.000   Median :6.000   Median :7.000  
##  Mean   :6.122   Mean   :6.526   Mean   :5.684   Mean   :5.681  
##  3rd Qu.:7.000   3rd Qu.:7.000   3rd Qu.:7.000   3rd Qu.:7.000  
##  Max.   :7.000   Max.   :7.000   Max.   :7.000   Max.   :7.000  
##  Private_DebitCard Private_CreditCard Private_PhonePay Private_Neobank
##  Min.   :1.000     Min.   :1.000      Min.   :1.000    Min.   :1.000  
##  1st Qu.:4.000     1st Qu.:4.000      1st Qu.:3.000    1st Qu.:4.000  
##  Median :4.000     Median :4.000      Median :4.000    Median :4.000  
##  Mean   :4.457     Mean   :4.457      Mean   :4.395    Mean   :4.484  
##  3rd Qu.:6.000     3rd Qu.:6.000      3rd Qu.:6.000    3rd Qu.:6.000  
##  Max.   :7.000     Max.   :7.000      Max.   :7.000    Max.   :7.000  
##   Control_Cash   Control_DebitCard Control_CreditCard Control_PhonePay
##  Min.   :1.000   Min.   :1.000     Min.   :1.000      Min.   :1.000   
##  1st Qu.:3.000   1st Qu.:4.000     1st Qu.:4.000      1st Qu.:4.000   
##  Median :5.000   Median :5.000     Median :5.000      Median :5.000   
##  Mean   :4.852   Mean   :5.155     Mean   :5.007      Mean   :5.161   
##  3rd Qu.:7.000   3rd Qu.:7.000     3rd Qu.:7.000      3rd Qu.:7.000   
##  Max.   :7.000   Max.   :7.000     Max.   :7.000      Max.   :7.000   
##  Control_Neobank Social_Friends  Social_Family   Concern_Fraud   
##  Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :-1.000  
##  1st Qu.:4.000   1st Qu.:1.000   1st Qu.:1.000   1st Qu.: 4.000  
##  Median :5.000   Median :3.000   Median :4.000   Median : 5.000  
##  Mean   :5.115   Mean   :3.125   Mean   :3.339   Mean   : 4.507  
##  3rd Qu.:7.000   3rd Qu.:4.000   3rd Qu.:5.000   3rd Qu.: 6.000  
##  Max.   :7.000   Max.   :7.000   Max.   :7.000   Max.   : 7.000  
##  Concern_PersonalInfo Concern_IDTheft  Concern_Hacker  
##  Min.   :-1.000       Min.   :-1.000   Min.   :-1.000  
##  1st Qu.: 4.000       1st Qu.: 3.000   1st Qu.: 4.000  
##  Median : 4.000       Median : 4.000   Median : 5.000  
##  Mean   : 4.457       Mean   : 4.273   Mean   : 4.849  
##  3rd Qu.: 6.000       3rd Qu.: 6.000   3rd Qu.: 6.000  
##  Max.   : 7.000       Max.   : 7.000   Max.   : 7.000

3.1.1 Summary of key insights - Numerical data

3.1.1.1 Trust and Security Preferences

Trust in Security Measures: The data reveals significant variation in users’ trust in different security measures, particularly 2FA (Two-Factor Authentication). The median value for Trust_2FA is 6, indicating that most users place high trust in two-factor authentication systems.
Security Concerns: Users show notable concern regarding personal information security, with Concern_PersonalInfo having a mean value of 4.457. This suggests that there is a moderate level of concern regarding the security of personal data. Interestingly, concerns related to fraud (Concern_Fraud) and digital theft (Concern_IDTheft) also follow similar patterns, with medians around 5, highlighting a general apprehension about fraud and security breaches in the digital space.

3.1.1.2 Preferences for Cash and Digital Payments

Preference for Digital Convenience: The Prefer_Digital_Convenience variable shows that most participants favor digital payment solutions for their ease of use, with a high median score of 7. This aligns with the growing trend toward digitalization of financial transactions.
Preference for Cash Control: In contrast, a significant portion of users still values cash control, with the Prefer_Cash_Control variable showing a mean of 3.441 and a median of 4, pointing towards a preference for having tangible control over their spending.
Willingness to Use Cash in the Absence of Digital Payments: Interestingly, the Use_Cash_IfNoDigital variable has a mean of 5.704, indicating that while most users prefer digital methods, they are still willing to revert to cash if necessary.

3.1.1.3 Digital Payment Security

Security of Digital Payment Methods: There is a noticeable difference in the perception of security between various digital payment methods. Safe_CashCarry has a median of 4, while methods like Safe_DebitCard and Safe_CreditCard have medians of 5, suggesting that while users feel relatively secure with traditional methods, there is more trust in newer forms of payment like digital wallets.
Cash vs. Digital Payment Security: It’s clear from the data that traditional cash-based payments (e.g., Safe_CashCarry) are generally perceived as safer, but users’ trust in digital solutions like debit and credit cards is growing. However, there’s still a gap when it comes to mobile payment methods, with mobile solutions such as Safe_PhonePay scoring lower in perceived safety compared to physical cards.

3.1.1.4 Payment Speed

Preference for Fast Payments: In terms of speed, digital payments are preferred for their efficiency. Fast_Cash has a mean of 4.076, suggesting a moderate preference for faster cash-based transactions. Conversely, digital payments such as Fast_CreditCard and Fast_PhonePay are rated highly, with means over 6. This reflects the growing demand for quick transactions in today’s fast-paced world.

3.1.1.6 Conclusion

In summary, the findings highlight a clear dichotomy between the traditional and digital worlds in users’ preferences for financial transactions. While there’s a strong trust in traditional security measures and cash-based transactions, there is also a growing acceptance of digital solutions, albeit with a more cautious approach toward security. Users’ social circles influence their preferences and decisions, underscoring the importance of peer-driven advice in financial matters. The research points toward a future where the convergence of digital convenience and robust security measures will drive the evolution of financial decision-making.

3.2 Categorical data

# Select specific categorical data for summary
categorical_columns <- c("AgeF", "Cash_UseF", "Card_UseF", "Phone_UseF", "OtherPay_UseF", 
                         "Cash_Up10F", "Cash_11_99F", "Cash_100_1000F", "Cash_Over1000F", 
                         "NoDigital_ResponseF", "Income_StudJobSalaryF", "Income_PocketMoneyF", 
                         "Income_GiftsF", "Income_OccasionalF", "Income_SubsidyF", "Income_ScholarshipF", 
                         "Income_InvestmentsF", "Expense_SharingF", "OnlineFraud_ExpF", "Switch_DigitalF", 
                         "GenderF", "EducationF", "Status_EmploymentF", "Income_LevelF", "Primary_BankF")

# Use summary to view frequency counts for categorical columns
summary(NLB_data[, categorical_columns])

##       AgeF                   Cash_UseF                  Card_UseF  
##  23     :64   Never               : 23   Never               : 22  
##  22     :44   1-3 monthly         :133   1-3 monthly         : 62  
##  25     :43   1 per week          : 66   1 per week          : 27  
##  20     :40   Several times a week: 65   Several times a week:125  
##  24     :30   Daily               : 17   Daily               : 68  
##  21     :23                                                        
##  (Other):60                                                        
##                 Phone_UseF               OtherPay_UseF          Cash_Up10F 
##  Never               : 81   Never               :180   Never         : 34  
##  1-3 monthly         : 24   1-3 monthly         : 96   Less than half:169  
##  1 per week          : 20   1 per week          : 15   Half          : 55  
##  Several times a week: 71   Several times a week: 12   More than half: 31  
##  Daily               :108   Daily               :  1   Always        : 15  
##                                                                            
##                                                                            
##          Cash_11_99F         Cash_100_1000F        Cash_Over1000F
##  Never         : 88   Never         :197    Never         :252   
##  Less than half:153   Less than half: 68    Less than half: 21   
##  Half          : 37   Half          : 16    Half          :  8   
##  More than half: 18   More than half: 17    More than half:  8   
##  Always        :  8   Always        :  6    Always        : 15   
##                                                                  
##                                                                  
##             NoDigital_ResponseF          Income_StudJobSalaryF
##  Pay digital elsewhere: 52      Cash                : 10      
##  Pay as available     :195      Cash&Digitally      : 43      
##  Never occurred       : 42      Digitally           :217      
##  Other                : 15      Not using           : 34      
##                                 Don't want to answer:  0      
##                                                               
##                                                               
##            Income_PocketMoneyF              Income_GiftsF
##  Cash                : 76      Cash                :251  
##  Cash&Digitally      : 63      Cash&Digitally      : 36  
##  Digitally           : 57      Digitally           :  3  
##  Not using           :107      Not using           : 13  
##  Don't want to answer:  1      Don't want to answer:  1  
##                                                          
##                                                          
##             Income_OccasionalF             Income_SubsidyF
##  Cash                : 88      Cash                :  1   
##  Cash&Digitally      : 17      Cash&Digitally      :  1   
##  Digitally           :  3      Digitally           : 62   
##  Not using           :187      Not using           :235   
##  Don't want to answer:  9      Don't want to answer:  5   
##                                                           
##                                                           
##            Income_ScholarshipF           Income_InvestmentsF
##  Cash                :  0      Cash                :  2     
##  Cash&Digitally      :  2      Cash&Digitally      :  2     
##  Digitally           :137      Digitally           : 93     
##  Not using           :164      Not using           :203     
##  Don't want to answer:  1      Don't want to answer:  4     
##                                                             
##                                                             
##       Expense_SharingF     OnlineFraud_ExpF             Switch_DigitalF
##  Cash         : 28     Yes - me    : 22     Fully digital       : 78   
##  Mobile apps  :250     Yes - others: 96     Balance digital-cash:171   
##  Bank transfer: 14     Yes - both  : 11     Cash                : 44   
##  Don't share  :  8     No          :172     Don't know          : 10   
##  Other        :  4     NA's        :  3     NA's                :  1   
##                                                                        
##                                                                        
##                  GenderF                   EducationF      Status_EmploymentF
##  Man                 :120   Unfinished primary  :  0   Student      :236     
##  Woman               :182   Primary school      :  8   Employed     : 49     
##  Other               :  1   Vocational education:  2   Self-employed:  7     
##  Don't want to answer:  1   High School         :132   Unemployed   :  3     
##                             Bachelor Degree     : 99   Other        :  9     
##                             Master Degree       : 61                         
##                             PhD                 :  2                         
##         Income_LevelF             Primary_BankF
##  0-200 EUR     :48    NLB                :121  
##  201-500 EUR   :91    OTP                :109  
##  501-800 EUR   :58    Intesa Sanpaolo    : 17  
##  801-1300 EUR  :48    Sparkasse          :  7  
##  Above 1300 EUR:57    Addiko Bank        :  8  
##  NA's          : 2    Delavska Hranilnica: 19  
##                       Other              : 23

3.2.1 Summary of key insights - Categorical data

3.2.1.1 Payment Frequency and Method Preferences

Cash Usage:
A large proportion of participants (133 respondents) reported using cash 1-3 times a month, while 66 participants use cash 1 per week, and 17 use it daily. Interestingly, the group that never uses cash for payments is smaller (23 respondents), suggesting that cash remains a popular choice for most users.
Card Usage:
For card payments, the most common frequency is several times a week (125 participants), followed by daily usage (68 participants), highlighting that many users rely heavily on cards for transactions. Only 22 respondents never use cards for payments.
Phone Payments:
Mobile payments, as expected, have a high daily usage rate with 108 respondents using their phones for payments daily. The group that never uses phone payments is notably large (81 respondents), which could reflect concerns around mobile payment security or simply a preference for other methods.

3.2.1.2 Cash Spend and Usage Distribution

Frequency of Cash Usage in Different Ranges:
The frequency of cash spending also exhibits a wide range of behavior:
- Less than 10 EUR: Only 34 respondents report never spending this amount, with a large number using it less than half the time.
- Cash between 11 EUR to 99 EUR: Most respondents report less than half of their cash spending falling in this range, with a few respondents reporting always spending this amount.
- Cash Usage Above 1000 EUR: The highest usage category is never spending above 1000 EUR, with 252 respondents reporting so.
Other Payment Methods:
Digital payment methods like bank transfers and mobile apps have seen a higher level of use, with 250 respondents reporting using mobile apps and 14 respondents using bank transfers.

3.2.1.3 Income Sources and Digital vs. Cash Payments

Income and Payment Preferences:
Participants reported diverse income sources:
- Cash remains the predominant source of income for many respondents, particularly for PocketMoney and Gifts, where 251 and 36 participants respectively rely on cash.
- Digitally received income is also quite common, particularly for Scholarships (137 respondents) and Investments (93 respondents), suggesting that digital income sources are more prevalent in these categories.
Responses to Digital Payments:
Most respondents (195 participants) report paying digitally when available, but a significant portion (52 respondents) specifically never pay digitally elsewhere.

3.2.1.4 Spending Habits and Financial Control

Sharing Expenses:
Mobile apps are commonly used for expense-sharing, with 250 respondents using them for this purpose. A smaller group of 28 respondents still use cash for sharing expenses, while 8 respondents do not share their expenses at all.
Switch to Digital Payments:
There is a notable shift toward digital payments, with 78 participants fully digital and 171 participants balancing digital and cash. However, 44 respondents still primarily use cash for their payments.

3.2.1.5 Demographic Breakdown

Gender Distribution:
The gender distribution in the sample is 120 males and 182 females, with a small portion of participants marking their gender as other or not wanting to answer.
Educational Background:
A significant number of participants have completed high school (132 respondents), with 99 respondents holding a bachelor’s degree, and a smaller portion possessing a master’s degree (61 respondents). The educational distribution indicates a predominantly young, educated sample, which could influence financial decisions and preferences.
Employment Status:
The majority of the participants are students (236 respondents), followed by employed individuals (49 respondents). The data suggests that many participants may be financially dependent, influencing their preferences toward cash, digital payments, and financial control.
Income Levels:
Students and young individuals are more likely to report lower-income brackets (e.g., 0-200 EUR), with a median income level appearing to be low overall, particularly for scholarships and casual income sources. However, a small portion of individuals report incomes above 1300 EUR, particularly for those engaged in digital or investment-based income.
Primary Bank Usage:
NLB and OTP are the two most used banks among respondents, with 121 and 109 participants, respectively, favoring these institutions. Other banks like Intesa Sanpaolo and Sparkasse are used by significantly fewer respondents, indicating a reliance on a few major banks for digital transactions.

3.2.1.6 Conclusion

The categorical analysis highlights the increasing shift towards digital payments, especially with mobile phones and cards. However, cash remains relevant for many respondents, particularly for daily purchases and income sources. Participants’ educational backgrounds and employment statuses suggest that students and younger individuals, particularly those with lower incomes, are more likely to engage with digital payment systems and seek financial control through cash or digital combinations.

Furthermore, the reliance on social platforms and digital methods for expense sharing is evident, with mobile apps dominating as the preferred platform for dividing costs among peers. Financial preferences, particularly between cash and digital methods, are influenced by age, income level, and bank affiliations, pointing to the growing but cautious adoption of digital financial solutions among younger, student populations.

4 PCA Creation

NLB_PCASafety <- NLB_data[ , c("Safe_DebitCard", "Safe_CreditCard", "Safe_PhonePay", "Safe_Neobank")]

NLB_PCAEase <- NLB_data[ , c("Easy_DebitCard", "Easy_CreditCard", "Easy_PhonePay", "Easy_Neobank")]

NLB_PCAAvailability <- NLB_data[ , c("Accept_DebitCard", "Accept_CreditCard", "Accept_PhonePay", "Accept_Neobank")]

NLB_PCASpeed <- NLB_data[ , c("Fast_DebitCard", "Fast_CreditCard", "Fast_PhonePay", "Fast_Neobank")]

NLB_PCAPrivacy <- NLB_data[ , c("Private_DebitCard", "Private_CreditCard", "Private_PhonePay", "Private_Neobank")]

NLB_PCAControl <- NLB_data[ , c("Control_DebitCard", "Control_CreditCard", "Control_PhonePay", "Control_Neobank")]

library(pastecs)

## 
## Attaching package: 'pastecs'

## The following objects are masked from 'package:dplyr':
## 
##     first, last

round(stat.desc(NLB_PCASafety, basic = FALSE), 2)

##              Safe_DebitCard Safe_CreditCard Safe_PhonePay Safe_Neobank
## median                 5.00            5.00          5.00         5.00
## mean                   5.14            4.95          5.17         4.83
## SE.mean                0.07            0.08          0.09         0.09
## CI.mean.0.95           0.14            0.15          0.17         0.17
## var                    1.58            1.85          2.28         2.30
## std.dev                1.26            1.36          1.51         1.52
## coef.var               0.25            0.28          0.29         0.31

round(stat.desc(NLB_PCAEase, basic = FALSE), 2)

##              Easy_DebitCard Easy_CreditCard Easy_PhonePay Easy_Neobank
## median                 7.00            7.00          7.00         6.00
## mean                   6.15            6.02          6.48         5.59
## SE.mean                0.06            0.07          0.06         0.09
## CI.mean.0.95           0.13            0.14          0.12         0.17
## var                    1.26            1.45          1.22         2.31
## std.dev                1.12            1.20          1.10         1.52
## coef.var               0.18            0.20          0.17         0.27

round(stat.desc(NLB_PCAAvailability, basic = FALSE), 2)

##              Accept_DebitCard Accept_CreditCard Accept_PhonePay Accept_Neobank
## median                   7.00              7.00            7.00           5.00
## mean                     6.48              6.34            6.03           5.08
## SE.mean                  0.05              0.06            0.07           0.10
## CI.mean.0.95             0.11              0.13            0.14           0.19
## var                      0.88              1.24            1.51           2.81
## std.dev                  0.94              1.11            1.23           1.68
## coef.var                 0.14              0.18            0.20           0.33

round(stat.desc(NLB_PCASpeed, basic = FALSE), 2)

##              Fast_DebitCard Fast_CreditCard Fast_PhonePay Fast_Neobank
## median                 7.00            7.00          7.00         6.00
## mean                   6.17            6.12          6.53         5.68
## SE.mean                0.07            0.07          0.06         0.08
## CI.mean.0.95           0.13            0.14          0.12         0.17
## var                    1.34            1.47          1.08         2.18
## std.dev                1.16            1.21          1.04         1.48
## coef.var               0.19            0.20          0.16         0.26

round(stat.desc(NLB_PCAPrivacy, basic = FALSE), 2)

##              Private_DebitCard Private_CreditCard Private_PhonePay
## median                    4.00               4.00             4.00
## mean                      4.46               4.46             4.39
## SE.mean                   0.10               0.10             0.10
## CI.mean.0.95              0.20               0.20             0.20
## var                       3.01               2.99             3.22
## std.dev                   1.73               1.73             1.80
## coef.var                  0.39               0.39             0.41
##              Private_Neobank
## median                  4.00
## mean                    4.48
## SE.mean                 0.09
## CI.mean.0.95            0.19
## var                     2.71
## std.dev                 1.64
## coef.var                0.37

round(stat.desc(NLB_PCAControl, basic = FALSE), 2)

##              Control_DebitCard Control_CreditCard Control_PhonePay
## median                    5.00               5.00             5.00
## mean                      5.15               5.01             5.16
## SE.mean                   0.10               0.10             0.11
## CI.mean.0.95              0.19               0.20             0.21
## var                       2.92               3.03             3.38
## std.dev                   1.71               1.74             1.84
## coef.var                  0.33               0.35             0.36
##              Control_Neobank
## median                  5.00
## mean                    5.12
## SE.mean                 0.10
## CI.mean.0.95            0.19
## var                     2.84
## std.dev                 1.69
## coef.var                0.33

library(FactoMineR)
components10 <- PCA(NLB_PCASafety,
                  scale.unit = TRUE,
                  graph = FALSE,
                  ncp = 1)

library(FactoMineR)
components11 <- PCA(NLB_PCAEase,
                  scale.unit = TRUE,
                  graph = FALSE,
                  ncp = 1)

library(FactoMineR)
components12 <- PCA(NLB_PCAAvailability,
                  scale.unit = TRUE,
                  graph = FALSE,
                  ncp = 1)

library(FactoMineR)
components13 <- PCA(NLB_PCASpeed,
                  scale.unit = TRUE,
                  graph = FALSE,
                  ncp = 1)

library(FactoMineR)
components14 <- PCA(NLB_PCAPrivacy,
                  scale.unit = TRUE,
                  graph = FALSE,
                  ncp = 1)

library(FactoMineR)
components15 <- PCA(NLB_PCAControl,
                  scale.unit = TRUE,
                  graph = FALSE,
                  ncp = 1)

NLB_data$PCASafety <- components10$ind$coord[ , 1]
NLB_data$PCAEase <- components11$ind$coord[ , 1]
NLB_data$PCAAvailability <- components12$ind$coord[ , 1]
NLB_data$PCASpeed <- components13$ind$coord[ , 1]
NLB_data$PCAPrivacy <- components14$ind$coord[ , 1]
NLB_data$PCAControl <- components15$ind$coord[ , 1]

head(NLB_data)

##   Age Cash_Use Card_Use Phone_Use OtherPay_Use Cash_Up10 Cash_11_99
## 1   7        3        2         5            2         4          3
## 2   9        2        5         1            1         2          1
## 3   9        2        4         4            1         2          2
## 4   7        3        5         5            1         2          2
## 5   7        2        5         5            1         2          1
## 6   9        3        3         1            2         4          3
##   Cash_100_1000 Cash_Over1000 NoDigital_Response NoDigital_Response_Text
## 1             2             1                  2                      -2
## 2             1             1                  1                      -2
## 3             1             1                  2                      -2
## 4             1             1                  1                      -2
## 5             1             1                  1                      -2
## 6             2             1                  2                      -2
##   Income_StudJobSalary Income_PocketMoney Income_Gifts Income_Occasional
## 1                    3                  2            1                 1
## 2                    4                  2            1                 4
## 3                    3                  3            1                 4
## 4                    3                  3            2                 4
## 5                    4                  3            3                 4
## 6                    3                  4            1                 1
##   Income_Subsidy Income_Scholarship Income_Investments Save_Money Save_Form
## 1              3                  3                  3          1         5
## 2              4                  4                  3          1         7
## 3              4                  4                  2          1         7
## 4              4                  3                  4          1         7
## 5              4                  4                  4          2        -2
## 6              3                  3                  4          2        -2
##   Spend_SameForm Concern_Security LessSecure_Digital Trust_2FA Safe_CashCarry
## 1              6                3                  5         6              4
## 2              7                6                  3         6              3
## 3              6                2                  3         6              3
## 4              6                5                  6         7              4
## 5              7                4                  7         4              4
## 6              7                3                  5         7              7
##   Prefer_Digital_Convenience Prefer_Cash_Control Use_Cash_IfNoDigital
## 1                          7                   6                    7
## 2                          7                   1                    6
## 3                          7                   1                    7
## 4                          7                   1                    7
## 5                          7                   4                    7
## 6                          5                   6                    5
##   Importance_Ease Importance_Speed Importance_Availability Importance_Security
## 1               7                7                       6                   6
## 2               6                7                       5                   7
## 3               7                6                       6                   6
## 4               7                7                       7                   7
## 5               7                7                       7                   7
## 6               5                3                       7                   7
##   Importance_TrackingBudgeting Importance_Privacy Safe_Cash Safe_DebitCard
## 1                            3                  6         6              6
## 2                            7                  6         3              7
## 3                            4                  5         6              6
## 4                            7                  7         6              5
## 5                            7                  7         7              4
## 6                            4                  4         7              6
##   Safe_CreditCard Safe_PhonePay Safe_Neobank Easy_Cash Easy_DebitCard
## 1               6             7            6         5              6
## 2               7             3            6         7              7
## 3               4             6            6         4              7
## 4               4             6            6         3              6
## 5               4             4            4         7              7
## 6               6             5            5         6              7
##   Easy_CreditCard Easy_PhonePay Easy_Neobank Accept_Cash Accept_DebitCard
## 1               6             7            6           6                7
## 2               7             6            6           6                6
## 3               7             7            7           7                7
## 4               6             7            7           6                7
## 5               7             7            7           7                7
## 6               7             6            6           6                7
##   Accept_CreditCard Accept_PhonePay Accept_Neobank Fast_Cash Fast_DebitCard
## 1                 7               7              6         5              6
## 2                 6               6              6         5              7
## 3                 7               7              7         4              7
## 4                 7               6              6         1              6
## 5                 7               7              4         7              7
## 6                 7               6              6         2              7
##   Fast_CreditCard Fast_PhonePay Fast_Neobank Private_Cash Private_DebitCard
## 1               6             7            7            7                 6
## 2               7             7            7            7                 1
## 3               7             7            7            6                 5
## 4               6             7            7            7                 4
## 5               7             7            7            4                 7
## 6               7             6            6            5                 7
##   Private_CreditCard Private_PhonePay Private_Neobank Control_Cash
## 1                  5                5               6            7
## 2                  1                1               1            2
## 3                  5                5               5            5
## 4                  4                4               4            6
## 5                  7                7               7            4
## 6                  7                7               7            7
##   Control_DebitCard Control_CreditCard Control_PhonePay Control_Neobank
## 1                 5                  5                3               3
## 2                 7                  4                1               6
## 3                 7                  7                7               7
## 4                 5                  1                5               5
## 5                 7                  7                7               7
## 6                 6                  5                3               5
##   Social_Friends Social_Family Expense_Sharing Expense_Sharing_Text
## 1              5             2               2                   -2
## 2              4             4               2                   -2
## 3              4             4               2                   -2
## 4              7             7               2                   -2
## 5              4             4               2                   -2
## 6              3             5               2                   -2
##   Reason_ExactSum Reason_NoCash Reason_Convenient Reason_TrackFinances
## 1               1             0                 1                    0
## 2               1             1                 1                    0
## 3               1             1                 1                    1
## 4               1             1                 1                    1
## 5               1             1                 1                    1
## 6               1             0                 0                    0
##   Reason_Other Reason_Other_Text Concern_Fraud Concern_PersonalInfo
## 1            0                -2             3                    5
## 2            0                -2             5                    5
## 3            0                -2             3                    3
## 4            0                -2             5                    5
## 5            0                -2             7                    7
## 6            0                -2             5                    4
##   Concern_IDTheft Concern_Hacker OnlineFraud_Exp Behavior_MoreCash
## 1               3              3               2                 0
## 2               6              7               3                 0
## 3               3              3               4                -2
## 4               5              6               4                -2
## 5               7              7               1                -1
## 6               6              6               2                 1
##   Behavior_SecureDigital Behavior_Cautious Behavior_SecureOption
## 1                      0                 1                     0
## 2                      0                 1                     0
## 3                     -2                -2                    -2
## 4                     -2                -2                    -2
## 5                     -1                -1                    -1
## 6                      0                 0                     0
##   Behavior_NoChange Switch_Digital Gender Education Status_Employment
## 1                 0              2      1         6                 1
## 2                 0              2      2         5                 1
## 3                -2              2      2         4                 1
## 4                -2              1      2         4                 1
## 5                -1              1      2         4                 1
## 6                 0              4      2         6                 1
##   Status_Employment_Text Income_Level Primary_Bank Primary_Bank_Text AgeF
## 1                     -2            2            2                -2   23
## 2                     -2            2            2                -2   25
## 3                     -2            3            7         Unicredit   25
## 4                     -2            2            1                -2   23
## 5                     -2            4            1                -2   23
## 6                     -2            2            1                -2   25
##     Cash_UseF            Card_UseF           Phone_UseF OtherPay_UseF
## 1  1 per week          1-3 monthly                Daily   1-3 monthly
## 2 1-3 monthly                Daily                Never         Never
## 3 1-3 monthly Several times a week Several times a week         Never
## 4  1 per week                Daily                Daily         Never
## 5 1-3 monthly                Daily                Daily         Never
## 6  1 per week           1 per week                Never   1-3 monthly
##       Cash_Up10F    Cash_11_99F Cash_100_1000F Cash_Over1000F
## 1 More than half           Half Less than half          Never
## 2 Less than half          Never          Never          Never
## 3 Less than half Less than half          Never          Never
## 4 Less than half Less than half          Never          Never
## 5 Less than half          Never          Never          Never
## 6 More than half           Half Less than half          Never
##     NoDigital_ResponseF Income_StudJobSalaryF Income_PocketMoneyF
## 1      Pay as available             Digitally      Cash&Digitally
## 2 Pay digital elsewhere             Not using      Cash&Digitally
## 3      Pay as available             Digitally           Digitally
## 4 Pay digital elsewhere             Digitally           Digitally
## 5 Pay digital elsewhere             Not using           Digitally
## 6      Pay as available             Digitally           Not using
##    Income_GiftsF Income_OccasionalF Income_SubsidyF Income_ScholarshipF
## 1           Cash               Cash       Digitally           Digitally
## 2           Cash          Not using       Not using           Not using
## 3           Cash          Not using       Not using           Not using
## 4 Cash&Digitally          Not using       Not using           Digitally
## 5      Digitally          Not using       Not using           Not using
## 6           Cash               Cash       Digitally           Digitally
##   Income_InvestmentsF Expense_SharingF OnlineFraud_ExpF      Switch_DigitalF
## 1           Digitally      Mobile apps     Yes - others Balance digital-cash
## 2           Digitally      Mobile apps       Yes - both Balance digital-cash
## 3      Cash&Digitally      Mobile apps               No Balance digital-cash
## 4           Not using      Mobile apps               No        Fully digital
## 5           Not using      Mobile apps         Yes - me        Fully digital
## 6           Not using      Mobile apps     Yes - others           Don't know
##   GenderF      EducationF Status_EmploymentF Income_LevelF Primary_BankF
## 1     Man   Master Degree            Student   201-500 EUR           OTP
## 2   Woman Bachelor Degree            Student   201-500 EUR           OTP
## 3   Woman     High School            Student   501-800 EUR         Other
## 4   Woman     High School            Student   201-500 EUR           NLB
## 5   Woman     High School            Student  801-1300 EUR           NLB
## 6   Woman   Master Degree            Student   201-500 EUR           NLB
##    PCASafety   PCAEase PCAAvailability  PCASpeed PCAPrivacy PCAControl
## 1 -1.7204789 0.1593095      -1.2385549 0.3545114 -1.2168529 -1.2370388
## 2 -1.1689414 0.9246044       0.2539360 1.3676446  3.9978412 -0.6158432
## 3 -0.6604604 1.4565407      -1.4619345 1.3676446 -0.6395533  2.1738887
## 4 -0.2295463 0.3880241      -0.8265122 0.3545114  0.5197953 -1.2939568
## 5  1.4772851 1.4565407      -0.7917959 1.3676446 -2.9582505  2.1738887
## 6 -0.7464178 0.9246044      -0.8265122 0.7468479 -2.9582505 -0.3666127

5 Clustering

NLB_CluStd <- as.data.frame(scale(NLB_data[c("PCASafety", "PCAEase", "PCAAvailability", "PCASpeed", "PCAPrivacy", "PCAControl")]))

NLB_CluStd$Dissimilarity <- sqrt(NLB_CluStd$PCASafety^2 + NLB_CluStd$PCAEase^2 + NLB_CluStd$PCAAvailability^2 + NLB_CluStd$PCASpeed^2 +NLB_CluStd$PCAPrivacy^2 + NLB_CluStd$PCAControl^2)

library(factoextra)

## Loading required package: ggplot2

## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa

Distances <- get_dist(NLB_CluStd,
                      method = "euclidian")

fviz_dist(Distances, 
          gradient = list(low = "#230078",    # NLB INDIGO BLUE
                          mid = "#A7A8AA",    # NLB LIGHT GRAY
                          high = "white"))

NLB_CluStd <- NLB_CluStd %>% rename(Security = PCASafety, 
                                    `Ease of use` = PCAEase, 
                                    Availability = PCAAvailability, 
                                    Speed = PCASpeed, 
                                    Privacy = PCAPrivacy, 
                                    `Spending Control` = PCAControl)

library(factoextra)
get_clust_tendency(NLB_CluStd,
                   n = nrow(NLB_CluStd) -1,
                   graph = FALSE)

## $hopkins_stat
## [1] 0.6785538
## 
## $plot
## NULL

library(dplyr)
library(factoextra)
WARD <- NLB_CluStd %>%
  get_dist(method = "euclidean") %>%
  hclust(method = "ward.D2")

WARD

## 
## Call:
## hclust(d = ., method = "ward.D2")
## 
## Cluster method   : ward.D2 
## Distance         : euclidean 
## Number of objects: 304

library(factoextra)
fviz_dend(WARD)

## Warning: The `<scale>` argument of `guides()` cannot be `FALSE`. Use "none" instead as
## of ggplot2 3.3.4.
## ℹ The deprecated feature was likely used in the factoextra package.
##   Please report the issue at <https://github.com/kassambara/factoextra/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

library(factoextra)
library(NbClust)
fviz_nbclust(NLB_CluStd, kmeans, method = "wss") +
  labs(subtitle = "Elbow Method")

fviz_nbclust(NLB_CluStd, kmeans, method = "silhouette") +
  labs(subtitle = "Silhouette analysis")

Clustering <- kmeans(NLB_CluStd,
                     centers = 4,
                     nstart = 25)
Clustering

## K-means clustering with 4 clusters of sizes 63, 93, 103, 45
## 
## Cluster means:
##     Security Ease of use Availability       Speed    Privacy Spending Control
## 1  0.4866563   0.4709095  -0.15616090  0.24994550  0.8603929       -0.8773218
## 2 -0.7517767   0.7902180  -0.57596893  0.69484422 -0.8043929        0.5618051
## 3  0.1430845  -0.5409715   0.04180835 -0.07174967 -0.0235824        0.1264165
## 4  0.5448485  -1.0541669   1.31326640 -1.62170805  0.5118394       -0.2221667
##   Dissimilarity
## 1      2.469170
## 2      2.324214
## 3      1.698921
## 4      3.443997
## 
## Clustering vector:
##   [1] 2 1 2 3 2 2 3 3 3 3 2 3 1 3 2 2 1 3 3 2 1 2 2 4 3 3 3 2 2 4 1 4 3 3 3 2 2
##  [38] 4 2 2 1 2 4 3 3 4 1 3 3 2 4 3 2 2 1 3 2 2 1 3 2 4 1 1 1 1 2 3 3 2 4 1 3 2
##  [75] 2 1 2 1 3 3 1 4 3 1 1 3 3 3 2 3 4 2 1 1 2 2 3 2 4 1 1 2 2 2 3 4 1 2 2 1 3
## [112] 1 1 1 2 3 1 4 2 3 2 1 4 4 1 2 3 2 2 3 1 1 4 2 3 3 4 1 1 4 3 3 2 2 2 4 2 2
## [149] 1 3 2 3 3 1 3 3 2 3 3 2 2 3 2 4 3 3 2 1 4 4 4 2 4 1 2 4 1 1 3 2 3 3 2 3 1
## [186] 3 3 3 4 1 4 3 2 3 4 3 3 3 1 3 1 4 1 4 2 4 4 4 4 3 3 3 2 1 2 3 1 3 2 3 1 3
## [223] 3 3 3 4 2 2 4 2 2 1 2 3 3 2 2 2 2 3 2 3 1 1 3 3 2 2 2 2 2 2 2 4 2 1 3 3 1
## [260] 1 1 3 2 2 3 2 3 3 3 3 2 2 3 3 3 3 3 3 3 3 3 3 2 1 1 1 1 4 1 4 4 2 2 1 2 3
## [297] 3 4 1 4 3 4 2 4
## 
## Within cluster sum of squares by cluster:
## [1] 301.0014 275.2221 329.2284 299.6181
##  (between_SS / total_SS =  40.2 %)
## 
## Available components:
## 
## [1] "cluster"      "centers"      "totss"        "withinss"     "tot.withinss"
## [6] "betweenss"    "size"         "iter"         "ifault"

library(factoextra)

# Define the NLB colors
NLB_colors <- c("#230078", "#84BD00", "#FA7800", "#63666A", "#A7A8AA", "black", "orange")

fviz_cluster(Clustering, 
             palette = NLB_colors, # Use NLB colors for clusters
             repel = FALSE,
             ggtheme = theme_bw(), # Black and white theme
             data = NLB_CluStd)

Averages <- Clustering$centers
Averages

##     Security Ease of use Availability       Speed    Privacy Spending Control
## 1  0.4866563   0.4709095  -0.15616090  0.24994550  0.8603929       -0.8773218
## 2 -0.7517767   0.7902180  -0.57596893  0.69484422 -0.8043929        0.5618051
## 3  0.1430845  -0.5409715   0.04180835 -0.07174967 -0.0235824        0.1264165
## 4  0.5448485  -1.0541669   1.31326640 -1.62170805  0.5118394       -0.2221667
##   Dissimilarity
## 1      2.469170
## 2      2.324214
## 3      1.698921
## 4      3.443997

Figure <- as.data.frame(Averages)
Figure$id <- 1:nrow(Figure)

library(tidyr)

## 
## Attaching package: 'tidyr'

## The following object is masked from 'package:pastecs':
## 
##     extract

Figure <- pivot_longer(Figure, cols = c("Security", "Ease of use", "Availability", "Speed", "Privacy", "Spending Control"))

Figure$Group <- factor(Figure$id, 
                       levels = c(1, 2, 3, 4, 5), 
                       labels = c("1", "2", "3", "4", "5"))

Figure$ImeF <- factor(Figure$name, 
              levels = c("Security", "Ease of use", "Availability", "Speed", "Privacy", "Spending Control"), 
              labels = c("Security", "Ease of use", "Availability", "Speed", "Privacy", "Spending Control"))


library(ggplot2)
ggplot(Figure, aes(x = ImeF, y = value)) +
  geom_hline(yintercept = 0) +
  theme_bw() +
  geom_point(aes(shape = Group, col = Group), size = 3) +
  geom_line(aes(group = id), linewidth = 1) +
  ylab("Averages") +
  xlab("Cluster variables") +
  scale_color_manual(values = NLB_colors) +  # Use NLB colors for points and lines
  ylim(-3, 3) +
  theme(axis.text.x = element_text(angle = 45, vjust = 0.50, size = 10))

NLB_CluStd$Group <- Clustering$cluster

fit <- aov(cbind(`Security`, `Ease of use`, `Availability`, `Speed`, `Privacy`, `Spending Control`) ~ as.factor(Group), 
           data = NLB_CluStd)

summary(fit)

##  Response Security :
##                   Df  Sum Sq Mean Sq F value    Pr(>F)    
## as.factor(Group)   3  82.949 27.6495  37.695 < 2.2e-16 ***
## Residuals        300 220.051  0.7335                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response Ease of use :
##                   Df Sum Sq Mean Sq F value    Pr(>F)    
## as.factor(Group)   3 152.19  50.731  100.92 < 2.2e-16 ***
## Residuals        300 150.81   0.503                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response Availability :
##                   Df Sum Sq Mean Sq F value    Pr(>F)    
## as.factor(Group)   3 110.18  36.726   57.14 < 2.2e-16 ***
## Residuals        300 192.82   0.643                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response Speed :
##                   Df Sum Sq Mean Sq F value    Pr(>F)    
## as.factor(Group)   3 167.71  55.905  123.97 < 2.2e-16 ***
## Residuals        300 135.29   0.451                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response Privacy :
##                   Df Sum Sq Mean Sq F value    Pr(>F)    
## as.factor(Group)   3 118.66  39.553   64.37 < 2.2e-16 ***
## Residuals        300 184.34   0.614                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response Spending Control :
##                   Df  Sum Sq Mean Sq F value    Pr(>F)    
## as.factor(Group)   3  81.711 27.2370  36.925 < 2.2e-16 ***
## Residuals        300 221.289  0.7376                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

6 Criterion validity (significant descriptors)

NLB_data$Group <- NLB_CluStd$Group

6.1 Frequency of Mobile Payment Usage per Month

NLB_data$Phone_Use_merged <- ifelse(
  NLB_data$Phone_Use == 1, "never",
  ifelse(NLB_data$Phone_Use %in% c(2, 3), "irregular basis", "regular basis")
)

# Convert the merged column to a factor with levels in the desired order
NLB_data$Phone_Use_merged <- factor(
  NLB_data$Phone_Use_merged,
  levels = c("never", "irregular basis", "regular basis")
)

chi_square <- chisq.test(NLB_data$Phone_Use_merged, as.factor(NLB_data$Group))
chi_square

## 
##  Pearson's Chi-squared test
## 
## data:  NLB_data$Phone_Use_merged and as.factor(NLB_data$Group)
## X-squared = 19.394, df = 6, p-value = 0.003547

6.2 Frequency of Cash Usage for Purchases Up to 10 EUR

NLB_data$Cash_Up10_merged <- ifelse(
  NLB_data$Cash_Up10 %in% c(1, 2), "less than half",
  ifelse(NLB_data$Cash_Up10 == 3, "half", "more than half")
)

# Convert the merged column to a factor with levels in the desired order
NLB_data$Cash_Up10_merged <- factor(
  NLB_data$Cash_Up10_merged,
  levels = c("less than half", "half", "more than half")
)

chi_square <- chisq.test(NLB_data$Cash_Up10_merged, as.factor(NLB_data$Group))
chi_square

## 
##  Pearson's Chi-squared test
## 
## data:  NLB_data$Cash_Up10_merged and as.factor(NLB_data$Group)
## X-squared = 20.107, df = 6, p-value = 0.00265

7 Demographics

7.1 Demographics - significant :)

7.1.1 Frequency of Mobile Payment Usage per Month

# Calculate frequency by Mobile payment usage
Phone_freq <- NLB_data %>%
  group_by(Group, Phone_Use_merged) %>%
  summarise(Count = n(), .groups = 'drop') %>% 
  group_by(Group) %>%
  mutate(Percentage = Count / sum(Count) * 100)

# Plot the frequency by Mobile payment usage
ggplot(Phone_freq, aes(x = Group, y = Percentage, fill = Phone_Use_merged)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(
    title = "Distribution of Mobile Payment Usage Across Youngs (18-27 y.o.)",
    x = "Group",
    y = "Percentage(%)",
    fill = "Frequency of Mobile Payment Usage"
  ) +
  scale_fill_manual(values = NLB_colors) +  # Use NLB colors for the fill
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotate x-axis labels if needed

# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)

## 
## Attaching package: 'kableExtra'

## The following object is masked from 'package:dplyr':
## 
##     group_rows

# Assuming Phone_freq is already calculated, reshape it
Phone_wide <- Phone_freq %>%
  select(-Count) %>%  # Remove the Count column (optional)
  spread(key = Phone_Use_merged, value = Percentage)  # Spread data across columns

# Create and style the wide format table with borders and grey title row
Phone_wide %>%
  kable(caption = "Mobile Payment Usage Distribution by Group (in %)", digits = 1) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
  row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>%  # Style header row
  column_spec(1, bold = TRUE) %>%  # Make the first column bold (Group column)
  kable_styling(bootstrap_options = "bordered")  # Add borders around the table

Mobile Payment Usage Distribution by Group (in %)
Group	never	irregular basis	regular basis
1	38.1	14.3	47.6
2	18.3	17.2	64.5
3	22.3	8.7	68.9
4	37.8	22.2	40.0

7.1.2 Frequency of Cash Usage for Purchases Up to 10 EUR

# Calculate frequency by cash usage for purchases up to 10 EUR
Purchases_freq <- NLB_data %>%
  group_by(Group, Cash_Up10_merged) %>%
  summarise(Count = n(), .groups = 'drop') %>%
  group_by(Group) %>%
  mutate(Percentage = Count / sum(Count) * 100)

# Plot the frequency by cash usage for purchases up to 10 EUR
ggplot(Purchases_freq, aes(x = Group, y = Percentage, fill = Cash_Up10_merged)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(
    title = "Distribution of Cash Payment for Purchases up to 10 EUR Among Youngs (18-27 y.o.)",
    x = "Group",
    y = "Percentage(%)",
    fill = "Number of Cash Payments (Up to 10 EUR)"
  ) +
  scale_fill_manual(values = NLB_colors) +  # Use NLB colors for the fill
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotate x-axis labels if needed

# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)

# Assuming Purchases_freq is already calculated, reshape it
Purchases_wide <- Purchases_freq %>%
  select(-Count) %>%  # Remove the Count column (optional)
  spread(key = Cash_Up10_merged, value = Percentage)  # Spread data across columns

# Create and style the wide format table with borders and grey title row
Purchases_wide %>%
  kable(caption = "Cash Usage for Purchases Distribution by Group (in %)", digits = 1) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
  row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>%  # Style header row
  column_spec(1, bold = TRUE) %>%  # Make the first column bold (Group column)
  kable_styling(bootstrap_options = "bordered")  # Add borders around the table

Cash Usage for Purchases Distribution by Group (in %)
Group	less than half	half	more than half
1	65.1	23.8	11.1
2	72.0	8.6	19.4
3	68.9	23.3	7.8
4	53.3	17.8	28.9

7.2 Demographics - not significant :(

7.2.1 Frequency of Card Payment Usage per Month

NLB_data$Card_Use_merged <- ifelse(
  NLB_data$Card_Use == 1, "never",
  ifelse(NLB_data$Card_Use %in% c(2, 3), "irregular basis", "regular basis"))

# Convert the merged column to a factor with levels in the desired order
NLB_data$Card_Use_merged <- factor(
  NLB_data$Card_Use_merged,
  levels = c("never", "irregular basis", "regular basis"))

# Calculate frequency by Mobile payment usage
Card_freq <- NLB_data %>%
  group_by(Group, Card_Use_merged) %>%
  summarise(Count = n(), .groups = 'drop') %>% 
  group_by(Group) %>%
  mutate(Percentage = Count / sum(Count) * 100)

# Plot the frequency by Mobile payment usage
ggplot(Card_freq, aes(x = Group, y = Percentage, fill = Card_Use_merged)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(
    title = "Distribution of Card Payment Usage Across Youngs (18-27 y.o.)",
    x = "Group",
    y = "Percentage(%)",
    fill = "Frequency of Card Payment Usage"
  ) +
  scale_fill_manual(values = NLB_colors) +  # Use NLB colors for the fill
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotate x-axis labels if needed

# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)

# Assuming Purchases_freq is already calculated, reshape it
Purchases_wide <- Card_freq %>%
  select(-Count) %>%  # Remove the Count column (optional)
  spread(key = Card_Use_merged, value = Percentage)  # Spread data across columns

# Create and style the wide format table with borders and grey title row
Purchases_wide %>%
  kable(caption = "Card Usage for Purchases Distribution by Group (in %)", digits = 1) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
  row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>%  # Style header row
  column_spec(1, bold = TRUE) %>%  # Make the first column bold (Group column)
  kable_styling(bootstrap_options = "bordered")  # Add borders around the table

Card Usage for Purchases Distribution by Group (in %)
Group	never	irregular basis	regular basis
1	6.3	28.6	65.1
2	10.8	28.0	61.3
3	4.9	32.0	63.1
4	6.7	26.7	66.7

7.2.2 Frequency of Other Payment (PayPal, Stripe, etc.) Usage per Month

NLB_data$OtherPay_Use_merged <- ifelse(
  NLB_data$OtherPay_Use == 1, "never",
  ifelse(NLB_data$OtherPay_Use %in% c(2, 3), "irregular basis", "regular basis"))

# Convert the merged column to a factor with levels in the desired order
NLB_data$OtherPay_Use_merged <- factor(
  NLB_data$OtherPay_Use_merged,
  levels = c("never", "irregular basis", "regular basis"))

# Calculate frequency by Mobile payment usage
OtherPay_freq <- NLB_data %>%
  group_by(Group, OtherPay_Use_merged) %>%
  summarise(Count = n(), .groups = 'drop') %>% 
  group_by(Group) %>%
  mutate(Percentage = Count / sum(Count) * 100)

# Plot the frequency by Mobile payment usage
ggplot(OtherPay_freq, aes(x = Group, y = Percentage, fill = OtherPay_Use_merged)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(
    title = "Distribution of Other Payment (PayPal, Stripe, etc.) Usage Across Youngs (18-27 y.o.)",
    x = "Group",
    y = "Percentage(%)",
    fill = "Frequency of Other Payment Usage"
  ) +
  scale_fill_manual(values = NLB_colors) +  # Use NLB colors for the fill
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotate x-axis labels if needed

# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)

# Assuming OtherPay_freq is already calculated, reshape it
OtherPay_wide <- OtherPay_freq %>%
  select(-Count) %>%  # Remove the Count column (optional)
  spread(key = OtherPay_Use_merged, value = Percentage)  # Spread data across columns

# Create and style the wide format table with borders and grey title row
OtherPay_wide %>%
  kable(caption = "Other Payment Methods Distribution by Group (in %)", digits = 1) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
  row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>%  # Style header row
  column_spec(1, bold = TRUE) %>%  # Make the first column bold (Group column)
  kable_styling(bootstrap_options = "bordered")  # Add borders around the table

Other Payment Methods Distribution by Group (in %)
Group	never	irregular basis	regular basis
1	73.0	25.4	1.6
2	57.0	38.7	4.3
3	50.5	42.7	6.8
4	64.4	33.3	2.2

7.2.3 Frequency of Cash Usage for Purchases 11-99 EUR

# Merge cash usage for purchases between 11-99 EUR
NLB_data$Cash_11_99_merged <- ifelse(
  NLB_data$Cash_11_99 %in% c(1, 2), "less than half",
  ifelse(NLB_data$Cash_11_99 == 3, "half", "more than half")
)

# Convert the merged column to a factor with levels in the desired order
NLB_data$Cash_11_99_merged <- factor(
  NLB_data$Cash_11_99_merged,
  levels = c("less than half", "half", "more than half")
)

# Calculate frequency by cash usage for purchases between 11-99 EUR
Purchases1199_freq <- NLB_data %>%
  group_by(Group, Cash_11_99_merged) %>%
  summarise(Count = n(), .groups = 'drop') %>%
  group_by(Group) %>%
  mutate(Percentage = Count / sum(Count) * 100)

# Plot the frequency by cash usage for purchases between 11-99 EUR
ggplot(Purchases1199_freq, aes(x = Group, y = Percentage, fill = Cash_11_99_merged)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(
    title = "Distribution of Cash Payment for Purchases Between 11-99 EUR Among Youngs (18-27 y.o.)",
    x = "Group",
    y = "Percentage(%)",
    fill = "Number of Cash Payments (11-99 EUR)"
  ) +
  scale_fill_manual(values = NLB_colors) +  # Use NLB colors for the fill
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotate x-axis labels if needed

# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)

# Reshape Purchases1199_freq to a wide format using pivot_wider()
Purchases_wide <- Purchases1199_freq %>%
  select(-Count) %>%  # Remove the Count column (optional)
  pivot_wider(names_from = Cash_11_99_merged, values_from = Percentage)  # Reshape data

# Create and style the wide format table with borders and grey title row
Purchases_wide %>%
  kable(caption = "Cash Usage for Purchases (11-99 EUR) Distribution by Group (in %)", digits = 1) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
  row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>%  # Style header row
  column_spec(1, bold = TRUE) %>%  # Make the first column bold (Group column)
  kable_styling(bootstrap_options = "bordered")  # Add borders around the table

Cash Usage for Purchases (11-99 EUR) Distribution by Group (in %)
Group	less than half	half	more than half
1	79.4	15.9	4.8
2	76.3	12.9	10.8
3	87.4	6.8	5.8
4	66.7	17.8	15.6

7.2.4 Frequency of Cash Usage for Purchases 100-1000 EUR

# Merge cash usage for purchases over 1000 EUR
NLB_data$Cash_Over1000_merged <- ifelse(
  NLB_data$Cash_Over1000 %in% c(1, 2), "less than half",
  ifelse(NLB_data$Cash_Over1000 == 3, "half", "more than half")
)

# Convert the merged column to a factor with levels in the desired order
NLB_data$Cash_Over1000_merged <- factor(
  NLB_data$Cash_Over1000_merged,
  levels = c("less than half", "half", "more than half")
)

# Calculate frequency by cash usage for purchases over 1000 EUR
PurchasesOver1000_freq <- NLB_data %>%
  group_by(Group, Cash_Over1000_merged) %>%
  summarise(Count = n(), .groups = 'drop') %>%
  group_by(Group) %>%
  mutate(Percentage = Count / sum(Count) * 100)

# Plot the frequency by cash usage for purchases over 1000 EUR
ggplot(PurchasesOver1000_freq, aes(x = Group, y = Percentage, fill = Cash_Over1000_merged)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(
    title = "Distribution of Cash Payment for Purchases Over 1000 EUR Among Youngs (18-27 y.o.)",
    x = "Group",
    y = "Percentage(%)",
    fill = "Number of Cash Payments (Over 1000 EUR)"
  ) +
  scale_fill_manual(values = NLB_colors) +  # Use NLB colors for the fill
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotate x-axis labels if needed

# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)

# Reshape PurchasesOver1000_freq to a wide format using pivot_wider()
Purchases_wide <- PurchasesOver1000_freq %>%
  select(-Count) %>%  # Remove the Count column (optional)
  pivot_wider(names_from = Cash_Over1000_merged, values_from = Percentage)  # Reshape data

# Create and style the wide format table with borders and grey title row
Purchases_wide %>%
  kable(caption = "Cash Usage for Purchases (100-1000 EUR) Distribution by Group (in %)", digits = 1) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
  row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>%  # Style header row
  column_spec(1, bold = TRUE) %>%  # Make the first column bold (Group column)
  kable_styling(bootstrap_options = "bordered")  # Add borders around the table

Cash Usage for Purchases (100-1000 EUR) Distribution by Group (in %)
Group	less than half	half	more than half
1	87.3	1.6	11.1
2	92.5	3.2	4.3
3	92.2	1.0	6.8
4	82.2	6.7	11.1

7.2.5 Age

# Calculate percentage by Group
Age_freq <- NLB_data %>%
  group_by(Group, AgeF) %>%
  summarise(Count = n(), .groups = 'drop') %>%
  group_by(Group) %>%
  mutate(Percentage = Count / sum(Count) * 100)

NLB_colors2 <- c("#230078",    # Deep Blue
                 "#3A1A8B",    # Purple-blue
                 "#4D33B1",    # Lighter purple-blue
                 "#5F47D8",    # Soft lavender
                 "#84BD00",    # Green (from the original)
                 "#A7C700",    # Soft green
                 "#98FA00",    # Light green
                 "#FA7800",    # Orange (from the original)
                 "#FF9A33",    # Lighter orange
                 "#FFB266")    # Light peachy-orange


# Plot the percentage by Response
ggplot(Age_freq, aes(x = Group, y = Percentage, fill = as.factor(AgeF))) +
  geom_bar(stat = "identity", position = "stack") +
  labs(
    title = "Distribution of Age by Youngs' Clusters (18-27 y.o.)",
    x = "Group",
    y = "Percentage (%)",
    fill = "Age"
  ) +
  scale_fill_manual(values = NLB_colors2) +  # Apply the new homogeneous color palette
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotate x-axis labels if needed

# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)

# Assuming Age_freq is already calculated, reshape it
Age_wide <- Age_freq %>%
  select(-Count) %>%  # Remove the Count column (optional)
  pivot_wider(
    names_from = AgeF,
    values_from = Percentage,
    values_fill = list(Percentage = 0)  # Replace NA with 0
  )  # Reshape data

# Create and style the wide format table with borders and grey title row
Age_wide %>%
  kable(caption = "Age Distribution by Group (in %)", digits = 1) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
  row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>%  # Style header row
  column_spec(1, bold = TRUE) %>%  # Make the first column bold (Group column)
  kable_styling(bootstrap_options = "bordered")  # Add borders around the table

Age Distribution by Group (in %)
Group	18	19	20	21	22	23	24	25	26	27
1	4.8	3.2	14.3	9.5	12.7	23.8	14.3	9.5	6.3	1.6
2	1.1	7.5	8.6	8.6	15.1	24.7	8.6	20.4	2.2	3.2
3	4.9	9.7	12.6	7.8	13.6	15.5	10.7	13.6	4.9	6.8
4	13.3	4.4	22.2	2.2	17.8	22.2	4.4	8.9	4.4	0.0

# Convert AgeF from factor to numeric
NLB_data$AgeF <- as.numeric(as.character(NLB_data$AgeF))

# Now aggregate and calculate the mean
Age_means <- aggregate(AgeF ~ Group, data = NLB_data, FUN = mean, na.rm = TRUE)

# Print results
print(Age_means)

##   Group     AgeF
## 1     1 22.47619
## 2     2 22.75269
## 3     3 22.49515
## 4     4 21.62222

7.2.6 Income

# Calculate frequency by Income (Income_LevelF) and Group
income_freq <- NLB_data %>%
  group_by(Group, Income_LevelF) %>%
  summarise(Count = n(), .groups = 'drop') %>%
  group_by(Group) %>%
  mutate(Percentage = Count / sum(Count) * 100)

# Plot the frequency by Income
ggplot(income_freq, aes(x = Group, y = Percentage, fill = Income_LevelF)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(
    title = "Distribution of Income among Youngs (18-27 y.o.)",
    x = "Group",
    y = "Frequency",
    fill = "Income"
  ) +
  scale_fill_manual(values = NLB_colors) +  # Apply the previous color palette
  theme_minimal()  # Using minimal theme

# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)

# Reshape the data using pivot_wider and remove NA column
income_wide <- income_freq %>%
  select(-Count) %>%  # Remove the Count column (optional)
  pivot_wider(names_from = Income_LevelF, values_from = Percentage) %>%
  select(-`NA`)  # Remove the column with NA values (if it exists)

# Create and style the wide format table with borders and grey title row
income_wide %>%
  kable(caption = "Income Distribution by Group (in %)", digits = 1) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
  row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>%  # Style header row
  column_spec(1, bold = TRUE) %>%  # Make the first column bold (Group column)
  kable_styling(bootstrap_options = "bordered")  # Add borders around the table

Income Distribution by Group (in %)
Group	0-200 EUR	201-500 EUR	501-800 EUR	801-1300 EUR	Above 1300 EUR
1	12.7	41.3	15.9	15.9	14.3
2	17.2	25.8	20.4	17.2	19.4
3	12.6	29.1	22.3	13.6	21.4
4	24.4	24.4	13.3	17.8	17.8

# Create a new column with calculated midpoints of Income Level
NLB_data <- NLB_data %>%
  mutate(Income_Mid = case_when(
    Income_LevelF == "0-200 EUR" ~ (0 + 200) / 2,
    Income_LevelF == "201-500 EUR" ~ (201 + 500) / 2,
    Income_LevelF == "501-800 EUR" ~ (501 + 800) / 2,
    Income_LevelF == "801-1300 EUR" ~ (800 + 1300) / 2,
    Income_LevelF == "Above 1300 EUR" ~ (1300 + 1700) / 2,  # Assuming 1300-1700 as range
    TRUE ~ NA_real_  # Assign NA for any unexpected values
  ))

# Print first rows to verify transformation
print(head(NLB_data[, c("Income_LevelF", "Income_Mid")]))

##   Income_LevelF Income_Mid
## 1   201-500 EUR      350.5
## 2   201-500 EUR      350.5
## 3   501-800 EUR      650.5
## 4   201-500 EUR      350.5
## 5  801-1300 EUR     1050.0
## 6   201-500 EUR      350.5

# Compute mean income for each cluster
Income_means <- aggregate(Income_Mid ~ Group, data = NLB_data, FUN = mean, na.rm = TRUE)

# Print results
print(Income_means)

##   Group Income_Mid
## 1     1   641.5556
## 2     2   711.5215
## 3     3   730.1618
## 4     4   664.9659

7.2.7 Employment status

# Calculate frequency by Employment status (Status_Employment) and Group
status_freq <- NLB_data %>%
  group_by(Group, Status_EmploymentF) %>%
  summarise(Count = n(), .groups = 'drop') %>%
  group_by(Group) %>%
  mutate(Percentage = Count / sum(Count) * 100)

# Plot the frequency by status
ggplot(status_freq, aes(x = Group, y = Percentage, fill = Status_EmploymentF)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(
    title = "Distribution of Employment among Youngs (18-27 y.o.)",
    x = "Group",
    y = "Frequency",
    fill = "Employment status"
  ) +
  scale_fill_manual(values = NLB_colors) +  # Apply the NLB colors palette
  theme_minimal()  # Keep the minimal theme

# Print results
# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)

# Reshape the data using pivot_wider
status_wide <- status_freq %>%
  select(-Count) %>%  # Remove the Count column (optional)
  pivot_wider(
    names_from = Status_EmploymentF,
    values_from = Percentage,
    values_fill = list(Percentage = 0)  # Replace NA with 0
  )  # Reshape data to wide format

# Create and style the wide format table with borders and grey title row
status_wide %>%
  kable(caption = "Employment Status Distribution by Group (in %)", digits = 1) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
  row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>%  # Style header row
  column_spec(1, bold = TRUE) %>%  # Make the first column bold (Group column)
  kable_styling(bootstrap_options = "bordered")  # Add borders around the table

Employment Status Distribution by Group (in %)
Group	Student	Employed	Self-employed	Other	Unemployed
1	82.5	14.3	1.6	1.6	0.0
2	82.8	15.1	2.2	0.0	0.0
3	72.8	17.5	3.9	2.9	2.9
4	71.1	17.8	0.0	11.1	0.0

# Reclassify Employment Status
NLB_data <- NLB_data %>%
  mutate(Status_EmploymentF = case_when(
    Status_EmploymentF == "Employed" ~ "With Job",
    Status_EmploymentF == "Self-employed" ~ "With Job",
    TRUE ~ Status_EmploymentF  # Keep other categories unchanged
  ))

# Recalculate the frequency by employment status and group
status_freq <- NLB_data %>%
  group_by(Group, Status_EmploymentF) %>%
  summarise(Count = n(), .groups = 'drop') %>%
  group_by(Group) %>%
  mutate(Percentage = Count / sum(Count) * 100)

# Print only the percentages for "Student" and "With Job"

# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)

# Filter and reshape the data for "Student" and "With Job"
status_wide_filtered <- status_freq %>%
  filter(Status_EmploymentF %in% c("Student", "With Job")) %>%  # Filter specific statuses
  select(-Count) %>%  # Remove the Count column (optional)
  pivot_wider(names_from = Status_EmploymentF, values_from = Percentage)  # Reshape data to wide format

# Create and style the wide format table
status_wide_filtered %>%
  kable(caption = "Percentage of Students and Individuals With Jobs by Group (in %)", digits = 1) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
  row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>%  # Style header row
  column_spec(1, bold = TRUE) %>%  # Make the first column bold (Group column)
  kable_styling(bootstrap_options = "bordered")  # Add borders around the table

Percentage of Students and Individuals With Jobs by Group (in %)
Group	Student	With Job
1	82.5	15.9
2	82.8	17.2
3	72.8	21.4
4	71.1	17.8

7.2.8 Education

# Calculate frequency by Education (Q24F) and Group
education_freq <- NLB_data %>%
  group_by(Group, EducationF) %>%
  summarise(Count = n(), .groups = 'drop') %>%
  group_by(Group) %>%
  mutate(Percentage = Count / sum(Count) * 100)

# Plot the frequency by Education
ggplot(education_freq, aes(x = Group, y = Percentage, fill = EducationF)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(
    title = "Distribution of Education level among Youngs (18-27 y.o.)",
    x = "Group",
    y = "Frequency",
    fill = "Education level"
  ) +
  scale_fill_manual(values = NLB_colors) +  # Apply the NLB colors palette
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotate x-axis labels if needed

# Load required libraries
library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)

# Reshape the data using pivot_wider
education_wide <- education_freq %>%
  select(-Count) %>%  # Remove the Count column (optional)
  pivot_wider(
    names_from = EducationF,
    values_from = Percentage,
    values_fill = list(Percentage = 0)  # Replace NA with 0
  )  # Reshape data to wide format

# Create and style the wide format table
education_wide %>%
  kable(caption = "Education Level Distribution by Group (in %)", digits = 1) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
  row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>%  # Style header row
  column_spec(1, bold = TRUE) %>%  # Make the first column bold (Group column)
  kable_styling(bootstrap_options = "bordered")  # Add borders around the table

Education Level Distribution by Group (in %)
Group	High School	Bachelor Degree	Master Degree	PhD	Primary school	Vocational education
1	46.0	41.3	12.7	0.0	0.0	0.0
2	44.1	32.3	21.5	2.2	0.0	0.0
3	39.8	31.1	24.3	0.0	2.9	1.9
4	46.7	24.4	17.8	0.0	11.1	0.0

7.2.9 Response

# Calculate frequency by Response
response_freq <- NLB_data %>%
  group_by(Group, NoDigital_ResponseF) %>%
  summarise(Count = n(), .groups = 'drop') %>%
  group_by(Group) %>%
  mutate(Percentage = Count / sum(Count) * 100)

# Plot the frequency by Response
ggplot(response_freq, aes(x = Group, y = Percentage, fill = NoDigital_ResponseF)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(
    title = "Responses to Merchants Not Accepting Digital Payments - Youngs (18-27 y.o.)",
    x = "Group",
    y = "Frequency",
    fill = "Response"
  ) +
  scale_fill_manual(values = NLB_colors) +  # Apply the NLB colors palette
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotate x-axis labels if needed

# Reshape the data using pivot_wider and replace NA with 0
response_wide <- response_freq %>%
  select(-Count) %>%  # Remove the Count column (optional)
  pivot_wider(
    names_from = NoDigital_ResponseF,
    values_from = Percentage,
    values_fill = list(Percentage = 0)  # Replace NA with 0
  )

# Create and style the wide format table
response_wide %>%
  kable(caption = "Responses to Merchants Not Accepting Digital Payments by Group (in %)", digits = 1) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
  row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>%  # Style header row
  column_spec(1, bold = TRUE) %>%  # Make the first column bold (Group column)
  kable_styling(bootstrap_options = "bordered")  # Add borders around the table

Responses to Merchants Not Accepting Digital Payments by Group (in %)
Group	Pay digital elsewhere	Pay as available	Never occurred	Other
1	14.3	63.5	12.7	9.5
2	22.6	64.5	7.5	5.4
3	15.5	62.1	18.4	3.9
4	13.3	68.9	17.8	0.0

7.2.10 Banks

# Calculate frequency by Banks
Banks_freq <- NLB_data %>%
  group_by(Group, Primary_BankF) %>%
  summarise(Count = n(), .groups = 'drop') %>%
  group_by(Group) %>%
  mutate(Percentage = Count / sum(Count) * 100)

# Plot the frequency by Bank
ggplot(Banks_freq, aes(x = Group, y = Percentage, fill = Primary_BankF)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(
    title = "Distribution of Primary Banks used by Youngs (18-27 y.o.)",
    x = "Group",
    y = "Frequency",
    fill = "Bank"
  ) +
  scale_fill_manual(values = NLB_colors) +  # Apply the NLB colors palette
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotate x-axis labels if needed

# Reshape the data using pivot_wider and replace NA with 0
banks_wide <- Banks_freq %>%
  select(-Count) %>%  # Remove the Count column (optional)
  pivot_wider(
    names_from = Primary_BankF,
    values_from = Percentage,
    values_fill = list(Percentage = 0)  # Replace NA with 0
  )

# Create and style the wide format table
banks_wide %>%
  kable(caption = "Distribution of Primary Banks Used by Group (in %)", digits = 1) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
  row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>%  # Style header row
  column_spec(1, bold = TRUE) %>%  # Make the first column bold (Group column)
  kable_styling(bootstrap_options = "bordered")  # Add borders around the table

Distribution of Primary Banks Used by Group (in %)
Group	NLB	OTP	Intesa Sanpaolo	Addiko Bank	Delavska Hranilnica	Other	Sparkasse
1	39.7	36.5	9.5	1.6	3.2	9.5	0.0
2	31.2	36.6	7.5	2.2	5.4	11.8	5.4
3	48.5	34.0	2.9	2.9	7.8	2.9	1.0
4	37.8	37.8	2.2	4.4	8.9	6.7	2.2

7.2.11 Gender

# Calculate frequency by Gender
Gender_freq <- NLB_data %>%
  group_by(Group, GenderF) %>%
  summarise(Count = n(), .groups = 'drop') %>%
  group_by(Group) %>%
  mutate(Percentage = Count / sum(Count) * 100)

# Plot the frequency by Bank
ggplot(Gender_freq, aes(x = Group, y = Percentage, fill = GenderF)) +
  geom_bar(stat = "identity", position = "stack") +
  labs(
    title = "Distribution of Gender among Youngs (18-27 y.o.)",
    x = "Group",
    y = "Frequency",
    fill = "Gender"
  ) +
  scale_fill_manual(values = NLB_colors) +  # Apply the NLB colors palette
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotate x-axis labels if needed

# Reshape the data using pivot_wider and replace NA with 0
gender_wide <- Gender_freq %>%
  select(-Count) %>%  # Remove the Count column (optional)
  pivot_wider(
    names_from = GenderF,
    values_from = Percentage,
    values_fill = list(Percentage = 0)  # Replace NA with 0
  )

# Create and style the wide format table
gender_wide %>%
  kable(caption = "Gender Distribution by Group (in %)", digits = 1) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = F) %>%
  row_spec(0, background = "lightgrey", bold = TRUE, color = "black") %>%  # Style header row
  column_spec(1, bold = TRUE) %>%  # Make the first column bold (Group column)
  kable_styling(bootstrap_options = "bordered")  # Add borders around the table

Gender Distribution by Group (in %)
Group	Man	Woman	Other	Don’t want to answer
1	33.3	65.1	1.6	0.0
2	40.9	58.1	0.0	1.1
3	40.8	59.2	0.0	0.0
4	42.2	57.8	0.0	0.0

8 Cluster description

8.1 Group 1: The Traditionalists

Group demographics: The average age in this group is slightly higher compared to other groups, indicating that they may be in later stages of their education or early stages of employment. The gender distribution is fairly balanced, with a significant proportion having completed high school or a vocational education. A smaller percentage have pursued higher academic degrees. The most popular financial institutions in Group 1 are traditional banks, with NLB and OTP Bank being the most commonly used. This suggests a preference for well-established institutions over newer, digital-first banking solutions. The majority of people in Group 1 are students and young professionals, with many having incomes in the lower-to-middle range (201-800 EUR).

Group behaviors: When it comes to payment behavior, Group 1 is the least engaged with mobile and digital payment solutions. A significant portion of the group never uses mobile payments, and only a small percentage make regular use of them. Instead, cash remains the preferred payment method, even for small transactions. Many individuals in this group use cash for at least half of their purchases under 10 EUR, and some rely on cash even for larger expenses.

The hesitancy toward digital payments is not due to security concerns but is more about habit and preference. Many in this group view cash as a more tangible and controlled way of managing their finances, avoiding the potential pitfalls of overspending that can accompany digital transactions. These individuals can be best described as practical, cautious, and financially disciplined. Their reliance on cash payments reflects a strong preference for traditional financial management, possibly influenced by family habits or a general skepticism toward modern financial technologies. They are likely to:

Be conservative spenders: They prioritize budgeting and may be reluctant to adopt new payment technologies.
Prefer familiarity over convenience: They tend to stick with what they know and trust rather than experimenting with new financial tools.
Avoid financial risks: Their cautious nature extends beyond payments, likely influencing other financial decisions such as avoiding credit or loans.

8.2 Group 2: The Security Freaks

Group demographics: The average age in this group is 23. In the group, 40.8% are males, 58.06% are females and 0.6% don’t want to answer. The composition of education is as follows: 44.1% have High School finished, 32.2% have a Bachelor’s degree and 21.5% have a Master’s degree, and 2.2% have a PhD. The majority of people use OTP Bank (36.6%), followed by NLB (31.2%), with smaller percentages for Intesa Sanpaolo (7.5%), Sparkasse (5.4%), and Other banks (11.8%). In Group 2, the majority of people are students with 82.8%, followed by employed individuals at 15.1% and self-employed at 2.2%. In Group 2, the most common income range is 201-500 EUR with 25.8%, followed by 501-800 EUR with 20.4%, 0-200 EUR with 17.2%, 801-1300 EUR also at 17.2%, and Above 1300 EUR with 19.4%.

Group behaviors: Regarding phone mobile payments, 38.1% of people never use their phones to pay, 14.3% pay on an irregular basis and 47.6% of the groups pay on a regular basis. In Group 2, the majority of people reported using cash for less than half of their purchases up to 10 euros, with 72.0%, followed by more than half at 19.4% and half at 8.6%.

These young adults from Slovenia are highly aware of online risks and tend to be cautious, perhaps due to personal concerns about fraud or privacy breaches. Despite their mistrust of digital payments regarding security and privacy, they still use them regularly, likely valuing convenience over their worries. They feel digital payments are easy to use and provide excellent control over spending, making them financially conscious. These individuals may display characteristics like:

Cautious and analytical: They weigh risks before adopting technologies and prefer transparency.
Efficiency seekers: They want fast, intuitive systems but expect them to ensure a high level of data protection.
Emotionally conflicted: Their reliance on digital payments may cause underlying anxiety due to distrust in security systems.
Hobbies & Interests: Budget planning, cybersecurity awareness, or exploring apps that promote secure spending habits.

8.3 Group 3: The Balanced Users

Group demographics: Group 3 strikes a balance between digital and cash-based payments, making them the most financially adaptable of all clusters. The average age in this group is similar to Group 2, and the gender representation is nearly even. The majority of individuals have completed high school or higher education, with a notable proportion holding bachelor’s or master’s degrees. When it comes to banking, this group is more diverse, utilizing a mix of traditional and digital banks. Their employment status reflects a mix of students and working professionals, suggesting they are in a transition phase between education and financial independence. Income levels are widely distributed, from low-income students to those earning above 1300 EUR, reinforcing their financial diversity.

Group behaviors: Unlike Group 1, these individuals are more flexible in their payment choices. While they still use cash, they are comfortable switching to digital payments depending on the context. Their use of mobile and card payments is moderate, meaning they neither fully embrace digital finance nor reject it entirely. In small transactions (e.g., purchases up to 10 EUR), most individuals use cash sparingly, though some still prefer it for specific purchases. They are less resistant to financial technology than Group 1 but also less eager to rely on digital payments compared to Group 2 and 4. Their behavior suggests that convenience, rather than ideology, guides their payment choices.

Group 3 consists of rational, pragmatic, and tech-aware individuals who make financial decisions based on practicality rather than strict preferences. Unlike the Security Freaks (Group 2), they don’t have major concerns about digital security, and unlike the Speeders (Group 4), they don’t demand ultra-fast transactions. Instead, they choose what works best for them in the moment.

Practical decision-makers: They adapt their payment methods based on the situation rather than strong personal biases.
Financially balanced: They are mindful of their spending but not overly cautious or restrictive.
Tech-aware but not dependent: They use fintech when it adds value but don’t abandon cash entirely.

8.4 Group 4: The Speeders

Group demographics: The average age in this group is 22. In the group, 42.2% are males, 57.8% are females and 0.6% don’t want to answer. The composition of education is as follows: 11.1% have Primary School finished, 46.7% have High School, 24.4% have a Bachelor’s degree and 17.8% have a Master’s degree. Both OTP Bank and NLB are the most commonly used banks, each with 37.8% of users, followed by Delovska Hranilnica with 8.9%, Other banks with 6.7%, Addiko Bank with 4.4%, and both Intesa Sanpaolo and Sparkasse having 2.2% each. In Group 4, the most common status is student with 71.1%, followed by employed individuals at 17.8% and others at 11.1%. In Group 4, both the 0-200 EUR and 201-500 EUR income ranges are the most common, each with 24.4%, followed by 801-1300 EUR and Above 1300 EUR, both at 17.8%, 501-800 EUR with 13.3%, and missing income data at 2.2%.

Group behaviors: Regarding phone mobile payments, 37.8% of people never use their phones to pay, 22.2% pay on an irregular basis and 40% of the groups pay on a regular basis. In Group 4, 53.3% of people reported using cash for less than half of their purchases up to 10 euros, followed by more than half at 28.9% and half at 17.8%.

This group views digital payments favorably in terms of security and privacy but paradoxically struggles with perceived inefficiencies or complexity. They might get frustrated with transaction errors, processing times, or navigating payment interfaces. Some rarely use digital payments, while others are regular users, creating a spectrum of familiarity within the group. Traits include:

Impatient and task-focused: They seek quick solutions and may dislike anything that delays or complicates payment processes.
Trusting and privacy-oriented: They prioritize systems that protect their personal information and value reliability in payments.
Emotionally reactive: These individuals may become easily frustrated with poorly designed apps but feel reassured knowing their money is safe.
Hobbies & Interests: Tech innovations, time-saving gadgets, and exploring payment platforms that promise better user experiences.

9 Perception map

library(pastecs)

NLB_PCA <- NLB_data[ , c("Safe_Cash", "Safe_DebitCard", "Safe_CreditCard", "Safe_PhonePay", "Safe_Neobank", 
                         "Easy_Cash", "Easy_DebitCard", "Easy_CreditCard", "Easy_PhonePay", "Easy_Neobank", 
                         "Accept_Cash", "Accept_DebitCard", "Accept_CreditCard", "Accept_PhonePay", "Accept_Neobank", 
                         "Fast_Cash", "Fast_DebitCard", "Fast_CreditCard", "Fast_PhonePay", "Fast_Neobank", 
                         "Private_Cash", "Private_DebitCard", "Private_CreditCard", "Private_PhonePay", "Private_Neobank", 
                         "Control_Cash", "Control_DebitCard", "Control_CreditCard", "Control_PhonePay", "Control_Neobank")]

library(dplyr)

NLB_PCA <- NLB_PCA %>%
  rename(
    Cash_Security = Safe_Cash,
    Debit_Security = Safe_DebitCard,
    Credit_Security = Safe_CreditCard,
    Mobile_Security = Safe_PhonePay,
    NeoBanks_Security = Safe_Neobank,
    
    Cash_Ease = Easy_Cash,
    Debit_Ease = Easy_DebitCard,
    Credit_Ease = Easy_CreditCard,
    Mobile_Ease = Easy_PhonePay,
    NeoBanks_Ease = Easy_Neobank,
    
    Cash_Availability = Accept_Cash,
    Debit_Availability = Accept_DebitCard,
    Credit_Availability = Accept_CreditCard,
    Mobile_Availability = Accept_PhonePay,
    NeoBanks_Availability = Accept_Neobank,
    
    Cash_Speed = Fast_Cash,
    Debit_Speed = Fast_DebitCard,
    Credit_Speed = Fast_CreditCard,
    Mobile_Speed = Fast_PhonePay,
    NeoBanks_Speed = Fast_Neobank,
    
    Cash_Privacy = Private_Cash,
    Debit_Privacy = Private_DebitCard,
    Credit_Privacy = Private_CreditCard,
    Mobile_Privacy = Private_PhonePay,
    NeoBanks_Privacy = Private_Neobank,
    
    Cash_Control = Control_Cash,
    Debit_Control = Control_DebitCard,
    Credit_Control = Control_CreditCard,
    Mobile_Control = Control_PhonePay,
    NeoBanks_Control = Control_Neobank)

library(tibble)
perceptual <- NLB_PCA %>% 
  pivot_longer(everything(), names_to = "name", values_to = "score")  %>% 
  separate(name, into = c("Payment method", "Variable"), sep = "_")%>% 
  pivot_wider(names_from = Variable, values_from = score, values_fn = mean) %>%
  column_to_rownames(var = "Payment method")

print(perceptual)

##          Security     Ease Availability    Speed  Privacy  Control
## Cash     5.503289 5.101974     6.226974 4.075658 5.680921 4.851974
## Debit    5.138158 6.148026     6.483553 6.174342 4.457237 5.154605
## Credit   4.947368 6.019737     6.335526 6.121711 4.457237 5.006579
## Mobile   5.174342 6.480263     6.029605 6.526316 4.394737 5.161184
## NeoBanks 4.825658 5.592105     5.075658 5.684211 4.483553 5.115132

library(FactoMineR)
pca <- PCA(perceptual, 
           scale.unit = TRUE, 
           graph = FALSE,
           ncp = 2)

print(pca$var$cor)

##                   Dim.1        Dim.2
## Security     -0.7563320  0.534605714
## Ease          0.8659604  0.465478553
## Availability -0.1808968  0.942766082
## Speed         0.9753947  0.204881709
## Privacy      -0.9934258  0.026725632
## Control       0.9275822 -0.001610424

library(factoextra)
fviz_pca_biplot(pca, 
                repel = TRUE)

9.1 Interpretation of the PCA Biplot (Perception Map)

This Principal Component Analysis (PCA) biplot provides a visual representation of financial payment methods and how they relate to key perceptual factors such as security, privacy, speed, availability, control, and ease of use.

9.1.1 Axes Interpretation:

Dim1 (69.2%) and Dim2 (23.9%) together explain 93.1% of the variance, meaning most of the variation in the data can be understood from this two-dimensional representation.
Dim1 (X-axis) likely captures a contrast between traditional vs. modern payment methods, with Cash positioned far to the left and Mobile payments, Credit, and Debit cards toward the right.
Dim2 (Y-axis) may represent the trade-off between accessibility and digital convenience vs. security and privacy concerns.

9.1.2 Key Observations:

Cash vs. Digital Payments
- Cash is located far left, strongly associated with privacy but negatively correlated with factors like speed, control, and availability.
- Credit, Debit, and Mobile Payments are clustered together on the right, indicating that they share common characteristics, particularly ease of use, speed, and control.
NeoBanks (Fintech-Driven Banking)
- NeoBanks are positioned lower in Dim2, indicating a potential perception of risk or lack of trust compared to traditional banking options.
- They may also be less associated with security and privacy, as they are further away from these vectors.
Direction and Meaning of Vectors (Arrows):
- Security and Privacy point toward the upper left, aligning with Cash, indicating that users who prioritize these factors tend to prefer cash transactions.
- Speed, Ease of Use, and Control point toward the right, aligning with Mobile and Digital payment methods, suggesting that these are perceived as efficient and user-friendly.
- Availability is pointing upwards, which suggests that people associate availability with banking products that are easily accessible across multiple channels.

9.1.3 Conclusion:

This perception map illustrates the trade-offs in different payment methods: - Cash is viewed as the safest and most private option, but at the cost of convenience and speed. - Digital payment methods (Credit, Debit, Mobile) are associated with efficiency, ease of use, and control, but potentially at the expense of security and privacy. - NeoBanks are viewed as a distinct category, perhaps due to concerns over trust or unfamiliarity compared to traditional banking systems.

This visualization helps us understand consumer preferences when choosing payment methods and their perceived benefits and drawbacks.

10 Hypothesesis

10.1 Majority of young people use cash at least once a month.

Let’s check if assumptions to perform a parametric are met:

n * 𝜋> 5 -> 304 * 0.5 = 152 > 5

n(1-𝜋) > 5 -> 304 * 0.5 = 152 > 5

Both assumptions are met, we can proceed with the parametric test - test of population proportion

H0: π = 0.5

H1: π > 0.5

sum(NLB_data$Q3a > 1, na.rm = TRUE)

## [1] 0

prop.test(x = 281,
          n = 304,
          p = 0.5,
          correct = FALSE,
          alternative = "greater")

## 
##  1-sample proportions test without continuity correction
## 
## data:  281 out of 304, null probability 0.5
## X-squared = 218.96, df = 1, p-value < 2.2e-16
## alternative hypothesis: true p is greater than 0.5
## 95 percent confidence interval:
##  0.8954807 1.0000000
## sample estimates:
##         p 
## 0.9243421

We reject H0 at p<0.001, the proportion of young people who still use cash at least once a month is larger than 50%.

10.2 Cash is primarily used by young people (18–27) when digital payment methods are unavailable.

To test this, we will do a test of population proportion. Let’s check if assumptions are met.

n * 𝜋> 5 -> 304 * 0.5 = 152 > 5

n(1-𝜋) > 5 -> 304 * 0.5 = 152 > 5

Assumptions are met, so we can proceed with the test of population proportion.

H0: 𝜋 = 0.5

H1: 𝜋 > 0.5

sum(NLB_data$Q9h > 5, na.rm = TRUE)

## [1] 0

prop.test(x = 200,
          n = 304,
          p = 0.5,
          correct = FALSE,
          alternative = "greater")

## 
##  1-sample proportions test without continuity correction
## 
## data:  200 out of 304, null probability 0.5
## X-squared = 30.316, df = 1, p-value = 1.836e-08
## alternative hypothesis: true p is greater than 0.5
## 95 percent confidence interval:
##  0.6119223 1.0000000
## sample estimates:
##         p 
## 0.6578947

We reject H0. The proportion of people who agree or completely agree that they use cash when digital payments are not available, is statistically larger than 50% (p<0.001).

10.3 Most young people use cash for small payments (up to 10 EUR).

To test this, we will do a test of population proportion. Let’s check if assumptions are met.

n * 𝜋> 5 -> 304 * 0.5 = 152 > 5

n(1-𝜋) > 5 -> 304 * 0.5 = 152 > 5

Assumptions are met, so we can proceed with the test of population proportion.

H0: 𝜋 = 0.5

H1: 𝜋 > 0.5

sum(NLB_data$Q4a > 1, na.rm = TRUE)

## [1] 0

prop.test(x = 270,
          n = 304,
          p = 0.5,
          correct = FALSE,
          alternative = "greater")

## 
##  1-sample proportions test without continuity correction
## 
## data:  270 out of 304, null probability 0.5
## X-squared = 183.21, df = 1, p-value < 2.2e-16
## alternative hypothesis: true p is greater than 0.5
## 95 percent confidence interval:
##  0.8549349 1.0000000
## sample estimates:
##         p 
## 0.8881579

We reject H0 at p<0.001. We have found that more than 50% use cash for their small (less than 10 EUR payments) at least some of the time.

Let’s do the same test, to test how many people use cash for less than half of their small payments.

H0: 𝜋 = 0.5

H1: 𝜋 > 0.5

sum(NLB_data$Q4a == 2, na.rm = TRUE)

## [1] 0

prop.test(x = 169,
          n = 304,
          p = 0.5,
          correct = FALSE,
          alternative = "greater")

## 
##  1-sample proportions test without continuity correction
## 
## data:  169 out of 304, null probability 0.5
## X-squared = 3.8026, df = 1, p-value = 0.02559
## alternative hypothesis: true p is greater than 0.5
## 95 percent confidence interval:
##  0.5087589 1.0000000
## sample estimates:
##         p 
## 0.5559211

We reject H0 at p = 0.026. We have found that the proportion of people, who use cash for less than half of their their small payments is greater than 50%.

10.4 Young people who save, mainly save in cash.

NLB_data_save <- NLB_data %>% select(20)

NLB_data_save$Save_Form <-ifelse(test = NLB_data_save$Save_Form == -2,
                              yes = NA,
                              no = NLB_data_save$Save_Form)

library(tidyr)
NLB_data_save <- drop_na(NLB_data_save)

sum(NLB_data_save$Save_Form < 4, na.rm = TRUE)

## [1] 34

NLB_data_save$Save_FormF<- factor(NLB_data_save$Save_Form,
                       levels = c(1, 2, 3, 4, 5, 6, 7),
                       labels = c("Fully in cash", "Up to 25% in cash", "25% and 50% in cash", "About 50% in cash, 50% digital", "25% to 50% digital", "Up to 25% digital", "Fully digital"))

library(ggplot2)
library(dplyr)

# Calculate percentages
data_percent <- NLB_data_save %>%
  count(Save_FormF) %>%
  mutate(percentage = (n / sum(n)) * 100)

# Create the bar plot with NLB colors
ggplot(data_percent, aes(x = Save_FormF, y = percentage)) +
  geom_bar(stat = "identity", fill = NLB_colors[1], color = "black") +  # Apply NLB Indigo Blue color
  labs(title = "How do you save?", 
       x = "Preference", 
       y = "Percentage") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

We have 233 people who save in our sample. Based on the frequency graph and the calculations, we have found that in our sample, only 34 (14.6%) people who save, save more than 50% in cash. Based on this, we have decided to reject our hypothesis, that young people who save, mostly do so in cash.

10.5 Young people (18–27) mostly use the money in the same form they receive it.

To test this, we will do a test of population proportion. Let’s check if assumptions are met.

n * 𝜋> 5 -> 304 * 0.5 = 152 > 5

n(1-𝜋) > 5 -> 304 * 0.5 = 152 > 5

Assumptions are met, so we can proceed with the test of population proportion.

H0: 𝜋 = 0.5

H1: 𝜋 > 0.5

NLB_form <- NLB_data %>% select(21)

sum(NLB_form$Spend_SameForm > 4, na.rm = TRUE)

## [1] 207

prop.test(x = 207,
          n = 304,
          p = 0.5,
          correct = FALSE,
          alternative = "greater")

## 
##  1-sample proportions test without continuity correction
## 
## data:  207 out of 304, null probability 0.5
## X-squared = 39.803, df = 1, p-value = 1.405e-10
## alternative hypothesis: true p is greater than 0.5
## 95 percent confidence interval:
##  0.6355172 1.0000000
## sample estimates:
##         p 
## 0.6809211

We reject H0 at p<0.001. We have found that more than 50% of young people somewhat agree, agree or completely agree that they tend to use money in the same form as they receive it.

10.6 The convenience of digital payments is a more important factor influencing adoption among young people (18–27) than security.

library(dplyr)

NLB_R6H6 <- NLB_data %>% select(c(29, 32))

head(NLB_R6H6)

##   Importance_Ease Importance_Security
## 1               7                   6
## 2               6                   7
## 3               7                   6
## 4               7                   7
## 5               7                   7
## 6               5                   7

str(NLB_R6H6)

## 'data.frame':    304 obs. of  2 variables:
##  $ Importance_Ease    : num  7 6 7 7 7 5 6 6 4 7 ...
##  $ Importance_Security: num  6 7 6 7 7 7 5 7 7 7 ...

NLB_R6H6 <- NLB_R6H6 %>% mutate_all(as.numeric)

str(NLB_R6H6)

## 'data.frame':    304 obs. of  2 variables:
##  $ Importance_Ease    : num  7 6 7 7 7 5 6 6 4 7 ...
##  $ Importance_Security: num  6 7 6 7 7 7 5 7 7 7 ...

summary(NLB_R6H6)

##  Importance_Ease Importance_Security
##  Min.   :1.000   Min.   :1.000      
##  1st Qu.:6.000   1st Qu.:5.000      
##  Median :7.000   Median :7.000      
##  Mean   :6.105   Mean   :6.076      
##  3rd Qu.:7.000   3rd Qu.:7.000      
##  Max.   :7.000   Max.   :7.000

NLB_R6H6 <- na.omit(NLB_R6H6)

# Calculate the mean of column 29 (Enostavnost)
mean_enostavnost <- mean(NLB_R6H6[[1]], na.rm = TRUE)

print(mean_enostavnost)

## [1] 6.105263

# Calculate the mean of column 29 (Varnost)
mean_varnost <- mean(NLB_R6H6[[2]], na.rm = TRUE)

print(mean_varnost)

## [1] 6.075658

NLB_R6H6$diffs <- NLB_R6H6$Importance_Ease - NLB_R6H6$Importance_Security

shapiro_result <- shapiro.test(NLB_R6H6$diffs)

print(shapiro_result)

## 
##  Shapiro-Wilk normality test
## 
## data:  NLB_R6H6$diffs
## W = 0.8771, p-value = 6.971e-15

Since the p-value is much smaller than 0.05, we reject the null hypothesis that the differences are normally distributed.

This means the data does not follow a normal distribution, which justifies using a non-parametric test like the Wilcoxon Signed-Rank test instead of a paired t-test.

wilcox.test(
  NLB_R6H6[[1]], 
  NLB_R6H6[[2]], 
  paired = TRUE, 
  correct = FALSE, 
  exact = FALSE, 
  alternative = "two.sided")

## 
##  Wilcoxon signed rank test
## 
## data:  NLB_R6H6[[1]] and NLB_R6H6[[2]]
## V = 5908, p-value = 0.537
## alternative hypothesis: true location shift is not equal to 0

p-value = 0.537

Since the p-value is greater than 0.05, we fail to reject the null hypothesis.

This indicates that there is no statistically significant difference between the two variables (Enostavnost and Varnost).

In the context of our hypothesis, this means that we do not have enough evidence to say that convenience is a significantly stronger factor than security in influencing digital payment adoption among young people.

The Wilcoxon test results indicate that there is no statistically significant difference between the perceived importance of convenience and security (p = 0.537). This means that we fail to reject the null hypothesis, suggesting that convenience is not significantly more important than security in influencing digital payment adoption among young people (18-27 years old).

Therefore, our data does not support the hypothesis that convenience is a more important factor than security in digital payment adoption. Both factors appear to have similar levels of importance for young users.

10.7 Young people (18–27) mostly prefer a society that predominantly uses digital payments but remain reluctant to completely eliminate cash.

NLB_R4H4 <- NLB_data %>% select(85)

library(dplyr)

NLB_R4H4$Switch_Digital[NLB_R4H4$Switch_Digital == -1] <- 4

NLB_R4H4$Q22F <- factor(NLB_R4H4$Switch_Digital,
                       levels = c(1, 2, 3, 4),
                       labels = c("Fully digital", "Balance digital-cash", "Cash", "Don't know"))

library(ggplot2)
library(dplyr)

# Calculate percentages
data_percent <- NLB_R4H4 %>%
  count(Q22F) %>%
  mutate(percentage = (n / sum(n)) * 100)

# Create the bar plot with percentages
ggplot(data_percent, aes(x = Q22F, y = percentage)) +
  geom_bar(stat = "identity", fill = NLB_colors[1], color = "black") +  # Apply NLB Blue color
  labs(title = "Would you switch to a fully digital society?", 
       x = "Preference", 
       y = "Percentage") +
  theme_minimal()

head(data_percent)

##                   Q22F   n percentage
## 1        Fully digital  78  25.657895
## 2 Balance digital-cash 171  56.250000
## 3                 Cash  44  14.473684
## 4           Don't know  11   3.618421

From the graph, we can clearly see that the largest percentage of our respondents have answered that they would consider switching to a mostly digital society, but the do not want to fully give up cash.

To further test this, we will do a test of population proportion. Let’s check if assumptions are met.

n * 𝜋> 5 -> 304 * 0.5 = 152 > 5

n(1-𝜋) > 5 -> 304 * 0.5 = 152 > 5

Assumptions are met, so we can proceed with the test of population proportion.

H0: 𝜋 = 0.5

H1: 𝜋 > 0.5

prop.test(x = 171,
          n = 304,
          p = 0.5,
          correct = FALSE,
          alternative = "greater")

## 
##  1-sample proportions test without continuity correction
## 
## data:  171 out of 304, null probability 0.5
## X-squared = 4.75, df = 1, p-value = 0.01465
## alternative hypothesis: true p is greater than 0.5
## 95 percent confidence interval:
##  0.5153528 1.0000000
## sample estimates:
##      p 
## 0.5625

We reject H0 at p = 0.015. Population proportion of people who like digital but are not prepared to fully give up cash is larger than 50%.

10.8 Young people (18–27) are motivated to use digital payments when splitting bills with friends due to the simplicity of transferring exact amounts.

library(dplyr) 

NLB_R8H8 <- NLB_data %>% select(c(69))

head(NLB_R8H8)

##   Reason_ExactSum
## 1               1
## 2               1
## 3               1
## 4               1
## 5               1
## 6               1

str(NLB_R8H8)

## 'data.frame':    304 obs. of  1 variable:
##  $ Reason_ExactSum: chr  "1" "1" "1" "1" ...

col_name <- colnames(NLB_R8H8)[1] 

NLB_R8H8$Reason_ExactSum <- factor(NLB_R8H8$Reason_ExactSum,
                       levels = c(1, 0, -2),
                       labels = c("Transfer whole amounts", "Other reasons", "Not applicable"))



library(dplyr)

NLB_R8H8 %>% count (Reason_ExactSum)

##          Reason_ExactSum   n
## 1 Transfer whole amounts 236
## 2          Other reasons  28
## 3         Not applicable  40

To test this, we will do a test of population proportion. Let’s check if assumptions are met.

n * 𝜋> 5 -> 264 * 0.5 = 132 > 5

n(1-𝜋) > 5 -> 264 * 0.5 = 132 > 5

Assumptions are met, so we can proceed with the test of population proportion.

H0: 𝜋 = 0.5

H1: 𝜋 > 0.5

prop.test(x = 236,
          n = 264,
          p = 0.5,
          correct = FALSE,
          alternative = "greater")

## 
##  1-sample proportions test without continuity correction
## 
## data:  236 out of 264, null probability 0.5
## X-squared = 163.88, df = 1, p-value < 2.2e-16
## alternative hypothesis: true p is greater than 0.5
## 95 percent confidence interval:
##  0.8586738 1.0000000
## sample estimates:
##         p 
## 0.8939394

Since p-value < 0.05, we reject the null hypothesis H₀. This means there is strong statistical evidence that the true population proportion is greater than 50%.

NLB Project - Youngs

Group 3

7th February 2025

1 Dataset overview

1.1 Variables description

2 Data manipulation

2.1 Factoring

3 Descriptive statistics

3.1 Numerical data

3.1.1 Summary of key insights - Numerical data

3.1.1.1 Trust and Security Preferences

3.1.1.2 Preferences for Cash and Digital Payments

3.1.1.3 Digital Payment Security

3.1.1.4 Payment Speed

3.1.1.5 Social Influence on Financial Decisions

3.1.1.6 Conclusion

3.2 Categorical data

3.2.1 Summary of key insights - Categorical data

3.2.1.1 Payment Frequency and Method Preferences

3.2.1.2 Cash Spend and Usage Distribution

3.2.1.3 Income Sources and Digital vs. Cash Payments

3.2.1.4 Spending Habits and Financial Control

3.2.1.5 Demographic Breakdown

3.2.1.6 Conclusion

4 PCA Creation

5 Clustering

6 Criterion validity (significant descriptors)

6.1 Frequency of Mobile Payment Usage per Month

6.2 Frequency of Cash Usage for Purchases Up to 10 EUR

7 Demographics

7.1 Demographics - significant :)

7.1.1 Frequency of Mobile Payment Usage per Month

7.1.2 Frequency of Cash Usage for Purchases Up to 10 EUR

7.2 Demographics - not significant :(

7.2.1 Frequency of Card Payment Usage per Month

7.2.2 Frequency of Other Payment (PayPal, Stripe, etc.) Usage per Month

7.2.3 Frequency of Cash Usage for Purchases 11-99 EUR

7.2.4 Frequency of Cash Usage for Purchases 100-1000 EUR

7.2.5 Age

7.2.6 Income

7.2.7 Employment status

7.2.8 Education

7.2.9 Response

7.2.10 Banks

7.2.11 Gender

8 Cluster description

8.1 Group 1: The Traditionalists

8.2 Group 2: The Security Freaks

8.3 Group 3: The Balanced Users

8.4 Group 4: The Speeders

9 Perception map

9.1 Interpretation of the PCA Biplot (Perception Map)

9.1.1 Axes Interpretation:

9.1.2 Key Observations:

9.1.3 Conclusion:

10 Hypothesesis

10.1 Majority of young people use cash at least once a month.

10.2 Cash is primarily used by young people (18–27) when digital payment methods are unavailable.

10.3 Most young people use cash for small payments (up to 10 EUR).

10.4 Young people who save, mainly save in cash.

10.5 Young people (18–27) mostly use the money in the same form they receive it.

10.6 The convenience of digital payments is a more important factor influencing adoption among young people (18–27) than security.

10.7 Young people (18–27) mostly prefer a society that predominantly uses digital payments but remain reluctant to completely eliminate cash.

10.8 Young people (18–27) are motivated to use digital payments when splitting bills with friends due to the simplicity of transferring exact amounts.