Correspondence Analysis

Correspondence Analysis (CA) is a multivariate statistical technique used to visualize relationship between qualitative variables (e.g. Yes/No). It works through reducing dimensions in the dataset, just as principal component analysis (PCA): PCA works well with quantitative variables. CA output is displayed inform of a scatterplot. For ease of interpration, a four quadrant graph is best for displaying the output. Each quadrant match information from columns and rows. I have generated a data on banks and how they are perceived by their customers e.g. are they perceived as modern, transparent,techsavvy, etc. Through CA, the output has been clustered in four quadrants to reflect how each bank is perceived. —

library(MASS)
library(FactoMineR)
library(factoextra)
## Loading required package: ggplot2
## Welcome! Related Books: `Practical Guide To Cluster Analysis in R` at https://goo.gl/13EFCZ
library(knitr)
library(kableExtra)
setwd("D:\\My projects\\banks")
bank<-read.table("banks.csv",sep=",",header=T)
attach(bank)
rownames(bank)<-bank[,1]
bank[,1]=NULL
bank %>%
  kable("html") %>%
  kable_styling(font_size=10) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Bank.A Bank.B Bank.C Bank.D Bank.E Bank.F Bank.G Bank.H Bank.I Bank.J Bank.K Bank.L Bank.M Bank.N Bank.O
Tech savvy 25 6 0 3 114 53 3 0 20 2 18 30 6 79 59
Tech averse 50 11 5 1 82 79 36 3 53 10 42 64 22 45 41
For the rich 46 21 6 8 202 107 13 4 33 17 34 63 31 134 69
For the poor 11 3 1 4 57 62 27 0 20 7 32 47 12 35 51
Modern 37 13 5 7 174 103 17 1 30 16 42 60 9 111 83
Traditional 13 5 3 2 66 73 29 3 35 9 26 61 34 43 36
Trustworthy 7 11 3 2 84 47 8 0 8 4 16 17 6 53 40
Untrustworthy 52 9 5 3 76 86 38 1 43 10 42 57 29 37 60
Customer friendly 15 8 4 3 106 76 6 0 13 9 21 25 3 58 62
Customer unfriendly 56 11 3 2 118 78 35 4 46 11 32 62 29 57 58
Stable 15 3 0 2 104 57 7 0 12 6 17 28 14 59 44
unstable 65 13 5 2 87 105 44 7 56 16 59 63 18 55 80
High interest rate 26 3 5 4 119 50 7 1 15 8 10 28 5 59 44
Low interest rate 32 14 3 3 120 112 36 1 43 13 51 87 24 70 65
A brand that adds joy to life 15 6 1 3 91 66 10 0 7 1 16 18 5 77 46
Good advertisements 23 4 3 3 216 91 11 0 13 3 27 25 1 48 110
Poor advertisements 63 10 5 5 54 71 44 0 73 11 46 84 39 77 51
A market leader 8 0 1 0 68 27 6 1 2 3 9 9 1 37 26
Not a market leader 62 20 6 5 111 142 57 5 84 21 65 107 43 81 91
Transparent & accountable 11 5 4 0 61 42 5 0 11 5 17 17 4 50 38
A profitable institution 13 7 2 2 104 57 4 0 5 4 19 20 6 51 40
A respectable Institution 13 6 2 2 95 46 7 0 6 7 17 19 18 58 27
A corruption free institution 15 7 1 5 136 63 11 1 17 4 24 28 10 66 71
Supports the community 13 3 5 0 54 41 6 0 5 2 13 17 1 62 30
res.ca<-CA(bank,graph=F)
fviz_ca_biplot(res.ca,repel=T,axis=c(1,1),map = "symmetric",
                             labelsize=3,
                             pointsize=0.5,
                             invisible = c("row.sup", "col.sup") )

Based on the graph, Bank N & D are perceived to be for the rich, are respectable,have high interest rates, tech savvy, stable, modern, trustworthy, support the community and tend to have loyal customers.

Banks E,F & O are profitable, corrupt free, have good adverst, friendly and are market leaders.

Banks B,C,J,M & L are traditional and make poor advertisements.

Banks A, K, G, I & H are unfriendly, untrustworthy, tech averse, have low interest rates, unstable, not market leaders and are for the poor.