Correspondence Analysis (CA) is a multivariate statistical technique used to visualize relationship between qualitative variables (e.g. Yes/No). It works through reducing dimensions in the dataset, just as principal component analysis (PCA): PCA works well with quantitative variables. CA output is displayed inform of a scatterplot. For ease of interpration, a four quadrant graph is best for displaying the output. Each quadrant match information from columns and rows. I have generated a data on banks and how they are perceived by their customers e.g. are they perceived as modern, transparent,techsavvy, etc. Through CA, the output has been clustered in four quadrants to reflect how each bank is perceived. —
library(MASS)
library(FactoMineR)
library(factoextra)
## Loading required package: ggplot2
## Welcome! Related Books: `Practical Guide To Cluster Analysis in R` at https://goo.gl/13EFCZ
library(knitr)
library(kableExtra)
setwd("D:\\My projects\\banks")
bank<-read.table("banks.csv",sep=",",header=T)
attach(bank)
rownames(bank)<-bank[,1]
bank[,1]=NULL
bank %>%
kable("html") %>%
kable_styling(font_size=10) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
| Bank.A | Bank.B | Bank.C | Bank.D | Bank.E | Bank.F | Bank.G | Bank.H | Bank.I | Bank.J | Bank.K | Bank.L | Bank.M | Bank.N | Bank.O | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Tech savvy | 25 | 6 | 0 | 3 | 114 | 53 | 3 | 0 | 20 | 2 | 18 | 30 | 6 | 79 | 59 |
| Tech averse | 50 | 11 | 5 | 1 | 82 | 79 | 36 | 3 | 53 | 10 | 42 | 64 | 22 | 45 | 41 |
| For the rich | 46 | 21 | 6 | 8 | 202 | 107 | 13 | 4 | 33 | 17 | 34 | 63 | 31 | 134 | 69 |
| For the poor | 11 | 3 | 1 | 4 | 57 | 62 | 27 | 0 | 20 | 7 | 32 | 47 | 12 | 35 | 51 |
| Modern | 37 | 13 | 5 | 7 | 174 | 103 | 17 | 1 | 30 | 16 | 42 | 60 | 9 | 111 | 83 |
| Traditional | 13 | 5 | 3 | 2 | 66 | 73 | 29 | 3 | 35 | 9 | 26 | 61 | 34 | 43 | 36 |
| Trustworthy | 7 | 11 | 3 | 2 | 84 | 47 | 8 | 0 | 8 | 4 | 16 | 17 | 6 | 53 | 40 |
| Untrustworthy | 52 | 9 | 5 | 3 | 76 | 86 | 38 | 1 | 43 | 10 | 42 | 57 | 29 | 37 | 60 |
| Customer friendly | 15 | 8 | 4 | 3 | 106 | 76 | 6 | 0 | 13 | 9 | 21 | 25 | 3 | 58 | 62 |
| Customer unfriendly | 56 | 11 | 3 | 2 | 118 | 78 | 35 | 4 | 46 | 11 | 32 | 62 | 29 | 57 | 58 |
| Stable | 15 | 3 | 0 | 2 | 104 | 57 | 7 | 0 | 12 | 6 | 17 | 28 | 14 | 59 | 44 |
| unstable | 65 | 13 | 5 | 2 | 87 | 105 | 44 | 7 | 56 | 16 | 59 | 63 | 18 | 55 | 80 |
| High interest rate | 26 | 3 | 5 | 4 | 119 | 50 | 7 | 1 | 15 | 8 | 10 | 28 | 5 | 59 | 44 |
| Low interest rate | 32 | 14 | 3 | 3 | 120 | 112 | 36 | 1 | 43 | 13 | 51 | 87 | 24 | 70 | 65 |
| A brand that adds joy to life | 15 | 6 | 1 | 3 | 91 | 66 | 10 | 0 | 7 | 1 | 16 | 18 | 5 | 77 | 46 |
| Good advertisements | 23 | 4 | 3 | 3 | 216 | 91 | 11 | 0 | 13 | 3 | 27 | 25 | 1 | 48 | 110 |
| Poor advertisements | 63 | 10 | 5 | 5 | 54 | 71 | 44 | 0 | 73 | 11 | 46 | 84 | 39 | 77 | 51 |
| A market leader | 8 | 0 | 1 | 0 | 68 | 27 | 6 | 1 | 2 | 3 | 9 | 9 | 1 | 37 | 26 |
| Not a market leader | 62 | 20 | 6 | 5 | 111 | 142 | 57 | 5 | 84 | 21 | 65 | 107 | 43 | 81 | 91 |
| Transparent & accountable | 11 | 5 | 4 | 0 | 61 | 42 | 5 | 0 | 11 | 5 | 17 | 17 | 4 | 50 | 38 |
| A profitable institution | 13 | 7 | 2 | 2 | 104 | 57 | 4 | 0 | 5 | 4 | 19 | 20 | 6 | 51 | 40 |
| A respectable Institution | 13 | 6 | 2 | 2 | 95 | 46 | 7 | 0 | 6 | 7 | 17 | 19 | 18 | 58 | 27 |
| A corruption free institution | 15 | 7 | 1 | 5 | 136 | 63 | 11 | 1 | 17 | 4 | 24 | 28 | 10 | 66 | 71 |
| Supports the community | 13 | 3 | 5 | 0 | 54 | 41 | 6 | 0 | 5 | 2 | 13 | 17 | 1 | 62 | 30 |
res.ca<-CA(bank,graph=F)
fviz_ca_biplot(res.ca,repel=T,axis=c(1,1),map = "symmetric",
labelsize=3,
pointsize=0.5,
invisible = c("row.sup", "col.sup") )
Based on the graph, Bank N & D are perceived to be for the rich, are respectable,have high interest rates, tech savvy, stable, modern, trustworthy, support the community and tend to have loyal customers.
Banks E,F & O are profitable, corrupt free, have good adverst, friendly and are market leaders.
Banks B,C,J,M & L are traditional and make poor advertisements.
Banks A, K, G, I & H are unfriendly, untrustworthy, tech averse, have low interest rates, unstable, not market leaders and are for the poor.