05/01/18 9:15pm
Over the last three years, I have manually categorized more than 4000 rows of my household’s card transactions to help me keep track of the budget. It was time I made a start on creating a model that could categorize new transactions for me based on the text details of the Payee.
This model is based on techniques in text mining as well as using k-Nearest Neighbour. It is work in progress with 60% accuracy.
library(plyr)
library(tm)
library(class)
library(dplyr)
## # A tibble: 6 x 4
## Date Spend Payee Category
## <dttm> <dbl> <chr> <chr>
## 1 2014-01-06 00:00:00 17.7 1090Watercare Water
## 2 2014-01-06 00:00:00 144. NZ TRANSPORT AGENCY PALM NO~ Transport
## 3 2014-01-07 00:00:00 385. DREAM LIFESTYLE NORTH S~ Home & Décor
## 4 2014-01-07 00:00:00 115. THE HOME STORE LIMITED AUCKLAN~ Home & Décor
## 5 2014-01-10 00:00:00 15.6 Westpac Life Bank/Service ~
## 6 2014-01-10 00:00:00 107. CALTEX SYLVIA PARK SYLVIA ~ Petrol
set.seed(1234)
rows<-sample(nrow(Exp))
Exp <- Exp[rows,]
glimpse(Exp)
## Observations: 4,245
## Variables: 4
## $ Date <dttm> 2014-07-23, 2016-08-08, 2016-07-21, 2016-08-08, 2017...
## $ Spend <dbl> 11.80, 11.00, 20.79, 24.78, 5.00, 15.00, 850.00, 23.2...
## $ Payee <chr> "Kaze Sushi", "BUTTERFLY CREEK AUCKLAND N...
## $ Category <chr> "Eat Out", "Butterfly Creek", "Crafts & Decorations",...
payeeCorpus <- VCorpus(VectorSource(Exp$Payee))
Here, I remove uppercase, punctuations, numbers, URLs, stopwords, and unnecessary spaces from corpus
payeeCorpus <- tm_map(payeeCorpus, content_transformer(tolower))
payeeCorpus <- tm_map(payeeCorpus, removePunctuation)
payeeCorpus <- tm_map(payeeCorpus, removeNumbers)
removeURL <- function(x) gsub("http[[:alnum:]]*", "", x) # create remove URL function
payeeCorpus <- tm_map(payeeCorpus, content_transformer(removeURL)) # remove URLs using function
payeeCorpus <- tm_map(payeeCorpus, removeWords, c(stopwords(kind="en")))
payeeCorpus <- tm_map(payeeCorpus, stripWhitespace)
Just to be sure I have cleaned the data properly, I want to see the first 10 rows of the corpus.
for (i in 1:10) {
cat(paste("[[", i, "]] ", sep = ""))
writeLines(as.character(payeeCorpus[[i]]))
}
## [[1]] kaze sushi
## [[2]] butterfly creek auckland nzl
## [[3]] spotlight stores nz lt wairau park nzl
## [[4]] countdown glenfield glenfield nzl
## [[5]] waitemata district hea auckland nzl
## [[6]] new zealand red cross wellington nzl
## [[7]] luke hughes
## [[8]] delissimos takeaway devonport nzl
## [[9]] david jones wellington nzl
## [[10]] major sprout auckland nzl
This function allows me to specify whether my terms in my Document-Term Matrix will be in unigram, bigram, trigram, or mixed format. I set “min” and “max” as arguments when using the function.
library(rJava)
library(RWeka)
token_delim <- " \\t\\r\\n.!?,;\"()"
NgramTokenizer <- function(min, max) {
result <- function(x){
RWeka::NGramTokenizer(x, RWeka::Weka_control(min = min, max = max, delimiters=token_delim))
}
return(result)
}
Document-Term Matrix (DTM) shows me which terms are used in which documents in a corpus. I’ve set min=1 and max=2 to set my terms in unigram and bigram format.
payeeDTM<- DocumentTermMatrix(payeeCorpus, control=list(tokenize=NgramTokenizer(min=1, max=2)))
inspect(payeeDTM[4080:4090,100:110])
## <<DocumentTermMatrix (documents: 11, terms: 11)>>
## Non-/sparse entries: 0/121
## Sparsity : 100%
## Maximal term length: 18
## Weighting : term frequency (tf)
## Sample :
## Terms
## Docs alexander albany allium allium interiors alphabet alphabet bistro
## 4080 0 0 0 0 0
## 4081 0 0 0 0 0
## 4082 0 0 0 0 0
## 4083 0 0 0 0 0
## 4084 0 0 0 0 0
## 4085 0 0 0 0 0
## 4086 0 0 0 0 0
## 4087 0 0 0 0 0
## 4088 0 0 0 0 0
## 4089 0 0 0 0 0
## 4090 0 0 0 0 0
## Terms
## Docs alterations alterations albany aly aly albany amazon
## 4080 0 0 0 0 0
## 4081 0 0 0 0 0
## 4082 0 0 0 0 0
## 4083 0 0 0 0 0
## 4084 0 0 0 0 0
## 4085 0 0 0 0 0
## 4086 0 0 0 0 0
## 4087 0 0 0 0 0
## 4088 0 0 0 0 0
## 4089 0 0 0 0 0
## 4090 0 0 0 0 0
## Terms
## Docs amazon mktplace
## 4080 0
## 4081 0
## 4082 0
## 4083 0
## 4084 0
## 4085 0
## 4086 0
## 4087 0
## 4088 0
## 4089 0
## 4090 0
payeeDTM_freq <- sort(colSums(as.matrix(payeeDTM)), decreasing=TRUE)
barplot(payeeDTM_freq[1:20], col="dark blue", las=2, main = "Most Frequent Words", ylab = "Frequency")
payeeDTM.df <- as.data.frame(data.matrix(payeeDTM), stingsAsfactors=FALSE)
payeeDTM.df <- cbind(payeeDTM.df, Exp$Category)
colnames(payeeDTM.df)[ncol(payeeDTM.df)] <-"category"
train <- sample(nrow(payeeDTM.df), ceiling(nrow(payeeDTM.df)*0.5))
test <- (1:nrow(payeeDTM.df))[-train]
classifier <- payeeDTM.df[,"category"]
modeldata <- payeeDTM.df[,!colnames(payeeDTM.df) %in% "category"]
knn.pred <- knn(modeldata[train, ], modeldata[test, ], classifier[train])
# Confusion Matrix
conf.mat <- table("Predictions" = knn.pred, Actual = classifier[test])
conf.mat
## Actual
## Predictions AA AMI Bank/Service Fees Big Boys Toys
## AA 7 0 0 0
## AMI 0 16 0 0
## Babysitting 0 0 0 0
## Bank/Service Fees 0 0 69 0
## Butterfly Creek 0 0 0 0
## C. Mackenzie 0 0 0 0
## Children 0 0 0 0
## Clothing & Accessories 0 0 0 0
## Crafts & Decorations 0 0 0 0
## Destiny Rescue 0 0 0 0
## Eat Out 0 0 0 0
## Education 0 0 0 0
## Electricity 0 0 0 0
## Family Leisure 0 0 0 0
## Fitness 0 0 0 0
## Furniture & Appliances 0 0 0 0
## Gas 0 0 0 0
## Gift 0 0 0 0
## Groceries 0 0 0 0
## Haircuts 0 0 0 0
## Home & Décor 0 0 0 0
## Home Improvement 1 0 0 0
## Hospitality 0 0 0 0
## Internet 0 0 0 0
## J&L Carroll 0 0 0 0
## Medical 0 0 0 0
## Miscellaneous 0 0 1 0
## Mobile Phone 0 0 0 0
## Mortgage 0 0 0 0
## Movies 0 0 0 0
## Online 0 0 0 0
## Other 0 0 0 0
## Perpetual Guardian 0 0 0 0
## Petrol 0 0 0 0
## Rates 0 0 0 0
## Red Cross 0 0 0 0
## Retail 0 0 0 0
## Southern Cross 0 0 0 0
## Sovereign 0 0 6 1
## Technology 0 0 0 0
## Transport 0 0 0 0
## Travel 0 0 1 0
## Water 0 0 0 0
## Xtreme 0 0 0 0
## Actual
## Predictions Butterfly Creek C. Mackenzie Children
## AA 0 0 0
## AMI 0 0 0
## Babysitting 0 0 0
## Bank/Service Fees 0 0 0
## Butterfly Creek 1 0 0
## C. Mackenzie 0 29 0
## Children 0 0 41
## Clothing & Accessories 0 0 3
## Crafts & Decorations 0 0 2
## Destiny Rescue 0 0 0
## Eat Out 1 0 4
## Education 0 0 0
## Electricity 0 0 0
## Family Leisure 0 0 0
## Fitness 0 0 0
## Furniture & Appliances 0 0 0
## Gas 0 0 0
## Gift 0 0 0
## Groceries 0 0 0
## Haircuts 0 0 0
## Home & Décor 0 0 0
## Home Improvement 0 0 1
## Hospitality 0 0 0
## Internet 0 0 0
## J&L Carroll 0 0 0
## Medical 0 0 0
## Miscellaneous 0 0 0
## Mobile Phone 0 0 0
## Mortgage 0 0 0
## Movies 0 0 0
## Online 0 0 0
## Other 0 0 0
## Perpetual Guardian 0 0 0
## Petrol 0 0 1
## Rates 0 0 0
## Red Cross 0 0 0
## Retail 0 0 8
## Southern Cross 0 0 0
## Sovereign 0 0 5
## Technology 0 0 0
## Transport 0 0 0
## Travel 0 0 0
## Water 0 0 0
## Xtreme 0 0 0
## Actual
## Predictions Clothing & Accessories Crafts & Decorations
## AA 1 0
## AMI 0 0
## Babysitting 0 0
## Bank/Service Fees 0 0
## Butterfly Creek 0 0
## C. Mackenzie 0 0
## Children 1 0
## Clothing & Accessories 7 0
## Crafts & Decorations 0 43
## Destiny Rescue 0 0
## Eat Out 4 1
## Education 0 0
## Electricity 0 0
## Family Leisure 0 0
## Fitness 0 0
## Furniture & Appliances 0 0
## Gas 0 0
## Gift 0 0
## Groceries 1 0
## Haircuts 0 0
## Home & Décor 0 1
## Home Improvement 1 1
## Hospitality 0 0
## Internet 0 0
## J&L Carroll 0 0
## Medical 0 0
## Miscellaneous 0 0
## Mobile Phone 0 0
## Mortgage 0 0
## Movies 0 0
## Online 0 0
## Other 0 0
## Perpetual Guardian 0 0
## Petrol 0 0
## Rates 0 0
## Red Cross 0 0
## Retail 2 1
## Southern Cross 0 0
## Sovereign 6 2
## Technology 0 0
## Transport 0 0
## Travel 0 0
## Water 0 0
## Xtreme 0 0
## Actual
## Predictions Destiny Rescue Eat Out Education Electricity
## AA 0 0 0 0
## AMI 0 0 0 0
## Babysitting 0 0 0 0
## Bank/Service Fees 0 1 0 0
## Butterfly Creek 0 0 0 0
## C. Mackenzie 0 0 0 0
## Children 0 0 0 0
## Clothing & Accessories 0 0 0 0
## Crafts & Decorations 0 1 0 0
## Destiny Rescue 16 0 0 0
## Eat Out 0 459 0 0
## Education 0 0 11 0
## Electricity 0 0 0 23
## Family Leisure 0 0 0 0
## Fitness 0 0 0 0
## Furniture & Appliances 0 0 0 0
## Gas 0 0 0 0
## Gift 0 2 0 0
## Groceries 0 8 0 0
## Haircuts 0 0 0 0
## Home & Décor 0 5 0 0
## Home Improvement 0 4 0 0
## Hospitality 0 0 0 0
## Internet 0 0 0 0
## J&L Carroll 0 0 0 0
## Medical 0 3 0 0
## Miscellaneous 0 0 0 0
## Mobile Phone 0 1 0 0
## Mortgage 0 0 0 0
## Movies 0 1 0 0
## Online 0 1 0 0
## Other 0 0 0 0
## Perpetual Guardian 0 0 0 0
## Petrol 0 0 0 0
## Rates 0 0 0 0
## Red Cross 0 0 0 0
## Retail 0 2 0 0
## Southern Cross 0 0 0 0
## Sovereign 0 61 0 0
## Technology 0 0 0 0
## Transport 0 4 0 0
## Travel 0 0 0 0
## Water 0 0 0 0
## Xtreme 0 0 0 0
## Actual
## Predictions Family Leisure Fitness Furniture & Appliances Gas
## AA 0 1 0 0
## AMI 0 0 0 0
## Babysitting 0 0 0 0
## Bank/Service Fees 0 0 0 0
## Butterfly Creek 0 0 0 0
## C. Mackenzie 0 0 0 0
## Children 0 0 0 0
## Clothing & Accessories 0 0 0 0
## Crafts & Decorations 0 0 0 0
## Destiny Rescue 0 0 0 0
## Eat Out 1 4 1 0
## Education 0 0 0 0
## Electricity 0 0 0 0
## Family Leisure 0 0 0 0
## Fitness 0 10 0 0
## Furniture & Appliances 0 0 1 0
## Gas 0 0 0 18
## Gift 0 0 0 0
## Groceries 0 0 0 0
## Haircuts 0 0 0 0
## Home & Décor 0 0 0 0
## Home Improvement 0 0 0 0
## Hospitality 0 0 0 0
## Internet 0 0 0 0
## J&L Carroll 0 0 0 0
## Medical 0 0 0 0
## Miscellaneous 0 0 0 0
## Mobile Phone 0 0 0 0
## Mortgage 0 0 0 0
## Movies 0 0 0 0
## Online 0 0 0 0
## Other 0 0 0 0
## Perpetual Guardian 0 0 0 0
## Petrol 0 0 0 0
## Rates 0 0 0 0
## Red Cross 0 0 0 0
## Retail 0 0 0 0
## Southern Cross 0 0 0 0
## Sovereign 0 6 3 0
## Technology 0 0 0 0
## Transport 1 0 0 0
## Travel 0 0 0 0
## Water 0 0 0 0
## Xtreme 0 0 0 0
## Actual
## Predictions Gift Groceries Haircuts Home & Décor
## AA 0 0 0 0
## AMI 0 0 0 0
## Babysitting 0 0 0 0
## Bank/Service Fees 0 0 0 0
## Butterfly Creek 0 0 0 0
## C. Mackenzie 0 0 0 0
## Children 5 1 0 0
## Clothing & Accessories 1 0 0 4
## Crafts & Decorations 1 0 0 2
## Destiny Rescue 0 0 0 0
## Eat Out 12 4 1 11
## Education 0 0 0 0
## Electricity 0 0 0 0
## Family Leisure 0 0 0 0
## Fitness 0 0 0 0
## Furniture & Appliances 0 0 0 0
## Gas 0 0 0 0
## Gift 12 1 0 5
## Groceries 1 402 0 1
## Haircuts 0 0 7 0
## Home & Décor 3 1 0 13
## Home Improvement 1 0 0 1
## Hospitality 0 0 0 0
## Internet 0 0 0 0
## J&L Carroll 0 0 0 0
## Medical 1 0 0 1
## Miscellaneous 1 0 0 0
## Mobile Phone 1 0 0 0
## Mortgage 0 0 0 0
## Movies 0 0 0 0
## Online 1 0 0 0
## Other 0 0 0 0
## Perpetual Guardian 0 0 0 0
## Petrol 0 0 0 0
## Rates 0 0 0 0
## Red Cross 0 0 0 0
## Retail 6 0 0 2
## Southern Cross 0 0 0 0
## Sovereign 19 9 0 9
## Technology 0 0 0 0
## Transport 0 1 0 2
## Travel 0 1 0 0
## Water 0 0 0 0
## Xtreme 0 0 0 0
## Actual
## Predictions Home Improvement Hospitality Internet J&L Carroll
## AA 0 0 0 0
## AMI 0 0 0 0
## Babysitting 0 0 0 0
## Bank/Service Fees 0 0 0 0
## Butterfly Creek 0 0 0 0
## C. Mackenzie 0 0 0 0
## Children 0 0 0 0
## Clothing & Accessories 0 0 0 0
## Crafts & Decorations 0 0 0 0
## Destiny Rescue 0 0 0 0
## Eat Out 5 13 0 0
## Education 0 0 0 0
## Electricity 0 0 0 0
## Family Leisure 0 2 0 0
## Fitness 0 0 0 0
## Furniture & Appliances 0 0 0 0
## Gas 0 0 0 0
## Gift 0 0 0 0
## Groceries 0 32 0 0
## Haircuts 0 0 0 0
## Home & Décor 0 1 0 0
## Home Improvement 29 0 0 0
## Hospitality 0 0 0 0
## Internet 0 0 26 0
## J&L Carroll 0 0 0 24
## Medical 0 0 0 0
## Miscellaneous 0 0 0 0
## Mobile Phone 0 0 0 0
## Mortgage 0 0 0 0
## Movies 1 0 0 0
## Online 0 0 0 0
## Other 0 0 0 0
## Perpetual Guardian 0 0 0 0
## Petrol 0 0 0 0
## Rates 0 0 0 0
## Red Cross 0 0 0 0
## Retail 0 0 0 0
## Southern Cross 0 0 0 0
## Sovereign 12 4 0 0
## Technology 0 0 0 0
## Transport 0 2 0 0
## Travel 0 0 0 0
## Water 0 0 0 0
## Xtreme 0 0 0 0
## Actual
## Predictions Medical Miscellaneous Mobile Phone Movies Online
## AA 0 1 0 0 0
## AMI 0 0 0 0 0
## Babysitting 0 0 0 0 0
## Bank/Service Fees 0 0 0 0 0
## Butterfly Creek 0 0 0 0 0
## C. Mackenzie 0 0 0 0 0
## Children 0 0 0 0 0
## Clothing & Accessories 0 0 0 0 0
## Crafts & Decorations 0 0 0 0 0
## Destiny Rescue 0 0 0 0 0
## Eat Out 4 2 0 0 4
## Education 0 0 0 0 0
## Electricity 0 0 0 0 0
## Family Leisure 0 0 0 0 0
## Fitness 1 1 0 0 0
## Furniture & Appliances 0 0 0 0 0
## Gas 0 0 0 0 0
## Gift 0 0 0 0 0
## Groceries 2 0 0 0 0
## Haircuts 0 0 0 0 0
## Home & Décor 0 0 0 0 0
## Home Improvement 0 0 0 0 0
## Hospitality 0 0 0 0 0
## Internet 0 0 0 0 0
## J&L Carroll 0 0 0 0 0
## Medical 34 0 0 0 0
## Miscellaneous 0 4 0 0 0
## Mobile Phone 0 0 20 0 1
## Mortgage 0 0 0 0 0
## Movies 0 0 0 25 0
## Online 0 0 0 0 14
## Other 0 0 0 0 0
## Perpetual Guardian 0 0 0 0 0
## Petrol 0 0 0 0 0
## Rates 0 0 0 0 0
## Red Cross 0 0 0 0 0
## Retail 0 0 0 0 0
## Southern Cross 0 0 0 0 0
## Sovereign 3 12 0 3 4
## Technology 0 0 0 0 0
## Transport 1 0 0 0 0
## Travel 0 1 0 0 0
## Water 0 0 0 0 0
## Xtreme 0 0 0 0 0
## Actual
## Predictions Party Kingdom Petrol Rates Red Cross Retail
## AA 0 2 0 0 1
## AMI 0 0 0 0 0
## Babysitting 0 0 0 0 0
## Bank/Service Fees 0 0 0 0 0
## Butterfly Creek 0 0 0 0 0
## C. Mackenzie 0 0 0 0 0
## Children 0 0 0 0 5
## Clothing & Accessories 0 0 0 0 0
## Crafts & Decorations 0 0 0 0 0
## Destiny Rescue 0 0 0 0 0
## Eat Out 0 5 0 0 2
## Education 0 0 0 0 0
## Electricity 0 0 0 0 0
## Family Leisure 0 0 0 0 0
## Fitness 0 0 0 0 0
## Furniture & Appliances 0 0 0 0 3
## Gas 0 0 0 0 0
## Gift 0 0 0 0 2
## Groceries 0 0 0 0 0
## Haircuts 0 0 0 0 0
## Home & Décor 0 0 0 0 0
## Home Improvement 0 0 0 0 0
## Hospitality 0 0 0 0 0
## Internet 0 0 0 0 0
## J&L Carroll 0 0 0 0 0
## Medical 0 0 0 0 0
## Miscellaneous 0 0 0 0 0
## Mobile Phone 0 0 0 0 0
## Mortgage 0 0 0 0 0
## Movies 0 0 0 0 0
## Online 0 0 0 0 0
## Other 0 0 0 0 0
## Perpetual Guardian 0 0 0 0 0
## Petrol 0 50 0 0 0
## Rates 0 0 4 0 0
## Red Cross 0 0 0 21 0
## Retail 0 2 0 0 58
## Southern Cross 0 0 0 0 0
## Sovereign 2 6 0 0 2
## Technology 0 0 0 0 0
## Transport 0 0 0 0 0
## Travel 0 0 0 0 1
## Water 0 0 0 0 0
## Xtreme 0 0 0 0 0
## Actual
## Predictions Southern Cross Sovereign Technology
## AA 0 0 0
## AMI 0 0 0
## Babysitting 0 0 0
## Bank/Service Fees 0 0 0
## Butterfly Creek 0 0 0
## C. Mackenzie 0 0 0
## Children 0 0 0
## Clothing & Accessories 0 0 0
## Crafts & Decorations 0 0 0
## Destiny Rescue 0 0 0
## Eat Out 0 0 6
## Education 0 0 0
## Electricity 0 0 0
## Family Leisure 0 0 0
## Fitness 0 0 0
## Furniture & Appliances 0 0 0
## Gas 0 0 0
## Gift 0 0 0
## Groceries 0 0 0
## Haircuts 0 0 0
## Home & Décor 0 0 0
## Home Improvement 0 0 0
## Hospitality 0 0 0
## Internet 0 0 0
## J&L Carroll 0 0 0
## Medical 0 0 1
## Miscellaneous 0 0 0
## Mobile Phone 0 0 0
## Mortgage 0 0 0
## Movies 0 0 0
## Online 0 0 0
## Other 0 0 0
## Perpetual Guardian 0 0 0
## Petrol 0 0 0
## Rates 0 0 0
## Red Cross 0 0 0
## Retail 0 0 0
## Southern Cross 13 0 0
## Sovereign 0 37 7
## Technology 0 0 1
## Transport 0 0 0
## Travel 0 0 0
## Water 0 0 0
## Xtreme 0 0 0
## Actual
## Predictions Tower Insurance Transport Travel UNICEF Water
## AA 0 0 0 0 0
## AMI 0 0 0 0 0
## Babysitting 0 0 0 0 0
## Bank/Service Fees 0 1 6 0 0
## Butterfly Creek 0 0 0 0 0
## C. Mackenzie 0 0 0 0 0
## Children 0 0 1 0 0
## Clothing & Accessories 0 0 1 0 0
## Crafts & Decorations 0 0 0 0 0
## Destiny Rescue 0 0 0 0 0
## Eat Out 0 3 2 0 0
## Education 0 0 0 0 0
## Electricity 0 0 0 0 0
## Family Leisure 0 0 0 0 0
## Fitness 0 0 0 0 0
## Furniture & Appliances 0 0 0 0 0
## Gas 0 0 0 0 0
## Gift 0 1 0 0 0
## Groceries 0 0 0 0 0
## Haircuts 0 0 0 0 0
## Home & Décor 2 1 0 0 0
## Home Improvement 0 0 0 0 0
## Hospitality 0 0 0 0 0
## Internet 0 0 0 0 0
## J&L Carroll 0 0 0 0 0
## Medical 0 0 0 0 0
## Miscellaneous 0 0 0 0 0
## Mobile Phone 0 0 0 0 0
## Mortgage 0 0 0 0 0
## Movies 0 0 0 0 0
## Online 0 0 0 0 0
## Other 0 0 0 0 0
## Perpetual Guardian 0 0 0 0 0
## Petrol 0 0 0 0 0
## Rates 0 0 0 0 0
## Red Cross 0 0 0 0 0
## Retail 0 0 0 0 0
## Southern Cross 0 0 0 0 0
## Sovereign 0 1 24 0 0
## Technology 0 0 0 0 0
## Transport 0 44 0 0 0
## Travel 0 0 25 1 0
## Water 0 0 0 0 24
## Xtreme 0 0 0 0 0
## Actual
## Predictions Xtreme Zoo
## AA 0 0
## AMI 0 0
## Babysitting 0 0
## Bank/Service Fees 0 0
## Butterfly Creek 0 0
## C. Mackenzie 0 0
## Children 0 0
## Clothing & Accessories 0 0
## Crafts & Decorations 0 0
## Destiny Rescue 0 0
## Eat Out 0 1
## Education 0 0
## Electricity 0 0
## Family Leisure 0 0
## Fitness 0 0
## Furniture & Appliances 0 0
## Gas 0 0
## Gift 0 0
## Groceries 0 0
## Haircuts 0 0
## Home & Décor 0 0
## Home Improvement 0 0
## Hospitality 0 0
## Internet 0 0
## J&L Carroll 0 0
## Medical 0 0
## Miscellaneous 0 0
## Mobile Phone 0 0
## Mortgage 0 0
## Movies 0 0
## Online 0 0
## Other 0 0
## Perpetual Guardian 0 0
## Petrol 0 0
## Rates 0 0
## Red Cross 0 0
## Retail 0 0
## Southern Cross 0 0
## Sovereign 0 0
## Technology 0 0
## Transport 0 0
## Travel 0 0
## Water 0 0
## Xtreme 2 0
# Accuracy
(accuracy <- sum(diag(conf.mat))/length(test) * 100)
## [1] 60.32045