For every transaction recorded, these are 40 different metrics associated with that specific transaction. Most of them are redundant, null, or not important to this evaluation so we will condense the data as much as possible by eliminating those columns. One phenomena that occurred during parsing the raw data was that certain transactions that seem to be duplicates that were written in parenthesis were parsed as negative, so we will remove those as well.
## Isolate the three columns we need
## in a new Data Frame called RE_Data
RE_Data = data.frame(df[8],df[18],df[38])
## Eliminating duplicate by only using positive data;
RE_Data = RE_Data[which(RE_Data$POSTING.AMOUNT>=0),]