The algortihm used by me is a very simple approach and have many things to work on for better accuracy, but it works for simple texts. It is a frequency based algorithm where the algorithm uses the cleaned data (RDS format - milestone report) to match the input string provided by the user, and provides all the matches wether its bi, tri or quad gram and after that the matched phrase is chosen with highest frequency.