Yunfeng Xi
04/11/15
If you type something, it will show ten highest-frequency words starting with what you typed in word cloud. Note that font size is scaled to frequency.
For example, if you type “trans”, the word cloud plot will be like this.
To make it simple, only the typos that are off by one edit distance are considered. There are four kinds of typos:
If you type “mistkae”, you will plot on the right.
To save the corpus uploading time, cases are considered up to 3-gram. If there are less than ten candidates found from corpus, the algorithm will take step back and search 2-gram, if the number of candidates is still less than ten, it will search 1-gram till there are ten words in plot. If you type “university of ”, you will see:
Below is ten phrases randomly grabbed from twitters, 4 out of 10 are in plot which means in rank of top 10. It is a small sample just show how I did the test. For a larger sample, the accuracy is no more than 20%.