Text Clustering
Many a time, your data might contain textual information that also needs to be analysed. For instance you might have a dataset where the same thing could be written in different ways by different people (color and colour for instance), and you would like them all to be treated in the same manner.
As an example, remember the dataset from the Reproducible Research's final project, which had a column of storm types, which were officially supposed to be 48, but due to data entry errors, spelling mistakes, and other reasons, had more than 900 unique items listed.
One solution in such a case is to group similar strings together, just like you group similar points together based on how close they are to each other (as done in Exploratory Data Analysis class).
