The idea of tf-idf is to find the important words for the content of each document 1) by decreasing the weight for commonly used words and 2) increasing the weight for words that are not used very much in a collection or corpus of documents, in this case, the group of Jane Austen’s novels as a whole.

Interpretation