In text processing, we are interested in words or phrases that will help us differentiate the given text from the other text in the corpus. Let’s call these words or phrases as
key
phrases. Every text mining application needs a way to find out the key phrases. An information retrieval application needs key phrases for the easy retrieval and ranking of search results. A text classification system needs key phrases as its features that are to be fed to a classifier. This is where stop words come into the picture. “
Sometimes, some extremely common words which would appear to be of little value in helping select documents matching a user need are excluded from the vocabulary entirely. These words are called
stop words.” Introduction to Information Retrieval By Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze.