How can we help you today?

Which words and terms get analyzed, and which do not?

Modified on: Wed, Jun 19, 2019 at 1:21 PM


Terms that provide little or no information (also called stopwords) are ignored in an analysis. 


These include articles (a, an, the, ...), pronouns (he, she, they, it, ...), conjunctions (and, but, or, ...), auxiliary verbs (can, have, do, ...), some prepositions (of, for, ...), etc.


The remaining words are collocated, or grouped together, if they belong to meaningful multi-term units (e.g., "Amazon Kindle Fire" or "power button" or "purchased for my wife").


Currently,  collocations can be up to seven words long -- or longer in the case where negation words (such as "no" and "never") are collocated with other terms (for example, if "buy milk from the grocery store" is a collocation, "never will I have to buy milk from the grocery store" could also be a concept, because the negation ("never") and the intervening stopwords ("will I have to") don't count toward the seven-word limit).


NOTE: Most terms are not collocated.


Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.