OVERVIEW: Which words and terms get analyzed, and which do not?

Last Updated: Sep 02, 2016
Terms that provide little or no information (also called stopwords) are ignored in our analysis.  These include articles (a, an, the, ...), pronouns (he, she, they, it, ...), conjunctions (and, but, or, ...), auxiliary verbs (can, have, do, ...), some prepositions (of, for, ...), etc.

The remaining words are collocated, or grouped together, if they belong to meaningful multi-term units (e.g., "Amazon Kindle Fire"). Currently, our collocations can be up to seven words long, or longer in the case where negation words (such as "no" and "never") are collocated with other terms (for example, if "buy milk from the grocery store" is a collocation, "never will I have to buy milk from the grocery store" could also be a term, because the negation ("never") and the intervening stopwords ("will I have to") don't count toward the seven-word limit). Most words are not collocated.

