Patent 8423350 was granted and assigned to Google on April, 2013 by the United States Patent and Trademark Office.
Methods, systems, and apparatus, including computer program products, for segmenting text for searching are disclosed. In one implementation, a method is provided. The method includes receiving text; segmenting the text into one or more unigrams; filtering the one or more unigrams to identify one or more core unigrams; and generating a searchable resource, including: for each of the one or more core unigrams: identifying a stem, indexing the stem, and associating one or more second n-grams with the indexed stem. Each of the one or more second n-grams is derived from the text and includes a core unigram that is related to the indexed stem.