Patent 7617182 was granted and assigned to Microsoft on November, 2009 by the United States Patent and Trademark Office.
For each document in a document set, entities are identified and a set of association rules, based on appearance of the entities in the paragraphs of the documents in the set, are derived. Documents are clustered based on the association rules. As documents are added to the clusters, additional association rules specific to the clusters can optionally be derived as well.