Patent attributes
A method of detecting contextual duplicate items can include identifying a plurality of representations of items in a data repository, each item representation including one or more textual attributes. A degree of fit between an item representation's attributes and other items can be calculated. The degree of fit can reflect the relevance of the attributes of one item to the other item. A degree of association between the two item representations can be calculated based at least in part on the calculated degree of fit. The degree of association between the two item representations can reflect the similarity of the two items. The degree of association between the two item representations can be assessed to determine whether the items are contextual duplicates.