A computer implemented method of document classification includes receiving a text document. A first classification is generated for the document, and a text corpus is searched for one or more terms from the document. Searched terms having an incidence in the text corpus lower than a threshold incidence are flagged, and at least one classification is generated after removing at least one flagged term from the document. An output is generated if the further classification is different from the first classification.