Systems and methods to automatically generate classifiers are provided. A labeled dataset is initially received. The dataset may be for a positive class, or may be a negative for a class, or a false positive class. N features that are predictive for the class (or false positive or the negative class) are identified. These features are combined within a classifier dictionary. Medical records received may be processed in order to be machine readable. Features within the medical records are identified and are compared against the dictionary of classifiers. Matches indicate classes within the medical record. The classifier dictionary may be periodically updated in response to insufficient classification accuracy, or when new data becomes available.