Patent attributes
In one embodiment, a machine learning model evaluation system may define standardized, extensible class hierarchies for evaluating performance of a given machine learning model. The class hierarchies may include a plurality of target classes that formalize an expected output of the given machine learning model based on a given dataset, a plurality of output classes that formalize an actual output of the given machine learning model based on the given dataset, a plurality of metric classes that formalize a comparison of the expected output of the given machine learning model with the actual output of the given machine learning model, and a plurality of datasets. When a machine learning model is received for evaluation, the system may identify a target class, an output class, and a metric class that are applicable to the machine learning model. The system may also retrieve a dataset applicable to the machine learning model.