Patent attributes
One or more processors store rules for performing rules-based cleaning operations on a plurality of datasets, wherein each rule comprises one or more functions to be executed against a dataset during the rules-based cleaning operations, the one or more functions each having one or more associated conditions and actions, wherein the one or more actions are performed on the dataset responsive to the one or more associated conditions being satisfied. The one or more processors further apply the rules to each of the plurality of datasets to perform the rules-based cleaning operations. To apply the rules to a given dataset, the one or more processors identify an ordered list of the one or more functions to be executed with respect to the given dataset during the rules-based cleaning operations and determine, for each of the one or more functions, whether the given dataset satisfies one or more conditions associated with a respective function of the one or more functions. Responsive to the given dataset satisfying the one or more conditions associated with the respective function, the one or more processors perform, on the given dataset, one or more actions associated with the respective function and provide a derived dataset comprising at least one modification to the given dataset resulting from the one or more actions associated with the respective function.