Patent attributes
Some embodiments described herein cover a machine learning architecture with a separated perception subsystem and application subsystem. These subsystems can be co-trained. In one example embodiment, a data item is received and information from the data item is processed by a first node to generate a first feature vector comprising a plurality of features, each of the plurality of features having a similarity value representing a similarity to one of a plurality of centroids. The first node selects a subset of the features from the first feature vector, the subset containing one or more features that have highest similarity values. The first node generates a second feature vector from the first feature vector by replacing similarity values of features in the first feature vector that are not in the subset with zeros. A second node then processes the second feature vector to determine an output.