Patent attributes
An apparatus includes a memory and processor. The memory stores a machine learning algorithm configured to decide between using an algorithmic and a virtual cart to process a transaction. The processor receives feedback for a decision made by the algorithm, indicating whether the algorithmic and virtual carts match. The processor assigns a reward to the feedback. A first positive reward is assigned when the algorithmic cart is selected, and the feedback indicates that the carts match. A second positive reward is assigned when the virtual cart is selected, and the feedback indicates that the carts do not match. A first negative reward is assigned when the algorithmic cart is selected, and the feedback indicates that the carts do not match. A second negative reward is assigned when the virtual cart is selected, and the feedback indicates that the carts match. The processor uses the reward to update the algorithm.