Patent attributes
Methods and Systems for using reinforcement learning to optimize promotions. A promotion can be offered to a customer for a prepaid calling card using a reinforcement learning model with a sensitivity parameter. The reinforcement learning model can estimate a time period during which the customer will purchase the prepaid calling card. The customer's reaction to the promotion can be observed. A reward or a penalty can be collected based on the customer's reaction. The reinforcement learning model can be adapted based on the reward or the penalty to optimize the timing of the promotion by estimating a new time period during which the customer will purchase the prepaid calling card. The reward proxy and/or the penalty proxy can comprise frequency of usage.