A model-free reinforcement technique in machine learning and data mining that compares available actions to states of the expected actions for a given machine state. Q-learning is able to find the optimal set of states for any given finite Markov decision process (MDP).

