LoginSign Up
Reinforcement Learning

Reinforcement Learning

An area of machine learning focusing on how machines and software agents react in a specific context to maximize performance and achieve reward known as reinforcement signal

Reinforcement learning is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. It allows machines and software agents to automatically determine the ideal behavior within a specific context, in order to maximize its performance. Simple reward feedback is required for the agent to learn its behavior; this is known as the reinforcement signal.

Reinforcement learning is defined by a specific type of problem, and all its solutions are classed as Reinforcement learning algorithms. In the problem, an agent is supposed decide the best action to select based on his current state. The environment is typically formulated as a Markov decision process (MDP), as many reinforcement learning algorithms for this context use dynamic programming techniques. The main difference between the classical techniques and reinforcement learning algorithms is that the latter do not need knowledge about the MDP and they target large MDPs where exact methods become impractical. The problem has been studied in the theory of optimal control, still most studies are concerned with the existence of optimal solutions and their characterization, and not with the learning or approximation aspects. In economics and game theory, reinforcement learning may be utilized to analyze how equilibrium may arise under bounded rationality.

The simplest context in which to think about reinforcement learning is in games with a clear objective and a point system.

For example, a game where a mouse is looking for the cheese at the end of the maze (+500 points), or the lesser reward of water along the way (+10 points). Meanwhile, mouse tries to avoid electric shock (-100 points).

The reward is not always immediate. Here, the robot-mouse will go to a long stretch of the maze. It has to walk through the paths and face several decision points before reaching the cheese.

The agent observes the environment, takes an action to interact with the environment, and receives positive or negative reward.

With the advance of neural networks, deep reinforcement learning, a strategy that uses neural networks to evaluate the states (e.g. Q-values), becomes more popular. It allows researchers and engineers to create agents that does well in more complex enviroments.

Due to its generality, it is studied in many other disciplines, such as game theory, control theory, operations research, information theory, simulation-based optimization, multi-agent systems, swarm intelligence, statistics and genetic algorithms. In the operations research and control literature.

Practical Applications of Reinforcement Learning

  • Computer Games

Timeline

Currently, no events have been added to this timeline yet.
Be the first one to add some.

People

Name
Role
Related Golden topics

Further reading

Author
Title
Link
Type

Luca M. Gambardella and Marco Dorigo

Ant-Q: A Reinforcement Learning approach tothe traveling salesman problem

Academic paper

Michael L. Littman and Csaba Szepesva`ri

A Generalized Reinforcement-Learning Model: Convergence and Application

Shakir Mohamed and Danilo Jimenez Rezende

Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning

Academic paper

Vishal Maini

Machine Learning for Humans, Part 5: Reinforcement Learning Exploration and exploitation. Markov decision processes. Q-learning, policy learning, and deep reinforcement learning.

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller

Playing Atari with Deep Reinforcement Learning

Xin Xu, Lei Zuo and Zhenhua Huang

Reinforcement learning algorithms with function approximation: Recent advances and applications

Academic paper

Documentaries, videos and podcasts

Title
Date
Link

Companies

Company
CEO
Location
Products/Services

DeepMind

Demis Hassabis

London

AI research and application

References