Other attributes
Reservoir computing is an approach to recurrent neural network design and training, which maps input signals into higher dimensional computational spaces through a fixed, nonlinear system called a reservoir. The reservoir is treated as a black box from which a simple readout mechanism is trained to read the state of the reservoir and map it to the desired output. Reservoir computing is suited for temporal or sequential data processing. This computing setup features two key elements: a dynamical system that can respond to inputs (a reservoir) and a readout layer that is used to analyze the state of the system.
Reservoir computing differs from traditional recurrent neural network (RNN) learning techniques by making conceptual and computation separation between the reservoir and the readout. This means in contrast to traditional supervised learning, errors in the weights to input or in the reservoir will only influence the weights of the readout layer, as these weights are set at the start of the learning and do not change. Whereas in traditional supervised learning, the error between the desired output and the computed output will influence the weights of the entire network.
The reservoir in reservoir computing is the internal structure of the computer. The reservoir must have two properties:
- It must be made of individual, non-linear units
- It must be capable of storing information
The response of each unit to an input is a nonlinear response, and the time to respond allows reservoir computers to solve complex problems. The reservoir is capable of storing information by connecting units in recurrent loops, where previous input affects the next response, and where the change in reaction due to the past allows the computer to be trained to complete specific tasks.
Reservoirs can be virtual or physical. In virtual reservoirs, the reservoir is randomly generated and designed like neural networks. They can further be designed to have nonlinearity and recurrent loops. But, unlike neural networks, the connections between units are randomized and remain unchanged during computation.
Physical reservoirs are possible because of the nonlinearity of certain natural systems. The interaction between ripples on the surface of water contains the nonlinear dynamics required in reservoir creation. A pattern recognition reservoir computer was developed by inputting ripples with electric motors and analyzing the ripples in the readout. The framework of exploiting physical systems as information-processing devices is especially suited for edge computing devices, in which the information processed is incorporated at the edge in a decentralized manner to reduce adaptation delays caused by data transmission overhead.
The readout is a neural network layer that performs a linear transformation on the output of the reservoir. The weights of the readout layer are, in turn, trained through analyzing the spatiotemporal patterns of the reservoir after excitation by known inputs, and utilizing training methods such as linear regression or ridge regression. Because the readout implementation depends on the reservoir patterns, the details of readout methods are tailored to a specific reservoir.
An early example of reservoir computing, the context reverberation network architecture has an input layer that feeds into a dimensional dynamical systems read out by a trainable single-layer perceptron. In this network, two kinds of dynamical systems were described:
- A recurrent neural network with fixed random weights
- A continuous reaction-diffusion system, inspired by Alan Turing's model of morphogenesis
At a trainable layer, the perceptron associates inputs with the signals in the dynamical system. The latter were said to provide a dynamic context for inputs. In the language of later work, the reaction-diffusion system served as a reservoir.
Echo state networks (ESN) provide an architecture and a supervised learning principle for recurrent neural networks. The echo state network principle drives a random, large, fixed recurrent neural network with the input signal. Thereby, the ESN induces a nonlinear response signal in each neuron in the recurrent neural network, which composes the 'reservoir' network. This can combine a desired output signal with a trainable linear combination of all the response signals.
A liquid state machine (LSM) is a reservoir computer that uses a spiking neural network for computation. The name comes from an analogy to a stone dropped into a body of water or other liquid, which generates ripples in the liquid such that the input (the falling stone) has been converted into a pattern of liquid displacement. In this system, the LSM consists of a large collection of units (called nodes or neurons). These nodes receive time varying input from external sources and from other nodes, all of which are randomly connected to each other. The recurrent nature of the connections translate the time varying input into a spatial-temporal pattern of activations in the network node, which in turn are read out by linear discriminant units. This ends with the computing of nonlinear functions on the input; given a large variety of nonlinear functions, it is possible to obtain linear combinations to perform mathematical operations and achieve tasks such as speech recognition and computer vision.
The extension of the reservoir computing framework toward deep learning, with the introduction of Deep Reservoir Computing and of the Deep Echo State Network (DeepESN) allows the development of trained models for hierarchical processing of temporal data. As well, DeepESN enables the investigation of the inherent role of layered composition in recurrent neural networks.
Quantum reservoir computing utilizes the nonlinear nature of quantum mechanical interactions or processes to form a reservoir. It may also be accomplished using linear reservoirs, when the injection of the input to the reservoir creates the nonlinearity. As well, the possible combination of quantum devices with machine learning could lead to the development of quantum neuromorphic computing.
Reservoir computing, for different reasons, has advantages over classical fully trained recurrent neural networks. The reservoir computing paradigm has facilitated the practical application of recurrent neural networks, and reservoir computer-trained recurrent neural networks have outperformed classically trained recurrent neural networks in many tasks. The advantages to reservoir computing over other forms of recurrent neural network training include:
- Training is performed at the readout stage, simplifying the overall training process.
- The computational power for reservoir computing comes from naturally available systems, either classical or quantum mechanical, and can be utilized to reduce the computational cost.
- The reservoir without adaptive updating is amenable to hardware implementation using a variety of physical systems, substrates, and devices.
- Physical reservoir computing approaches have the potential for low-energy high-performance computing
In a reservoir computing framework, the reservoir is fixed and does not need to be trained or weighted. Rather, the readout is trained with a method such as linear regression and classification. And, compared to other recurrent neural networks, learning can occur faster and result in lower training costs.
To train an RNN conventionally, a backpropagation-through-time (BPTT) method is often used. In the BPTT method the weights of the network are tuned toward a target function. This takes time and has been known to be unstable, in that it cannot always obtain the optimal set of weights after learning.
Whereas, in a reservoir computer, only the readout is trained towards a target function, which results in fewer parameters needing to be tuned and a reduction in training time. More specifically, if the readout is set as linear and static weights, training can be executed with a linear regression and ridge regression, and the optimal weights can be induced through a batch learning procedure, making the learning process simple and stable.
Another advantage to reservoir computing is the ease in multi-tasking or sequential learning. In the approach of BPTT, an entire network is first optimized for a task and then additionally trained for other tasks, which can interfere during the update of weights within the same network. In this situation, there is a danger that a network forgets previously learned tasks. In the reservoir computing framework, because training occurs at the readout part, no interference occurs among tasks, so multi-tasking can be safely implemented.
The arbitrariness and diversity in the choice of a reservoir is another advantage over conventional training frameworks. With the basic concept of reservoir computing exploiting the intrinsic dynamics of a reservoir by outsourcing learning to the readout part. This means reservoirs do not have to be a recurrent neural network but can be any dynamical system. Which leads to the possible exploitation of physical dynamics as a reservoir instead of using simulated dynamics inside the computer. This makes the framework different from other machine learning methods. As well, reservoir computing could provide insight into dynamical systems, such as physics, materials science, and biological science.
Based on the relative computational complexity and the simplicity of use of reservoir computing, these systems are considered to be well-suited for forecasting dynamical systems. This could include training recurrent neural networks for:
- Speech recognition
- Computer vision
- Dynamic systems prediction (weather, financial data)
- System control
- Identification
- Adaptive filtering
- Noise reduction
- Robotics (gait generation, planning)
- Vision and speech (recognition, processing, production)
In research predating reservoir computing, some have argued that the brain exhibits similarities to a reservoir computing framework, especially in the random connection between neurons and linear learning on the output layer. In this way, the usage of the reservoir computing paradigm could have potential as either a machine learning tool or as a biologically feasible mechanism.
The reservoir computing framework is used to test hardware systems for neuromorphic computing. One preferred task for benchmarking devices is speech recognition. This requires acoustic transformations from sound waveforms with varying amplitudes to frequency domain maps. The use of speech recognition has been shown to be an appropriate benchmark for different hardware, as the nonlinearity in acoustic transformation plays a critical role in the speech recognition success rate.
The possibility of reservoir computing realized on purpose-built hardware has suggested the possibility of speed gains and further power savings in implementation. However, it has been considered difficult to develop hardware for reservoir computing as many components, perhaps one for each reservoir neuron, could be required and with the capability to be adapted each time a new architecture was devised.
But, a passive silicon photonics reservoir has been proposed. The generic chip has been demonstrated to be able to perform arbitrary Boolean logic operations with memory as well as 5-bit header recognition up to 12.5 Gbit/s−1. The chip was also capable of performing isolated spoken digit recognition. The success suggests exploiting optical phase for computing could be scalable to larger networks and higher bitrates and offer integrated photonic reservoir computing for a range of applications.
The photonic integrated circuit has been further proposed for performing prediction and classification tasks, with the main challenge for the miniaturization of photonic reservoir computing the use of integrated circuits. The use of reservoir computing with a photonic integrated circuit has been demonstrated with a semiconductor laser and a short external cavity. A method for increasing the number of virtual nodes was also proposed, in which delayed feedback, using short node intervals and outputs from multiple delay times, was used. A photonic integrated circuit in this construction using an optical feedback has been shown to perform a similar photonic integrated circuit without optical feedback specifically in prediction tasks.
In February 2021, a study by Nakajima, et al looked at the possibility of using photonic implementation on-chip for a simplified recurrent neural network. This study used an integrated coherent linear photonic processor, and, in contrast to the previous approaches, the input and recurrent weights were encoded in the spatiotemporal domain using photonic linear processing. This could enable computing beyond the input electrical bandwidth of traditional computing systems. As well, the device was capable of processing multiple wavelength inputs over the telecom C-band simultaneously. The tests showed good performance for chaotic time-series forecasting and image classification. The study also confirmed the potential of photonic neuromorphic processing towards peta-scale neuromorphic super-computing on a photonic chip.
In machine learning, feed-forward structures, such as artificial neural networks, graphical Bayesian models, and kernel methods, have been studied for the processing of non-temporal problems. These methods are well understood due to their non-dynamic nature. The feed-forward network is a fundamental building block of a neural network. However, the appeal of neural networks is the possibility of being parallel with the human brain, the network architecture of which is not a feedforward; this understanding led to the recurrent neural networks. In 2001, with difficulties in developing recurrent neural networks, a new approach to design and training was proposed independently by Wolfgang Maass and Herbert Jaeger. These respective approaches were called Liquid State Machines and Echo State Networks.
The Liquid State Machine (LSM), proposed by Wolfgang Maass, was originally presented as a framework to perform real-time computation on temporal signals. However, most descriptions use an abstract cortical microcolumn model, in which a 3D structured locally connected network of spiking neurons is created using biologically inspired parameters and excited by external input spikes. The responses from all neurons are projected to the next cortical layer where the training is performed. This usually modeled a simple linear regression function, but the description of LSM supports more advanced readout layers such as parallel perception. Based on the biologically inspired parameters leaving LSMs slow and computationally intensive, these systems have not been commonly used for engineering purposes.
The Echo State Network (ESN), developed around the same time as the Liquid State Machine, was developed by Herbert Jaeger. The ESN consists of a random, recurrent network of analog neurons driven by a one-dimensional or a multi-dimensional time signal. The activations of the neurons are used to do linear classification and regression tasks. The ESN was introduced as a better way to use the computational power of recurrent neural networks, without needing to train the internal weights. In this way, the reservoir works as a complex nonlinear dynamic filter that transforms input signals using a temporal map. It is possible to use ESN to solve several classification tasks on an input signal by adding multiple readouts to a single reservoir. Because ESNs are more motivated by machine learning theory, they often use sigmoid neurons over the biologically inspired models of LSMs.
Proposed by Schiller and Steil, the algorithm called Backpropagation-Decorrelation was a possible new RNN training method that also treated the reservoir and readout layer separately and suggested fast convergence and good practical results. The proposition also provided a conceptual bridge between traditional BPTT and the reservoir computing framework.
Due to the different underlying theories of the LSM and ESN models, the literature concerning these was spread across different domains and did not interact often, if at all. But, once they did, the ideas were proposed to be combined into a common research stream, which they called reservoir computing. These methods, along with Backpropagation-Decorrelation are now considered reservoir computing.
The concept of reservoir computing used the recursive connections within neural networks to create a complex dynamical system in a generalization of LSMs and ESNs. Recurrent neural networks had previously been found to be useful for language processing and dynamic system modeling, but the training of the networks was challenging and computationally expensive. Reservoir computing reduced the training-related challenges through its use of a dynamic reservoir and the required need to only train the output. Reservoir computing was also shown to be able to use a variety of nonlinear dynamical systems for a reservoir to perform computations. The increased interest in reservoir computation has led to research into the use of photonics and lasers for computation in order to increase efficiency when compared to electrical components.
Reservoir computing has also been extended to physical, or natural, ways of developing computing devices. For example, one experiment used a bucket of water into which inputs were projected and the waves were recorded in order to train a pattern recognizer. As well, an E.Coli. bacteria colony was used in which chemical stimuli were utilized as input and protein measures were used as output. Both experiments showed that reservoir computing may be suitable to use, in combining computational power with unexpected hardware material.