Highway network is a neural network architecture designed to ease gradient-based training of very deep networks. It is built on residual networks (ResNets).
It preserves the shortcuts introduced in the ResNet, but augments them with a adaptable parameter to determine to what extent each layer should be a skip connection or a nonlinear connection. It optimizes networks and increases depth. Highway networks allow training of deep, efficient and multilayered networks.
Its gating mechanisms to regulate information flow across different layers. Highway layers allow the network to copy or transform representations instead of traditional neural layers. A feed forward neural network is defined as,
y = H(x,WH)
while a highway network is defined as,
y = H(x,WH) · T(x,WT) + x · C(x,WC).
T is the transform gate and C is the carry gate, they express how much of the output is produced by transforming the input and carrying it, respectively.
Highway network is used as part of text sequence labeling and speech recognition tasks.
Rupesh Kumar Srivastava, Klaus Greff and Jürgen Schmidhuber
Highway Networks: A new (old) chapter in Deep Learning
Rupesh Kumar Srivastava
ResNets, HighwayNets, and DenseNets, Oh My!