The present disclosure provides an improved approach to implement structure learning of neural networks by exploiting correlations in the data/problem the networks aim to solve. A greedy approach is described that finds bottlenecks of information gain from the bottom convolutional layers all the way to the fully connected layers. Rather than simply making the architecture deeper, additional computation and capacitance is only added where it is required.