Balancing the observed signals used to train network intrusion detection models allows for a more accurate allocation of computing resources to defend the network from malicious parties. The models are trained against live data defined within a rolling window and historic data to detect user-defined features in the data. Automated attacks ensure that various kinds of attacks are always present in the rolling training window. The set of models are constantly trained to determine which model to place into production, to alert analysts of intrusions, and/or to automatically deploy countermeasures. The models are continually updated as the features are redefined and as the data in the rolling window changes, and the content of the rolling window is balanced to provide sufficient data of each observed type by which to train the models. When balancing the dataset, low-population signals are overlaid onto high-population signals to balance their relative numbers.