Adversarial machine learning

Is a

Technology

Technology attributes

Related Industries

Machine learning

Artificial neural network

Artificial Intelligence (AI)

Other attributes

Child Industry

Clustering

Wikidata ID

Q20312394

Neural networks execute tasks such as clustering, classification, association and prediction.An artificial neural network is a computational model which is developed based on iterative exposure to large sets of training data which affects the statistical weights and balances of the model.

Adversarial training entails intentionally incorporating statistical noise into the training data with the initial intent to deceive the model, thus identifying vulnerabilities and ways to improve model robustness and resilience. In the context of machine learning, robustness refers to reliable operation of a system across a range of conditions (including attacks) and resilience refers to adaptable operations and recovery from disruptions (including attacks).

For developers and maintainers of machine learning models, the ultimate goal of incorporating adversarial methods is to train a model to accommodate and process inputs which may be malicious or otherwise differ from a narrow set of expected inputs. For malicious actors, the goal is to identify a vulnerability in the system which allows them to destroy, invalidate, or subvert a machine learning model.

Taxonomy of attacks, defenses, and consequences

In October 2019, the National Institute of Standards and Technology (NIST) released and draft taxonomy and terminology guide for adversarial machine learning.

Taxonomy of Attacks, Defenses, and Consequences in Adversarial Machine Learning

Adversarial examples

Adversarial examples are intentionally manipulated data which are fed into a neural network with the intent of deceiving it. An adversarial example is generated by introducing a small perturbation to a sample of known-good training data, such that the newly-generated adversarial example reliably causes undesired behaviors or outputs (ex. consistently mis-classifying images) from a machine learning model.

To simulate real-world malicious behavior against a neural network, adversarial examples often appear indistinguishable from legitimate samples from the training data. Adversarial examples of image or audio data, for example, may look or sound nearly identical to legitimate samples to avoid detection by human observers of the input stream.

An example of adversarial example generation applied to GoogLeNet.

Adversarial examples of image data can also be generated by printing images on paper, and then taking a photo of the resulting image printed onto a piece of paper.In addition to these real-world methods, there are open source software tools which can be used to generate adversarial examples.