Decentralized networks are designed such that multiple separate actors can follow a common set of rules in order to achieve a consensus on a single state for the distributed system. Byzantine fault tolerance is a measure of the network's ability to defend against failures that make it more difficult to achieve a consensus. Some examples of possible Byzantine failures are:
- Hardware components crashing or breaking
- Network congestion and disconnection
- Requests are processed incorrectly
- Local states (the system states of one of the actors) are corrupted
- Producing incorrect or inconsistent outputs
- Malicious attacks from individuals or groups in the network
This system property was named after the Byzantine General's Problem which was outlined in a 1982 research paper from Microsoft. The paper uses a real world analogy in which a group of generals from the Byzantine army have encircled a city and must now agree collectively on whether or not to attack or retreat. If some generals lead their portions of the army to attack while others retreat, the odds of a successful attack diminish.
To complicate matters, the generals are physically separated and must send their attack/retreat votes via messengers to each of the other generals, individually. This could fail if the messengers can't deliver the votes or choose to forge false votes maliciously (i.e. telling some generals one vote and other generals the opposite vote), resulting in some of the generals believing that their is a majority decision for the entire army to attack while others believe that the decision is to retreat.
In computer terms, the generals are processors while the undelivered or forged votes are possible failures such as faulty processors or faulty system components. A couple of prominent solutions to the Byzantine General's Problem are:
- Using public-key cryptography to achieve unforgeable digital signatures (i.e. unforgeable messages).
- Broadcasting all messages to all of the participants in the network simultaneously, which is known as atomic broadcasting.
Decentralized systems that need to be highly Byzantine fault tolerant can implement both of these solutions together, along with other possible solutions as necessary.
Byzantine fault tolerance is an important property for blockchains, as bad actors have economic incentives to participate in blockchain networks and cause faults maliciously for their own financial gain. Bitcoin uses a Proof-of-Work (PoW) consensus system as a solution in which their is an economic disincentive to attempt to attack Bitcoin. Bitcoin's solution for Byzantine fault tolerance also includes both public-key cryptography and atomic broadcasting.
Consensus Algorithms, Blockchain Technology and Bitcoin UCL - by Andreas M. Antonopoulos
January 31, 2016
L6: Byzantine Fault Tolerance
October 21, 2016
Practical Byzantine Fault Tolerance
Miguel Castro and Barbara Liskov
Re: Bitcoin P2P e-cash paper
The Byzantine Generals Problem - An Intro To Blockchain
August 8th, 2016