LoginSign Up
Byzantine fault tolerance

Byzantine fault tolerance

Byzantine fault tolerance is a property of a distributed system such that it can tolerate components of a system failing in arbitrary ways, processing incorrect states, rather than simply stopping or crashing.

Decentralized networks are designed such that multiple separate actors can follow a common set of rules in order to achieve a consensus on a single state for the distributed system. Byzantine fault tolerance is a measure of the network's ability to defend against failures that make it more difficult to achieve a consensus. Some examples of possible Byzantine failures are:

  • Hardware components crashing or breaking
  • Network congestion and disconnection
  • Requests are processed incorrectly
  • Local states (the system states of one of the actors) are corrupted
  • Producing incorrect or inconsistent outputs
  • Malicious attacks from individuals or groups in the network

This system property was named after the Byzantine General's Problem which was outlined in a 1982 research paper from Microsoft. The paper uses a real world analogy in which a group of generals from the Byzantine army have encircled a city and must now agree collectively on whether or not to attack or retreat. If some generals lead their portions of the army to attack while others retreat, the odds of a successful attack diminish.

To complicate matters, the generals are physically separated and must send their attack/retreat votes via messengers to each of the other generals, individually. This could fail if the messengers can't deliver the votes or choose to forge false votes maliciously (i.e. telling some generals one vote and other generals the opposite vote), resulting in some of the generals believing that their is a majority decision for the entire army to attack while others believe that the decision is to retreat.

In computer terms, the generals are processors while the undelivered or forged votes are possible failures such as faulty processors or faulty system components. A couple of prominent solutions to the Byzantine General's Problem are:

  1. Using public-key cryptography to achieve unforgeable digital signatures (i.e. unforgeable messages).
  2. Broadcasting all messages to all of the participants in the network simultaneously, which is known as atomic broadcasting.

Decentralized systems that need to be highly Byzantine fault tolerant can implement both of these solutions together, along with other possible solutions as necessary.

Byzantine fault tolerance is an important property for blockchains, as bad actors have economic incentives to participate in blockchain networks and cause faults maliciously for their own financial gain. Bitcoin uses a Proof-of-Work (PoW) consensus system as a solution in which their is an economic disincentive to attempt to attack Bitcoin. Bitcoin's solution for Byzantine fault tolerance also includes both public-key cryptography and atomic broadcasting.

Timeline

People

Name
Role
Related Golden topics

Further reading

Title
Author
Link
Type

Practical Byzantine Fault Tolerance

Miguel Castro and Barbara Liskov

Academic paper

Understanding Blockchain Fundamentals, Part 1: Byzantine Fault Tolerance

Georgios Konstantopoulos

Web

Documentaries, videos and podcasts

Title
Date
Link

Consensus Algorithms, Blockchain Technology and Bitcoin UCL - by Andreas M. Antonopoulos

January 31, 2016

L6: Byzantine Fault Tolerance

October 21, 2016

The Byzantine Generals Problem - An Intro To Blockchain

August 8th, 2016

Companies

Company
CEO
Location
Products/Services

References