Byzantine Fault Tolerance algorithm
July 10, 2025
⏳ 12 min read
One of the most prominent blockchain consensus algorithms is Byzantine Fault Tolerance algorithm.
Now this concept has been democratized, with many popular implementations like CometBFT or HotStuff out there. All BFT implementations are based on Lamport’s paper The Byzantine Generals Problem.
WTH is Byzantine Generals Problem
This situation can be expressed abstractly in terms of a group of generals of the Byzantine army camped with their troops around an enemy city. Communicating only by messenger, the generals must agree upon a common battle plan. However, one or more of them may be traitors who will try to confuse the others. The problem is to find an algorithm to ensure that the loyal generals will reach agreement.
Specifically, these are the conditions:
- A: All loyal generals decide upon the same plan of action.
- B: A small number of traitors cannot cause the loyal generals to adopt a bad plan.
Now, let be the information from th general.
Each general uses some method to combine .
Then, obviously, condition A can be achieved if all generals use the same method (or a function) to combine , because it will always give the same result, giving them the same plan of action.
To fulfill condition B, the first intuitive method is to use the majority vote. Let every be either ‘attack’ or ‘retreat’, and we just count the number of each option from each general. In this case, a small number of traitors may be able to twist the result from attack to retreat or retreat to attack, but both decisions will kind of make sense (if it is at 56% or something like that). The way is communicated is by sending a messenger to every other general.
But this way, traitorous generals may send different values to different generals. For example, 2nd general may send attack to 1st, and retreat to 3rd general. This makes condition A fail, where all loyal generals decide upon the same plan of action, because they need the same set of in order to decide the same plan of action.
So if we want condition A to be true, we need two other conditions:
-
Every loyal general must obtain the same . This is because a general cannot trust the value of from ith general due to the presence of a traitorous general who may send out different information to different generals.
Not only that, we must not allow a loyal general to use info from a traitorous general for condition B to be fulfilled. So that leads to the next condition:
-
If the ith general is loyal, then the value that he sends must be used by every loyal general as the value of .
Now, we can rephrase condition 1 as condition 1’: for every i, any two loyal generals use the same value of .
Here’s one simple example where condition 1 and condition 2 are fulfilled:
, and traitorous generals are 2nd and 5th generals. Magically, the traitorous generals send a different message to each other only, fulfilling condition 1 and condition 2.
Conditions 1’ and 2 relate to the value sent by ith general, so let’s restrict the problem to how a single general sends his information to others (this is the most crucial point to understand, as we will only focus on a single general at a time when working with the algorithm):
Byzantine Generals Problem. A commanding general must send an order to his n - 1 lieutenant generals such that
- IC1. All loyal lieutenants obey the same order.
- IC2. If the commanding general is loyal, then every loyal lieutenant obeys the order he sends.
Note that there’s no restriction to who becomes a commanding general. Any one general at a time can be a commanding general.
It can’t work with three generals
It’s known that there is no solution that fulfills IC1 and IC2 for three generals with at least one traitor. Think about these two different cases below:
In both cases, the 1st general is the commander. In the first case, the 2nd general is a traitor. In the second case, the first general is. The 3rd general, being a loyal one, has no legitimate way to figure out which message is real, because there are only two generals sending him two different messages.
We thus come up with this theorem (we will skip the proof simply because I don’t want to study more about the proof):
No solution with fewer than generals can cope with traitors.
Oral message solution
Before going into the solution, we define three assumptions about the messages among the generals.
- A1. Every message that is sent is delivered correctly
- A2. The receiver of a message knows who sent it
- A3. The absence of a message can be detected
We need A1 and A2 so that a traitor can’t tamper with the communication between two other generals. A3 is so that a traitor won’t be able to prevent a decision making process by not sending a message, which will have to be just replaced by a default value such as retreat.
Now we define the ‘Oral Message algorithms’ where , where a commanding general sends an oral message to other generals.
In this algorithm, we have a function where when a majority of values equal . If does not exist, we fall back to the value .
Now let’s look at the trivial case of , where (zero traitors):
- The commander sends to every other general.
- Every other general uses the value from the commander, or uses the value if no value received.
What about ?
- The commander sends his value to every other general (remember, this could be fake or real info - we don’t have a reason to trust the commander).
- Every other general receives value from the commander, or chooses if no value received. Every ith general acts as the commanding general in to send to other generals (n - 2 because we exclude ith general and the commanding general at ).
- For each and , ith general receives the value from from jth general, which is . ith general uses . If no value received, default to .
For example, if :
We start from , which is the top level. Because we’re at the top level, only one commander initiates the message, which is the 1st general in the illustration above. He sends out to everyone.
Then, at , remember everyone sends out their value to everyone else. This is illustrated by green and blue arrows. We only care about the value that the 3rd general receives at , so we draw them in blue. The 2nd general is a traitor, so he sends a forged value .
The 3rd general in total receives 2 attack orders and 1 retreat order. , so the 3rd general is able to get he correct value.
But does it also work when the commanding general is a traitor given the same and ? Let’s take another example:
The 1st commander is a traitor. He sends out respectively to the 2nd, 3rd and 4th general at . These values can be either attack or retreat, but it doesn’t really matter.
At , each loyal general broadcasts the value that he received. We find that all loyal generals receive the same value no matter what. So they can all use the same , which is good.
We can go on and on and try a bigger , but basically the whole concept is built around recursion. invokes as many as times, till we get to . So will be called times.
How could we differentiate the same value (attack or retreat) received from different generals? We simply prefix the number assigned to each general to .
For example, when general sends a message, he adds his own number to the back of it.
So if general 1 sends value , it becomes:
[ATTACK, 1]
If general 3 passes it on, it becomes:
[ATTACK, 1, 3]
If general 2 passes it,
[ATTACK, 1, 3, 2]
Something like that.
Signed message solution
On top of A1, A2 and A3, we add A4:
- a. A loyal general’s signature cannot be forged, and any alteration of the contents of his signed messages can be detected.
- b. Anyone can verify the authenticity of a general’s signature.
And now we’ll see that a solution using signed messages is much more useful. The algorithm is denoted by , where is the number of maximum traitors we have in a problem.
The only requirement for this algorithm is that it has a function called that will be applied to the set of final orders that each general will have collected. The function satisfies two properties:
-
If the set consists of the single element , then .
For example, .
-
, where is the empty set.
The requirement is not exhaustive; these are the two essential properties that function needs to have. Othere than these, any deterministic rule can be applied to the set of orders that a general receives, to decide on one final order. Lamport doesn’t specify a specific choice function with more properties, leaving a room for more specific algorithms such as PBFT, HotStuff, and CometBFT to be developed, each with their own unique rule.
Order vs message
And just to make it clear before talking about the algorithm, an order and a message are different:
-
order: just a value or command that’s being passed on. In our example, it’s been attack or retreat.
-
message: a message is an order, followed by a list of signatures from the generals that have forwarded the order so far.
For example, would mean that general 0 issued an attack order, and forwarded that to general 1.
Algorithm
Initially, each general has an empty set of orders received.
The commanding general signs and sends his order to every other general.
For each ,
-
if a general receives a message of the form (commander is the 0th general) and he has an empty set of orders, then
-
if general receives a message of the form and v is not in the set ,
-
general adds to . For example, maybe , but , because someone was a traitor. In this case, the commanding general must be a traitor, or another traitor forged/inserted a message pretending to have come from a traitor commanding general, which is also only possible if commanding general’s a traitor. If the commander is loyal, no one can forge his sig according to A4, so no new should appear.
-
if , then he sends the message to every general other than . This just means sending a message to every general that hasn’t received this message yet.
For example, the commanding general at 0 sends to general 1, but sends to others.
Then, each general sends the message to every general other than . Note that every message containing an order already in the set is ignored.
Upon receiving the message, each general would know that the general 0 is a traitor, because each would have , but the message originated from general 0.
For example, upon receiving , general 1 would have , and after receiving or , he would have . But since he knows that the message originated from general 0 but it turns out to be a different order, he can take steps to deduce that general 0 is a traitor. But the specifics of how this works is out of the scope of the Byzantine Generals Problem. The BGP only focuses on two properties:
- IC1. All loyal lieutenants obey the same order.
- IC2. If the commanding general is loyal, then every loyal general obeys the order he sends.
And all generals that receive the order from general 0 will end up with , which will give a deterministic result.
-
For each i: when general i will receive no more messages, he obeys the order .
-