Summary
Resolving bank hash mismatches between different validators and validator releases is an arduous process.
The most widely used approach involves
- dumping bank hash pre-images to the validator log files (shared with arbitrary other log output)
- using a log parsing tool to extract information
- accumulating the changes every slot, then constructing a diff
Background
- The bank hash commits to the execution inputs and state changes of a slot
- A bank hash mismatch occurs when two Solana runtime implementations output different bank hashes for the same inputs (same state, same slot)
- This implies that these two runtime implementations are incompatible, which is a severe bug that has to be fixed
- The bank hash pre-image refers to the raw inputs fed into the hash function
- This pre-image is highly useful for debugging, as it pin-points the input that is different
- There is no standard for encoding the pre-image; All solutions so far rely on hacks that are incompatible across different validator code bases
Requirements
The following pseudocode describes the declarations of the hash constructions part of the bank hash.
account_hash := blake3 {
le u64 lamports
le u64 slot
le u64 rent_epoch
[]u8 data
u8 executable
[32]u8 owner
[32]u8 key
}
accounts_delta_hash := merkle {
leaf = [32]byte account_hash
branch = sha256 {
[1..=16][32]byte node
}
}
bank_hash := merkle {
leaf = [32]byte account_hash
branch = sha256 {
[1..=16][32]byte node
}
}
The solution must be able to serialize all of the above data in a language-agnostic format. There should be consensus among validator developers, and every team should be willing to implement and work with this format.
The serialized size is estimated to be hundreds of megabytes per slot.
Therefore, the serialization scheme used should also be efficient.
Stretch Goals
Ideally, this file format should support streaming use and compress well.
Perhaps, we could wrap the Protobuf blobs in a binary container format, such as .tar.zst or a custom format.
Possible Solutions
Designing a data structure representing the above information is trivial.
It is not obvious which serialization scheme should be used however.
JSON
steviez at Solana Labs has been working on a JSON-based solution.
This format can be easily upgraded, but we’d argue it is a little too free form, and does not offer great performance.
Custom Binary Format
I’ve worked on a custom binary format for maximum performance.
There are a number of obvious shortcomings:
- It is not easily upgradable
- It is more difficult to implement and debug
Protobuf
After meeting with the Firedancer team on this topic, we settled on the mix between the above two. A Protobuf schema can be upgraded just like JSON structures, but it also features powerful cross-language tooling, a schema language for coordinating these upgrades, as well as decent performance. Finally, Solana validators already use the Protobuf stack for RPC.
We would like to request comments from client developers, and invite validator developers to collaborate on a solution.