Opportunities to simplify TLS over peer-to-peer connections
I wanted to share some progress on fd_tls and kick off general discussion about the use of TLS in the Solana protocol.
Disclaimer: fd_tls is not an officially supported component of Firedancer.
Since the adoption of the QUIC protocol, Solana’s peer-to-peer layer depends on the TLS protocol for securing connections. Currently, the Solana Labs client uses the rustls library, and Firedancer uses quictls, a fork of OpenSSL.
I started fd_tls as an experiment to replace third-party network dependencies in Firedancer, with the intention of making fd_quic entirely self-hosted. It aims to implement the minimum amount of components required to secure peer-to-peer connectivity, while staying compliant with TLS 1.3 (RFC 8446) and QUIC-TLS (RFC 9001).
TLS is commonly seen as a complex standard due to its lengthy history of bugs and changes, all while maintaining backward-compatibility. Since the deployment of TLS in Solana has no such backwards-compatibility requirements, there is opportunity to shed some complexity and make the handshake logic of the QUIC protocol more robust against various types of attacks.
The development philosophy for Firedancer thus far has been to own the entire Solana validator stack from OSI Layer 2 upwards. This is a lot of work, but has the advantage of reducing the amount of unknowns. (Such as: “How would our QUIC library behave in a specific edge case?”). It also reveals opportunities for deep optimization. However, all of this new networking code presents additional attack surface and will have to get audited.
Considering the above, we strongly suggest minimizing code complexity and the amount of cryptographic algorithms in the Solana validator network.
https://quic.xargs.org/ is a great resource explaining every step of the QUIC-TLS handshake. I will try to summarize it in my own words.
QUIC-TLS in Solana is a combination of three separate protocols:
- The TLS handshake layer (as the name implies, only active during the handshake)
- X.509 certificates (mostly unused)
- The QUIC record layer, which specifies how QUIC packets get encrypted (comparable to the TLS or DTLS record layers for TLS connections over TCP or UDP)
In TLS version 1.3, the latest version at the time of writing, creating a connection involves the following high-level steps:
- Negotiate a suite of cryptographic algorithms
- Establish a “handshake-level” symmetric encryption key using X25519, an Elliptic Curve Diffie-Hellman key exchange algorithm
- Exchange and verify X.509 peer certificates containing Ed25519 signatures
- Establish an “application-level” symmetric encryption key
An obvious first step is to drop support for legacy TLS versions. TLS 1.3 is more secure and much simpler than older TLS versions. It finds almost ubiquitous support and is currently the default in Solana peer-to-peer connections.
TLS 1.3 incorporates a flexible mechanism for negotiating cryptographic algorithms. In early steps of the handshake, the client advertises a list of algorithms it supports. The server then picks a combination of them.
The main types of algorithms being negotiated are as follows:
- Key Exchange cryptography. Solana Labs validators support X25519, secp256r1, and secp384r1.
- Cipher Suites: Solana Labs validators support the TLS 1.3 recommended Authenticated Encryption suites: AES-128-GCM-SHA256, AES-384-GCM-SHA256, and ChaCha20-Poly1305-SHA256. (Note: This implies HMAC-SHA256, not “pure” SHA)
- Signature Algorithms: Solana Labs validators support 9 signature hash algorithms, including EdDSA (Ed25519), 2x ECDSA-based schemes, and 6x RSA-based schemes.
Some of the above cryptography is already in use in the Solana protocol.
- SHA-256 (almost everywhere in the Solana protocol)
- Ed25519 (transaction signatures)
- by extension, Curve25519 used in X25519
- The ChaCha20 block function (on-chain randomness)
Other algorithms were newly introduced by adopting QUIC. Notably, RSA-based signature schemes are considerably slower than the elliptic curve alternatives.
Luckily, to establish a TLS connection, only one cryptographic algorithm of each type is required. Therefore the first version of fd_tls will only support X25519 KEX, AES-128-GCM-SHA256 AEAD, and Ed25519 signatures. (Potentially also ChaCha20-Poly1305-SHA256)
Another obvious opportunity for reducing complexity is eliminating the use of X.509 certificates. X.509 secures peer identity through a chain of trust, anchored in a set of root CAs. This model does not fit permissionless networks well, in which peers are inherently identified by their public keys, as opposed to a domain name (like the server of the
forum.solana.com site you are currently reading).
Consequently, the use of X.509 certificates in Solana is awkward: Nodes serve auto-generated certificates that are signed by themselves, and their peers verify this useless signature.
For each connection the validator makes, it then generates an additional “CertificateVerify” proof. It involves using the certificate’s key to sign a hash that is tied to the current connection. This proves that it is in possession of the key advertised by the certificate.
From the perspective of the verifier, this means the following steps are involved when accepting a new QUIC connection:
- Parse the X.509 certificate (DER serialization over various complex ASN.1 data structures)
- Verify the certificate chain (signature verification)
- Extract the Ed25519 public key of the peer
- Verify the “CertificateVerify” proof
Raw Public Keys
RFC 7250 introduces a second certificate type: Raw Public Keys (RPKs)
RPKs consist of a minimal ASN.1/DER prefix followed by a copy of the serialized public key.
The new verifier steps then become:
- Negotiate RPK support via the CertificateType extension
- Parse the RPK ASN.1 prefix
- Verify the “CertificateVerify” proof
Not only is this mechanism much simpler; It also decreases the maximum byte count of a TLS handshake.
Unfortunately, support for RPKs is sparse. It is currently not supported by stable releases of OpenSSL, GnuTLS, quictls, rustls, nor the Go standard library. OpenSSL and GnuTLS both provide experimental support. My attempt to use an Ed25519 RPK with GnuTLS failed for unknown reasons.
If time permits, I would like to contribute RFC 7250 support to the Go standard library and rustls. I would greatly appreciate any help with this task.
Finally, an update on fd_tls:
fd_tls is currently able to correctly derive TLS 1.3 decryption keys up to the handshake level when speaking to OpenSSL. So far, my experience with implementing TLS 1.3 has been quite pleasant. There are no obvious blockers to completing self-hosted QUIC-TLS support; it is simply a matter of time. I am currently working on additional TLS extension types and the Certificate/CertificateVerify message types.
Whether we’ll use fd_tls in production is unclear. Certainly, before attempting to do so, fd_tls needs to pass tlsfuzzer torture and various other conformance tests.
I hope this post was informative and I’m looking forward to continue discussion on Solana’s network protocols.