sRFC 00015: Interfaces

toly · May 24, 2023, 2:29pm

Interfaces for programs

Summary

Goal is to define a simple interface specifications for programs that avoid creating additional CPIs during execution.

Interfaces must be discoverable from the program elf.
Discovery shouldn’t require a CPI
Calling an interface should be as fast as calling a non interfaced program
Transactions should be the same size
There needs to be some way to add interfaces to current set of programs on solana

Implementation:

1 interface defines 1 method. So instead of a “Token” interface, there is a specific interface for the Transfer method. The interface is identified by a u128 GUID, a random number that can be generated by a dev.

Publishing Interfaces
The interface is published as in .rs files as documentation for the GUID in mod interfaces

For Example:

    /// interfaces.rs

    /// Transfers tokens from one account to another either directly or via a
    /// delegate.  If this account is associated with the native mint then equal
    /// amounts of SOL and Tokens will be transferred to the destination
    /// account.
    ///
    /// Accounts expected by this instruction:
    ///
    ///   * Single owner/delegate
    ///   0. `[writable]` The source account.
    ///   1. `[writable]` The destination account.
    ///   2. `[signer]` The source account's owner/delegate.
    ///
    ///   * Multisignature owner/delegate
    ///   0. `[writable]` The source account.
    ///   1. `[writable]` The destination account.
    ///   2. `[]` The source account's multisignature owner/delegate.
    ///   3. ..3+M `[signer]` M signer accounts.
    /// Transfer {
    ///    /// The amount of tokens to transfer.
    ///    amount: u64,
    ///},
   const TRANSFER: u128 = 0x2423423fda2344321u128;

Registering Interfaces

///program's instruction.rs
interfaces!([(interfaces::TRANSFER, interface::Instruction::U8(TokenInstruction:Transfer))])

This macro takes an array of mapping the GUID to the instruction id. The instruction type is coerced into the right format by Instruction::U8, so types of any kind of instruction index can be handled. The macro generates a segment that is included in the program ELF file that can be easily parsed from the program bytecode.

Discovering Interfaces

///implementation code
fn get_interface(program_account:: Account, guid: u128) -> Result<interface::Instruction>

This helper function finds the lookup table for the interfaces at a well defined spot in the program byte code and finds the interface instruction index.

toly · May 24, 2023, 4:54pm

Github Issues related to BTF changes that would make interfaces discoverable

github.com/solana-labs/solana

Loader v3 Built-In Program

opened 05:27PM - 24 Jan 23 UTC

Lichtso

runtime

- New loader built-in program - Pin account allocations host ptrs by reservin…g (without allocating) host address space for account resizing - Support multiple entrypoints (generalized methods instead of one `main`) per executable in RBPF - [Optional] Use page table `dirty` bit to track which parts of accounts were actually modified and report that back to the accounts DB to allow for a partial write back to disk. This could be done using either using [`/proc/PID/pagemap`](https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/mm/soft-dirty.rst) or using CPU virtualization. - Actual ABI changes: - Rework executable / program account ownership chain and `is_executable` flag (workaround for avoiding finalization) - [Not MVP] Dynamic ABI memory layout using BTF relocations - Unify the currently separate virtual address spaces and remove address translation at runtime - Share the host address space for all programs in a transaction (similar to [Native Client](https://developer.chrome.com/docs/native-client/)) - Dynamic function calls replacing CPI and Syscalls - Replace VM nesting by dynamic linking using two levels of indirection - [Not MVP] Replace syscalls by CPI to built-in programs - Allocation and lifetime tracking - Stack allocation for internal types - Normal pointers with memory layout information: Load always possible, store only if pointer is mutable - Heap allocation (transaction global) for external types and persistent structures (accounts) - Opaque pointers: No load or store possible - Runtime provides table of these opaque pointers for programs to lookup their members - [Not MVP] How to inline dynamic arrays / vectors into structs (especially accounts)?

github.com/solana-labs/solana

Generate and Verify BTF

opened 05:22PM - 24 Jan 23 UTC

Lichtso

runtime

- BTF type info in toolchain - [Not MVP] ELF dynamic loader instructions for …defining on-chain addresses of dependencies - [Optional] cargo (dependencies) - rustc (attribute) - LLVM: Lift C type restriction, inject runtime code for lifetime tracking - New stricter verifier in RBPF - [Not MVP] Reject cyclic dependencies when deploying a program - [Not MVP] Enforce redeployment maintains existing interfaces (function signatures and types), optionally support type migration - Type inference - Emit type line info for transmutation / reinterpretation - Reject ptr transmutation / reinterpretation - Forbid ptr introspection or order based comparison, only allow equality test (check for aliasing) - Track types of stack slots - Canonicalize bounds checks of sub-slices - Rustc needs to emit the canonical conditional branch - Verifier checks that all sub-slicing has such a runtime bound check - Restrict control flow - Forbid jumps outside of the current function - Enforce that functions end with an `exit` instruction - Constrain `call` and `callx` to functions with the same signature

toly · May 24, 2023, 4:55pm

Github Issues related to BTF changes that would make interfaces discoverable

github.com/solana-labs/solana

Program-Runtime v2 - Road Map

opened 05:32PM - 06 Nov 22 UTC

Lichtso

enhancement work in progress runtime

The performance optimizations we have been working on inside the runtime under t…he ABIv2 project the past year will be deployed separately as ["account data direct mapping"](https://github.com/solana-labs/solana/pull/28053). The redesign of the ABI between runtime and on-chain program is pushed out further as we changed the goals again. A discussion with @alessandrod and @pgarg66 led to the following: ### Tasks - [ ] #29803 - [x] #29654 - [x] #30154 - [x] #30139 - [x] #30282 - [x] #30336 - [x] #30337 - [x] #30348 - [x] #30371 - [x] #30275 - [x] #30425 - [x] #30803 - [x] #30900 - [x] #30902 - [x] #30561 - [x] #30703 - [x] #30900 - [x] #30902 - [x] #30940 - [x] #30945 - [x] #30950 - [x] #30959 - [x] #31034 - [x] #31036 - [x] #31116 - [x] #31118 - [x] #31142 - [x] #31311 - [x] #31331 - [x] #31395 - [x] #31413 - [x] #31465 - [x] #31493 - [x] #31494 - [ ] #20323 - [x] https://github.com/solana-labs/rbpf/pull/454 - [x] https://github.com/solana-labs/rbpf/pull/460 - [ ] #29863 - [ ] #29864 - [x] #29728 - [x] #30579 - [x] #30614 - [x] #30464 - [x] #30693 - [x] #30893 - [x] #31007 - [x] #31088 - [x] #31244 - [x] #31221 - [x] #31244 - [x] #31324 - [x] #31329 - [x] #31345 - [x] #31429 - [x] #31488 - [Not MVP] Adjust other built-in programs and the testing framework

ripatel-jump · May 24, 2023, 7:50pm

The addition of a GUID and the interfaces macro seems redundant. It also seems susceptible to type confusion security issues if an attacker manages to create two distinct function signatures with the same ID.

How about the following?

Adjust the compiler to generate BTF for all public and externally linked entrypoints
Adjust the compiler to generate BTF for all imported entrypoints
When generating BTF for a function, also generate BTF for all transitive types (the types of the function arguments and the return type, recursively)

The ELF format only supports the specification of one entrypoint, so we could instead signal what is public through a custom flag in the dynamic symbol table, e.g. STV_PUBLIC_ENTRYPOINT.

The Solana SDK could make this more developer-friendly through a macro annotations, like so:

// Callee
#[solana_program::entrypoint]
pub fn transfer(bla: u32) -> Result<u64, String>

// Caller
use callee::transfer;

fn bla() {
  transfer(...);
}

In both the caller and callee, the ELF of both programs would contain the BTF definitions of types

Result<u64, u64>
String
type of callee::transfer

The runtime would then check both types for equality before execution.

This avoids the use of GUIDs and brings it more in line with regular dynamic linking.
The drawback of this is that it uses more space, as the caller ELFs will now have to store copies of the BTF. I would expect the BTF footprint to be negligible though unless devs make excessive use of templates.

toly · May 25, 2023, 2:11am

Security issues aren’t a concern because the callee never trusts the caller and has to validate all inputs.

In the BTF approach the runtime does that at link time. I generally think it’s the better option, but we need an actual design for the conventions we want programs to use. Something needs to do the dispatching from a wallet signed message string to the public entry points.

alessandrod · May 25, 2023, 11:18am

This is all already planned and even mostly implemented in LLVM: emitting BTF for a type recursively triggers BTF emission of all the referenced types - this includes function prototypes and definitions. The footprint is indeed negligible - BTF for the whole linux kernel (millions of LOC) is 4.5MB today. Also since BTF is only emitted for types reachable from public entrypoints - and not emitted for unused types - even depending on crates with a large API surface like solana_program won’t significantly impact ELF size.

For CPI, the idea is that this will work completely transparently. There’s no explicit dynamic dispatch or discovery needed. We’re planning to teach rustc, cargo and the linker to use the type info generated when compiling programs, so you’ll be able invoke a callee program just like any other function, you’ll get compile time errors if you try to misuse something etc. Compile time errors are just for improved developer experience, but obviously the runtime will still not trust the resulting bytecode.

At load time then the program runtime will resolve links, apply BTF relocations and enforce whatever security constraints can be enforced based on type info. Higher level properties that can’t be expressed via the type system will be enforced at runtime.

Having said all that, as @toly pointed out we do still need a way to invoke entrypoints from a tx, so we do need a <something> => <entrypoint> mapping. Since the names of public entrypoints will likely not be mangled - the rust mangling scheme is not stable yet and we need to interop with C and one day move programs too - could we use symbol names? If we can use symbol names then essentially we don’t have to do anything special in the runtime, we already have a symbol table so we can just lookup.

toly · May 25, 2023, 8:54pm

So with Runtime V2 something like this should work

///token.rs
///Token Transfer interface

struct Token {
   authority: Pubkey,
   balance: u64
};

trait Token {
   fn transfer(from: &mut Token, to: &mut Token, amount: u64) -> Result<()>
}

then the implementation can look like this

///mytoken.rs
///my token implementation, counts the number of balance transfers

struct MyToken {
  token: Token,
  counter: u64,
}

impl Token {
   fn transfer(from: &mut Token, to: &mut Token, amount: u64) {
     let from: &mut Account = to_account(from);
     let from: &mut MyToken = from_account(from);
     from.counter += 1;
     from.token.transfer(to, amount);
   }
}

ripatel-jump · May 26, 2023, 12:01am

@toly would PRv2 support dynamically dispatching this trait? This looks very interesting. How would we address (lack of) ABI stability in rustc v1.x.x?

toly · May 26, 2023, 12:39am

It should be equivalent to a global extern function in C.

Do you mean dispatching from the transaction message processor or between programs? I think the tricky part will be figuring out which token implementation gets called when more then one is present.

I think we will need to be able call the extern functions from the program object.

splintr · May 26, 2023, 12:54am

I’d like to suggest that interface GUID’s should be the first 128 bits of a hash of the specification detailing what accounts/data the interface expects. This convention should prevent contention/confusion over specific ID’s (ie low # ID’s).

Additionally, I’d like to note that there is a need for data interfaces. For example, for any ownable object, it should be possible to determine the owning address. In different implementations that owner may be stored at different offsets into an account. It would be a huge burden on indexers/applications to require that these offsets be found manually. Additionally, a single program may have multiple different account types, so these offsets might be different within a single program.

I propose a solution in two parts: Account discriminators and an offsets section within the ELF. The idea would be that, within the binary, each different discriminator would be followed by a list of interface-offset pairs, designating the location within each account that the interface’s data can be found.

Note: If I remember correctly, some current implementations vary account type via account size. These implementations will need to be manually grandfathered in by indexers. In fact, since these implementations already exist and are indexed, it would require minimal work. However, all future programs would need to adhere to the discriminator system.

cavemanloverboy · June 9, 2023, 4:11am

if I understand correctly, it seems to me like this goes against last year’s trend of composability (since everyone has their own impl of a primitive), will contribute to chain bloat, and opens up a can of worms re: vulnerability.

Why are these issues not a concern?
How big is the byte code for each interface implementation (e.g., for these simple transfers)?
Are there any guarantees about state transitions that can be provided beyond Rust/Move aliasing rules? For example, instead of a &mut self, can we mark only part of state as mutable for a given implementation? Also, as another example, can we provide default implementations for particular methods that cannot be overwritten.

To illustrate this last point, consider another take on your last example:

struct Token {
   authority: Pubkey,
   balance: u64
};

trait Token {
   fn transfer_hook(from: &mut Token, to: &mut Token, amount: u64) -> Result<()>
  
   #[immutable]
   fn transfer(from: &mut Token, to: &mut Token, amount: u64) -> Result<()> {
     let to_account: &mut MyToken = to_account(to);
     let from_account: &mut MyToken = from_account(from);
     from_account.token.transfer(to_account, amount);

     transfer_hook(from, to, amount)
   }
}

and

struct MyToken {
  token: Token,
  counter: u64,
}

impl Token for MyToken {
   fn transfer_hook(from: &mut Token, to: &mut Token, amount: u64) {
     let from: &mut MyToken = from_account(from)
     from.counter += 1;
   }
}

which provides the same functionality while providing a base implementation that need only be audited once.

Topic		Replies	Views
sRFC 00003: On-chain interface account resolution sRFC account-resolution , interfaces	1	1283	April 4, 2023
sRFC 00017: Token Metadata Interface sRFC interfaces	16	3096	June 26, 2024
sRFC 00010: Program Trait - Transfer Spec sRFC account-resolution , interfaces	2	1033	April 27, 2023
sRFC 21 - Nested Account Resolution sRFC account-resolution , interfaces , program-interface , cpi	0	546	January 15, 2024
sRFC 30: Account Abstraction Interfaces sRFC interfaces	0	167	July 25, 2024

sRFC 00015: Interfaces

Summary

Related topics