sRFC 00014: Rethinking SPL Token

sRFC 00014: Rethinking SPL Token

Summary

This sRFC highlights a critical issue with the current implementation of the SPL token program: the tightly coupled interface and implementation. This results in performance degradation and significant barriers to innovation. To overcome these challenges, the sRFC proposes a modular token program that offers flexible functionality while maintaining security.

Problem

One of the biggest differences between the SVM and the EVM is the SVM’s separation of state and code. This mechanism is useful because it enables parallel processing of transactions. However, it has also had a less-desirable consequence: tightly coupling interfaces and implementations.

This issue is particularly relevant to Solana tokens. Unlike EVM chains with a standard ERC20 interface and multiple implementations, Solana requires all projects to utilize the same shared implementation for token adoption within the ecosystem. Consequently, all token instructions flow through a common codepath. This presents two major drawbacks:

  • Performance degradation due to ‘lowest common denominator’ approach: The shared implementation must accommodate all possible token features, even if they are unnecessary for certain tokens. As a result, token instructions often traverse unnecessary codepaths, leading to performance inefficiencies and increased data storage requirements. For example, all transfers check if the source and destination accounts are frozen, even if a token doesn’t need / isn’t using the freeze functionality. Further, 66 bytes of the 75-byte mint account (88%) and 88 bytes of the 136-byte token account (65%) are only used for a subset of tokens. Given the mass-market adoption that Solana aims to achieve, these differences could add up to tens of millions of dollars of added cost to Solana end-users.

  • Raised barriers to innovation: While the Token-2022 initiative aims to add more features, it fails to address the core problem of inhibiting permissionless innovation. Many projects on Ethereum, such as MakerDAO with DAI, Compound with cTokens, OlympusDAO with OHM, and others, have required custom token implementations tailored to their specific needs. The inability to create alternative token programs easily limits experimentation and adoption within the Solana ecosystem.

  • Opaque governance: As the token program is singleton and unpredictable features are expected, it must be upgradeable. Currently, the governance of the token program is driven by Solana Labs, with validators theoretically having the ability to prevent an upgrade. However, this arrangement raises concerns about centralization and the lack of practical oversight by validators.

jordan-tokenkeg-tweet

Solutions

To address these issues, several potential solutions are proposed:

Wrapper contract

One idea discussed by @joncinque is the utilization of a wrapper contract that automatically freezes token accounts. Additional logic can be implemented in this wrapper contract, imposing it on users during token transfers.

However, this approach does not sufficiently address the problem of permissionless innovation, as widespread support for custom wrappers from various applications and wallets is unlikely.

Change the runtime to allow fine-grained control over CPIs

Enhancing the Solana runtime to provide developers with fine-grained control over Cross-Program Invocations (CPIs) is another solution. Developers could specify specific access rights for callees, such as restricting the ability to pass signed accounts to other programs via CPIs. Similar to Linux’s containerization tools, these runtime additions would offer enhanced security and control.

A modular token program

The advocated approach in this sRFC is the adoption of a modular token program. This program would allow anyone to create a token handler program and register it with the main token program. The token program would then pass along all calls to the relevant handler (this is defined at the mint), taking care to never pass along signed user accounts. It would do so through two means:

  • Upon receiving an initialize_x call, such as initialize_token_account, it would pre-allocate any accounts and pass ownership to the handler. This way, users can prevent passing any account initialization ‘payer’ to the handler.

  • Upon receiving a call where someone needs to be authorized (e.g., from.authority, in the case of a transfer), the token program would authorize the user on behalf of the handler and pass the handler a different signed account to signify that the relevant user has signed.

A proof-of-concept of such a program can be found here.

Open questions

  • Should a token handler be able to specify extra accounts required for basic instructions like transfers? If so, how should this be standardized?

  • How important is backwards-compatibility, and what steps can be taken to ensure compatibility with the existing SPL token? How would migration be facilitated?

  • Are there any security vulnerabilities in this design, and if so, how can they be addressed?

Conclusion

This proposal introduces a method for Solana developers to create new token mechanisms while preserving end-user security. Feedback and questions within this forum are greatly appreciated, particularly from esteemed SPL contributors such as @joncinque, criesofcarrots, and mvines.

7 Likes

Thanks for bringing this up and thinking so much about the problem. Certainly, the lack of composability with token programs is a huge hindrance to open development in the Solana ecosystem, and I would love to see a better solution than the current monolith of Tokenkeg....

I view this in a very similar way, but rather than having everything go through a centralized program, I’ve always preferred simply having programs that implement many interfaces. For example, Tokenkeg... can really be broken down into a program that contains many different interfaces:

  • transferable: transfer and transfer_checked
  • mintable: mint_to
  • freezable: freeze and thaw
  • closeable: close_account
  • approvable: approve and revoke
  • burnable: burn
  • initialize mint / account

If we can write and implement program interfaces for each of these, then we can re-compose everything. sRFC 00010: Program Trait - Transfer Spec is the first step towards this future.

In the interface specs, we can allow for an arbitrarily more accounts, to be implemented through an instruction, or through some sort of lookup account as in the “transfer-hook” interface created for token-2022 https://github.com/solana-labs/solana-program-library/pull/4147.

We also need to figure out “state interfaces”, ie for defining initialize_mint and initialize_account, similar to your example, but doing it through a spec / interface rather than a centralized program.

What do you think about this interface approach?

1 Like

This interface approach makes a lot of sense and pairs nicely with Solana’s use of Rust.

Regarding state interfaces, I’ve pondered potential solutions. One idea is to incorporate a preflight function that returns a structure resembling:

[
  {
    field_name: "mint_authority",
    type: Pubkey,
  },
  {
    field_name: "supply",
    type: u64,
  },
  {
    field_name: "freeze_authority",
    type: Option<Pubkey>,
  }
]

The calling program could then match the field names with its own knowledge (e.g., desiring an Alice mint_authority and a supply of 1000), use None for unknown Option fields, and trigger a revert if encountering non-optional, unknown fields.

However, my primary concern with the interface approach, unless mediated by a program like the one I’ve developed, lies in security. How can we ensure implementors don’t misuse signed inputs? This becomes crucial, especially considering implementors can request additional accounts. Without safeguards, a malicious program might grief the user by creating a 1KB account when it only needs 100 bytes. Couldn’t a program also use the pre-flight mechanism to request a legitimate token program and the user’s token account at that program, thus allowing them to steal the user’s legitimate tokens?

To some extent, user wallets simulate transactions to prevent such risks (e.g., identifying a transaction attempting to steal SOL and aborting it). However, there are ways to bypass these protections, such as malicious codepaths dependent on semi-random events (e.g., stealing funds if wallclock time % 1000 == 0). We could argue that it’s the user’s responsibility to verify the code they interact with, but that weakens Solana’s value proposition, as EVM users, for example, don’t face similar concerns when purchasing tokens on Uniswap.

Thus, it appears necessary to introduce a mediator between the client and implementor, preventing the client’s signed account from leaking through. I would appreciate your thoughts on this matter.

1 Like

These concerns are definitely all valid, and could be a model that’s used on top of interfaces, a sort of “safe-interface-wrapper” program that’ll enforce all of the correct signer / writable flags on accounts, downgrade signers, and CPI to the next program.

With the model you’re proposing, since you don’t want a signature to propagate down, then you’ll have to also provide some signed PDA from the interface wrapper program to ensure that this is a “valid” call to the program, which may be a bit restrictive.

Rather than restricting program design, I’d prefer to make the interfaces well-designed, the wallets to catch potentially dangerous situations, and for everyone to make heavy use of token delegates.

For example, you should never send an instruction that requires your wallet to be signer and writable, along with the system program. This is the current pattern with PDA creation, but it stinks! The program should only allocate + assign the PDA, and your wallet can do a direct system transfer of the required lamports to the PDA at the top-level of the transaction, so only the system program gets your wallet as a signer.

If an interface needs to create a PDA from a wallet, it’s very risky for the reasons that you’ve mentioned, and should not be used.

For tokens, the best option is to use the CPI guard extension on token-2022 solana-program-library/instruction.rs at 8f9c33b3a04250938a573809cd9dfdb698025972 · solana-labs/solana-program-library · GitHub

But otherwise, wallets / clients should always use delegates when transferring tokens. A client should never sign a transaction containing an instruction to a program that requires an owner’s signature, their token account, and the token program. Unless that’s the token program, of course.

If an interface requires these things, it should also be changed. And wallets can catch if there’s a potentially risky set of accounts in an instruction.

Or we can consider expanding the runtime / transaction format to “scope” signatures so they can’t go past a few levels. Bad actors can abuse the privilege extension feature for Cross-Program Invocations via system_instruction::transfer, spl_token::instruction::approve, spl_token::instruction::transfer · Issue #17762 · solana-labs/solana · GitHub has some interesting ideas on that.

2 Likes

Thank you for your thoughtful response, sire. I am also not yet convinced that the solution I’ve presented here is the best one.

@ngundotra maybe want to offer your counsel as well, given your commendable work on sRFCs 2, 3, and 10? @joec as well, given your leadership of Nautilus and your involvement in the interfaces project?

For the purposes of clarity, I am going to enumerate the options that have emerged from this discussion thus far.

Option 1: Status quo

In the status quo option, users would continue passing in signed accounts directly to programs. For example, if someone wanted to submit an order to a CLOB, they would need to directly pass their signed token account, which the CLOB would in turn pass to a token program so that the CLOB may claim the user’s funds.

Here, a user needs to trust every program that they pass their signed account into. Hence, in order to remain secure, the user would need to do due diligence on programs.

Programs, on the other hand, could adopt a more relaxed posture, since a PDA generally only has control over 1 asset (e.g., a quote_vault of a CLOB holding only quote tokens, so that even a malicious token program wouldn’t be able to steal their other assets). Still, programs would need to ensure that their PDAs are not drained of lamports, which could be done like so:

let pda_balance_before = pda.balance;
solana_program::invoke_signed(/* CPI here */);
assert!(pda.balance == pda_balance_before);

Option 2: Discourage passing in signed accounts

Another option is to discourage passing signed accounts into programs. In the CLOB example, a user would first delegate an amount to the CLOB before their trade, and then the CLOB could pull the funds even without the user’s signed account. This is analogous to the EVM approve and transferFrom combination, although this could be done in a single transaction because Solana transactions can contain multiple instructions.

To create accounts, the program would Allocate and Assign the account, and the user would Transfer lamports to the account in a separate instruction.

Of course, the user would still need to trust the token program they interact with and any other programs that require an account to authorize itself (e.g., an NFT program).

Option 3: Proxy

This is what I originally proposed in this sRFC. Interfaced programs would sit behind a proxy that allocates accounts on behalf of the user and signs on behalf of the user.

proof of concept code

Option 4: Runtime changes

Allow users to sandbox their program invocations.

Some ideas here include:

1 Like

I personally am unconvinced that any of these approaches is ideal.

Option #1 is possibly the worst one, as it requires users to completely trust programs that they interact with. As a result, users will naturally limit their interactions to a small set of programs.

I consider option #2 a step-function improvement over option #1, but insufficient by itself. In the CLOB example, one must still trust the token program, even if one may not need to trust the CLOB program. In the EVM, on the other hand, you needn’t place such a high degree of trust in token programs.

At first, I considered option #3 estimable (hence the sRFC :slightly_smiling_face:), but it also seems insufficient. In the CLOB example, one would still need to pass one’s signed account into the CLOB, even if it will never reach the token program. Doing every CPI through a proxy would also double the number of CPIs, which is likely to remain expensive until runtime v2.

Option #4 seems the least poor choice. However, I am not well-versed enough in the validator codebase to determine what can be introduced without adding performance degradation.

The solution to this is to open-source programs, and verify that the code compiles to the program executable that we see on-chain. There’s on going work funded by Foundation to make this an easy-to-use tool.

Realistically, I think we’ll end up in 2 different worlds with some amount of interoperability.
1 - Tokenkeg + Token22 programs
2 - ERC 721 world where each program is its own token with innovative or malicious rules around how it requests accounts and transfers balance

With Token22’s being the bridge between the two worlds (cc @joncinque).

I think this is a great idea, and would love to see more research here. @austbot came up with a similar program structure he called Digital Asset Spec that has the same spirit: shared state, modular programs control specific logic (royalty payout, transfer rules, trait swaps, etc), and a slim program that defines this lifecycle.

If you continue to build this out, I’d recommend pursuing 2 checkpoints:

  1. Can you build a marketplace that swaps / lists assets built with your modular token interface?
  2. Can you index assets appropriately given your token interface?

Thanks for your research so far @metaproph3t !