sRFC 33 - Sign message in Actions/blinks

sRFC 33 - Sign message in Actions/blinks

TLDR

Add the ability for Actions and blinks to ask users to sign a plaintext message. This is a common and no-cost way to validate a user actually controls a given wallet address.

Background

Currently, all actions ultimately require the user to sign and send a transaction to be confirmed on chain. So there is always transaction fee for users and therefore always a realized cost. Even in action chaining, the user must sign a transaction before proceeding to the next action. Each time paying a transaction fee.

A very common flow within dApps is asking the user to sign a plaintext message to validate they do in fact control a specific wallet address. This is great because it is completely free for users. There is no transaction fee paid.

Adding the ability for actions/blinks to request users to sign message will unlock new use cases for blinks, including:

  • reducing costs for users
  • blinks could authenticate a user’s wallet address natively within the blink
  • allow blinks to craft customized blink metadata based on the wallet interacting with it

Proposal

Note: specific implementation details to be worked out in PRs.

  • add the ability for an action to request a user sign a transaction OR plaintext message via updating LinkedAction to support both
  • ActionPostResponse should handle different “action response types”, one being for transaction signature requests and another being message signature request
  • messaged being signed should have a consistent structure and ensures their can be a security mechanism to ensure messages originated from its own action api

Requesting the user sign a message

LinkedAction be updated to support multiple types. Something similar to this:

export type LinkedActionType = "transaction" | "sign-message";

export interface LinkedAction {
  type?: LinkedActionType;
  href: string;
  label: string;
  parameters?: Array<TypedActionParameter>;
}

When the LinkedActionType is a sign message type:

  • the blink client will make a POST request to the href with the user’s account address (just as transaction focused actions do now)
  • the api server will respond with the message for the user to sign (see message sign request)
  • after the user signs the message, the blink client should make a POST request to

Response payload for a sign message request

When an action api server is requesting a user sign a plaintext message, vice a transaction, the server should respond with an updated ActionPostResponse described below:

Note: Sign message requires action chaining. Therefore when an action api sends a sign message request to the client, the api server must also include a PostNextActionLink to inform action-clients where to send the signature of the user signed message.

export type ActionPostResponse = TransactionResponse | SignMessageResponse;

export interface TransactionResponse {
  type?: "transaction"
  transaction: string;
  message?: string;
  links?: {
    next: NextActionLink;
  };
}

export interface SignMessageResponse {
  type: "sign-message"
  data: SignMessageData; // see "Structured sign message data"
  state?: string;
  message?: string;
  links: {
    next: PostNextActionLink;
  };
}

The state should be a utf-8 string of a MAC created by the action api server using a secret stored on that server. Action clients should pass this value back to the api server in the PostNextActionLink request. This enables api servers to cryptographically verify that the initial sign message request came from their server by generating a HMAC on their server. It also make it so they are not required to maintain server state of which messages their api requested users sign.

After the user signs the provided message, the blink client will make a POST request to the included next action (i.e. perform action chaining) with a payload as follows:

  • account (required) - the user’s wallet address that signed the message
  • data (required) - the structured data that the user was requested to sign. See Structured sign message data
  • signature (required) - the signature created by the account singing the data (as a base58 encoded string)
  • state (optional) - the same state value the action api initial provided, relayed back from the client.

An example of the updated NextActionPostRequest looks like this:

export interface NextActionPostRequest extends ActionPostRequest {
  /** signature produced from the previous action (either a transaction id or message signature) */
  signature: string;
  /**  */
  data: SignMessageData; // see "Structured sign message data"
  /**  */
  state?: string
}

Note: since the data already supports a key-value object, no change to that type should be required.

After receiving the NextActionPostRequest, the action api should perform all required validation checks on the signature, data, and state to satisfy their business constraints.

The action api can now return the metadata for the next action, and the user can continue within the blink experience.

Structured sign message data

For better user experience and improved security, the plaintext message a user will be prompted to sign should be structure with a few required fields:

  • domain (required) - domain requesting the user to sign the message
  • address (required) - base58 string of the Solana address requested to sign this message
  • statement (required) - human readable string message to the user. it should not contain new line characters (i.e. \n)
  • nonce (required) - a random alpha-numeric string at least 8 characters. this value is generated by the action api, should only be used once, and is used to prevent replay attacks
  • issuedAt (required) - ISO 8601 datetime string. This represents the time at which the sign request was issued by the action api.
  • chainId (optional) - Chain id compatible with CAIPs, defaults to Solana mainnet-beta in clients. If not provided, the blink client should not include chain id in the message to be signed by the user.
type SignMessageData = {
    domain: string;
    address: string;
    statement: string;
    nonce: string;
    issuedAt: string;
    chainId?: string;
}

Note: this structured data is similar to the Sign In With Solana spec, but without the additional rarely used fields. Therefore structure of this data is compatible with SIWS.

chainId is consistent with sRFC 31: Compatibility of Blinks and Actions.

issuedAt should be validated by the action api during their signature verification process in order to perform any desired expiration checks (i.e. was this sign request issued in the last 10 minutes? if not, my api will reject it)

At a minimum, the required fields in the structured message should be presented to the user at or before they are prompted to sign said message.

2 Likes

@nickfrosty, this is great

General question - is there a reason for enforcing consistent structure and specifically use SIWS-like structure for every signed message in context of Blinks?

My understanding is that SIWS is great for initial authentication, but it is not strictly needed for every single message signing operation in solana. There can be application specific message signing operations, that are not SIWS compliant and this is fine. Imo, same applies to Blinks.

From the SIWS docs:

SIWS aims to standardize message formats in order to improve authentication UX and security, replacing the traditionally clunky connect + signMessage flow with a one-click signIn method.
SIWS shifts the responsibility of message construction from dapps to the wallet.

I like idea of SIWS, but I would propose to allow more freedom and let developers decide on final message structure by supporting arbitrary message signing, instead of enforcing SIWS-like SignMessageData structure. The approach below also has less coupling by relying on composition

export interface SignMessageResponse {
  type: "sign-message"
  data: string // can still be SIWS-like message if it's needed by use-case
  state?: string;
  message?: string;
  links: {
    next: PostNextActionLink;
  };
}

This way we still can support both options:

a) use SIWS via composition, we even can provide utility functions for this
b) use arbitrary business specific messages

Like the idea of having optional message signature, can serve as an extra security level if use-case needs it. Do you have an idea or example of what attack can be prevented by verifying that the initial sign message request came from their server?

1 Like

I like idea of SIWS, but I would propose to allow more freedom and let developers decide on final message structure by supporting arbitrary message signing, instead of enforcing SIWS-like SignMessageData structure. The approach below also has less coupling by relying on composition

Developers still have the ability to ask for any message to be signed via the statement field in my proposed SignMessageData. This is the plain text message that the user will see when signing. So they get the same composability and business logic specific messages put into the statement field.

Enforcing a structure like this allows wallets to present all of this info the the user so the user can be sure that the data is coming from a place that they expect. Since blinks can (eventually) be on any domain, having this data being provided to the wallet can help boost user confidence and present a better UX in the wallet signing modal itself.

The fact that it is compatible with SIWS is sort of just a nice side effect and convenience.

Do you have an idea or example of what attack can be prevented by verifying that the initial sign message request came from their server?

Since anyone and any bot can send a signed message to the api server, including an HMAC allows the api server to validate the initial request came from their server to begin with (and do other logic checks like expiration checks, etc).

1 Like

hey everyone, thanks for moving sign message forward.
I agree with @tsmbl that SIWS seems to be too specific for a feature called “sign message”.
I would vote to keep sign message generic as it’s done in the wallets, and create a separate sRFC for SIWS support in actions/blinks if needed. Basically as a rule of thumb I’d propose to keep the same level of abstraction which is used for wallets and for wallet standard and not to overthink for developers, I believe a lot of devs might want to use a good old low level sign message for their needs

2 Likes

Thanks for details! I understand the general idea, but couldn’t imagine a use-case, when this is useful.

Regarding bots, in my understanding, any bot can first make a request to API to generate the message to be signed, so what difference does having HMAC make?

Do you have an example use-case in mind, that demonstrates in which context this extra validation is needed? I guess this feature is inspired by some user request or specific idea of Blink, so curious to learn more about the specifics.

1 Like

the state can help it so the server does not need to store the state of the message-to-be-signed, since it can be verified via it being an HMAC string (if they chose to use it).
also to note, since the state is an arbitrary string that should not be modified by the client, they can also pass any other data if they want. HMAC was more of just an example of something to pass which can be used for extra verification/validation if the api provider wants to use it

1 Like

One example use case for sign message is if I want to integrate non-blockchain actions that are still done by the same user/wallet. Take for example doing email verification for a user that is signing up for a platform via blink. Why should I pay or force the user to pay some even be it small tx fee just to move on to the next action in a multi-action blink?

Or as another example if I’m playing a game via blink not every action/move the user makes is a tx especially for games that are not fully on-chain.

In both cases, SIWS is too specific bc I don’t want to force a user to sign in every time they take an action that doesn’t require a tx.

Hey all, chiming in here to get this initiative unblocked. Had some IRL chats including with Jon. Plan based on those conversations is as follows:

  1. We’re going to get started on an implementation here alongside a couple design partners, specifically DRiP.
  2. We’ll start with Nick F.'s proposal on a more opinionated implementation as is proposed here in the original post. While this isn’t strictly sign message in the most general sense, & I am concerned about that from an educational & specification perspective, I understand the desire to enforce a stricter message structure for safety & security.
  3. As we proceed with the implementation, we’ll report back if we learn that maintaining the original, generic implementation of sign message is what teams want.

Let’s get some great experiences shipped.

This is good feedback rexap. Let’s connect on Telegram and discuss what you’re building. My username is @ chrisoss if you want to send me a message.

@nickfrosty if the above sounds good, can you update the actions spec types to reflect this? Then we’ll have what we need to get started.

1 Like

there is another proposal here (SRFC #32) that covers “optional transactions” in actions/blinks.

effectively, it proposes the ability for actions to be only making a POST request and never asking the user to sign anything. I think this exact proposal is what you would be asking for here: “non-blockchain actions”?

1 Like