sRFC 28: Blinks Chaining

Blinks as they exist now have a single depth to interactions. You fetch the entrypoint of the blink with a GET request, then use one of the button actions to make a POST request to the server to fetch a transaction with the user’s account info.

This single depth prevents good error handling, and can be easily expanded to allow for blink chaining by just reusing the ActionGetResult for the POST request (with an optional transaction field) and rerendering the blink.

This allows branching logic on the part of the blink, where actions can be shown specific to the user’s account, and even multiple transactions could be carried out. This also allows for better error messages and error handling, giving more info to the end user.

So what does this look like?

Currently ActionGetRequest looks like this:

export interface ActionGetResponse {
  /** url of some descriptive image for the action */
  icon: string;
  /** title of the action */
  title: string;
  /** brief description of the action */
  description: string;
  /** text to be rendered on the action button */
  label: string;
  /** optional state for disabling the action button(s) */
  disabled?: boolean;
  /** optional list of related Actions */
  links?: {
    actions: LinkedAction[];
  };
  /** optional (non-fatal) error message */
  error?: ActionError;
}

we can expand it to add:

export interface ActionUnifiedResponse {
  /** url of some descriptive image for the action */
  icon: string;
  /** title of the action */
  title: string;
  /** brief description of the action */
  description: string;
  /** text to be rendered on the action button */
  label: string;
  /** optional state for disabling the action button(s) */
  disabled?: boolean;
  /** optional list of related Actions */
  links?: {
    actions: LinkedAction[];
  };
  /** optional (non-fatal) error message */
  error?: ActionError;
  /** optional b64 encoded transaction */
  transaction?: string
}

This would get rid of the ActionPostResponse completely, and just use this unified response for all blinks.

So what would this look like?

  1. User fetches the entrypoint to the blink with standard GET request and gets a list of actions
  2. User clicks on an action, which POST requests with their account to the given action URL
  3. This returns a new ActionUnifiedResponse with an optional transaction they can sign and rerender of potential links or errors the server had in case the account + path from previous request resulted in a transaction that didn’t work (like if that account was specifically timed out from interacting with that path).
  4. Loop until done

Notes

  1. Does chaining mean that servers have to hold state?

    1. Not necessarily, all state can be URL encoded as path params
  2. What are some other cool things this allows?

    1. Right now one of the biggest challenges is blink generation unique to user has to be done off platform :- if you want users to have their own blinks they have to go to telegram or discord or a website or something to generate a blink cause twitter api access is trash. With this you could give out new blinks to uses on twitter itself
4 Likes

Hi @spacemandev this is a great idea, and definitely something I’m interested in.

For a secure authenticated session to take place, there needs to still be some form of transaction or signature. In the current implementation of Blinks only transactions are supported (looks like signatures are coming soon) - and it seems that once a transaction is sent, the blink reaches a final state. I would love to see this proposal also enable multiple subsequent states after a transaction takes place

2 Likes

Actually, this implementation would support secure sessions. After the first transaction (which can be a memo ix or transfer of lamport from and to the user), you can give a JWT to the user encoded in the query parameters.

This allows you do to a multi state system with a secure session

4 Likes

I like this general idea of “blink chaining”. Support for chaining the Actions API responses in a sequential way like this could open some interesting designs and applications for actions/blinks.

Aside: I think technically it should be title “actions chaining” since the blink is just the renderer and not always required to be used. The actions are being chained and can be client agnostic.

Questions:

  1. How many actions could be chained together? No limit or some limit?

Having no limit might be nice for some use cases, but I suspect it will lead to a lot of user drop of while submitting actions and signing transactions. Chaining alone might lead to some users getting confused when the UI gets updated on the subsequent chained action if the messaging in the previous aciton is unclear (I guess these are all general UI design best practices and not necessary actions specific though)

  1. How would the action-aware client (like a blink) know to stop requesting for subsequent chained actions? Is it simply if the proposed UnifiedResponse has no links.actions declared?

In the current actions spec, if the GET response has no links.actions is not provided, then a single button is expected to be rendered in the client and the POST request be made to the same url as the GET request (aka backwards compatible with Solana Pay transaction requests). However, if links.actions does exist, then the POST request is made to the corresponding href value for the action a user submits via button click.

  1. What should happen if the action api returns a transaction and the rest of the metadata to support chaining an additional action? Should the user be promoted to sign the transaction before rendering the new chained actions? Should it be before?

  2. Why remove the ActionPostResponse in favor of your proposed ActionUnifiedResponse that seems to only add the optional transaction field?

Even in your proposal, it seems like after the very first GET response to collect the initial metadata and available actions, you are proposing to continue to make POST requests. Why not just add the optional transaction field to the ActionPostResponse interface? If this transaction field exists, it is the action api implicitly saying “I am performing action chaining”

  1. With the idea of action chaining via your proposed “unified response” interface, should the error field become fatal? Halting the chain of events? Or should the only error drive fatal halting be when a proper http error coded response is returned from the action api?
3 Likes

Also, with a goal to maintains backwards compatability between action-aware clients that support action chaining and those that do not, do you have concerns or thoughts on how an action api should react if the client does not perform the chaining?

With another change to the spec for action-aware clients somehow declaring “what features they support” (which I dislike the idea of), there is no way for an action api to know if their request to chain multiple actions will actually be facilitated by the client UI

2 Likes
  1. How many actions could be chained together?
  • With the session based model on actions, it’d be up to the app developer to make sure they aren’t making their sessions so long that users drop off and have good handle on recovery
  1. How would the action-aware client (like a blink) know to stop requesting for subsequent chained actions? Is it simply if the proposed UnifiedResponse has no links.actions declared?
  • Yes, you could still have ActionGet flow be the kick off flow to remain backwards compatibility and have no links.actions on that. You “end” the session when the user is done pressing buttons, not when there’s no links.actions left. I’m assuming the final branch of a flow would just return disabled is true to end the session
  1. What should happen if the action api returns a transaction and the rest of the metadata to support chaining an additional action? Should the user be promoted to sign the transaction before rendering the new chained actions? Should it be after?
  • Definitely before. As a follow on, it’s also important that there’s a way to fetch the txn signature from the client and then have it in the context of the following post request so the backend can confirm it
  1. Why remove the ActionPostResponse in favor of your proposed ActionUnifiedResponse that seems to only add the optional transaction field?
  • You’re confusing ActionPostResponse with ActionGetResponse. ActionPostResponse in the current spec only has transaction and message fields. Specifically the reason you wouldn’t want to add transaction field to ActionGetResponse is because passive rendering of an action (such as through a blink on twitter) should not popup a client transaction. It’d be pretty bad UX if you were scrolling on twitter and your wallet kept popping up. Only after the first GET should there be an option of POST requests with transactions
  1. With the idea of action chaining via your proposed “unified response” interface, should the error field become fatal? Halting the chain of events? Or should the only error drive fatal halting be when a proper http error coded response is returned from the action api?
  • I don’t think the error field should become fatal. It can be used for a “try again” by users for example when the transaction cannot be confirmed or in case of other app logic.
  1. In terms of backwards compatibility, this is a tough one. You could add a client version header on the initial GET request maybe?
2 Likes
  • Yes, you could still have ActionGet flow be the kick off flow to remain backwards compatibility and have no links.actions on that. You “end” the session when the user is done pressing buttons, not when there’s no links.actions left. I’m assuming the final branch of a flow would just return disabled is true to end the session

Makes sense. I think using disabled to end the chaining session would be the easiest implementation for sure. For some reason, by brain does like it and it seems like not the best dev experience, but I cannot articulate why…

  • Definitely before. As a follow on, it’s also important that there’s a way to fetch the txn signature from the client and then have it in the context of the following post request so the backend can confirm it

A few people have suggested wanting some sort of “callback” functionality to verify the signature on their server side. With action chaining, it makes sense to desire this too. But on the flip side, since the transaction id is being provided by the client, it can easily be spoofed.

You’re confusing ActionPostResponse with ActionGetResponse. ActionPostResponse in the current spec only has transaction and message fields. Specifically the reason you wouldn’t want to add transaction field to ActionGetResponse is because passive rendering of an action (such as through a blink on twitter) should not popup a client transaction. It’d be pretty bad UX if you were scrolling on twitter and your wallet kept popping up. Only after the first GET should there be an option of POST requests with transactions

Ohh you are right. I misthough / mistyped. I meant in the PostResponse to add the same fields as the GetResponse. Updating the interface to be this is what I meant to suggest:

/**
 * Response body payload returned from the Action POST Request
 */
export interface ActionPostResponse extends ActionGetResponse {
  /** base64 encoded serialized transaction */
  transaction: string;
  /** describes the nature of the transaction */
  message?: string;
}

Specifically the reason you wouldn’t want to add transaction field to ActionGetResponse is because passive rendering of an action (such as through a blink on twitter) should not popup a client transaction. It’d be pretty bad UX if you were scrolling on twitter and your wallet kept popping up.

Totally agree lol. This is also why the initial GET request has no body payload since if it did, it would in theory send the user’s wallet address to every blink on the page. Bad experience and removes the user’s ability to interact or not interact with a specific blink.

  • I don’t think the error field should become fatal. It can be used for a “try again” by users for example when the transaction cannot be confirmed or in case of other app logic.

Makes sense, but the current non-fatal error does not follow this “try again” flow really.

3 Likes

Currently the ActionPostResponse looks like

/**
 * Response body payload returned from the Action POST Request
 */
export interface ActionPostResponse {
  /** base64 encoded serialized transaction */
  transaction: string;
  /** describes the nature of the transaction */
  message?: string;

When the user signs the transaction, the action server has no option to verify/know if the user has signed the transaction, therefore the server needs to scan all the transactions for the given programId and check if the user has made the transaction or not. Scanning all the transactions is not a feasible approach, which can be replaced by a callback URL , the callback URL would accept a signature ( signed by the user ) and the account ( base58-encoded representation of the public key of the user ) in the body to link the transaction.

/**
 * Response body payload returned from the Action POST Request
 */
export interface ActionPostResponse {
  /** base64 encoded serialized transaction */
  transaction: string;
  /** describes the nature of the transaction */
  message?: string;
  /** callback URL to be called after the transaction is confirmed */
  callback?: string;
}

After the user has signed the transaction, then the blink-client would use the callback URL field to send a HTTP OK JSON with the following payload

/**
* Response body payload returned from the Callback Post Request
*/
export interface CallbackPostResponse {
  /** user signature */
  signature: string;
  /** base58-encoded representation of the public key of the user  */
  account: string;
}
1 Like

with regards to @spacemandev’s proposal for action chaining, how does your callback idea fit into it?

dev’s proposal suggests the post response would return the optimistic UI items that should be rendered after the previous transaction is successful, allowing 1 less network request and the user to immediately interact with the next action in the chain (as if it was a freshly rendered blink)

for your callback proposal, is the purpose to simply tell the action api server that the transaction was successful or to get the next action in the chain only after it was successful? or something else?

2 Likes

For the callback proposal, the idea was simply to tell the action API server that the transaction was successful so that the API server can do any event linked to the user success tx.
Now that I think of it,
In addition to the @spacemandev proposal of having the optimistic UI items in the post-response, the post-response should have 3 options:-

  1. success optimistic ui - to be rendered after the previous transaction was successful

  2. failed optimistic ui ( optional ) - to be rendered after the previous transaction failed. [ this would help in cases where the rendered blink is the last in the chain ]

  3. callback URL - to enable a confirmation of the transaction to the action server, would remove the overhead on the action server to scan all the transactions related to that programId and also encourage the action developers to use third-party programs which otherwise would have a lot of on-chain transaction ( for example - a jupiter dca )

This can now open some interesting design space for the developers and creating better UX ( instead of having 5-6 input field in 1 rendered blink, it can be distributed among 2/3 steps )

2 Likes

Wow, lots of good stuff here. I am learning a lot about blinks chaining

2 Likes

After chatting more with @spacemandev some on this, we are leaning towards this to both enable support for action chaining and provide the callback functionality. This will both prevent the new feature from being a breaking change and will play nice will other open spec change proposals and expected future ones (like message signing).

Note: The interfaces listed below may be simplified version of their final implementations. To improve the type safety and DX, the specific names and types may be adjusted but will accomplish the same functionality.

Proposal

Update the ActionPostResponse to allow passing the a url to discover the next action in the chain (via callback) or include the metadata for the next action in the action (without making a callback):

/**
 * Response body payload returned from the Action POST Request
 */
export interface ActionPostResponse {
  /** base64 encoded serialized transaction */
  transaction: string;
  /** describes the nature of the transaction */
  message?: string;
  /** support action chaining */
  links?: {
    /**
     * - when `next` is type=`string` aka url => make a POST to this address with a payload of `NextActionPostRequest` to retrieve the next action
     *    - note: this url is required to be same origin as the POST request returning this response
     * - when `next` is type=`ActionGetResponse` => after transaction is confirmed, render this data
     */
    next: string | ActionGetResponse;
  };
}

export interface NextActionPostRequest extends ActionPostRequest {
  /** signature produced from the previous action (either a transaction id or message signature) */
  signature: string;
}

Explanation

When any ActionPostResponse include the links.next attribute, an action chain is created/continued.

The links.next can either be a string url or the blink metadata (ActionGetResponse) that will trigger the chaining after the provided transaction is confirmed.

If the links.next is a string url, it is required to be from the same origin that creates the ActionPostResponse. After the transaction is confirmed, the blink-client should make a POST request to the provided links.next url with the user’s wallet address (account) and confirmed transaction id (signature) in the POST body as JSON.

Think: on transaction confirmed => perform callback (with the transaction id) to get the next action in the chain

If the links.next is an object of ActionGetResponse, after the transaction is confirmed, this blink metadata should be rendered to the user (effectively making it appear as if a new blink) and no callback to the Actions API is made.

Think: on transaction confirmed => render this blink metadata, I don’t need to confirm the transaction on my backend

If an Action provider needs to track any state between their chained actions, they must handle/validate that themselves. For example, using a query param to track the user’s current “step in the action chain”.

When an Action provider wants to stop the action chain, they can return the final action as one with a disabled=true, which will stop the user from continuing via the blink UI.

An action chain can effectively be any length as long as the user continues to sign and confirm transactions and the action api continue to return signable transactions.

2 Likes

@nickfrosty, this is great update! I like the idea of links.next, it looks consistent and explicit.

This will definitely work and allow to stop the chain. There is one case that came to my mind related to existing client behaviour. Currently blinks have a Completed state indicating the success of execution.

As a developer I would like to preserve this behaviour and be able to clearly indicate the Completed state in the end of the chain. I believe setting disabled=true is not sufficient to cover this case.

As an idea, we could indicate Completed state explicitly, I see at least 2 options
a) Support 2 types in Chained POST Response to explicitly indicate terminal state, for example

export type NextActionResponse = Action | CompletedAction;

/** A response indicating next action to be displayed to a user */
export interface Action extends ActionGetResponse {
  type: 'action';
}

/** A response indicating that terminal action state reached */
export interface CompletedAction {
  type: 'completed-action';
}

b) Somehow indicate Completed state in the GET Response, could be something like

  • completed=true
  • variant='completed'

I think we can simplify a bit by keeping a single option for chaining - using the POST request. Rationale:

  • Chaining via POST request already covers all functional cases and provides broader functionality comparing to chaining via ActionGetResponse.
  • Having fewer options is better for end developer experience.

In my opinion the only potential benefit of chaining using ActionGetResponse is saving one network call, but the performance benefits are not clear to me in this case. So, not sure if ActionGetResponse case needs to be specially covered.

Currently blinks have a Completed state indicating the success of execution.

This is all in the blink-client though right? This same flow would be preserved if the final action in the chain simply does not return links.next, effectively just like now.

If we consider the current actions have a “action chain length of 1”, the “completed” state that the client displays shows after the transaction is confirmed and there is not links.next (since the spec does not exist yet). If there are more actions in the chain the user can continue to execute them, until when there is no longer any “next actions” set, then the blink-client can render this completed state.

Support 2 types in Chained POST Response to explicitly indicate terminal state,

I do think having a way for action api’s to explicitly declare the final action in the chain is a good idea. Adding a type field to the GetResponse (which we should likely rename now lol) I think is the best way to declare this. Add in type with a default of action for backwards compatibility.

Your proposed CompletedAction has a drawback of not allowing the action api to update the metadata at the very end of the chain which will be very useful (and I suspect will be very commonly used).

I think we should make sure to support this case, and be able to smartly handle both cases of the final actions wanting to update the metadata AND final actions not wanting to update the metadata.

In my opinion the only potential benefit of chaining using ActionGetResponse is saving one network call, but the performance benefits are not clear to me in this case. So, not sure if ActionGetResponse case needs to be specially covered.

Saving one network call is still useful imo. But the other benefit that I see is for showing the final “completed” state’s metdata at the end of the chain. After the final action is confirmed, the client already has the metadata to display and update the UI with what ever “completed metadata” the action api wants the user to see.

For example, if a blink is used to mint an nft, the api can return the nft artwork and other metadata when they send the transaction to the user. After the transaction is confirmed, show the nft the user just minted

1 Like

Yes, it’s about current blink-client behaviour, that is possible because client assumes that there’s a single action.

I think there is a scenario when this criteria is not sufficient: when the final action in chain requires submission of signature. Let’s say as an action developer I would like like to submit tx or message signature to actions API as a final step, no further user activity is required afterwards, so I expect Completed state to be rendered. Based on current the proposal, links.next is a part of POST response, but linking always returns ActionGetResponse - this response will be rendered instead of Completed state.

Yes, agreed here. So, option (a), as formulated above, doesn’t fit in this framing. But I still believe we need an explicit Completed state indication somewhere to achieve this.

This sounds close to option (b) from my message above. Can you elaborate on adding type to ActionGetResponse? Do you expect to have separate data structures for each type, or type should serve as a Completed state indicator?

Yes, def should rename, lol.

From my perspective, optimizing to save a network call at this stage might be a bit premature. Using POST as a single mechanic can achieve the same UX, for example same approach has been successfully implemented and proven to work in Farcaster frames. Generally, I think it’s simpler and more consistent to have a single option if we don’t sacrifice flexibility or feature completeness.

2 Likes

This sounds close to option (b) from my message above. Can you elaborate on adding type to ActionGetResponse ? Do you expect to have separate data structures for each type , or type should serve as a Completed state indicator?

Let’s say we update the spec types/interfaces to this:

/**
 * A single Solana Action
 */
export interface Action {
  /**
   * @default `action`
   */
  type?: "action" | "completed";
  /** image url that represents the source of the action request */
  icon: string;
  /** describes the source of the action request */
  title: string;
  /** brief summary of the action to be performed */
  description: string;
  /** button text rendered to the user */
  label: string;
  /** UI state for the button being rendered to the user */
  disabled?: boolean;
  /**  */
  links?: {
    /** list of related Actions a user could perform */
    actions: LinkedAction[];
  };
  /** non-fatal error message to be displayed to the user */
  error?: ActionError;
}

/**
 * Response body payload returned from the initial Action GET Request
 */
export type ActionGetResponse = Action;

/**
 * Response body payload returned from the Action POST Request
 */
export interface ActionPostResponse {
  /** base64 encoded serialized transaction */
  transaction: string;
  /** describes the nature of the transaction */
  message?: string;
  /** support action chaining */
  links?: {
    /**
     * - when `next` is type=`string` aka url => make a POST to this address with a payload of `NextActionPostRequest` to retrieve the next action
     *    - note: this url is required to be same origin as the POST request returning this response
     * - when `next` is type=`Action` => after transaction is confirmed, render this data
     */
    next: string | Action;
  };
}

export interface NextActionPostRequest extends ActionPostRequest {
  /** signature produced from the previous action (either a transaction id or message signature) */
  signature: string;
}

The initial GET response returns the metadata to render the blink. When the user clicks a button to begin interacting with an action, it makes the POST request (just as it does now).

If a user does NOT want to chain an action, they do NOT return links.next. After the transaction is confirmed, the blink-client can render the “completed” state. Just as they do now.

Any time ActionPostResponse does NOT return links.next, the blink-client knows this is the final transaction in the chain and can render the completed state after the transaction is confirmed. Just as they do now.

If a user wants to chain an action:

  • the POST response returns a transaction and includes the links.next which is the callback url (same origin only) to give the api server the tx id and fetch the next action
    • since the links.next was declared, you know the action chain is not yet in the final “completed” state
  • after the transaction is confirmed, the blink-client makes a POST request to this links.next url which returns the next action (aka Action aka ActionGetResponse) which declares a type value to determine what the UI should do, including handling the “completed” state
    • type=action (the default value) => regular action chain. not the terminal/completed state
    • type=completed => terminal state of the action. this allows developers to return updated metadata for the UI to render after the transaction is confirmed. (aka the completed state, but with custom updated metadata)
  • if this action includes either type=action or does not declare type since it defaults to action, then this response is the next action in the chain and the user can interact with it as if a “fresh action”. it goes through the lifecycle of requests to ultimatly get the ActionPostResponse with another transaction and maybe another chained action via links.next
  • this loop will continue until either:
    • no links.next is returned via the ActionPostResponse (implicitly declaring “this is the final action”). the blink-client can then render a standard “completed” ui just like it does now. no additional UI metadata gets updated (like the image, title, description, etc)
    • or the action api returns type=completed, the blink-client now explicitly knows this is the final action in the chain and can render the “completed” state and also update the metadata displayed to the user (since this payload just gave it to you)

I also think a the type value of completed is maybe not the best word to use. Something closer to the desire to “update the UI with this metadata”.

The type interface for Action should likely be updated to not allow including links when type=completed since this is effectively the terminal state of metadata to render to the user, and no further action exists in the chain.

In my opinion the only potential benefit of chaining using ActionGetResponse is saving one network call, but the performance benefits are not clear to me in this case. So, not sure if ActionGetResponse case needs to be specially covered.

I think it will be very useful for people and improved the developer experience of building blinks. And can be handled by blink-clients with basically a single if statement check.
If you are really hung up on it and so opposed to having it, we can just cut it so we can finalize this sRFC and get action chaining shipped to the public.

1 Like

@nickfrosty, this looks conceptually complete and should work from the request/response flow and information model perspective.

I think we can figure out the exact semantics for type and polish other aspects such as type safety on later stages during the PR in spec repo, wdyt?

Not strongly against conceptually, we can keep it if you still think it’s useful. Let’s just ensure it’s well typed and extensible during implementation.

2 Likes

Agreed on getting the types all working well in the PR.
I will work on opening a PR tomorrow then!

2 Likes

I don’t know how I missed this discussion.

The idea of links.next along which includes the signature and POST URL is a banger.

3 Likes