sRFC 32: Optional Transactions in Action Chaining

sRFC 32: Optional Transactions in Action Chaining

Summary

Transactions in Action Chaining are non-optional right now, therefore for each action in the chain the user need to sign completely differnet transactions which works fine for a limited use-cases but for majority of the user-cases and also to create a better UX, the user should only sign the transaction once whenever possible, that would be mostly the last action in the chain.

Same way as any form works, taking information from the user and only showing the submit button at the last. This could open up some interesting design spaces, where a game is getting played on the blink and when game ends, the winner can claim the price, or generating NFT’s with help any AI directly from the blink and deploying the collection, these are just some examples to support.

Implementation:

Currently when the client makes the POST request, then ActionPostResponse is sent with following structure.

/**
 * Response body payload returned from the Action POST Request
 */
export interface ActionPostResponse<T extends ActionType = ActionType> {
  /** base64 encoded serialized transaction */
  transaction: string;
  /** describes the nature of the transaction */
  message?: string;
  links?: {
    /**
     * The next action in a successive chain of actions to be obtained after
     * the previous was successful.
     */
    next: NextActionLink;
  };
}

If, we modify the ActionPostResponse to keep the transaction as an optional field

/**
 * Response body payload returned from the Action POST Request
 */
export interface ActionPostResponse<T extends ActionType = ActionType> {
  /** base64 encoded serialized transaction */
  transaction?: string;
  /** if transaction is present, describes the nature of the transaction  */
  message?: string;
  links?: {
    /**
     * The next action in a successive chain of actions to be obtained after
     * the previous was successful.
     */
    next: NextActionLink;
  };
}

The client can now skip the pop-up to user to sign the transaction and depending upon the NextActionLink render the blink.

export type NextActionLink = PostNextActionLink | InlineNextActionLink;

/** @see {NextActionPostRequest} */
export interface PostNextActionLink {
  /** Indicates the type of the link. */
  type: "post";
  /** Relative or same origin URL to which the POST request should be made. */
  href: string;
}

/**
 * Represents an inline next action embedded within the current context.
 */
export interface InlineNextActionLink {
  /** Indicates the type of the link. */
  type: "inline";
  /** The next action to be performed */
  action: NextAction;
}

If the NextActionLink is of type InlineNextActionLink then the client can directly render the metadata and no callback would be required.

If the NextActionLink is of type PostNextActionLink then the action server can return metadata depending on the previous inputs by the user address

It should be upto the aciton server to server a InlineNextActionLink or PostNextActionLink type of NextAction to make a better UX for the specific use-case.

This way, the action server can also track informations across a series of actions happening on the client. The idea of JWT also crosses the mind, but imo having each action linked to an address and on client only the given address can sign the transaction removes a lot of concerns.

2 Likes

I like the idea, allowing user actions without requiring direct wallet interaction can significantly improve the UX for certain use cases, opening up new design and application possibilities.

As we explore this further, adding more action types makes sense, especially with potential future needs like signing messages or navigating to external resources. To support this, I suggest refining the implementation to be more strict and explicit, below is a rough idea of how the interfaces could look with respect to my comment

// Can be extended to  "tx" | "post" | "sign-message" | "external-link" in the future
export type LinkedActionType = "tx" | "post"  

export interface LinkedAction {
  type?: LinkedActionType; // Can be used to tailor UX/UI depending on action to be performed
  href: string;
  label: string;
  parameters?: Array<TypedActionParameter>;
}

As an idea we can also correspondingly define multiple response types, it can also simplify the flow a bit

export type ActionPostResponse = TxResponse | PostResponse;

export interface TxResponse {
  type?: "tx"
  transaction: string;
  message?: string;
  links?: {
    next: NextActionLink;
  };
}

export interface PostResponse extends Action {}

Anyway, I think we can discuss the final interfaces during the PR in actions-spec.

3 Likes

Agreed, on having more action types, just saw a sRFC requesting for external-link so the implementation will make more sense in the long term.

// Can be extended to  "tx" | "post" | "sign-message" | "external-link" in the future
export type LinkedActionType = "tx" | "post

I like the idea of defining multiple response types, it will enable more explicit types. I’ll make a PR in action-specs

3 Likes

Instead of optionally passing / framing the transaction, If we could pass the tx base64 back to the server.

Meaning, Instead of exposing the tx directly to the client, we can pass the tx to the chained action,
So that once the chain of actions are completed, user can only sign the transaction once before the last step,
This also ensures all the actions have a transaction, and can be found onchain.

The chained actions should check for Previous / Incoming Transactions (from the params / body), append that to the current transaction, then construct a transaction, before it reaches the completed action.

The code for merging the tx might look like this :

import { Transaction, Message } from '@solana/web3.js';

function mergeTransactions(tx1Base64: string, tx2Base64: string): string {
  // Decode base64 transactions
  const tx1Buffer = Buffer.from(tx1Base64, 'base64');
  const tx2Buffer = Buffer.from(tx2Base64, 'base64');

  // Deserialize transactions
  const tx1 = Transaction.from(tx1Buffer);
  const tx2 = Transaction.from(tx2Buffer);

  // Create a new transaction and combine instructions
  const mergedTx = new Transaction();
  mergedTx.add(...tx1.instructions, ...tx2.instructions);

  // Set the feePayer and recentBlockhash from one of the original transactions
  mergedTx.feePayer = tx1.feePayer;
  mergedTx.recentBlockhash = tx1.recentBlockhash;

  // Encode the merged transaction back to base64
  const mergedBuffer = mergedTx.serialize();
  const mergedBase64 = mergedBuffer.toString('base64');

  return mergedBase64;
}
2 Likes

I really like this idea and concept @0xaryan: not all actions must result in a transaction. The current “a transaction is required” stemed from the general design for blinks/actions of “actions return transaction”, but there is no reason we need to continue to shoehorn into that design space anymore (I think)

I also prefer @tsmbl’s recommendation of clarifying the types to make them more extensible. Multiple “response types” will already be needed for the Sign Message functionality, so this is also inline there too.

I think proposed PostResponse simply extending Action is interesting since I think it would also enable actions to never prompt for a transaction or other signable event. The api server could continue to return an “action” metadata until eventually return type=completed, then the user would never be prompted to sign anything. I don’t see this as something we need to prohibit, but I do think it may require more thought on implications.

@tsmbl what is the benefit of adding a type into LinkedAction? What “tailor UX/UI” do you foresee?

Should the ActionPostResponse have an single type? My thought is that on the blink/client side, performing the checks of what to do with the POST response, you would have to now check for all type variants for the post ActionPostResponse’s type and the ActionType

3 Likes

this would then result in the transaction able to be altered on the client side before it is sent back to the server. and that should seriously be avoided!

with @0xaryan general idea here, and the existence of action chaining now, you can already build a blink with multiple actions that can build a complex transaction like you are proposing doing with merging the instructions together. except without all the security risk of transaction being alterable on the client side.

it looks like the missing piece for your described flow is what Arun is proposing: not all actions require the user to sign a transaction, effectively just allow a form submit. so you ultimately built one single transaction for the user to sign, based on all the inputs and action chaining steps you collected

1 Like

I have made a PR which can clarify the whole flow.

It took me time to wrap my head around the whole flow, but it finally made complete sense with Alexey’s idea.

IMO, having type in the LinkedAction can help the blink client look for only specific data fields when the action endpoint returns the data.

for example,

  • in the case of type post, the blink client would know to not look for transaction in the response
  • in the case of type sign-message, the blink client would only look for data: SignMessageData
    In a general sense, to create more explicit types on both the blink client and action sides.

I think, enabling this would simply make the blinks, a metadata standard to the URLs which anyone can use to take any form of data from the user. The ability to sign transaction/messages is just the cherry on top.

1 Like

I saw the PR and I am skimming it now :slight_smile:

having type in the PostResponse makes sense to me, just not in LinkedAction for this. it seems irrelevant to be in LinkedAction.

for example, the initial GET request to display the first action to a user:

  • the LinkedActions are displayed as form elements the user can interact with
  • no matter if there is a type representing “transaction” or “post” requests, the client will still make a POST request to the href endpoint. this POST request will be made either if type=post or type=transaction
  • the RESPONSE of this POST request should have a type noting what data is being returned for the user to interact with:
    • if type == “transaction” => asking the user to sign the transaction (the current flow of all actions)
    • if type == “sign-message” => ask the user to sign a message
    • if type == “post” => nothing for the user to sign, we just wanted to save the user input fields in our database
    • now we process the links.next action

this same flow would be the same concept for chained actions too, not just the initial GET request

I don’t see any need to add a type into LinkedAction, just the PostResponse to determine how the action-client should handle the response

2 Likes

ah, got it. makes sense. i’ll update the PR for the same.

2 Likes

The only potential benefit I see is customization of the button press behaviour & button UI. One example can be External Linking, for this case we can change button to visually display external link icon somewhere on button and change client behaviour to open a new tab, instead of making a POST request.

I’ve noticed that you also proposing to have type in LinkedAction in Sign Message proposal. Is it a bit outdated based on what was discussed in scope of this sRFC, or you have different considerations for having it in LinkedAction in your proposal?

@nickfrosty, the flow you’ve described above looks right to me if we don’t consider External Linking.

2 Likes

The only potential benefit I see is customization of the button press behaviour & button UI. One example can be External Linking , for this case we can change button to visually display external link icon somewhere on button and change client behaviour to open a new tab, instead of making a POST request.

I only think it is needed in the POST response. I don’t think a type on LinkedActions really matters, at least right now. Right now, no matter what, the linked action will always make a POST request to the href. Adding a type field in the POST response is what would help determine how the response is handled (i.e. is this a transaction, sign message request, external link, etc). The client would handle this then.

I’ve noticed that you also proposing to have type in LinkedAction in Sign Message proposal. Is it a bit outdated based on what was discussed in scope of this sRFC, or you have different considerations for having it in LinkedAction in your proposal?

Good catch, yes that is outdated then. I do not think LinkedActions should have a type at all with any of the current sRFCs or any specific features I foresee in the future.

I’m happy to have my mind changed, I just do not currently see a need for it.

@nickfrosty, the flow you’ve described above looks right to me if we don’t consider External Linking .

what special consideration do you see for “external linking” as an action?
my thought was:

  • if type == “external-link” => nothing for the user to sign, no post request, just display a button/link to open the provided href as a full link in a new tab (likely requiring this be an absolute url and displaying the domain to the user below the button/link)

aside from good UX design from designers (like you mentioned in the other sRFC), do you foresee something different?

Side notes:

  • I generally thing “external links” should NOT be allowed in the initial GET metadata, ONLY in follow on chained actions. If external links were supported, then people might simply put a external link and no other action. This feels like an anti-pattern for what actions are.

  • I feel fairly strongly that external linking should be considered a terminal action, like the current completed state. After the user has completed all interactions with the blink, the provider can provide an external link to be display to the user. I think this because external links inherently take the user away from the blink they are interacting with.

    • so this is contributing to me thinking there is no need to add a type field into LinkedAction
2 Likes

My consideration was based on the External Linking proposal and previous experience with other tools, that clearly indicate the action type on button in certain cases. I was assuming that External Linking is proposed to be implemented by adding type to LinkedAction, allowing to process external link without making extra POST call. I also didn’t notice that you’re hesitant about the External Linking proposal.

I fully agree that having type only in POST response is sufficient to deterministically cover the general flow. That said, it looks healthy thing to have type in LinkedAction in scope of External Linking for the reasons below

  1. Implementing extra POST endpoint & making extra network call to implement external link feels a bit redundant to me
  2. Visually displaying external link on button can slightly improve the UX

Please reflect your thoughts in Blinks CTA: External Linking, we should probably discuss it there, lol.

Mostly agree, however there still might be cases when “external links” useful in any GET metadata, including the initial one. For instance developers might need to have external link to the text document that doesn’t fit the blink description, e.g. governance proposal text or ToS. So, imo what you are describing sounds like a common sense and best practice to me, but it should not be a strong constraint in specification.

2 Likes

my post here in that proposal was meant to funnel the conversation into this one, since they are very closely related (at least for implementation)

I also didn’t notice that you’re hesitant about the External Linking proposal.

Not hesitant, I’m generally for it. Just trying to work out the finer details for it. I think the spec supporting external linking in some way is very useful.

  1. Implementing extra POST endpoint & making extra network call to implement external link feels a bit redundant to me

why would there be an extra POST endpoint?

Mostly agree, however there still might be cases when “external links” useful in any GET metadata, including the initial one. For instance developers might need to have external link to the text document that doesn’t fit the blink description, e.g. governance proposal text or ToS. So, imo what you are describing sounds like a common sense and best practice to me, but it should not be a strong constraint in specification.

I think a url like this for a governance proposal doc could and should be linked in the description (with the blink client making it clickable), not as a primary action button with the rest of the actions. Unless it is a terminal/completed action of some sort.

Having an “action button” that is an external link that fits into the form ui feels wrong and out of place to me. It takes the user away from the blink vice interacting directly with it.
The only time it feels right is for completed states: instead of a generic “completed button” the blink would render an “external link” in place of the completed button.

2 Likes

External linking sRFC proposes to add type attribute to LinkedAction, author proposes that it will be used by client to determine whether to open the link or making POST network calls.

At the same time, based on your comments

Following this, if I understand correctly, your proposal is to keep LinkedAction untouched. Meaning, client will make POST request that will return a response of external-link type. This means that as an action developer I need to implement extra endpoint, rather than just indicating the type in LinkedAction.

Ideally you should be able to do both. Sometimes links are very long, buttons provide better UX in many cases - let’s consider telegram bots, where you’re free to decide how to encode link.

Hm, not sure. Imo, blinks are not limited to form UI. Developers may do variety of use-cases and even in form UI you may need to include links to external resources. I would advocate to be more open and flexible closer to Frames / Telegram bots experience where developers are free to decide what buttons and where do they need, rather than adding constraints.

Again, I agree with your thoughts, but it feels more like a best practice & common sense to me and should not be spec-level constraint.

2 Likes

Hey all, thanks for moving this forward. Sorry if I’m a bit late to the discussion.

Frankly speaking I don’t see any disadvantages in adding type in the LinkedAction, looks very simple and clear solution which is commonly used in general in the programming. And knowing it in advance would allow for blinks to render different buttons differently, for example add a link icon for buttons that’s supposed to redirect user.

And in terms of limitations it’s def better to not to limit external link button to terminal state only. I believe developers can decide on their own where and when they want to redirect their users. I would really feel bad for not allowing it, because it seems like a baseline for me.

I would really recommend to look into telegram bots api i think it’s well designed and well tested on hundred thousand of use cases(which are not yet possible in blinks btw, but would really love to make blinks as power)

4 Likes

Hey @tsmbl / @nickfrosty, what can we do to move this sRFC forward form here?

1 Like

sounds like I am the only one that had a concern or other desires from your proposal @0xaryan, so I can void those right now and I think we can move forward with saying this SRFC is “approved”

to clarify what changes are being proposed after all the conversation above:

  • make the transaction optional in the type (blink clients need to handle this flow)
  • add a type to LinkedAction with the following to start:
    • transaction - if the linked action should include a transaction
    • post - if the LinkedAction is effectively going to just make a post request, with nothing for the user to sign
    • external-link - originally from this SRFC (we could add this in via the same implementation)
    • sign message - in the future with the sign message SRFC
  • update ActionPostResponse to add and handle the same type described above
  • no restrictions on where these external links can be presented to the user

did I miss anything?

cc @tsmbl @nze

3 Likes

Thanks for the summary, @nickfrosty. This looks good to me, I believe we can work out the remaining details during implementation, if necessary

3 Likes