sRFC 00011: A smart contract that allows for easy storage and retrieval of data on-chain

On-chain Data Storage

Summary

A smart contract that allows for easy storage and retrieval of data on-chain

Motivation

Currently in Solana there is no standard for data storage on-chain. By data, I mean anything that can be stored as bytes like a text file, a HTML document, a PNG image, NFT JSON Metadata etc. This RFC proposes a solution for storing data on-chain and having it be retrieved easily.

Implementation

To solve this issue, I have developed the Solana Data Program. This is a Solana smart contract that allows users to initialize a Data Account to store any data as bytes and have its metadata be stored in a separate PDA

Flow Diagram

Here’s a flow diagram to describe how the accounts are linked:
image

Accounts

Metadata Account:
In the Metadata PDA Account, we store the metadata regarding the data in the Data Account. A few important fields stored in it are:

  • authority: The authority needs to be a signer in any instruction that involves the Data Account - updating the data and/or data type, finalizing the data, transferring the authority, and/or closing the account

  • data_type and serialization_status: Currently the supported data types are:
    i. CUSTOM: To store custom data or a currently unsupported data type
    ii. IMG: To store image data as raw bytes
    iii. HTML: To store a HTML file as raw bytes
    iv. JSON: To store minified JSON as raw bytes

    The motivation behind having set data_types is that it helps the client application determine how to display the data. It also opens doors for data verification (denoted by serialization_status). Currently when JSON data being updated, the user could optionally verify to ensure it is valid JSON.

  • dynamic: A dynamic Data Account can realloc (up or down) while a static account will always stay a fixed size. This could be useful when users want to pay a fixed amount for the storage space (static) in the Data Account rather than a only-pay-how-much-is-needed approach (dynamic)

Data Account:
A Data Account is any Solana account that is owned by the Data Program and has an associated Metadata PDA Account. The data is stored directly as raw bytes so one could easily retrieve it using a single getAccountInfo RPC call as such:

const data = (await connection.getAccountInfo(dataKey, commitment)).data;

The data account need not be created by the Data Program. You can also pass in a previously created Data Account to the Initialize instruction and it will assign it to the Data Program

Instructions

  • InitializeDataAccount: This instruction creates and initializes the Metadata Account and optionally creates a Data Account.
  • UpdateDataAccount: This instruction updates the data_type field in the Metadata Account and the data in the Data Account.
  • UpdateDataAccountAuthority: This instruction updates the authority of the Data Account by updating the value in the Metadata Account. It requires both the old and new authority to be signers to prevent accidental transfers.
  • FinalizeDataAccount: This instruction finalizes the data in the Data Account by setting the data_status in the Metadata Account to be FINALIZED. Finalized data can no longer be updated.
  • CloseDataAccount: This instruction closes the Data Account and the Metadata Account and transfers the lamports to the authority.

Usage

SolD Website

SolD is a website that acts as an editor for the Data Program:

  • It allows users to view the metadata and data associated with a Data Account via the /<datakey>?cluster=<cluster> route
  • It allows users to connect with their wallet and upload files directly to the Data Program by going to the /upload route
  • If the user is logged in as the authority, it allows the user to edit the data, finalize it and/or close the metadata and data accounts.
  • Users can also view all data accounts they “own” via the /authority/<authority> route and perform group actions on them

Typescript SDK

solana-data-program is a Typescript SDK that exposes APIs to interact with the Data Program and helper methods to parse the data, metadata etc. of a Data Account

NFTs

One potential use case of the Data Program are fully on-chain dynamic NFTs. By this I mean an NFT with:

  • JSON metadata stored on-chain
  • Image data stored on-chain
  • JSON metadata that can be updated via an on-chain crank
  • Image data that can be updated via an on-chain crank

Here’s a link to such an NFT: Quine NFT
On clicking View original you will see the original HTML file that was pulled from on-chain. You can also inspect the Metadata and Image data on SolD separately:
Image: HoyEJgwKhQG1TPRB2ziBU9uGziwy4f25kDcnKDNgsCkg
Metadata JSON: Hb9vkWax5AeLWvCtYSjSvWrN6gTw324gKMa28kcBsgT3

P.S. The NFT image is a quine. The code on the surface of the rotating sphere is the code used to generate the sphere with code on its surface

Composability

An important consideration went into making sure that the Data Program is easy to use and composable. To demonstrate that, I have also made example smart contracts (two of which involve minting NFTs including the Quine NFT :eyes:) that CPI into the Data Program: solana-data-program/examples at main · nvsriram/solana-data-program · GitHub

URI Standard for Data Retrieval

Currently, to have the data be pulled from on-chain I have an API route /data/<dataKey>?cluster=<cluster>&ext=<ext> that just returns the data as is (or in the extension format specified by ext). It would be more handy to have a URI standard which might look something of the sort:

sol://<datakey>?ext=<ext>

to get the data stored in datakey in the format specified by ext

OR

sol://meta/<datakey>

to get the metadata associated with the datakey

Conclusion

This RFC discusses the features of the Data Program and how it can be used to store data on-chain. It presents the SolD website editor and Typescript SDK that make it easy to interact with the Data Program. It showcases potential use cases in fully on-chain dynamic NFTs and proposes a URI standard for data retrieval.

References

Data Program Smart Contract: GitHub - nvsriram/solana-data-program: Solana smart contract that handles on-chain data storage
SolD Website Editor: https://sold-website.vercel.app/
Typescript SDK: solana-data-program - npm
Quine NFT: Solscan
Quine NFT Image: https://sold-website.vercel.app/HoyEJgwKhQG1TPRB2ziBU9uGziwy4f25kDcnKDNgsCkg?cluster=Devnet
Quine NFT Metadata: https://sold-website.vercel.app/Hb9vkWax5AeLWvCtYSjSvWrN6gTw324gKMa28kcBsgT3?cluster=Devnet
Examples: solana-data-program/examples at main · nvsriram/solana-data-program · GitHub

3 Likes

Hey, this is actually really cool!

I have some qualms over the potential for this to be widely adopted as a standard. Instead, I could see this being either a protocol itself for data storage or a program library that can add a layer of abstraction over the management of the accounts/metadata.

How do you envision this being used as a standard?

2 Likes

Thank you, I am glad you like it! :))

I do agree that its role as a protocol or program library are more straightforward with how it currently is and it also makes it easier to adopt that way.

However, I do think that a standard for general data retrieval would be quite useful. The idea I had for this was to use the URL format to make any data stored on Solana easily accessible. This would be the same idea as with Solana pay URL and could also be an extension to the same solana URI scheme (would save the effort of having to register a new URI scheme).

But the end result of this would be that any user could just use this URL format and paste it in a browser to get the data associated with that data account. All the dApps would have to do to be compliant with this is parse the datakey part of the URL (sol://<datakey>) and return the data using a simple line of code like so:

return (await connection.getAccountInfo(dataKey, commitment)).data;

If this URL format were to be adopted as a standard, any user could “upload” their data directly into a data account and have it be easily retrieved via a simple URL. And because the data can be verified when uploaded, it has exciting interactions, say with validating NFT JSON metadata.

This is really cool. I think its important to have a website to pull account data and has a lot of uses for on-chain nfts.

I have tested it out with my own on-chain nfts and I can pull the raw data but it does not parse correctly due to usage of anchor (hence the raw data will have bits in front). Any way for the implementation to detect data:image/URLs or “{” (used in json) and start parsing from that byte?

1 Like

Thank you!

I didn’t really test it out with anchor so didn’t run into the bit padding issue with JSON so currently it just removes all the whitespace bits (that could be introduced when the account is resized etc.) and tries to parse as JSON. But like you mentioned, changing the implementation to parse data between first ‘{’ and last ‘}’ should be fairly straightforward and could be a easy fix.
As of right now, it should still display the invalid JSON in the error or in the HTML/CUSTOM data type (although not as nice).

As for reading data:image/URLs, that would be a bit tricky. Currently, if the data type is said to be of type image, it returns the raw bytes along with the appropriate content type. The website (under HTML datatype) would pass the URL returning the raw bytes to an iframe as its src. For it to read it as intended, it would need to pass the raw bytes as the URL instead.