How I deployed an on-chain 10k pfp project for less than 0.1 ETH - Clemlaflemme

Yes, as few as 0.1 ETH or more precisely as you can see on the etherscan contract transaction page for as few as 0.096212736214 ETH, most of it being the contract itself (0.075760070358 ETH), i.e. all the general decoding functions that could be embedded once for all in a library. In other words, the image part of the cost is only about 0.02 ETH!

Of course the gas price at the time of deploying was low (approximately 20 gwei) but even with a fairly high price (say ten times bigger) this would have resulted, for the image part, in only 0.2 ETH.

This article is a deep dive into the rendering storage and mechanism used to achieve this result.

Context

The project I was working on is called the co-bots and each of the co-bots looks like this:

Co-Bots #0222

At first sight, it looks like a standard pixel-art NFT project with a rather simple design. Actually, this can be compared to a community I’m a proud member of (Runner #9036), the Chain Runners, IMHO the state-of-the-art pixel-art NFT project so far. Because I already spent some time deep studying their rendering strategy, and because I also already launched my first on-chain NFT project — based on the runners' narrative but with a completely different storage strategy; you can learn more about it in the chain-dreamers website — I firstly declined to work on the co-bots.

Web3 is about democratization as they say. Really, when first class projects cost thousands of US$ up front? Well, I needed to try something. And I realized that the co-bots were the perfect opportunity to deliver an accessible and cheap encoding scheme allowing anyone to go for on-chain storage. Cheers @dom!

https://twitter.com/dhof/status/1410060181849919489

The storage strategy

The usual palette approach

When it comes to storing image data, there are two main strategies:

pixel-based approaches (like .png or .jpg files)
vector-based approaches (like .svg files) that could also be called shape-and-layering approaches (as I will explain in the next section).

Usually on-chain art leverages the palette representation of a pixel-based image to store, at each location, the color index of the pixel, i.e. the color to display at a given index in the raster image. You can learn more about pixel representation in the Pillow python package for example.

The storage cost of a color index depends on the size of the palette: when using n bits for storing such an index, there will be at most 2^n colors in the image.

With this representation, for a square image of say 32x32 pixels (standard size for most of the projects so far), the storage cost is about 32 * 32 * n bits. For a standard palette of 8 colors (and so n = 3), this is about 3 * 32 * 32 = 3072 bits = 384 bytes. Hence, for a given collection of few hundreds of traits (the co-bots have 92, the chain-runners 330), the required storage is about 40kb. Given this stackoverflow response, the EVM burns approximately 20k gas for 32 bytes of storage, and consequently 25,000,000 gas for 40kb.

With the above mentioned gas price, this should have resulted in a storage cost of about 0.5 to 5 ETH for the image part only, not even mentioning storing the palette itself. It is, 25 times bigger than what I achieved. And each Co-Bot can not only use 8, but up to 256 colors!

The rect-based approach

The pixel-based approach is very general and the de facto standard in computer graphics. However, it has a storage cost directly proportional to the number of pixels. On the other hand, a vector-based approach completely ignores the notion of pixels and instead uses mathematical formulas to encode shapes that can be eventually drawn at any scale. The chain-dreamers article gives an in-depth description of how to use this approach to store any kind of image on-chain.

The Co-Bots though are a bit different. They are not any kind of shape but rather a layering of rectangles of different sizes and colors. The .svg file format furthermore defines a <rect> element that can be used to draw a rectangle with given height, width, position and some other parameters as described in the documentation.

Given these attributes and the target style, I made the following project-dependent decisions:

because @smlg (the designer) worked on a 45x45 grid (see the viewBox attribute) I decided to use 6 bits for each coordinate (x, y, width, height). (Note that 2^6 = 64 so it's somehow a lost of granularity, meaning that the grid size could have been 63x63 at the same cost). 6 bits for 4 coordinates is convenient as it lets define the whole rectangle coordinates in 3 bytes (3 * 8 = 24 = 4 * 6).
because the EVM works with slots of 32 bytes (see doc) and because I found it easier to have a round number of bytes per rectangle, I decided to allocate a full byte for the color index. Consequently, the palette size for the whole Co-Bots could be 256 (even though the designer only used 33).

Eventually:

each rectangle is defined by 5 numbers, x, y, width, height and fill and is store with a single bytes4 object.
each co-bots is a combination of 6 traits but each trait is a non-constant number of layered rectangles. In order to avoid a for loop in the rendering function (that would require to iteratively concat bytes which is gassy ; I made the error in the chain-dreamers renderer contract), I made a quick data analysis of the traits. I then realized that:
- no more than 160 rectangles would eventually be required to generate any co-bot
- an empty bytes4 is a valid rectangle (with width=0 and height=0)

In other words,

each rectangle is encoded into 3 + 1 = 4 bytes
one EVM slot of 32 bytes is 8 rectangles
for up to 160 rectangles, 160 rectangles = 20 * 8 rectangles = 20 bytes32 is enough
using a constant size buffer of 20 bytes32 would be enough to render any co-bots.

When computing the rendering of a given co-bot, the first rectangles are valid while the last one are just empty non-visible rectangles (but still here!, see the tokenURI output of any minted token in etherscan).

The SStore2 library

A final step to dramatically lower the storage cost of the traits is to use the SStore2 library.

This library writes down directly a bytes array at a given address in the EVM and returns a pointer to it for later use (with different read/slices helpers). It has been shown that the bigger the bytes the more savings it brings: better store one long bytes than two small ones. Hence, encoding the traits in an easy-to-retrieve number of bytes removes the pain of concatenating everything to save on gas.

Conclusions

This final result brings the following high-level description of the storage and rendering mechanism:

for the designer:
- work on a grid of size up to 64x64
- draw rectangles of any size
- use a color palette of size 256
- create each trait so that each generated NFT will not require more than 160 rectangles (this constraint can be easily weaken by increasing the buffer size)
for the developer:
- concat all the traits in a long bytes
- use the SSTORE2 library to store and read portions of it
- (to be deployed soon) use the already deployed library embedding the rendering function to save the 0.07 ETH!

I strongly hope this can help the community going more and more on-chain, i.e. democratizing the use of fully decentralized assets. The Co-Bots code base is open sources here.

Stay tuned for the forthcoming release of the library, and please don't hesitate to reach out to me on Twitter or Discord should you have any question or suggestion!