Taking NFT Pricing to the Next Level with Machine Learning - Upshot

The second installment in our blog series exploring the use of machine learning to accurately price NFTs. Over the next few weeks, we’ll be diving deeper into how we use machine learning and crowd intelligence to generate automated and up-to-date NFT pricing, why machine learning models are necessary to achieve reliable valuations at scale, and much more.

You can read the first entry in the series on how NFT pricing can increase NFT adoption here.

Comparing NFT pricing using last price, moving averages, and machine learning

At Upshot, we leverage machine learning models that ingest historical sales data and NFT metadata to construct features based on this information to generate accurate, reliable pricings. But is all of this effort necessary to arrive at reasonable NFT valuations? Just what is the best way to price NFTs?

An alternative approach to pricing NFTs would be to simply rely on the most recent sale price of an NFT as its valuation. This approach is as common in practice as it is easy to implement. However, it may not be all that accurate.

Last price

First, many NFTs on the market are selling for the first time ever and lack a sales history, making it impossible to price them using their last sale price. Second, the last sale price of an NFT may reflect idiosyncrasies in the behavior of the buyer or the seller. For example, a buyer may choose to purchase a particular NFT for 3x the price of a similar NFT because they are especially drawn to its aesthetics. While this NFT may have a very high value for this particular buyer, its last purchase price would not accurately reflect how the market, which consists of multiple diverse buyers, would value it.

Moving a step beyond simply relying on the last sale price to appraise an NFT would entail using the entire price history of an NFT to generate a reasonable valuation. This fails to address NFTs being sold for the first time, so NFTs with no sales history still can’t be priced using this approach. However, this approach may help reduce the impact of idiosyncrasies by averaging out divergent buyer or seller behaviors.

Exponentially-weighted moving average (EWMA)

One way of considering an NFT’s entire price history is the exponentially-weighted moving average (EWMA). EWMAs are weighted averages of past sales prices for a particular NFT, where more weight is given to more recent sale events. The plots below show how an EWMA of past sales prices correlates with actual future sales prices for a sample of nine CryptoPunks.

The horizontal axis depicts time and the vertical axis shows prices in USD (converted from ETH). The black lines show the actual sale price histories, whereas the red lines show the price predictions based on the EWMAs. Each one of the plots corresponds to one of nine random CryptoPunks. The two lines appear to move in the same direction in all 9 cases, suggesting that EWMAs may be a strong predictor of future prices.

So, why do we need machine learning?

Machine learning (ML) allows us to incorporate data that EWMA does not take into consideration. For example, an ML algorithm can pool information from the sales histories of every CryptoPunk to arrive at a prediction for a single punk.

This would be especially valuable if, say, the CryptoPunk floor price plunged and a particular punk had not been recently exchanged. Information on the evolution of the sales prices for other CryptoPunks could be used by the model to infer a reasonable price for the punk that has not yet sold.

ML also allows us to incorporate additional types of data such as NFT metadata (e.g. CryptoPunk accessories or Axie parts). Critically, ML can weigh the importance of these many features. It may be the case that a CryptoPunk’s trait rarity or sale price EWMA are far more or less predictive of its next sale price than we may initially suspect. These characteristics of ML lead to significant performance improvements over simpler methods.

For CryptoPunks, many of which have a dense sales history, using an optimized EWMA alone generates a Median Relative Error (MRE) of 41%, meaning that the predictions are usually off by about 41% when compared to actual sale prices. When we move to a machine learning model, the MRE falls to 14%, an improvement of 293%!

The improvements can be even greater for projects that feature a large number of NFTs, such as Axie Infinity or CryptoKitties, most of which have sparse sales histories. Many of the assets in these projects lack a sales history entirely, limiting the predictive efficacy of an asset’s last sale price and its sale price EWMA. As a result, ML approaches excel in predicting the prices of NFTs where others fall short, and is why we rely on them to generate our NFT appraisals.

Error bounds and why they matter

While our pricing API uses machine learning to generate NFT pricing, it also outputs error bounds around the predicted pricing. What’s the purpose of these error bounds and why do they matter?

Let’s again look at the case of CryptoPunks. The plot below shows the predicted vs actual median sales prices for all CryptoPunks based on our machine learning model, together with error bounds.

The vertical axis shows the actual price in USD, corresponding to the observed historical sales prices for CryptoPunks, and the vertical axis shows the predictions of our machine learning model. The 45° red line shows the performance of a “perfect model”, meaning it would predict the sales price exactly on all occasions. However, such a model only exists in theory. We can never explain all of the idiosyncrasies in the CryptoPunks market (or any NFT market) using historical data alone.

The blue line and shaded region show the performance of our ML model. For each prediction from the model on the horizontal axis, we plot the median of the actual sales prices observed in the data on the vertical axis. From the graph, you can see that the blue line representing the performance of our ML model is well-aligned with the line of the “perfect model”, indicating that our model performs well at distinguishing cheap punks from expensive ones.

A standard error bound for each predicted price level is shown by the blue shaded region. This bound is calculated by taking the difference between the predicted and actual sale price for each punk and calculating quantiles of the resulting distribution at each predicted price point. While it may appear complex, this error bound amounts to measuring the extent to which our predictions deviate from what actually happened in reality.

But why do the error bounds show up in the first place? This is partly caused by model uncertainty. We invest a considerable amount of research and development effort to improve our models and reduce their errors, but this is always an ongoing task.

Error bounds may also arise because of unpredictable randomness in the behavior of buyers and sellers. For example, suppose that a new NFT buyer pays $40,000 for an NFT that’s actually worth $10,000 according to most other buyers in the market. This error on the buyer’s side would be reflected in the error bounds of our predictions as well.

As a result, it’s best to interpret the bounds as capturing both model uncertainty (which we can always work to reduce) and market randomness (which we can bound but never fully eliminate).

How should you use the error bounds if, say, you’re interested in selling a punk? Keep in mind that even if a punk is valued at 100 ETH, it may sell for more because of an optimistic buyer. It may also sell for less if the market suddenly experiences a downturn after the listing is made.

Our error bounds can help you understand a reasonable price range and make a decision based on your own expectations and risk tolerance. If you’re risk averse, you may consider numbers closer to the lower bound to maximize the chances of a sale. Whereas if you’re willing to take on some extra risk, you may set a price that’s closer to the upper bound.

Similar logic can extend to other use cases, such as if you’re considering buying a punk, valuing the punk as collateral for a loan, or pawning a punk. Understanding error bounds can help market participants make better decisions.

To recap, Machine Learning allows us to incorporate data that last price and EWMA do not take into consideration, such as pooling the sales histories of NFTs to arrive at a prediction for a single NFT and utilizing a range of NFT metadata (such as CryptoPunk accessories), and then weigh the importance of these features. As a result, ML approaches excel in predicting the prices of NFTs where others fall short, and is why we rely on them to generate our NFT appraisals.

Our pricing API also outputs error bounds around the predicted pricings to capture model uncertainty and market randomness. These error bounds provide valuation information that can be used by market participants to make better decisions.

Stay tuned - next week we will dive deeper into how we determine which variables are important in determining a price for an NFT, such as the impact of trait rarity and last sale price on NFT valuation.

If you are interested in building new products enabled by near real-time NFT appraisals, please reach out on Discord or at [email protected].