Goals
Messari’s Market Data Service has three main goals. In order of importance, they are:- Provide the most accurate price and volume data. Market data is an essential piece of understanding this industry, and we want to provide the best and most reliable tools for users to do so.
- Ensure wide coverage across the breadth of assets in the industry. If there’s an asset of interest in crypto, we want to have price and volume data on it - preferably from genesis.
- Minimize latencies. In addition to an extremely deep set of historical market data, Messari also offers live prices and price alerts. Timely updates to our pricing data is essential to keep these features useful.
Coverage: Spot and Derivatives
Messari’s Market Data considers Spot price and volume data across major Centralized, as well as Decentralized, Exchanges. Our methodology for creating prices is detailed below. We also support the following Derivatives datasets:- Open Interest
- Aggregated by Asset or Exchange
- Volume
- Aggregated by Asset or Exchange
- Funding Rates (weighted by Open Interest as well as by Volume)
- Aggregated by Asset
- Binance
- Bitget
- Bullish
- Bybit
- Coinbase
- CrossTower
- Crypto.com
- FTX
- OKX
Definitions
OHLCV object: A data structure that is comprised of five datapoints (Open, High, Low, Close, Volume) for a specific asset over a given timeframe, all of which are stored as floating point numbers. DEX: A Decentralized Exchange operated autonomously on a blockchain, typically via smart contracts and liquidity pools. Examples: Uniswap, Raydium, Curve, etc. CEX: Exchanges operated as centralized entities, such as corporations. Typically traded through traditional order books. Examples: Coinbase, Binance, etc. Market Level: A price based on particular quote pair at a specific exchange. For example, BTC/USD on Binance would be a separate market from BTC/USD on Kraken. Asset Level: An aggregated price for an asset that is inclusive of all the markets that it is traded in. At Messari, all Asset Level prices are denominated in USD. In the example above, the BTC price would be an Asset Level price and would be comprised of many Market Level prices. Graph Structure: A non-linear data structure consisting of a set of vertices and a set of edges (also called nodes and lines). Vertices are the fundamental units of a graph, and edges are drawn to connect nodes. VWAP: Volume Weighted Average Price. At Messari, the VWAP calculation process is a simple weighted arithmetic mean calculated once each for the Open, High, Low, and Close values. Market Cap (Market Capitalization): Total value of all tokens in circulation, calculated by multiplying the current token price by its circulating supply. It provides a snapshot of current valuation and is widely used to perform relative analysis. FDV (Fully Diluted Value): The theoretical market cap if all tokens were in circulation at the current token price. Calculated by multiplying the price by its total supply. In cases of missing total supply, max supply is used instead. Circulating Supply: Number of tokens currently available and actively circulating in the market. This number excludes locked tokens, tokens held by teams on vesting schedules, or tokens that have been burned. It reflects the tradable supply and provides an accurate picture of a token’s current availability. Max Supply: The maximum number of tokens that will ever exist for a token, typically set at launch. Many tokens do not have a maximum supply. Total Supply: The total number of tokens that currently exist, including both circulating and non-circulating tokens. This number excludes burned tokens. It reflects the total supply and provides a view into a token’s future availability.Spot Datasets and Outputs
Messari produces three price and volume datasets as OHLCV timeseries that are continuously updated to live instances of https://messari.io, as well as over the API. These are:- Market Level denominated in the quote asset (ex: AAVE/WETH on Uniswap v2 on ETH L1)
- Market Level denominated in USD (ex: AAVE/WETH → AAVE/USD on Uniswap v2 on ETH L1)
- Asset Level denominated in USD (ex: AAVE/USD)
Methodology
At a high level, Messari gives one volume-weighted price denominated in USD for every token that we support. This price is inclusive of every trusted market that it trades in, regardless of the quoted asset for the pair. For example, to calculate the price for ETH, we consider the price of ETH on markets where it is the Base Asset (ex. ETH/USD, ETH/BTC), as well as markets where it is the quote (ex: RPL/ETH). In order to do this in a useful way, we ingest raw trade data from a variety of sources on a continuous (24/7) basis. We then transform this raw data into time-series data sets that accurately reflect the USD-denominated value of each asset at any given time. The full methodology is explained below. There are many challenges to doing this accurately:- Preponderance of non-fiat pairs. One fundamental problem in denominating crypto markets in USD is that many markets do not trade directly with USD or any fiat currency, particularly on decentralized exchanges. Since the volume and price data from these pairs must still be part of the final price emission, we need a way to convert these to USD on-the-fly in an accurate and reliable way.
- Multiple venue types. Decentralized exchanges can trade differently and pose a unique set of challenges compared to traditional order book exchanges. For example, the same exchange can exist on different chains (Ethereum, Optimism, Arbitrum, etc.) and must be handled and mapped separately. Swaps from DEXes are also different from trades in the traditional sense as there is no concept of ‘side’ upon trade ingestion. Combining all of these markets into one price that represents all of these pairs is a uniquely difficult task without a playbook from the TradFi world.
- Data correctness and outliers. Due to the decentralized nature of crypto, anyone can spin up liquidity for a token on any multitude of decentralized exchanges. The liquidity for any given asset could be splintered into very small amounts (sometimes sub $1000) that must either be part of the final price, or filtered out. Every market that any asset trades on requires mapping and curation.
- Bad data. Messari often has overlap of coverage between our sources. This means we get duplicated raw trades from the same venues from different sources, and in some cases these sources might disagree with each other. At other times, we might only have one source for a particular market, but have reason to believe that the data is incorrect. As a result, we need a comprehensive method of detecting and excluding bad data.
- Duplicated tickers, redenominated tokens, migrated contracts. The large number of assets and exchanges in the space necessitates an accurate map with unique IDs. Because each provider maintains their own map, work must also be done to relate multiple maps to each other.
Data Ingestion
Messari maintains relationships with partners that provide us with continuous feeds of trade data. All of these feeds are converted into 1-minute OHLCV objects, and are then transformed/downsampled into the datasets described above. Messari strives to have both adequate breadth of coverage as well as accuracy in its market data service. We solve for each in the following way:- Breadth of coverage
- To ensure that we have as broad coverage as possible, Messari ingests raw trades and swaps from a variety of venues in the space. To supplement this coverage, we also contract with several third-party providers to supply us with trade data from venues we don’t support internally.
- Correctness and mapping
- We ingest raw trade data that we turn into candle objects. This allows us to employ outlier detection at the trade level and more carefully examine our data.
- Because we aggregate our market data from various providers as well as internal sources, Messari maintains extensive mapping to ensure that we are connecting the right markets.