Content Identifiers

In the archives each Epoch, Block and Transaction gets a computed content identifier. These content identifiers can be used to uniquely identify any of these elements when retrieving either from a locally stored CAR file or from IPFS or Filecoin.

In order to retrieve objects using Solana slot numbers, transaction signatures or epoch numbers, indexes are used to map these Solana specific identifiers onto CIDs.

To learn more about CIDs, you can deep dive into the IPFS documentation here.

From https://docs.ipfs.tech/concepts/content-addressing/#what-is-a-cid:

A content identifier, or CID, is a label used to point to material in IPFS. It doesn’t indicate where the content is stored, but it forms a kind of address based on the content itself. CIDs are short, regardless of the size of their underlying content. CIDs are based on the content’s cryptographic hash. That means:

  • Any difference in the content will produce a different CID
  • The same content added to two different IPFS nodes using the same settings will produce the same CID.

The CAR files and CIDs that are generated by Old Faithful are reproducible and will be the same regardless from which Rocksdb ledger archive they are generated from. This means that entities that are running archive nodes and store Rocksdb ledger archives can easily generated their own CIDs for the same blocks and transactions and compare to the ones being generated by the Old Faithful project. In theory this means that you can also choose to run a node that /just/ generates and stores the CIDs but not the durable data and then compare those CIDs to the ones provided by Faithful. Using these CIDs you can retrieve the actual data any time without trusting anyone but your own data generating nodes.