← Back to the index
January 8, 2021

How storage and retrieval deals work on Filecoin

How storage and retrieval deals work on Filecoin

In this blog post we are explaining how Filecoin Deals work from the perspective of storage providers (a.k.a miners) and clients who want to store data on Filecoin.

Introduction

The Filecoin network achieves economies of scale by allowing anyone to participate as a storage provider. Currently the network is made up of hundreds of storage providers spread across the globe. Content addressing and cryptographic storage proofs verify that data is being stored correctly and securely over time on miners’ hardware, which creates a robust and reliable service.

In this blog post, we will go over the basic stages of the two types of deals in Filecoin, namely storage deals and retrieval deals, and detail their lifecycle. We will also cover the cryptographic proofs used to verify that participants in the system are performing their responsibilities.

Data on Filecoin

Before looking into storage and retrieval deals within Filecoin, we must first understand the main unit of negotiation for data.

The Filecoin Piece is the main unit of negotiation for data that users store on the Filecoin network. The Filecoin Piece is not a unit of storage, it is not of a specific size, but is upper-bounded by the size of the Sector, governed by the parameters of the network. A Filecoin Piece can be of any size, but if it is larger than the size of a Sector that the miner supports, it has to be split into more pieces so that each piece fits into a sector.

Now that we know that we are going to store data as Pieces into Filecoin, and negotiate prices on Pieces, how are our files and directories converted to a Piece and what are the contents of a Piece?

Filecoin Piece

A Filecoin Piece is a CAR file produced by an IPLD DAG with its own payload CID and piece CID.

CAR stands for Content Addressable aRchives - a CAR file is a serialized representation of any IPLD DAG as the concatenation of its blocks, plus a header that described the graph in the file (with the root CID).

When a client wants to store a file in the Filecoin network, they start by producing the IPLD DAG of the file with UnixFS. The hash that represents the root node of the DAG is an IPFS-style CID, called payload CID.

UnixFS is a protocol buffers-based format for describing files, directories, and symlinks in IPFS. UnixFS is used in Filecoin as a file formatting guideline for files submitted to the Filecoin network.

The resulting CAR file is padded with extra zero bits in order for the file to make a binary Merkle tree.

The Storage Deal Flow

Users can store data in and retrieve data from the Filecoin network via deals. Participants in the network, miners (supply-side) and clients (demand-side), interact with each other via storage deals and retrieval deals.

The lifecycle of a storage deal is as follows:

1. Discovery

The client identifies miners and determines their current asks, i.e. the price per GiB per epoch (30sec.) in attoFIL that miners want to receive in order to accept a deal. Currently a deal in Filecoin has a minimum duration of 180 days.

You can list all currently active miners by querying the JSON RPC API of a synced node (for test purposes we will be using the https://api.node.glif.io public endpoint), with the Filecoin.StateListMiners method:

curl -X POST \
  -H "Content-Type: application/json" \
  --data '{ "jsonrpc": "2.0", "method": "Filecoin.StateListMiners", "params": [ null ], "id": 1 }' \
  'https://api.node.glif.io' | jq
{
  "jsonrpc": "2.0",
  "result": [
    "f011303",
    "f011092",
    "f011417",
    ...

You might want to decide on a specific provider based on their reputation or power in the network. Reputation metrics for miners are not part of the Filecoin protocol, and out of scope for this post.

Once you have decided on a specific provider/miner, you need to fetch its PeerID, for example with the Filecoin.StateMinerInfo JSON RPC API, in order to be able to do a cryptographic verification of their StorageAsk, and make sure that nobody is impersonating them:

curl -X POST \
	 -H "Content-Type: application/json" \
	 --data '{ "jsonrpc": "2.0", "method": "Filecoin.StateMinerInfo", "params": [ "f03274", null ], "id": 1 }' \
	 'https://api.node.glif.io' | jq
{
  "jsonrpc": "2.0",
  "result": {
    "Owner": "f03261",
    "Worker": "f03261",
    ...
    "PeerId": "12D3KooWP5D9TmqC45i6L2e2qQHYcuxaUwPdYo6CzqUMVmFEH3N9",
    ...

You can then query for a signed StorageAsk with the Filecoin.ClientQueryAsk JSON RPC API:

curl -X POST \
	 -H "Content-Type: application/json" \
	 --data '{ "jsonrpc": "2.0", "method": "Filecoin.ClientQueryAsk", "params": [ "12D3KooWP5D9TmqC45i6L2e2qQHYcuxaUwPdYo6CzqUMVmFEH3N9", "f03274" ], "id": 1 }' \
	 'https://api.node.glif.io' | jq
{
  "jsonrpc": "2.0",
  "result": {
    "Price": "100000000000",
    "VerifiedPrice": "100000000000",
    "MinPieceSize": 256,
    "MaxPieceSize": 34359738368,
    "Miner": "f03274",
    "Timestamp": 148031,
    "Expiry": 1199231,
    "SeqNo": 14
  },
  "id": 1
}

The result includes details about deals that this miner is willing to accept, such as range for piece size and price per GiB per epoch. Note that making a storage deal proposal which matches the miner’s storage ask is a precondition, but not sufficient to ensure the deal is accepted - the storage provider may run its own decision logic later on.

2. Negotiation

3. Publishing

The deal is published on-chain, making the storage provider (miner) publicly accountable for the deal.

4. Handoff

Once the deal is published on-chain, it is handled by the Storage Mining subsystem, and the miner is about to seal a sector which contains the deal.

The Storage Mining subsystem

The Storage Mining subsystem ensures miners can effectively commit storage to the Filecoin network and:

  1. Participate in the Filecoin Storage Market by taking on client data and participating in storage deals.

  2. Participate in Filecoin Storage Power Consensus, verifying and generating blocks to grow the Filecoin blockchain and earning block rewards and fees for doing so. This is out of scope for this post.

It oversees the following processes:

In order to register a sector in Filecoin, a miner must seal the sector. Sealing is a computation-heavy process that produces a unique representation of the data in the form of a proof, called Proof-of-Replication or PoRep. Once the proof has been generated, the miner compresses it and submits the result to the blockchain. This is a certification that the miner has indeed replicated a copy of the data they agreed to store.

Every storage miner must continuously submit proofs on-chain to prove that they continue to store their sectors.

Failing to submit the proofs mentioned above for a given sector will result in a fault, and the miner will be subject to penalties.

Storage miner and client considerations

Storage deals are published on-chain before they are active and sealed. This is important because publishing a deal locks clients’ funds on-chain, so the miner has assurance that if they do the work of sealing data in a sector, they will get paid for it.

It helps to think of publishing a deal on-chain as signing a contract, and of sealing and activating a deal as essentially starting to do the work the miner signed the contract for.

From the perspective of a client who wants to store data on Filecoin, the deal passes roughly through the following stages:

From the perspective of a miner who has wants to provide a service to the client and store their data, the deal passes roughly through the following stages:

The Retrieval Deal Flow

Retrieval deals, unlike storage deals, happen mostly off-chain faciliated by payment channels. Redeeming vouchers from payment channels is the only part of the process which involves interacting with the Filecoin blockchain.

  1. Discovery - the client identifies miners who have the data that it needs and requests information from them - price per byte, unseal price, payment interval

  2. Payment channel setup - the client sets up a payment channel between them and the miner (if one doesn’t already exist)

  3. Data transfer with payment - miner sends data to client until payment is required. Payment processing is requested when a certain threshold is reached, and data transfer continues after that. Depending on whether the miner has the data in their block store, they might need to first unseal it - an expensive and slow operation, which is the opposite of sealing described in the section about storage deals.

Proof-of-Spacetime

In the sections above we glanced over many details that make Filecoin unique and provide probabilistic guarantees on data to users. In this section we will cover some of the proofs that Filecoin utilises and explain how they fit into the protocol and what problems they address.

Proof-of-Spacetime (PoSt) is a procedure by which a storage miner can prove to the Filecoin network they continue to store a unique copy of some data on behalf of the network.

Proof-of-Spacetime manifests in two distinct varieties in Filecoin today:

  1. Window Proof-of-Spacetime (WindowPoSt) and

  2. Winning Proof-of-Spacetime (WinningPoSt)

In this blog post we will only cover the former.

Window Proof-of-Spacetime

Window Proof-of-Spacetime (WindowPoSt) is the mechanism by which the commitments made by storage miners are audited by the Filecoin blockchain.

Every storage miner should maintain their pledged sectors. These sectors contain deals made with clients or empty sectors, called committed capacity (miners can make capacity commitments, filling a sector with arbitrary data, rather than with client data. Maintaining these sectors allows the storage miner to provably demonstrate that they are reserving space on behalf of the network).

Every day is broken down into an array of windows, currently 48 windows, with a duration of 30 minutes (60 epochs, since 1 epoch is equal to 30sec)

Each storage miner’s set of pledged sectors is partitioned into subsets, one subset for each window.

Within a given window (30min), each storage miner must submit a Proof-of-Spacetime for each sector in their respective subset. This requires ready access to each of the challenged sectors, and will result in a zk-SNARK proof published to the Filecoin blockchain as a message in a block. In this way, every sector of pledged storage is audited at least once in any 24-hour period, and a permanent, verifiable, and public record attesting to each storage miner’s continued commitment is kept.

WindowPoSt sample deadlines

In the diagram above, we see that a sample miner should submit WindowPoSt proofs in deadlines 0 (> 16TB), deadline 1 (< 8TB) and deadline 2 (< 8TB), with the majority of their sectors falling in deadline 0. Deadlines for each miner are randomised, and for this specific miner, start respectively in epoch 1635, in epoch 1695, and in epoch 1755. You can inspect these deadlines and more details about miners on SpaceGap.

The Filecoin network expects constant availability of stored data. Failing to submit WindowPoSt for a sector will result in a fault, and the storage miner supplying the sector will be slashed. This incentives storage miners for healthy behaviour.

Faults

A fault occurs when a proof is not included in the Filecoin blockchain within the proving period, as a result of loss of network connectivity, storage malfunction, or malicious behaviour.

When a fault is registered for a sector, the Filecoin network will slash the storage miner that is supposed to be storing the sector; that is, it will assess penalties to the miner (to be paid out of the collateral fronted by the miner) for their failure to uphold their pledge of storage.

There are three types of sector fault fees:

  1. Sector fault fee - This fee is paid per sector per day while the sector is in a faulty state. The size of the fee is slightly more than the amount the sector is expected to earn per day in block rewards. If a sector remains faulty for more than 2 consecutive weeks, the sector will pay a termination fee and be removed from the chain state.

  2. Sector fault detection fee - This is a one-time fee paid in the event of a failure if the miner does not report it honestly and instead the unreported failure is caught by the blockchain. Given the probabilistic nature of PoSt checks, this is set to a few days worth of block reward that would be expected to be earned by a particular sector.

  3. Sector termination fee - A sector can be terminated before its expiration through automatic faults or miner decisions. A termination fee is charged that is, in principle, equivalent to how much a sector has earned so far, up to a limit in order to avoid discouraging long sector lifetimes.

Read more about faults and economics around them at the Filecoin Spec website

Conclusion

In this post we have covered some of the concepts related to storing and retrieving data on Filecoin, the protocols that clients and miners engage in to make that happen, and the different proofs and guarantees involved in the process. We discussed the flow for storage and retrieval deals from the perspective of clients and miners as well as the penalties that the Filecoin protocol would enforce in case one of the parties misbehaves. Ultimaltely we covered some of the foundations of the Filecoin protocol that govern the Filecoin network resulting in a reliable and trustless decentralised storage network.

← Back to the index