# FIP-0019

Title | Snap Deals |
---|---|

Author | @Kubuxu, @lucaniz, @nicola, @rosariogennaro, @irenegia |

Discussions-To | https://github.com/filecoin-project/FIPs/issues/145 |

Status | Final |

Type | Core |

Created | 2021-08-24 |

## Spec Sections

# FIP-0019: Snap Deals

## Simple Summary

A one-message protocol for updating any sector with new data without re-sealing.

## Abstract

Storage provider embeds deal data into existing sector replica with some unpredictable randomness by “xor”-ing.

The miner generates one message which proves that sector was updated with new data.

## Change Motivation

Since 90+% of sectors in the Filecoin Network are CC sectors, having a protocol that allows for updating CC sectors to store real data without incurring in a full re-sealing would massively improve our network in terms of the amount of real data stored through it.

- It would allow
**decoupling sealing latency from deal-making speed**- offering storage clients an improved experience for how quickly their data can land in on-chain deals - It would
**unlock the 8EiB of storage already committed to the Filecoin network**to be quickly used for deals - enabling a 100PiB+ client to make deals for their entire dataset with a single miner like f0127595 which already has 120PiB of committed capacity. - It makes
**utilizing existing committed capacity much cheaper for miners**(since they’ve already invested the sealing cost), increasing the chances they pay clients to add FIL+ data to these sectors.

What this FIP does not support (see Future Work section):

- Deal updates or updating sectors with existing deals
- Moving deals across sectors

## Specification

### Nomenclature

**SectorKeyCID** - on-chain commitment of CC sector replica

**UnsealedSectorCID** - commitment of deal data generated from PieceCIDs

**SealedSectorCID** - commitment of sector replica with deals embedded in it

**EncodingRands** - collection of **HashShards** randomness values used for encoding process

**Poseidon-128** - SNARK friendly hash function

### Parameters

**ChallengesNumber** = 1376 - number of challenges to be proven

**HashShards** = 512 - number of hashes used for randomness in encoding process

### One-messages Update Protocol

**Storage Provider accepts and publishes storage deals (using**`PublishStorageDeals`

)**If necessary storage provider extends sector expiration so that sector is active long enough to contain deals with**`miner.ExtendSectorExpiration`

**Storage Provider updates an existing replica with deals**:- Pre-generate HashShards possible number of
`EncodingRand(ComputeUnsealedCID(P1, P2, P3, ...), i)`

where`P1, P2, P3, ...`

are pieces corresponding to deals that will included in this sector. - Encode the deal data into existing replica using
`newReplica[i] = Enc(sectorKey[i], data[i], EncodingRand(o))`

function:`newReplica`

is the vector of`Fr`

elements of the new replica`sectorKey`

is the vector of`Fr`

elements of the CC sector replica data`data`

is the vector of`Fr`

elements of the deals data ordered and padded as done in the sealing subroutine`NodeSectorSize`

is size of the sector in nodes (32 byte chunks)

- Compute SealedSectorCID of the
`newReplica`

.

- Pre-generate HashShards possible number of
**Storage Provider produces a proof of sector update.**Generate a SNARK that proves:- Generation of ChallengesNumber challenges in batches of 8:
- For
`j=0..ChallengesNumber//8`

,`[_, c[8*j+7], c[8*j+6], ..., , c[8*j] = Split-31bit(Poseidon-128(Poseidon-128(UnsealedSectorCID, SealedSectorCID), j)`

. The`Split-31bit`

splits the 255bit output of Poseidon-128 hash into 8 31 bit chunks and discards 7 high bits. `c[i] = c[i] mod NodeSectorSize`

- For
- For each challenge
`i=0..ChallengesNumber`

,`c = c[i]`

- Encoding: the following we correcty computed:
`newReplica[c] = Enc(sectorKey[c], data[c], Poseidon-128(UnsealedSectorCID, c*HashNumber//NodeSectorSize)`

- Inclusion proofs:
`newReplica[c]`

is the opening of`UpdatingSectorCID`

at position`c`

`sectorKey[c]`

is the opening of`CommRLast`

from`SealedSectorCID`

at position`c`

`data[c]`

is the opening of`UnsealedSectorCID`

at position`c`

- Encoding: the following we correcty computed:

- Generation of ChallengesNumber challenges in batches of 8:
**Storage Provider publishes**: 1. Validate input parameters`Miner.ReplicaUpdate(SectorID, NewSealedSectorCID, deals []DealID, Proof)`

- Activate Deals
- Generate
`NewUnsealedSectorCID`

(CommD) based on DealIDs and market actor state - Verify the proof of correct data update:
- Verify that
`proof`

is valid for using inputs:`NewSealedSectorCID`

,`NewUnsealedSectorCID`

,`SectorKey`

.

- Verify that
- Update SectorInfo

### Algorithms

#### Encoding Randomness

`EncodingRand(UnsealedSectorCID, i) = Poseidon-128(UnsealedSectorCID, i*HashShards//NodeSectorSize)`

Intuitively the sector nodes (32byte chunks) are split into **HashShards** number of continuous segments with
the same encoding randomness.

#### Encoding

The current encoding algorithm is the following:

`Enc(sectorKey[i], data[i]) = sectorKey[i] + data[i] * EncodingRand(i)`

The Encoding function is cheap and allows for parallel encoding.

#### Decoding

`Dec(sectorKey[i], replica[i]) = (replica[i] - sectorKey[i]) * (EncodingRand(i)<sup>-1</sup>)`

The modular inverse of `EncodingRand`

can be computed only **HashShards** times per whole operation.

### Retrieving updated data from encoded replica

- From the
`replica`

:- Regenerate
`sectorKey`

by re-sealing the sector - For each
`i`

:`Dec(sectorKey[i], replica[i], EncodingRand)`

- Regenerate

### Deprecation of current CC Upgrade

`PreCommitSector`

will no longer be allowed to schedule current cc upgrades during prove commit. The data on chain tracking cc upgrades and the miner actor logic doing current cc upgrades will be removed. PreCommit message parameters will no longer include fields for specifying cc upgrades.

### ProveReplicaUpdates Actor Method

#### Params

```
type SnapDealsUpdate struct {
SectorID SectorID
Deadline int
Partition int
Deals []DealID
Proof ReplicaUpdateProofs
}
```

`Miner.ProveReplicaUpdates(Updates []SnapDealsUpdate)`

#### Spec

Unless stated assume that each step operates on all inputs updates in a loop

**Param validation**- Assert that
`SectorID`

has no deals - Assert proof size is not above limits
- Assert batch size bounds are not above limits
- Assert that sector’s power is active. Fail if sector is in unproven, faulty or terminated states

- Assert that
**Market Activation and CommD generation**- Call
`market.ActivateDeals`

on all deals in the update - Call
`market.ComputeDataCommittment`

with batched inputs to generate all CommDs for all updates

- Call
**Proof verification**- Extend runtime with VerifyReplicaUpdate method

**Update Miner State**`SectorNumber`

i.e. the sector ID,`Expiration`

and`SealProof`

fields remain unchanged`SealedCID`

,`DealIDs`

,`DealWeight`

,`VerifiedDealWeight`

are all modified to match the upgraded sector with deals`InitialPledge`

is the max of CC sector’s InitialPledge and the value computed with the upgrade power at the upgrade epoch`ReplacedDayReward`

is assigned the value of`ExpectedDayReward`

and`ExpectedDayReward`

is recalculated, as in current CC upgrade`ReplacedSectorAge`

is set to`upgradeEpoch - Activation`

as in current CC upgrade`ExpectedStoragePledge`

is recalculated- If
`SectorKeyCID`

is null, set it to`SealedCID`

of the CC sector being replaced `Activation`

is set to current epoch- Call
`ReplaceSectors`

on all partitions with updates

**Update Power**`partition.ReplaceSectors`

will update miner deadline state to correct power- Call
`power.UpdateClaimedPower`

to update power actor state.

#### Batching and Error Handling

This method accepts a batch of updates to be processed in one message call. As much as possible error handling should ignore updates that do not pass validation and only fail if all updates fail validation or if the entire operation fails.

### Upgrade and State Migration

- Migration removing precommit info about cc upgrades in the PreCommit map
- PreCommit params will lose the fours cc upgrade fields
- Add
`SectorKey`

a new field to the sector info. It should be a pointer to a cid serializing to cbor null when no snap deals upgrade has occurred and becoming the original`SealedCID`

after the first snap deals update

## Design Rationale

### Encoding Function

The aim of the encoding function is to allow encoding data into an existing sector key without breaking the compressibility assumption of PoRep. …

### Parameters choice

`ChallengeNumber`

: 1376 - to facilitate required security factor with Fiat-Shamir (TODO expand in Security Considerations).`HashShards`

: 512 - to reach required spacegap fraction without neadlessly increasing the cost of conducting protocol (TODO explanation).

### Limitations

The update protocol is limited to CC sectors as the `SectorKey`

commitment is only avaliable for sectors with no data. The SectorKey commitment is currently included as `CommRLast`

commitment inside the on-chain `SealedSectorCID`

when the sector is CC. When the sector contains deals natively the `SealedSectorCID`

includes only `CommRLast+Data`

commitment.

## Backwards Compatibility

This change is not backwards compatible and will require a new actors version and a network upgrade.

## Test Cases

TODO

## Security Considerations

### Loosing epsilon-Replication

This section is a summary of more detailed report.

**Assumption 1**: We work under the assumption that for each bit grinded on the `{\rhoi}i=1,...,N`

an adversary can flip one bit of her choice on the original `R`

.

We consider `\lambda = 80`

as the security parameter. This means that we give the adversary a bit flip budget of 80 bits on `R`

(by crafting `R*`

such that these bits are flipped, keeping all the rest untouched) in order to make R* more compressible than `R`

.

Observation: As we will see in what follows, in the case of our actual encoding (where a single `\rhoi`

is shared among *K* distinct nodes Dj), we’ll be either more conservative than we should, according to Kolmogorov bounds.

In order to stay on the safe side, we have two independent security analysis, following two different paths that give us similar results.

The first analysis is a probabilistic one, and we refer to it as the “security against the flipping adversary”. The second one is using an information theoretical argument based on Kolmogorov complexity.

#### Analysis 1: Security against “2 bytes for 1 bit” Adversary

**Preliminaries:**
Given a predetermined string of L bits, we take for granted that a random string has probability 2^{-L} to contain it.
We will consider bit strings of K_consecutive `0`

s (or `1`

s). Note that it is the same when considering patterns like `01`

or `10`

, the only difference is that those sequences would allow for less compression.
**Assumption 2**: An adversary can compress ~2 consecutive bytes (we would consider 2 bytes per node, but it can happen anywhere) by flipping a single bit.
This means having substrings s that are of the form `s = 0000000100000000`

(or its negation).
Note that the probability of `s`

(or negation of `s`

) to happen is 2^{-15}.
Since the adversary can adaptively choose the bits to flip, we are just assuming he finds 80 of these strings in a 32GiB string. 2 bytes out of 32 represent the 6.25% of a single node

Note 1: Remember that each randomness `\rhoi`

is shared among K nodes, have 2^{30}/K different hashes.
As a consequence, we consider each compression performed on a node D to propagate on all the K nodes that share the same randomness with D.

**Assumption 3**: An adversary A is able to find a string of the form of `s = 000000010000000`

(or equivalent) in 80 different nodes, each of those uses a different `\rhoi`

Note 2: Assumption 3 maximizes the propagation of the compression made by the adversary since all the local compression propagates to K nodes each. This results in a total compression of 80*0.0625*K nodes out of 2^{30}

With H = 2^{9} and K = 2^{21} we obtain a total compression under 1%.

Assume we have 2^{9} different hashes. This means that we have a total compression, according to Assumption 2 and Assumption 3 of ~1%.

#### Analysis 2: Based on Kolmogorov Complexity (Information Theoretic Argument)

**Information Theoretic Argument**: Given a string `S`

of size N with incompressibility V, for any adversary that modifies `S`

into `S'`

by flipping T bits of his choice, we have that the incompressibility of `S'`

is at least *V - T*(log(N) + log(log(N))+1)*.
**Proof**: Assume by contradiction that there exists a string `S''`

which is more compressible than `S'`

.
Then it means that it is possible to compress `S`

into a string `S*`

of length *| S’’ | + T*(log(N) + 2*log(log(N))+1) < V* .

Note: *T*(log(N) + 2*log(log(N))+1)* are the bits needed in order to be able to store the T bit positions where the adversary flipped the bits, using self delimiting codes (References: this paper, page 9; this book, Section 2.2).

Given the following:
We assume `R`

to be incompressible, and we then set the incompressibility of `R`

to `N`

.
We assume an adversary can flip 80 bits of his choice in `R`

.
To be conservative, given that we are considering 2^{30}/K different hashes, we apply the compressibility lower bound on a vector of size N’ = 2^{30}/K nodes

According to the lower bound, we get that an adversary that can flip 80 bits of his choice on `R`

can not compress `R`

to a string that has compressibility smaller than *N’ - 80(log(N’) + 2*log(log(N’)) + 1)*.

In order to be even more conservative, we are using `N`

instead of `N'`

in the compressibility reduction factor. We are then considering that R can not be compressed to a string which is less compressible than 2^{30}/K - 80(log(2^{30}) + log(log(2^{30})) + 1) = 2^{30}/ K - 80(30+11) .

As a consequence, we have that an adversary can compress a total of 80(30+11)*K bits out of 2^{30}.

This means that, for K = 2^{21} and 2^{9} = 512 different hashes, we have that the additional spacegap is < 2.51%. Given in our case the distribution of the last bit of each node is not uniform, we also considered the case of nodes of 255 bits rather than 256. Spacegap in this case is < 2.5123%.
If we allow for 1024 hashes, additional spacegap is bounded by < 1.26%. If as above we consider nodes of bit length 255 instead of 256 we still obtain that an additional spacegap bounded by < 1.26%

### Poseidon-128

Poseidon-128 is zk-SNARK friendly hash function which is already widely used in Filecoin protocol. Both uses of Poseidon-128 include Domain Separation Tags such that the usage here does not clash with possible future use cases.

Domain Separation Tags used:

- Encoding Randomness Generation:
`2^40`

- Challenge Generation:
`2^40`

Domain Separate Tags for both are the same as their inputs are different.

## Incentive Considerations

SnapDeals does not allow storage providers to release currently held collateral for the sector but it allows to reuse that already held collateral if collateral required is greater, for example due to verified deal.

## Product Considerations

Snap Deals protocol significantly reduces the time needed for data to be included in a sector and confirmed on-chian. The process was designed with cost for storage providers in mind.

## Implementation

## Future work

- DeclareDeals to support deal transfer: allow moving deals from sector to sector
- CapacityDeals: allow purchasing a capacity of the storage of a storage provider instead of the storage of a specific deal
- Update protocol that does not require to perform an operation on a full sector.
- DealUpdates: a license to terminate a deal in place for a new one (e.g. a user wants to update a portion of their files)

## Copyright

Copyright and related rights waived via CC0.

## Citation

Please cite this document as:

@Kubuxu, @lucaniz, @nicola, @rosariogennaro, @irenegia, "FIP-0019: Snap Deals," *Filecoin Improvement Proposals*, no. 0019, August 2021. [Online serial]. Available: https://fips.filecoin.io/fips/fip-0019.