FIP-0076 Source

TitleDirect data onboarding
AuthorAlex North, @zenground0
Discussions-Tohttps://github.com/filecoin-project/FIPs/discussions/730
StatusAccepted
TypeTechnical
CategoryCore
Created2023-08-24

Spec Sections

FIP-0076: Direct data onboarding

Simple Summary

Adds new ProveCommitSectors3 (method 34) and ProveReplicaUpdates3 (method 35) methods to the miner actor which support committing data to sectors and claiming verified allocations without requiring a built-in market (f05) deal. Adds a provider→sector→deal mapping in the built-in market actor and deprecates the storage of deal IDs in sector metadata in the miner actor. Removes the legacy PreCommitSector (method 6) and PreCommitSectorBatch (method 25) in favour of the existing PreCommitSectorBatch2 (method 28) which requires the sector unsealed CID to be specified while pre-committing. Removes the legacy and unused ProveReplicaUpdates3 (method 29) (giving that name to the new method 34).

Existing onboarding flows that use the built-in market actor remain fully supported, but optional.

Abstract

The only mechanism today by which storage providers (SPs) are permitted to commit data in sectors is to complete a deal with the built-in storage market actor. The built-in market actor is expensive in gas, and offers only basic functionality. Requiring its use raises costs for those who don’t need its features and limits the utility and value of sectors, and the applications that can be built on Filecoin.

This cost and inconvenience is unnecessary in the common cases of verified deals with no client payments, and storage arrangements that do not require any on-chain deal. The verified registry actor already records DataCap allocation “deal” terms on-chain; duplicating this information in a built-in market deal is an unnecessary cost.

This proposal adds new onboarding methods which support direct commitment of data into sectors, without necessary reference to any deal. This provides gas-cheap data onboarding for many use cases. It also introduces a new scheme for deal activation which can support deals (and other data-application transactions) brokered by user-programmed smart contracts. This new scheme is initially limited to the built-in market actor, but can be extended to other actors in the future.

Change Motivation

The built-in market actor is expensive to use, but provides very little utility to the majority of data onboarding use cases. A vast majority of deals today are simple Fil+ verified deals with no on-chain client payment. Since the verified registry actor already records all relevant information about DataCap allocation terms (the client and provider, piece commitment, and duration), the built-in market actor’s replication of this data is unnecessary. The built-in market plays no necessary role in allocating or accounting for QA power. Publishing deals is the largest on-chain cost of onboarding data and consensus power, currently consuming a large fraction (about half) of total chain bandwidth. If the built-in storage market actor were bypassed for the common cases of un-paid or off-chain settled deals, both verified and unverified data could be onboarded at a significant reduction in gas cost.

The built-in market actor is also very limited. It supports only a single, simple deal schema and policies, and does not support a range of desirable market features. User-programmed smart contracts are restricted in the functionality that they can provide by the necessity of using the built-in market actor as an intermediary for any data commitments. They are also restricted by the lack of any hooks into the built-in miner actor to be notified of sector data commitments.

This proposal aims to:

  • Support data onboarding, including Fil+ verified data, with no intermediary actor;
  • Provide a new scheme for data activation notifications from storage miner actors, which can later support user-programmed smart contracts to function as data storage applications (including as markets);
  • Continue supporting existing onboarding methods to give participants time to migrate their workflows.

Specification

Overview

Direct data onboarding comprises changes to the built-in miner and market actors to provide new onboarding methods that do not require a built-in market deal. These new methods accept additional parameters from the storage provider which specify the pieces of data activated in the sector, any verified allocations being claimed, and the address of any actors to notify about the successful data commitment. Deals with the built-in storage market actor are possible, but not necessary.

In the direct onboarding flow:

  • [for verified data only] a client (or authorized delegate) makes a verified allocation directly with the verified registry by transferring DataCap tokens to it (this is already possible today),
  • [for on-chain paid deals only] an SP publishes storage deals to the built-in market actor (also already possible today),
  • at pre-commit, an SP must specify a sector’s data commitment (unsealed CID), but does not need to specify the structure of that data nor any deals or verified allocations.
  • at prove-commit or replica-update, an SP specifies the pieces of data comprising a sector, and for each piece may nominate:
    • a verified allocation to claim, and/or
    • an actor to notify of the commitment (e.g. to activate a deal).

The miner actor verifies that the pieces of data claimed by the SP correspond to the data commitment proven. The verified registry is involved only if verified data is being claimed. The built-in market actor need only be involved if an on-chain-settled deal is required (i.e. if non-zero payment). Otherwise, the SP simply commits data directly to their sector with no unnecessary overhead.

Existing onboarding methods are retained and remain fully supported, but optional.

Storage miner actor

No changes to miner state schemas are necessary, but the deal IDs stored on with each sector’s metadata are no longer used. Any deal-related information that remains in the miner actor state will be incomplete so should not be used. A sector→deal association will be stored in the built-in market actor instead (see below).

The interpretation of some per-sector metadata fields changes slightly.

State

SectorPreCommitInfo

The DealIDs field remains in use to support existing onboarding methods. It is required to be empty by the new ProveCommitSectors3 method.

Pre-commit deal IDs may be removed in a future FIP when deprecating the old methods.

SectorOnChainInfo

The per-sector DealIDs field is deprecated and to be ignored. The miner actor will not write new deals to it.

The DealWeight field is interpreted to carry the weight of any non-zero, unverified data in a sector, not only the weight of deals made via the built-in market actor. Similarly, the VerifiedDealWeight field carries the weight of any verified data, regardless of whether a “deal” was made via the built-in market actor.

Per-sector deal IDs may be removed from state in a future migration.

Sector activation

The miner actor exports new methods for sector activation: ProveCommitSectors3 (method 34) and ProveReplicaUpdates3 (method 35). These new methods both support either batched or aggregated proofs, though aggregated proofs for replica updates are not yet possible (see https://github.com/filecoin-project/FIPs/discussions/752).

ProveCommitSectors3

This method rejects sectors with deal IDs specified at pre-commit to avoid operator confusion between the two distinct ways of specifying deals (one of which would have to be ignored). Such sectors must be activated with the existing ProveCommitSector or ProveCommitAggregate methods. This method does not fetch deal information from the built-in market actor. Instead, the pieces of data that comprise a sector are declared by the sender as a manifest.

Each piece in a manifest may declare a verified data allocation ID that it satisfies. The method will attempt to claim that allocation directly from the verified registry actor and, if successful, calculate quality-adjusted power according to the piece’s size. If unsuccessful, the containing sector will not be activated.

Each piece in a manifest may specify zero or more addresses of an actor and notification payloads, to be notified synchronously when the sector is activated. After successful activation, the method will invoke the SectorContentChanged method on the target actor(s) with the piece CID and payload. This functions as a synchronous notification that the piece has been committed, e.g. to a marketplace. SectorContentChanged will be invoked just once per recipient actor, with a message body describing all pieces to be notified to that actor.

The miner actor will reject attempts to notify any actor other than the built-in storage market actor (f05). This restriction may be lifted in a future FIP once the security considerations of calls into untrusted code are better understood and mitigated.

Excepting behaviour specified below, ProveCommitSectors3 performs the same logic and state transitions as the existing sector activation methods. It computes the same weight, pledge and power values, creates the same on-chain state, and charges the same fees. In brief, in execution of ProveCommitSectors3, the miner actor:

  1. validates that the pre-committed sectors are eligible for activation, rejecting any that specified deal IDs;
  2. verifies the sector seal proofs or aggregate proof, and hence the sealed and unsealed CIDs;
  3. computes an unsealed CID from the piece manifests and verifies it matches the proven one;
  4. claims any verified allocations specified in the piece manifest;
  5. computes weights, pledge, power, fees etc and activates the new sectors in state;
  6. notifies any actors specified in the piece manifests.
struct ProveCommitSectors3Params {
    // Activation manifest for each sector being proven.
    SectorActivations: []SectorActivationManifest,
    // Proofs for each sector, parallel to activation manifests.
    // Exactly one of sector_proofs or aggregate_proof must be non-empty.
    SectorProofs: [][]byte,
    // Aggregate proof for all sectors.
    // Exactly one of sector_proofs or aggregate_proof must be non-empty.
    AggregateProof: []byte,
    // The proof type for the aggregate proof (must be absent if no aggregate proof).
    AggregateProofType: Option<RegisteredAggregateProof>,
    // Whether to abort if any sector activation fails.
    RequireActivationSuccess: bool,
    // Whether to abort if any notification returns a non-zero exit code.
    RequireNotificationSuccess: bool,
}

// Data to activate a commitment to one sector and its data.
// All pieces of data must be specified, whether or not not claiming a verified allocation or being
// notified to a data consumer.
// An implicit zero piece fills any remaining sector capacity.
struct SectorActivationManifest {
    // Sector to be activated.
    Sector: SectorNumber,
    // Pieces comprising the sector content, in order.
    Pieces: []PieceActivationManifest,
}

struct PieceActivationManifest {
    // Piece data commitment.
    CID: Cid,
    // Piece size.
    Size: PaddedPieceSize,
    // Identifies a verified allocation to be claimed.
    VerifiedAllocationKey: Option<VerifiedAllocationKey>,
    // Synchronous notifications to be sent to other actors after activation.
    Notify: []DataActivationNotification,
}

struct VerifiedAllocationKey {
    Client: ActorID,
    ID: AllocationID,
}

struct DataActivationNotification {
    // Actor to be notified.
    Address: Address,
    // Data to send in the notification.
    Payload: []byte,
}

// Note transparent serialization of single-element struct.
struct ProveCommitSectors3Return {
    // Success/fail of each input in order.
    ActivationResults: BatchReturn
}

// This BatchReturn type is an existing structure used for batched results in other built-in actors.
struct BatchReturn {
    // Total successes in batch
    SuccessCount: u32,
    // Failure code and index for each failure in batch
    FailCodes: []FailCode,
}

struct FailCode {
    Idx: u32,
    Code: ExitCode,
}
FVM syscalls

The gas cost for the batch_verify_seals syscall is changed to 42M gas per proof. This value was previously a placeholder because the syscall was only ever invoked from call paths originating in the cron actor, which has unlimited gas budget. The new value has been determined empirically following a similar scheme to other gas prices.

ProveReplicaUpdates3

The name ProveReplicaUpdates3 is taken from the existing and now-deprecated method 29. Like ProveCommitSectors3, this method does not fetch deal information from the built-in market actor. Instead, the pieces of data that comprise the updated sector content are declared as a manifest. This manifest is similar to the manifest for ProveCommitSectors3, with differences being to specify the existing sector state to be updated rather than a new one.

The specification and semantics of pieces, including verified allocation IDs and notifications, is the same as for ProveCommitSectors3.

Excepting behaviour specified below, ProveReplicaUpdates3 performs the same logic and state transitions as the existing sector update methods. It computes the same weight, pledge and power values, creates the same on-chain state, and charges the same fees. In brief, in execution of ProveReplicaUpdates3, the miner actor:

  1. validates that the existing sectors are eligible for update;
  2. computes an unsealed CID from the piece manifests;
  3. verifies the sector update proofs or aggregate proof, and hence the new sealed and unsealed CIDs match the values declared and computed respectively;
  4. claims any verified allocations specified in the piece manifest;
  5. computes weights, pledge, power, fees etc and activates the new sectors in state;
  6. notifies any actors specified in the piece manifests.
struct ProveReplicaUpdates3Params {
    SectorUpdates: []SectorUpdateManifest,
    // Proofs for each sector, parallel to activation manifests.
    // Exactly one of sector_proofs or aggregate_proof must be non-empty.
    SectorProofs: [][]byte,
    // Aggregate proof for all sectors.
    // Exactly one of sector_proofs or aggregate_proof must be non-empty.
    AggregateProof: []byte,
    // The proof type for all sector update proofs, individually or before aggregation.
    UpdateProofsType: RegisteredUpdateProof,
    // The proof type for the aggregate proof (must be absent if no aggregate proof).
    AggregateProofType: Option<RegisteredAggregateProof>,
    // Whether to abort if any sector update activation fails.
    RequireActivationSuccess: bool,
    // Whether to abort if any notification returns a non-zero exit code.
    RequireNotificationSuccess: bool,
}

pub struct SectorUpdateManifest {
    Sector: SectorNumber,
    Deadline: u64,
    Partition: u64,
    NewSealedCid: Cid, // CommR
    // Declaration of all pieces that make up the new sector data, in order.
    // Until we support re-snap, pieces must all be new because the sector was previously empty.
    // Implicit "zero" piece fills any remaining capacity.
    // These pieces imply the new unsealed sector CID.
    Pieces: []PieceActivationManifest,
}

type ProveReplicaUpdates3Return = ProveCommitSectors3Return;

SectorContentChanged

When a piece manifest specifies one or notification receivers, the storage miner invokes these receivers after activating the sector or replica update. The receiving actor must accept the SectorContentChanged method number (2034386435) and parameter schema. SectorContentChanged is an FRC-0042 method, intended to be implemented by user-programmed actors. The miner actor will invoke each receiver address only once, with a batch of notification payloads.

// Notification of change committed to one or more sectors.
// The relevant state must be already committed so the receiver can observe any impacts
// at the sending miner actor.
// Note transparent serialization of single-element struct.
struct SectorContentChangedParams {
    // Distinct sectors with changed content.
    Sectors: []SectorChanges,
}

// Description of changes to one sector's content.
struct SectorChanges {
    // Identifier of sector being updated.
    Sector: SectorNumber,
    // Minimum epoch until which the data is committed to the sector.
    // Note the sector may later be extended without necessarily another notification.
    MinimumCommitmentEpoch: ChainEpoch,
    // Information about some pieces added to (or retained in) the sector.
    // This may be only a subset of sector content.
    // Inclusion here does not mean the piece was definitely absent previously.
    // Exclusion here does not mean a piece has been removed since a prior notification.
    Added: []PieceChange,
}

// Description of a piece of data committed to a sector.
struct PieceChange {
    Data: Cid,
    Size: PaddedPieceSize,
    // A receiver-specific identifier.
    // E.g. an encoded deal ID which the provider claims this piece satisfies.
    Payload: []byte,
}

// For each piece in each sector, the notifee returns an exit code and
// (possibly-empty) result data.
// The miner actor will pass through results to its caller.
// Note transparent serialization of single-element struct.
struct SectorContentChangedReturn {
    // A result for each sector that was notified, in the same order.
    Sectors: []SectorReturn,
}

// Note transparent serialization of single-element struct.
struct SectorReturn {
    // A result for each piece for the sector that was notified, in the same order.
    Added: []PieceReturn,
}

// Note transparent serialization of single-element struct.
struct PieceReturn {
    // Indicates whether the receiver accepted the notification.
    // The caller (miner) is free to ignore this, but may chose to abort and roll back.
    Accepted: bool,
}

Failure handling

Each batched sector activation or update comprises multiple sectors, each with multiple data pieces. Each data piece can have a single verified claim and multiple notifications. Any of these items might fail, but only limited ability to handle individual failures in a group is practical.

The activation of each sector is independent. A failed sector activation will not cause a top-level method to abort, unless all activations fail. An SP can specify RequireActivationSuccess=true to instead require every sector activation to succeed, aborting the operation if one fails.

Sector activation includes claiming of any verified allocations. If a claim fails for one piece in a sector, no claims will be made for any piece in the sector and the sector will not be activated. The caller cannot choose to have activation proceed despite an invalid claim; they can instead resubmit the failed sector with only the remaining valid claims.

Sector activation does not include notifications. Notifications are sent strictly after activation, and only for successfully activated sectors. If a notification call returns a non-zero exit code, sector activation will be committed regardless; notifications are sent on a best-effort basis. If a notification to the built-in storage market actor fails, the associated deal will not be started, but the sector can activate anyway. An SP can specify RequireNotificationSuccess=true to instead require every notification to succeed, aborting the entire operation if one fails. There is no ability to roll back the activation of only some sectors in response to failed notifications.

This is a change in sequencing and semantics from the existing onboarding methods flow, where deal activation happens first and failure leads to an individual sector failing, and verified claims happen second and must all succeed.

Sector termination

When a sector is terminated, the built-in market actor is notified of the sector termination if the sector has non-zero deal weight or verified deal weight. This is a change from the previous logic of notifiying the market actor explicitly of the deals contained in a sector (if any).

The miner actor no longer carries reliable information about which sectors carry deals.

Note that this notification on termination is a privilege of the built-in market actor that may not be extended to other actors in the future (because termination can be invoked by cron).

ProveCommitSector

The existing ProveCommitSector (method 7) and ProveCommitAggregate (method 26) remain, and continue to activate deals in the built-in market actor and claim verified allocations using the current algorithm. These methods remain available as a grace period for participants to migrate their workflows. These methods no longer writes deal IDs into per-sector chain state.

These methods cannot be used for direct data onboarding and will always incur the cost of a built-in market deal.

ProveReplicaUpdates

The existing ProveReplicaUpdates (method 27) remains and continues to activate deals and claim verified allocations using the current algorithm. This method remains available as a grace period for participants to migrate their workflows. This method no longer writes deal IDs into per-sector chain state.

This method cannot be used for direct data onboarding and will always incur the cost of a built-in market deal.

Deprecation of legacy methods

The following methods are removed. Invoking them will result in a USR_UNHANDLED_METHOD exit code (22).

  • PreCommitSector (method 6)
  • PreCommitSectorBatch (method 25)
  • ProveReplicaUpdates2 (method 29)

Storage market actor

The built-in storage market actor retains all existing state and functionality, but gains a new collection mapping provider addresses to sector numbers and deal IDs. The per-deal state structure is extended to include the sector number in which the deal is stored, thus providing a reverse mapping from deal ID to sector number. It also implements the new SectorContentChanged method to activate deals in response to notifications from the miner actor’s new activation methods.

State

A new field maps provider addresses, to sector numbers, to the deal IDs that the market has been notified are stored in those sectors. The reverse mapping from deal to sector number is provided by a new field in the deal state structure. The unused verified claim ID field is removed from the deal state structure.

// New structure storing per-sector deal information.

// Transparent serialization of single-element struct, rendering SectorDealIDs as []DealID
struct SectorDealIDs {
    Deals: []DealID
}

// Existing deal state structure gets a new field with sector number.
struct DealState {
    // 0 if not yet included in proven sector (0 is also a valid sector number).
    SectorNumber: SectorNumber,

    // Other existing fields as today.
    // ...
    
    // REMOVED
    // VerifiedClaim: AllocationID,

}

struct State {
    // All existing state as today.
    // ...

    // Existing mapping of deal state by ID.
    // Since deal state now includes sector number, this gives deal->sector index.
    // AMT[DealID]DealState
	States: Cid 

    // New mapping of sector IDs to deal IDs, grouped by storage provider.
    // HAMT[ActorID]HAMT[SectorNumber]SectorDealIDs
    ProviderSectors: Cid 
}

BatchActivateDeals

When an SP activates a piece with the existing activation methods, deals are activated with the BatchActivateDeals (method 6). This method returns the necessary verified allocation IDs to the miner actor. The BatchActivateDeals method parameters are expanded to include the sector ID for each deal. When activating a deal, the market actor writes it into the new ProviderSectors mapping.

struct BatchActivateDealsParams {
    /// Deals to activate, grouped by sector.
    /// A failed deal activation will cause other deals in the same sector group to also fail,
    /// but allow other sectors to proceed.
    Sectors: []SectorDeals,
    /// Requests computation of an unsealed CID for each sector from the provided deals.
    ComputeCid: bool,
}

struct SectorDeals {
    SectorNumber: SectorNumber,
    SectorType: RegisteredSealProof,
    SectorExpiry: ChainEpoch,
    DealIDs: []DealID,
}

SectorContentChanged

When an SP activates a piece with the new onboarding methods, any deals are activated by the SectorContentChanged method instead of BatchActivateDeals. The notification payload must be a CBOR-serialized deal ID.

The implementation checks that the deal ID nominated in the notification payload is valid for activation, and that committed piece CID and size match the deal proposal. If a deal ID is invalid, ineligible for activation, or doesn’t match the piece CID and size, no deal is activated and the method returns Accepted=false for the corresponding piece. Otherwise, the deal is activated. Deals succeed or fail independently, including within the same sector group.

OnMinerSectorsTerminate

The built-in market actor receives synchronous notification from the miner actor when a sector is terminated. The miner actor no longer knows which sectors have deals, so instead notifies the market actor of all sectors with non-empty data.

struct OnMinerSectorsTerminateParams {
    Epoch: ChainEpoch,
    Sectors: BitField,
}

The sector is removed from the ProviderSectors mapping, if present. Any deals mapped to that sector are marked as terminated, and subsequent processing deferred to cron.

GetDealSector

A new FRC-0042 exported method “GetDealSector” (method 2611213344) returns the sector number in which a deal’s data is stored, while a deal is active. If a deal is published but not yet activated, aborts with EX_DEAL_NOT_ACTIVATED = 33. If a deal is not found, aborts with the same exit codes as “GetDealActivation”.

// "Transparent" CBOR-encoding, the singleton field is encoeded directly.
struct GetDealSectorParams {
    DealID: DealID,
}

// "Transparent" CBOR-encoding, the singleton field is encoeded directly.
struct GetDealSectorReturn {
    SectorNumber: SectorNumber,
}

Migration

The built-in market actor’s ProviderSectors mapping is initialised from the existing deal state and miner actor state per-sector deal IDs.

  • For each deal state object in the market actor state that has a terminated epoch set to -1:
    • find the corresponding deal proposal object and extract the provider’s actor ID;
    • in the provider’s miner state, find the ID of the sector with the corresponding deal ID in sector metadata;
      • if such a sector cannot be found, assert that the deal’s end epoch has passed and use sector ID 0 [1];
    • set the new deal state object’s sector number to the sector ID found;
    • add the deal ID to the ProviderSectors mapping for the provider’s actor ID and sector number.
  • For each deal state object in the market actor state that has a terminated epoch set to any other value:
    • set the deal state object’s sector number to 0.

[1] It may be impossible to find the sector for a deal that has completed successfully but not yet been cleaned up in market actor state, if the corresponding sector has since expired and been compacted out of state.

The result includes a value in the ProviderSectors mapping for each activated and not yet terminated or expired deal. The built-in market actor’s implementation of deal expiration clean-up must be robust to the provider sector mapping missing a value for a terminated or expired deal.

Design Rationale

This proposal makes deals optional in order to reduce the cost of data onboarding when deal payments are not required. It separates the concerns of verified data allocations and client deals, making each one an optional component of data activation. A scheme for notifying actors of sector data commitments is introduced in order to support future user-programmed actors implementing markets and other data applications on the same level as the built-in storage market. Such actors will be able to implement deals (and other arrangements) with both greater efficiency and flexibility than the built-in market actor when they are not forced to use it as an intermediary. After this proposal, the only privilege of the built-in market actor that could not be made available to user-programmed actors is the synchronous notification of sector terminations.

CommD required at pre-commit

The existing PreCommitSector[Batch] methods are deprecated in order to enforce the specification of the unsealed sector CID (CommD) during pre-commitment. This is necessary in order to commit to the data to be proven when this can no longer be implied from deals looked-up in built-in market actor. PreCommitSectorBatch2 was introduced in FIP-0041 precisely to support this deprecation.

CommD not required at replica update

FIP-0041 introduced a ProveReplicaUpdates2 (method 29) in order to add an explicit CommD parameter with similar reasoning to that motivating PreCommitSectorBatch2. However, the specification of piece manifests renders this unnecessary. Method 29 is not used by the widely used “Lotus miner” software and has never been invoked on mainnet. It is deprecated in order to reduce the number of actively-supported methods. Note that there is no ProveCommitSectors2. The new methods take a consistent 3 suffix to indicate their shared behaviour.

Sector deal information moved to market actor

Deal-related metadata is removed from the miner actor, with a mapping from sector to deal ID added to the market actor. This is a pattern that can be extended to other data application actors in the future, without requiring further changes to the built-in actors or adding cost or complexity in the miner actor.

The per-sector deal IDs are no longer used or maintained by the miner actor and may be removed in a future FIP.

Implementation of SectorContentChanged by the built-in market

The built-in market actor implements SectorContentChanged in order to demonstrate this pattern of deal activation. SectorContentChanged is an untrusted method: the miner actor does not rely on its result for any critical computation. This pattern can thus be extended to other actors in the future without significant modification.

Sector content notifications are initially restricted to the built-in market actor because it may be possible for user-programmed actors to disrupt the miner actor by exhausting resources like gas or call stack depth. A future FIP can lift this restriction after implementing appropriate mitigations.

Client-initiated workflow for verified data

This proposal changes the high level workflow for verified data onboarding to require an initial on-chain message from the client to allocate DataCap. This differs from the current scheme, where the storage provider initiates the on-chain allocation of DataCap as a side effect of publishing a deal. The reason for this change is that verified allocations and deals are independent: a client can allocate DataCap without necessarily involving the built-in market actor, and save the SP significant gas costs by doing so.

Backwards Compatibility

This proposal deprecated the miner methods PreCommitSector (method 6), PreCommitSectorBatch (method 25), and ProveReplicaUpdates2 (method 29). All three have alternatives already available on mainnet that should be used instead.

This proposal requires a state migration to the market actor to add the new ProviderSectors mapping, and to add a sector number to and remove allocation ID from each DealState. Computing this mapping requires reading all sector metadata from the miner actor.

This proposal requires a network upgrade to deploy the new built-in actor code.

Test Cases

To be provided with implementation.

Security Considerations

This proposal has no impact on consensus or blockchain security.

Incentive Considerations

This proposal decreases the cost of, and hence disincentive to, committing data to Filecoin sectors (see Changes to gas costs). The resulting cost depends on the functionality desired by the participants, which can avoid the costs associated with on-chain deals if they are not required. An absolute cost reduction for onboarding data might increase the profitability of doing so, and hence the uptake.

This proposal changes the relative cost of onboarding data with different features, which is fairly uniform at present. After this proposal:

  • unverified data without on-chain payments will be the cheapest to onboard;
  • verified data without on-chain payments will be slightly more expensive;
  • unverified data with on-chain payments will be significantly more expensive;
  • verified data with on-chain payments will be the most expensive (but still a little cheaper than current).

The variations in cost account for the variations in computational work necessary to provide incremental functionality. These variations might increase the relative profitability of onboarding data that is unverified and/or requires no on-chain payments, and hence the uptake of these. When SectorContentChanged notifications support user-programmed actors, they may be able to implement on-chain payments more efficiently than the built-in storage market actor.

Data that is onboarded without a built-in market deal is not subject to the minimum deal collateral required by the built-in market actor. This deal collateral is intended to provide an assurance mechanism to benefit clients, and represents a cost and risk to storage providers. Because the deal collateral is calculated on raw byte power, it represents a proportionally 10x larger assurance and cost/risk for unverified data than verified data, as compared with the QA power and rewards attributable to the data. This proposal reduces the cost of onboarding data by removing the need for deal collateral. Parties can still use the built-in market actor for arrangements where deal collateral is desired. Future user-programmed actors will also be able to implement more flexible collateral schemes.

Product Considerations

Direct Fil+ supports maximum term

This proposal enables allocation and claiming verified data without a built-in market deal. This means that a client can make an allocation for the maximum permitted duration (5 years) instead of being limited by the built-in market actor’s maximum deal term (1.5 years). While a storage provider can still only commit to a sector for 1.5 years at a time, they will be able to extend a sector claiming such an allocation out to the 5-year maximum while retaining the full power multiplier with no further client interaction.

Direct data onboarding reduces collateral costs

This proposal enables an SP to commit to sector data without a built-in market deal. This means the SP need not deposit the built-in market actor’s required deal collateral when there is no on-chain deal for that collateral to protect.

Changes to gas costs

This proposal greatly reduces the total gas cost of common onboarding workflows by limiting the on-chain computation and state only to the necessary components for the features desired by participants.

As compared with current workflow costs:

  • For direct onboarding with no verified claim, neither publishing a deal nor transferring datacap are necessary. Thus, the full cost of PublishStorageDeals is avoided. The cost of sector pre-commitment is reduced slightly by avoiding interaction with the built-in market actor The cost of sector activation is reduced by avoiding the cost of deal activation.
  • For direct onboarding with a verified claim, most of the cost of PublishStorageDeals is avoided. The cost of transferring DataCap is still incurred, but is only a fraction of the cost of publishing a deal. The cost of sector pre-commitment is reduced slightly by avoiding interaction with the built-in market actor The cost of sector activation is reduced by avoiding the cost of deal activation.
  • For direct onboarding with a verified claim and a built-in storage market deal, the cost of PreCommitSectorBatch is reduced slightly by avoiding interaction with the built-in market actor.

As an illustrative example, an analysis of the gas cost of selected onboarding messages with a batch size of 16 in epoch 2844239 indicates:

Message Observed w/ Fil+ Direct with Fil+ Observed ex Fil+ Direct ex Fil+
PublishStorageDeals 1,911M 283M (14.8%) 1,624M 0 (0%)
PreCommitSectorBatch 111M 99M (89%) 111M 99M (89%)
ProveCommitAggregate 2,509M 1,591M (63%) 1,342M 348M (25%)
Total 4,531M 1,973M (43%) 3,077M 447M (14.5%)

The “direct” gas costs are approximated by analysing an execution trace and ignoring the gas costs of internal messages sent to the built-in storage market actor. The “ex Fil+” gas costs are similarly approximated by ignoring the gas costs of internal messages to the DataCap and verified registry actors.

Note that since the epoch at which the trace were gathered, optimisations to the built-in actors have reduced the costs associated with both verified claims and deal activation. These optimisations are not yet deployed to the network, so it is not possible to gather baseline data that takes them into account. Thus, the relative gas savings of this proposal are likely to be smaller than the above figures indicate, when based on this optimised code. However, the absolute gas costs of both changes combined will be even lower than the above figures indicate.

In the case of a single sector activation, the new onboarding methods may cost more gas than using ProveCommitSector. This is because ProveCommitSector defers sector activation (including verified claims and deals) to be performed in cron, thus subsidising the SP’s gas consumption. This is considered to be a protocol bug, expected to be resolved in a future FIP by forcing (now much reduced) sector activation costs to be paid by the SP. After this proposal, one way to do this would be to simply deprecate ProveCommitSector in favour of ProveCommitSectors3. See https://github.com/filecoin-project/FIPs/discussions/689.

Client-initiated workflow for verified data

This proposal requires a client to initiate the on-chain allocation of DataCap for verified data, as discussed in Design Rationale. This requires a client to hold sufficient tokens to pay the associated gas fee, and have some method of submitting messages to the network.

A future change could restore a high-level workflow initiated by the SP by specifying a mechanism for verified allocation vouchers. A voucher would be a message signed by the client, but submitted to the verified registry actor by the SP, much like publishing a deal. Such functionality is omitted from this proposal in the interest of simplicity.

Storage provider metadata for activation

This proposal avoids the storage of deal-related information on chain between sector pre-commitment and activation. Rather than the looking up piece CIDs and allocation IDs in the built-in storage market actor during activation, an SP must provide them explicitly as parameters. This movement of data and computation off-chain enables gas savings, but requires the storage provider to maintain this metadata during the onboarding process.

Source of truth for data commitments and verified claims

This proposal decouples both the ability to commit to data, and the allocation of DataCap, from on-chain deals with the built-in market actor. The motivation for this is to avoid the high costs of the built-in market actor when they are not necessary.

This decoupling breaks potential assumptions by network observers that (a) all data in any sector is reflected in an on-chain deal, and (b) all verified data is reflected in an on-chain deal. Observers relying on these assumptions could previously inspect only the built-in market actor’s state in order to infer the total amount and composition of data and verified data in the network.

After this proposal, information representing data commitments, verified claims, and deals are maintained separately.

  • The source of truth for sector data commitments is the piece activation manifests in the message history.
  • The source of truth for verified data claims is the verified registry actor state.
  • The built-in market actor state is a source of truth only for on-chain deal accounting.

The built-in market actor may have a partial view of both piece commitments and verified claims, but this is no longer a reliable or complete source of information about either. Clients, explorers and analytic tools should inspect the source of truth for each type of information, and avoid reliance on the presence of deals with built-in market actor.

Reduced scope of on-chain visibility into terms for sector data

This proposal has potential to reduce the scope of on-chain visibility into the data committed to sectors. At present, all sector data is represented by a record stored on-chain in the built-in market actor, which includes a client identifier and deal terms, and all verified data is represented by a similar record stored on-chain in the verified registry actor. After this proposal, it becomes possible for data to be committed to a sector without a corresponding record in either the built-in market actor or the verified registry actor.

The commitment to such data still appears in the chain message history, visible from outside the chain, but is not verifiable from an actor (unless a future change adds visibility into change history). The distinction between provably-empty sectors and those containing some data remains in chain state, but no client or terms for such data is necessarily present.

Note that even today there is no guarantee that deal metadata is meaningful. A storage provider can commit arbitrary data through the built-in market without a meaningful client or terms. This proposal makes that behaviour cheaper by not storing the meaningless metadata on chain.

Implementation

Implementation of the protocol changes is being developed in the integration/direct-onboarding branch of the built-in actors repository.

Copyright and related rights waived via CC0.

Citation

Please cite this document as:

Alex North, @zenground0, "FIP-0076: Direct data onboarding," Filecoin Improvement Proposals, no. 0076, August 2023. [Online serial]. Available: https://fips.filecoin.io/fips/fip-0076.