Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hypercore Split Resolution DEP #43

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions proposals/0000-hypercore-split-resolution.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@

Title: **DEP-0000: Hypercore Split Resolution**

Short Name: `0000-hypercore-split-resolution`

Type: Standard

Status: Draft (as of 2018-10-02)

Github PR: (add HTTPS link here after PR is opened)

Authors: Paul Frazee


# Summary
[summary]: #summary

The Hypercore data-structure is an append-only log which depends on maintaining a linear and unbranching history in order to encode the state-changes of its data content. A "split" event occurs when the log branches, losing its linear history. This spec provides a process for resolving "splits" in a Hypercore log.


# Motivation
[motivation]: #motivation

Hypercore requires a linear history in order to function correctly. A "split" event is a fatal corruption: peers which encounter a split will stop replication, causing the Hypercore to lose utility.

In cases with a strict security requirement this might be useful, but the append-only invariant can be difficult to maintain for users who migrate their dats between devices or restore from backups. For most users, it would be more desirable to risk losing some writes than to risk losing an entire dat due to a split.

This DEP specifies a process for recovering from splits so that users can safely backup or transfer their dats.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "safely" a guarantee here? The paragraph preceding it seems to imply like there's some risks involved (e.g. data loss) which do not guarantee safety.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a trade of costs. Currently, a split-event results in total hypercore corruption: no more forward progress can be made in the hypercore, and it must be replaced with a new keypair and history. This proposal provides a solution for resolving splits, therefore it re-enables forward progress on the hypercore. However it does not create a process for restoring any data in the "discarded branch" and so it leaves the potential for data loss.

The reason that this proposal doesn't include some form of data restoration from the dead-branch is that hypercore's semantics are too generic to create a universal solution. A restoration process is possible, but it would depend on the data-structure encoded on the hypercore. For instance, a hyperdrive might handle restoration by showing the user a list of the orphaned files, and prompting them to either "restore" or "discard" each file. This would cause each file to be rewritten onto the newly live branch.



# Usage Documentation
[usage-documentation]: #usage-documentation

Hypercore's APIs will provide a method for registering a "split handler." The split handler will be implemented by the application using the Hypercore. It may have a standard definition provided by a higher-level data structure such as HyperDB or Hyperdrive.

The "split handler" function will receive the `seq` number, `blockA`, and `blockB`. It will be expected to return `-1` to accept `blockA` or `1` to accept `blockB`. If `0` is returned, no resolution occurs and the new block is rejected (the current behavior). When a split is resolved, any subsequent messages after the rejected block will also be rejected.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using numbers might make a lot of sense for the JS implementation, but less so in Rust (an Enum seems like a better fit). Could the language be adjusted to reflect that better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure


It's expected that the split handler will read the contents of the blocks in order to make decisions about a split. In the case of Hyperdrive, for instance, a split might be resolved using the timestamp of the write.


# Reference Documentation
[reference-documentation]: #reference-documentation

A "split" is defined as a Hypercore which has two or more valid messages with the same sequence number.

"Split resolution" will occur during replication when a "split" is detected due to a received block. Split resolution should not occur during local writes. (Conflicting local writes should be rejected outright).

The split handler must be constructed in such a way that all peers will come to the same decisions independently. Peers should not make a decision based on information such as "time that the block was received," since it is not global knowledge. By only using global information, a split can be reliably resolved across the network.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the code example from the opening post be mentioned here? It helped me create a better model of what is being described here, and I imagine it would too for people reading this in the future.


# Drawbacks
[drawbacks]: #drawbacks

"Split resolution" will cause data loss as some part of the history must be discarded. If the managing software is not careful, this can result in massive data loss (e.g. if the split occurs at the first message during recovery). To limit this potential, the managing software can query the network for the latest history in cases where a split is likely (such as a backup recovery process).

"Split resolution" will cause the append-only invariant of Hypercore logs to be optional. This means that file history and versions will not be immutable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the implications of changing the file history and versions to not be immutable? Which assumptions will be changed by this? It feels like this would be a pretty big change to the core premise of Hypercore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The largest implication is that specifying a revision-number no longer provides a strong guarantee of content. Explaining in more detail:

With immutable history, it's enough to provide the pair (pubkey, revision) to specify a set of content with a strong guarantee of the content. That is, for any (pubkey, revision) pair, there is only one dataset which may exist in the world. (That's assuming that the history's immutability is actually maintained, which depends on the network gossiping effectively about any known splits.)

With mutable history, to provide a strong guarantee of content, you must include a content hash. Therefore you must provide a triple (pubkey, revision, hash) to specify a strong guarantee.

We had already planned to create the triple-form as something called "strong links" for two reasons:

  1. When DNS is involved, you're actually specifying the pair (domain-name, revision) which means that the immutable history guarantee is not upheld. DNS is common enough that this is a concern.
  2. It's not physically impossible to maintain a split in the network, at least for some time, making the immutable history guarantee somewhat weak.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another implication has to do with processing guarantees. For instance, if we assert immutable history, then a process which ingests hypercores and produces computed views can assume that previously-processed revisions will never change. This is a nice optimization which might even enable the process to discard revisions after processing them.

If the immutable history guarantee is lost, that optimization is no longer possible, and the processor has to watch for splits and recompute its views accordingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, it should be optional for a data structure built on hypercore to include a split-resolution algorithm. If it does not, then it can assume immutable history.



# Rationale and alternatives
[alternatives]: #alternatives

- Previously we've discussed a "major version pointer" which made it possible possible to recover from a split by publishing a new history ([discussion](https://github.com/datprotocol/DEPs/issues/31)). The "major version pointer" is a less efficient solution as it requires a wholly new history to recover, had its own edge-cases which were difficult to recover from, and was rejected for being too complex.
pfrazee marked this conversation as resolved.
Show resolved Hide resolved


# Unresolved questions
[unresolved]: #unresolved-questions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On line 34 the following statement is made:

It may have a standard definition provided by a higher-level data structure such as HyperDB or Hyperdrive.

Perhaps this would be worth mentioning as part of the unresolved questions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really an unresolved question; it's more a prompt for followup DEPs providing the standard definitions for the various data structures.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha.



# Changelog
[changelog]: #changelog

- 2018-10-02: First complete draft submitted for review