-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Extension Message to notify about hypercore keys #38
Comments
@mafintosh and I had talked a while back about supporting a zero knowledge proxy that would fulfill a similar case as Tor relays or TURN servers do... meaning you could run one to donate your bandwidth but you as the operator would not know what dat keys are being shared, as the entire connection would be e2e encrypted and metadata is obscured from you. Is this use case in scope here? |
We were talking about similar stuff on IRC, actually. The stuff I'm proposing here would be making use of the existing hypercore-protocol, so it would need to have access to the feed contents. This would still workw ith the second approach to zero-knowledge. There's two approaches for gateways: What you described with zero-knowledge relays, @fsteff and I have been wroking on it with discovery-swarm-stream. It acts as a proxy for discovery-swarm so it only knows about discovery keys and doesn't know about the actual keys or what the contents are. It can only do the same level of MITM as an ISP could. The other approach is to encrypt the contents of the hypercore and use the existing replication protocol with peers and make use of a hypercore-proxy. Basically, you give something like hashbase the URL for your archive to replicate it's contents, but the actual content of the hypercores is encrypted, so it can't do anything with it. Then, peers fetching the data will have the encryption key as part of the Dat URL so they will have full read access. Thus you can use this proposal with the existing hypercore-protocol, but prevent gateways from knowing what's really in the dats. This of course only works for encrypted dat archives and the first approach is more "simple". |
If I were to run this proxy service, I'd probably want authentication so that I could meter the bandwidth usage. You have any thoughts on that? |
I think that the authentication and the such could be tacked on at the transport level. If the standard for these services makes use of websockets (which it really should), it'd be trivial to add BasicAuth on top of them. If you have a service with metering, you could have people connect using I like basic auth because it's the lest effort for developers to implement if they have an existing WS implementation. They can pass in a URL and not worry about setting headers or anything else. This also decouples the genreation of any sort of token from the use of the token. This lets services define whatever authentication and token generation that they want. Having websockets would also make it easy to load-balance between services using something like nginx. Websockets are super important here, too, in order to support the web (which is going to be a big use case for gateways). Another reason it'd be useful to have it in the WS URL is that we don't need to modify the hypercore protocol further. A service can determine whether they want to accept a connection without having to bother handshaking or any other processing. |
Would it be reasonable to (ab)use the pinning service API for this? https://www.datprotocol.com/deps/0003-http-pinning-service-api/ To reduce the number of pinning API messages sent, a "client' could use pinning to establish a first hypercore session with the proxy, then announce additional feed public keys via that "write" mechanism. |
I think that having the information be part of the hypercore-protocol would be more useful for supporting different environments. Having something out-of-band in addition to the hypercore-protocol seems a little messy and wouldn't work for cases where you don't have an HTTP pinning service handy. On a different note, could somebody point me to anything describing how I'd use an extension message with hyperdrive so I could give this a go? |
@RangerMauve https://github.com/beakerbrowser/dat-ephemeral-ext-msg might be a good reference |
By the way, the reason I didn't progress on this is that I was getting skeptical of giving knowledge of public keys to gateways. That's why I took the approach of proxying entire connections through the gateway instead of hypercore-protocol in discovery-swarm-stream. This is mostly a concern for public gateways that you don't necessarily want to trust with your data, so it might not be relevant to other use cases. |
I'm gonna try to do something regarding pinning next week. Might have time to look at this. |
So now that pinning is ready, I'd like to revisit this. Ideally I'd like to have the following properties:
Here's how I'm thinking of approaching it:
Any comments on gaps in this plan or any better ideas on how to achieve this? |
My main motivation for this is to standardize storage further so that storage providers can be more generic, so that application developers can do crazy things with hypercores, and so that users will always be able to choose their storage providers without worrying about who supports which data structure. |
A possibly huge problem here is spam. If an app does load parts of the received feeds, this easily turns out as a DoS attack, especially for services like hashbase (!). |
Could you elaborate on that? I'd imagine you'd have the same bandwidth / storage quotas that you'd see in a large hyperdrive. Plus, the scenario you describe could still happen with a regular multiwriter hyperdrive (once they're out), it's just that Hashbase would need to parse the contents of the hyperdrive rather than plainly replicate hypercores without looking at what's inside. |
I think https://github.com/kappa-db/multifeed is doing this, and there are more and more applications using this standard. |
Recently, there's been work around hypercore-protocol-proxy which is making use of hypercore-protocol to replicate feeds from a gateway and for multicasting data.
It works great for feeds that gateways already know about, but the protocol is limited in that you can't start replicating keys that both peers don't already know about.
One use-case that I'm really interested in is public gateways that allow peers to connect to them and replicate any key with the gateway automatically connecting to the discovery-swarm to do replication.
I propose a hypercore extension message that will notify the other party about a "related" key.
This will be used by clients to notify the gateway about a key before attempting to send the "feed" message.
I think we'll need to bikeshed a bunch about the actual details, but I think it would look something like:
relateFeed(key, cb)
by their key.REQUEST
action for the key to the gatewayREFUSE
call the CB with an errorREADY
call the CB without an errorI think that having a standard DEP for this will make it easier to have people deploy gateways that can be reused between applications.
The text was updated successfully, but these errors were encountered: