-
Notifications
You must be signed in to change notification settings - Fork 389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC4256: RFC 9420 MLS mode Matrix #4256
base: main
Are you sure you want to change the base?
Conversation
|
||
![Invite flow][invite-flow] | ||
|
||
## Potential issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about display name and avatar url of members? Are they stored in the MLS group context too? And if so, how to prevent malicious users from changing them.
9. If the `can_propose` has entries not in `servers`, reject | ||
10. Otherwise, allow | ||
10. If the event type is `m.mls.pending_commit`: | ||
1. If the epoch of this event is not exactly the epoch of the previous `m.mls.commit` event + 1, reject |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the current m.mls.commit
event" might be clearer here since m.mls.pending_commit
doesn't directly replace that event. This could maybe also be applied to 9.2.
7. If the `powers` has entries not in `servers`, reject | ||
8. If the `powers` start not with exactly the same entries in the same order as the subset of | ||
entries in the previous `powers` above the `origin` , reject | ||
9. If the `can_propose` has entries not in `servers`, reject |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we need to enforce that entries from powers
also appear in can_propose
to enable the process described in "Commit handling with multiple servers" below?
Currently redactions are not supported in the outer protocol layer. This has the benefit of making | ||
redactions invisible to the server, but also prevents any server side aggregation. This also means | ||
content can’t be deleted. As we require all events to be encrypted, this might not be a major | ||
problem, but policy makers might disagree. Redactions could be supported in various ways and we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redactions are not only relevant in the context of data privacy but also for moderation. Not being able to delete malicious or illegal content seems like a significant problem. 😕
the expense of some of the security benefits of MLS. At its core having a coherent set of members in | ||
a room and the ability to change membership at any time without the ability to broadcast these | ||
changes to other servers in the room are at conflict. If any server can modify historical membership | ||
of a room, this compromises Post-Compromise Security (PFS). In an earlier proposal, decentralised |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The acronym here is a little confusing because it doesn't fit the written-out form:
Is it suppposed to be "Perfect Forward Secrecy (PFS)" or "Post-Compromise Security (PCS)"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is Forward secrecy which is compromised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only read through part of this, but here are some (mostly nitpicky) comments so far. Generally, there's inconsistency between using "mls" and "MLS". It would be nice to use "MLS" everywhere.
|
||
Every valid state event in this room version is required to have an empty state key. | ||
|
||
The `m.room.create` event has a mandatory `algorithm` field now to set the encryption algorithm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bikeshedding: I'd suggest using encryption_algorithm
as the property name, as defined in MSC2425, to make it clear that it's encryption-related.
encrypted events with only an `algorithm` and `ciphertext` field. It contains the current MLS | ||
commit as a MLS private message. | ||
|
||
The MLS commit contains a CBOR encoded field in its unencrypted authenticated data of the MLS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give some rationale here for why you're using CBOR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is this CBOR-encoded data the only thing in the authenticated data, or are there other things that are in there alongside it?
4. Considering the events auth events: | ||
1. If there is not exactly one `m.room.create` event (with implicitly an empty state key), reject | ||
2. If the event is not the `m.room.create` event (we never get here) or an `m.mls.commit` event | ||
with epoch == 0 (we do get here), then if there is no `m.mls.commit` event in the auth events, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see anywhere in this MSC that mentions where the epoch comes from. I think it may be helpful to mention something about that somewhere, since this MSC will be read by people not completely familiar with MLS. (It comes from the ciphertext
, which despite being called "ciphertext", contains some cleartext bits, including the epoch. Which means that servers will be expected to be able to parse the MLS ciphertext, at least enough to read the epoch, group ID, and anything else that's needed.)
5. If the `content` of the `m.room.create` event in the room state has the property `m.federate` set | ||
to `false`, and the `origin` of the event does not match the `origin` domain of the create event, | ||
reject. | ||
6. If the `origin` is not listed in the `servers` of the previous `m.mls.commit` event (unless it is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to explain what "previous" means here (presumably, the m.mls.commit
event that is listed in the event's auth_events
).
6. If the `powers` has entries not in `servers`, reject | ||
7. If the `can_propose` has entries not in `servers`, reject | ||
8. Otherwise, allow | ||
11. If the epoch of the `ciphertext` is not the epoch of the current `m.mls.commit` event + 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the epoch of the ciphertext be equal to the epoch of the current m.mls.commit
event (which I assume means the m.mls.commit
event that is referenced in the event's auth_events)? Otherwise you're encrypting with an epoch that hasn't been created yet?
should ignore the `m.mls.proposed_commit`. | ||
|
||
Clients can know their `m.mls.proposed_commit` failed to be applied if they haven’t received a | ||
`m.mls.commit` event matching their `m.mls.proposed_commit` after "len(powers)*5" minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can clients also infer that their commit will fail to be applied if they receive a different m.mls.commit
with the same epoch as their commit?
|
||
State resolution might result in some clients losing access to messages as they threw their private | ||
state away immediately to provide PCS and forward secrecy. In that case the clients dropped by the | ||
state event will have to be sent another welcome to rejoin them into the room. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way for clients to ask to be re-added? How do others know that the client needs to be re-added?
Having conflicting commits will lead to some undecryptable messages. For short durations that can be | ||
mitigated by keeping the old state around for seconds to minutes. This is the common case when | ||
people invite new users at the same time or do otherwise racing modifications on different servers. | ||
For long durations no automatic recovery is possible and manual communication is necessary. We |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what this "manual communication" would entail? Are users going to have to talk to each other out-of-band and poke at the cryptographic state in their clients?
|
||
As no member events are used in this room version, the direct /state endpoint can not be used to | ||
change membership in a room | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are user's new devices/deleted devices handled? Who is in charge of adding/removing them from the group? How do people leave the group?
not provide the security guarantees we are looking for and is incompatible with MLS. If state | ||
resolution allows an arbitrary long amount of time between one server changing membership and other | ||
servers being made aware of that change, secret keys to resolve those commits have to be kept around | ||
for the same amount of time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be noted that the plan for MSC2883 was to try to apply https://eprint.iacr.org/2023/394 to address the issue of needing to keep private keys, so that epochs could not be derived a second time. However, we did not manage to work on it, so it is absolutely fair to say that the MSC as it currently stands has that weakness. (And, of course, applying that would be a further deviation from standard MLS.)
|
||
### Room state | ||
|
||
Currently all state is stored in one GroupContext extension. This has obvious size limitations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear, I believe that this means that having the state in the GroupContext means that the entire state will be encoded in every single commit.
|
||
There are several options for how a client can send a new commit into a room. | ||
|
||
* Users on the first server listed in the "powers" list can send commit state events directly. They |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if multiple people on that first server send a commit state event at the same time? Does he server just select one to accept and return an error to the others? If so, what error code does it use?
|
||
![Invite flow][invite-flow] | ||
|
||
## Potential issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the possible delay in applying a commit due to "powers" servers being offline is a significant potential issue that should be listed here. For example, if the first server listed in "powers" is offline, then when you invite someone, it will take 5 minutes before they can actually be in the room. (Could they end up being halfway in the room, if the inviter closes their client after they try to invite, but before they see the commit event?)
This, combined with encrypted state, may also cause problems for MatrixRTC, which IIRC uses various state events. If it takes over 5 minutes to set up a call, or for someone to join a call, that could be problematic.
people invite new users at the same time or do otherwise racing modifications on different servers. | ||
For long durations no automatic recovery is possible and manual communication is necessary. We | ||
expect those cases to be rare even in larger federations. Long term server outages usually don’t | ||
include membership modifications at the same time as usually one server is completely offline. Even |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Membership modifications aren't the only reasons for sending commits, though. Group members are expected to periodically send commits (or Updates which are committed by others), which will probably be more frequent than membership changes.
* Servers not listed as the first entry in the "powers" list should send a "m.mls.pending_commit" | ||
as a normal message. This will then be applied according to the sorting rules of that server, or | ||
according to the rules in the following section, and a commit state event will be sent by one of the | ||
homeservers listed in the "powers" list. Only users on servers in the "can_propose" list can send |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This nearly tripped me the first time I read it, so a note for future readers: in the current existing room versions, if a server wants to send an event to a room, it needs to use a user ID of a room member that is on that server to send the event. However, in rooms covered by this MSC, there is no sender
field, so servers can send events on their own, without needing to invoke a user ID.
After this the commit in the room is updated. Once this commit is accepted by the delivery service, | ||
the inviting user should send out the welcome message as a to_device message and send an invite | ||
using the /invite endpoint. To distribute the ratchet tree the ratchet tree MLS extension is | ||
currently required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In large groups, the Welcome
message can get quite large, so it may not be appropriate to send the whole thing as a to-device message. In my MSC2883 work, I put large Welcome
messages in the media repository, and sent a reference to the media repository as a to-device message. This has some advantages (e.g. the client can download the large message when it wants to, rather than have it clog up the /sync
, and the message doesn't need to be base64-encoded). But it has some other problems (e.g. we don't have a way to delete the media when it's no longer needed, the sender's server could be down when the recipient wants to download it). We may need to look into some better way of sending large blobs between devices, as this is an issue that affects all MLS-related MSCs, as well as possibly some other things that we're working on.
user to the MLS group. The key packages should be validated and should be cross-signed as configured | ||
in the additional authenticated data of the current MLS commit. | ||
After this the commit in the room is updated. Once this commit is accepted by the delivery service, | ||
the inviting user should send out the welcome message as a to_device message and send an invite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any definition of how the Welcome message is sent as a to-device event (e.g. what event type it uses, what properties the event has). Can you add that?
user to the MLS group. The key packages should be validated and should be cross-signed as configured | ||
in the additional authenticated data of the current MLS commit. | ||
After this the commit in the room is updated. Once this commit is accepted by the delivery service, | ||
the inviting user should send out the welcome message as a to_device message and send an invite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the impact if the invite was not sent, for example due to a buggy or malicious client, or due to a network error? It seems like the MLS membership list and Network membership list would get out of sync, which could be problematic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relatedly, is there any risk of a buggy client sending an invite without having sent a welcome message?
The invited user will then receive the invite and welcome message. If the user wants to join the | ||
room, the client can send a request to the /join endpoint. | ||
|
||
As no member events are used in this room version, the direct /state endpoint can not be used to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that member events are not used, how does network membership (invites, joins, etc) propagate between the homeservers that are participating in the room?
and should clamp any larger values to that maximum. | ||
`security.required_verification` specifies the verification required to add new users or clients | ||
to the MLS state. The sender of such a commit needs to have at least this level of verification with | ||
the other participant to add them. Possible values are m.direct, m.cross, m.tofu and m.none. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can guess what most of the values mean, but what does m.direct mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is this setting enforced by the protocol somehow, or is it intended as an indication to clients about how they SHOULD behave?
"powers" list. | ||
|
||
Servers have a 5 minute conflict window to apply `m.mls.proposed_commit` they receive. The | ||
highest power server can apply the commit in the first 3 minutes after the `origin_server_ts` of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not subject to manipulation by a server wishing to put itself in charge of applying commits, by falsifying origin_server_ts?
Clients can know their `m.mls.proposed_commit` failed to be applied if they haven’t received a | ||
`m.mls.commit` event matching their `m.mls.proposed_commit` after "len(powers)*5" minutes. | ||
|
||
State resolution might result in some clients losing access to messages as they threw their private |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of this section is that a netsplit of any non-trivial duration can lead to different commits being applied on each side of the split; when the split is eventually resolved, this will result in one side "losing" and any clients on the losing side will be effectively excluded from the room, since they will have discarded the necessary MLS state.
If that's correct, it feels akin to the "state reset" problems that current Matrix can suffer from, but much worse.
This seems very dangerous to me. IMHO it would be better not to allow MLS commits outside the current highest power server at all.
#### Membership handling | ||
|
||
This proposal does not use membership events and gets rid of the "sender" field in events. This | ||
essentially makes membership opaque to the homeserver and gets Matrix a lot closer to portable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unclear how membership can be opaque to the server. Doesn't the server still need to know a membership list to distribute events to clients?
Rendered
Server/Synapse implementation:
Client implementation:
Related to: