-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposals to fix BLS RFC v4's message binding security property #38
Comments
Cross-referencing #35. |
I sent the following email to [email protected] I'd like to escalate this issue to the CFRG chairs as a last resort. By responsibility disclosure mechanism, I reported the bugs privately far before I posted it publicly at #38. I did everything in my capability: reported the bugs, wrote proof-of-concept attack, wrote proof-of-concept fix. I'm curious what is the time commitment of the RFC's authors in resolving the following deadlock:
The standard authors are in an extremely powerful position where they dictate what every library should do. Does it go with responsibility for responding in a timely manner for security bugs deployed in production? Note that I'm not saying my proposed fix is correct and RFC's authors should follow it. What I'm asking is the BLS RFC authors' time commitments in resolving the security issues deployed in production? Thanks, |
Hi Quan, I really appreciate your reporting the bug and all of the thought you have clearly put into this. The short answer to your question, "what is the time commitment of the authors?" is: we're doing this on a volunteer basis, and therefore real life takes priority. For my part: I have not had a moment's spare time in the last 6 months. I hope to have time in the next several weeks to think more about this, but I do not have any more specific information than that. I can't speak for any of the other authors. Regarding deployment in production: I really do understand the concern here, and I share it. But this document is in draft status, which means the risk of bugs or incomplete features is understood to be nonzero, including by people who choose to deploy the draft specification. I am not pointing this out to be legalistic or to justify inaction, only to push back on the idea that somehow production deployment changes the authors' volunteer status or time commitment. I very much plan to deal with this issue once real life lets up a bit, but I do not know how soon that is. That is really the best commitment I can give you. Thanks for understanding, -=rsw |
@cryptosubtlety perhaps you could send a PR with your suggestions? |
@chris-wood Are you suggesting that I make a PR before BLS RFC's authors agree that my proposed fix is correct, safe and they will accept my proposed fix? |
Yeah, that's precisely what I'm suggesting :-) if the fix is agreeable, having a PR ready to go and review will surely save time. If it's not, then we close it, and no harm is done. (Plus, seeing the change might make evaluating the fix easier.) |
"If it's not, then we close it, and no harm is done" --> it's a waste of time. |
As noted above, this is not an IETF standard, it is not even submitted in a way that puts it under consideration to be an IETF standard. At this time It is a work product of the IRTF (a research group) and marked as "Intended status: Informational". For more clarity hear it straight from the renowned Brian E Carpenter It typically takes years and years to get to IETF Standard status, and few RFCs achieve that. It takes years just to get to a Proposed or Draft Standard. Read about the process on the IETF site, or at Wikipedia: Internet Standard. And read another internet-draft on the topic: The people who should take responsibility for quickly resolving this are the people who have decided to ask others to trust them, by deploying this draft research work. Not the people who have volunteered to do research on the subject. |
I'm not sure why everyone is enthusiastic in correcting me, a reader among several readers of the draft. Why don’t you try to correct the authors themselves who created the confusion? 1/ https://github.com/cfrg/draft-irtf-cfrg-bls-signature says “BLS Standard Draft. The repo is maintained by a working group aiming to standardize BLS signature scheme.” |
@cryptosubtlety That's a fair point. My initial goal was simply to link this conversation to some clarity on how the IETF itself fits in, since I learned about this on the IETF mailing list. A top priority would seem to be for the authors of this site to update the language on this site (e.g. in the README, vs the language in the I-D which is much clearer) to clarify which goals are short-term vs long-term, which venues standardization might or might not take place in, caveats about possible flaws and bugs and expectations for responses, etc. I appreciate your efforts to identify flaws, to encourage this discussion, and now for helping to identify those who have all of the above reasons (including deployment) to quickly resolve this. As others have noted, an honored way to continue contributing is to propose "working code" (a PR) for at least one possible approach, but you are free to leave it to those who have even better-aligned incentives. |
I definitely agree on priority and short term vs long term fix. Note that the implementers already fixed the bugs for single signature and for FastAggregateVerify because the draft is correct. The only missing piece is to fix the same bug for AggregateVerify. It’s worth stressing that my proposal doesn't propose any new algorithm, it only makes all APIs consistent with each other. I think fixing this bug is the highest priority for now. I’ve learned that people from different backgrounds and environments have different views on what “quick” means. One of the authors were contacted privately since Nov 2020, i.e., 6 months ago. For industrial security, 6 months is a very long time to fix a security bug. |
The document has 5 authors who are security experts. It's hard to imagine that all 5 authors are too busy to review a security fix, not new algorithm. Therefore, waiting 6 months without even a concrete time commitment to fix is unacceptable to me. It's your document, you decide what to do with it. I give up. |
We plan to address this in the next update. Apologies that our schedules didn't match yours. As I've noted elsewhere, we're just volunteers. Apparently it is entirely possible for none of us to have time in a given 6-month period, since that just happened. I'm hopeful that my schedule will improve in the next few weeks. |
Hey Thanks @kwantam for all the heavy-lifting. Just checking in here. I was not aware of this issue until just now, and that was on me. I am happy to contribute in any means towards a fix. |
@zhenfeizhang @kwantam opened a PR with a proposed fix in #40 |
Let's get real for a moment. Do researchers actually assert that it's safe to use AggregateVerify with non-unique messages with PoP at hand? FastAggregateVerify rests on proof for same-message case, but it doesn't actually give you pass to use AggregateVerify with non-unique messages. Is there proof for the latter case? And if there is (where?), then which assurances does it actually provide? Or in other words, correct course of action is to first spell out the scheme's applicability boundaries, then discuss fixes. |
Besides, suggested mitigation works only when colluding parties choose a message different from all the others. But if they choose to share a message with one honest user and get it across, they can later claim they were sharing another with a different user. The only way I see to preclude |
@dot-asm Your questions are great. Let's wait for the IRTF's authors to shed some lights on them. |
From Dan Boneh and Victor Shoup's book https://toc.cryptobook.us/book.pdf, section "15.5.3.2 Method 2: proof of possession of the secret key" and "15.5.3.3 Proving security of both aggregation methods" allow arbitrary messages, i.e., non-distinct messages are allowed. |
Come on, and "which assurances does it actually provide?" Thing is that given the parameters of the corresponding attack game, the suggested fix is not needed. Because you can't introduce a message different from one signed by the victim, which would be paired with the victim's public key. In other words, the referred proof stands, but it doesn't apply to the described scenario. Just in case for reference. I also view |
From a security perspective, restricting AggregateVerify for PoP to distinct messages is the safest one. However, I think it will make using AggregateVerify practically useless, especially in the distributed environment like blockchain. From the verifier’s perspective, it's easy to check distinct messages, however making sure that distributed independent signers sign distinct messages is a challenging task. |
Looks like the PoP-less rogue-key attack-safe version of aggregation doesn't have the property that @cryptosubtlety noted, or does it? (the BDN18 version, and section 2 in https://crypto.stanford.edu/~dabo/pubs/papers/BLSmultisig.html) The IETF draft contains the comment "[BDN18] prove the security of another rogue key defense; this defense is not standardized in this document." I wonder if it has been considered and rejected (simplicity, performance?) or just not discussed yet. |
Does it matter in the context of the discussion? I mean it's specifically PoP that is discussed here, so how would discussion about other schemes advance this one? Well, one can say that it would illuminate PoP limitations, but then it becomes rather about making suitable choices for specific applications. In other words, it sounds like the question is rather "what are these-n-those schemes out there, how do they compare?" As well as "why aren't they discussed in this draft?" These are good questions, but they would be a distraction here. Well, unless the suggestion is to scrap PoP altogether as non-viable for any application, or as there-always-is-better-alternative. Is it? |
Didn't mean to derail the discussion, of course my implicit point was "why not killing two birds and switching to the improved construction", but I reckon it's too late to make such a change, isn't it? |
Thanks @veorq for bringing https://crypto.stanford.edu/~dabo/pubs/papers/BLSmultisig.html up. We can look at it to see whether we can learn any ideas from it to improve PoP's security. From security's perspective, the attacker can't manipulate both public keys (PK1, PK2, PK3) and coefficients (t1, t2, t3) of aggregate signature at the same time, so it prevent all attacks that I'm aware of. From applicability perspective, BLSmultisig.html has a drawback. The aggregator must know all the public keys (PK1, PK2, PK3) in advance to compute sig = t1 * sig1 + t2 * sig2 + t3 * sig3, i.e., it does not allow gradual aggregation. In PoP, sig = sig1 + sig2 + sig3 can be computed gradually over time by different entities: sig12 = sig1 + sig2 --> sig = sig12 + sig3. |
Good points, thanks. Other problems are that it would be incompatible with the fast verification for same-message signing, and (most likely) with the optimization implemented by most wallets for verifying multiple aggregate sigs (in https://ethresear.ch/t/fast-verification-of-multiple-bls-signatures/5407). |
Strawman patch for issue #38 plus fixing serialization format link
fixed with #40 |
Problems
There are 2 problems with current BLS RFC v4’s AggregateVerify and FastAggregateVerify in proof-of-possession schemes (see https://github.com/cryptosubtlety/zero/blob/main/0.pdf). AggregateVerify and FastAggregateVerify interfaces:
However, as we’ll see below, 1 fix is enough to fix both of them.
1. Message binding.
In its simplest form, if PK_1 + PK_2 = 0 and msg_1 = msg_2 = msg then
2. Inconsistency between AggregateVerify and FastAggregateVerify.
While AggregateVerify and FastAggregateVerify are 2 different interfaces, their use cases overlap. Naturally, in their common use cases, we expect them to return to the same result. However, by the current BLS RFC v4, they return different results for the following common test. When PK_1 + PK_2 = 0, msg_1 = msg_2 = msg, we have:
The reason is that FastAggregateVerify first aggregates the public keys PK = PK_1 + PK_2 and then calls KeyValidate(PK). KeyValidate checks the validity of the public key including checking whether PK = 0. As PK = 0, KeyValidate(PK) returns False which leads to returning False for FastAggregateVerify((PK_1, PK_2), msg, 0). AggregateVerify on the other hand, does not aggregate the public keys PK_1, PK_2 and hence does not check PK_1 + PK_2 and therefore, it returns True by pairing equations.
Proposed fix
Aggregate the public keys of the same message in AggregateVerify and call KeyValidate of those aggregate public keys. This is only an extra check, so it’s safe security-wise as it can’t decrease security by any means.
Let’s assume that we'd like to AggregateVerify the following public keys and messages:
Note that msg_1 = msg_3, msg_2 = msg_4. Before executing AggregateVerify as defined in the RFC, we aggregate the public keys of the same message as follow:
and we call KeyValidate(PK_13), KeyValidate(PK_24). If PK_13 = 0 or PK_24 = 0 then KeyValidate will return False and hence AggregateVerify will return False.
This will prevent all known message binding attacks because the attacks require using the same message and the aggregate public keys is 0. Note that rogue public key attacks also require the same message. In other words, I believe this proposal fixes the root cause.
This proposal also fixes the inconsistency between AggregateVerify and FastAggregateVerify because AggregateVerify((PK_1, PK_2), (msg, msg), 0) = False = FastAggregateVerify((PK_1, PK_2), msg, 0). Furthermore, with the proposal, the behavior of AggregateVerify and FastAggregateVerify are now exactly the same in case msg_1 = msg_2 = … = msg_n because they both aggregate the public keys and check it.
Optional proposal to improve performance
Note that in proof-of-possession, the current AggregateVerify is the same as CoreAggregateVerify (https://tools.ietf.org/html/draft-irtf-cfrg-bls-signature-04#section-2.9). In CoreAggregateVerify’s pseudocode
In our example, unroll the loop, C1 is computed as follow:
I.e., it computes 4 pairings.
As msg_1 = msg_3, msg_2 = msg_4, we can rewrite C1 as follow:
As we already aggregate public keys of the same messages in the previous steps : PK_1 + PK_3 = PK_13, PK_2 + PK_4 = PK_24, we see that the computation of C1 only requires 2 pairings while the old code requires 4 pairings.
This is strictly significant performance improvement. Think about the use case where the first 2000 messages are the same and the last 2000 messages are the same. The old code would need 4000 pairings while the new code only needs 2 pairings.
Note that this performance improvement does not change any existing behaviour, we only compute things in a different way, so it does not affect security.
Proof of concept implementation in py_ecc
This PoC only contains the fix, not the optional performance improvement. The idea is we sort the (message, public key) pairs by message so that all the same messages stay close to each other. After that we can aggregate the public key of the same messages in 1 for loop. An alternative implementation can use a hash map.
@kwantam @JustinDrake
The text was updated successfully, but these errors were encountered: