Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposals to fix BLS RFC v4's message binding security property #38

Closed
cryptosubtlety opened this issue Mar 11, 2021 · 27 comments
Closed

Comments

@cryptosubtlety
Copy link

cryptosubtlety commented Mar 11, 2021

Problems

There are 2 problems with current BLS RFC v4’s AggregateVerify and FastAggregateVerify in proof-of-possession schemes (see https://github.com/cryptosubtlety/zero/blob/main/0.pdf). AggregateVerify and FastAggregateVerify interfaces:

AggregateVerify((PK_1, ..., PK_n), (msg_1, ..., msg_n), signature)
FastAggregateVerify((PK_1, ..., PK_n), msg, signature)

However, as we’ll see below, 1 fix is enough to fix both of them.

1. Message binding.

In its simplest form, if PK_1 + PK_2 = 0 and msg_1 = msg_2 = msg then

 AggregateVerify((PK_1, PK_2), (msg, msg), 0) = True for all msg.

2. Inconsistency between AggregateVerify and FastAggregateVerify.

While AggregateVerify and FastAggregateVerify are 2 different interfaces, their use cases overlap. Naturally, in their common use cases, we expect them to return to the same result. However, by the current BLS RFC v4, they return different results for the following common test. When PK_1 + PK_2 = 0, msg_1 = msg_2 = msg, we have:

 AggregateVerify((PK_1, PK_2), (msg, msg), 0) = True
 FastAggregateVerify((PK_1, PK_2), msg, 0) = False.

The reason is that FastAggregateVerify first aggregates the public keys PK = PK_1 + PK_2 and then calls KeyValidate(PK). KeyValidate checks the validity of the public key including checking whether PK = 0. As PK = 0, KeyValidate(PK) returns False which leads to returning False for FastAggregateVerify((PK_1, PK_2), msg, 0). AggregateVerify on the other hand, does not aggregate the public keys PK_1, PK_2 and hence does not check PK_1 + PK_2 and therefore, it returns True by pairing equations.

Proposed fix

Aggregate the public keys of the same message in AggregateVerify and call KeyValidate of those aggregate public keys. This is only an extra check, so it’s safe security-wise as it can’t decrease security by any means.
Let’s assume that we'd like to AggregateVerify the following public keys and messages:

 (PK1_1, PK_2, PK_3, PK_4), 
 (msg_1 = "msg 1", msg_2 = "msg 2", msg_3 = "msg 1", msg_4 = "msg 2")

Note that msg_1 = msg_3, msg_2 = msg_4. Before executing AggregateVerify as defined in the RFC, we aggregate the public keys of the same message as follow:

 PK_1 + PK_3 = PK_13
 PK_2 + PK_4 = PK_24

and we call KeyValidate(PK_13), KeyValidate(PK_24). If PK_13 = 0 or PK_24 = 0 then KeyValidate will return False and hence AggregateVerify will return False.

This will prevent all known message binding attacks because the attacks require using the same message and the aggregate public keys is 0. Note that rogue public key attacks also require the same message. In other words, I believe this proposal fixes the root cause.

This proposal also fixes the inconsistency between AggregateVerify and FastAggregateVerify because AggregateVerify((PK_1, PK_2), (msg, msg), 0) = False = FastAggregateVerify((PK_1, PK_2), msg, 0). Furthermore, with the proposal, the behavior of AggregateVerify and FastAggregateVerify are now exactly the same in case msg_1 = msg_2 = … = msg_n because they both aggregate the public keys and check it.

Optional proposal to improve performance

Note that in proof-of-possession, the current AggregateVerify is the same as CoreAggregateVerify (https://tools.ietf.org/html/draft-irtf-cfrg-bls-signature-04#section-2.9). In CoreAggregateVerify’s pseudocode

5.  for i in 1, ..., n:
6.     If KeyValidate(PK_i) is INVALID, return INVALID
7.     xP = pubkey_to_point(PK_i)
8.     Q = hash_to_point(msg_i)
9.     C1 = C1 * pairing(Q, xP)

In our example, unroll the loop, C1 is computed as follow:

C1 = pairing(hash_to_point(msg_1), PK_1) * pairing(hash_to_point(msg_2), PK_2) 
   * pairing(hash_to_point(msg_3), PK_3) * pairing(hash_to_point(msg_4), PK_4)

I.e., it computes 4 pairings.

As msg_1 = msg_3, msg_2 = msg_4, we can rewrite C1 as follow:

C1         = pairing(hash_to_point(msg_1), PK_1) * pairing(hash_to_point(msg_2), PK_2) 
           * pairing(hash_to_point(msg_1), PK_3) * pairing(hash_to_point(msg_2), PK_4)
(reorder)  = pairing(hash_to_point(msg_1), PK_1)  * pairing(hash_to_point(msg_1), PK_3)
           * pairing(hash_to_point(msg_2), PK_2) * pairing(hash_to_point(msg_2), PK_4)
(bilinear) = pairing(hash_to_point(msg_1), PK_1 + PK_3) * pairing(hash_to_point(msg_2), PK_2 + PK_4)
           = pairing(hash_to_point(msg_1), PK_13) * pairing(hash_to_point(msg_2), PK_24)

As we already aggregate public keys of the same messages in the previous steps : PK_1 + PK_3 = PK_13, PK_2 + PK_4 = PK_24, we see that the computation of C1 only requires 2 pairings while the old code requires 4 pairings.

This is strictly significant performance improvement. Think about the use case where the first 2000 messages are the same and the last 2000 messages are the same. The old code would need 4000 pairings while the new code only needs 2 pairings.

Note that this performance improvement does not change any existing behaviour, we only compute things in a different way, so it does not affect security.

Proof of concept implementation in py_ecc

This PoC only contains the fix, not the optional performance improvement. The idea is we sort the (message, public key) pairs by message so that all the same messages stay close to each other. After that we can aggregate the public key of the same messages in 1 for loop. An alternative implementation can use a hash map.

--- a/py_ecc/bls/ciphersuites.py
+++ b/py_ecc/bls/ciphersuites.py
@@ -286,6 +286,20 @@ class G2ProofOfPossession(BaseG2Ciphersuite):
     @classmethod
     def AggregateVerify(cls, PKs: Sequence[BLSPubkey],
                         messages: Sequence[bytes], signature: BLSSignature) -> bool:
+        pk_message = [(messages[i], PKs[i]) for i in range(len(PKs))]
+        pk_message.sort(key = lambda x : x[0])
+        same_pks = [pk_message[0][1]]
+        for i in range(1, len(PKs), 1):
+            if pk_message[i][0] == pk_message[i - 1][0]:
+                same_pks.append(pk_message[i][1])
+            else:
+                agg_pubkey = cls._AggregatePKs(same_pks)
+                if not cls.KeyValidate(agg_pubkey):
+                    return False
+                same_pks = [pk_message[i][1]]
+        agg_pubkey = cls._AggregatePKs(same_pks)
+        if not cls.KeyValidate(agg_pubkey):
+            return False

@kwantam @JustinDrake

@dot-asm
Copy link

dot-asm commented Mar 14, 2021

Cross-referencing #35.

@cryptosubtlety
Copy link
Author

I sent the following email to [email protected]
"Hi,

I'd like to escalate this issue to the CFRG chairs as a last resort. By responsibility disclosure mechanism, I reported the bugs privately far before I posted it publicly at #38. I did everything in my capability: reported the bugs, wrote proof-of-concept attack, wrote proof-of-concept fix.

I'm curious what is the time commitment of the RFC's authors in resolving the following deadlock:

  • Libraries code (ethereum/py ecc, supranational/blst, herumi/bls,sigp/milagro bls) are deployed in production. They're not academic nor experimental code.
  • Libraries' authors can't fix the code because they have to follow the standard.
  • BLS RFC v4's authors don't move an inch in fixing it nor have any time commitment.

The standard authors are in an extremely powerful position where they dictate what every library should do. Does it go with responsibility for responding in a timely manner for security bugs deployed in production?
Even if they don't want to fix the message binding bug, should they at least fix a very obvious bug? AggregateVerify((PK_1, PK_2), (msg, msg), 0) = True, FastAggregateVerify((PK_1, PK_2), msg, 0) = False.

Note that I'm not saying my proposed fix is correct and RFC's authors should follow it. What I'm asking is the BLS RFC authors' time commitments in resolving the security issues deployed in production?

Thanks,
Quan

@kwantam
Copy link
Collaborator

kwantam commented Apr 23, 2021

Hi Quan,

I really appreciate your reporting the bug and all of the thought you have clearly put into this.

The short answer to your question, "what is the time commitment of the authors?" is: we're doing this on a volunteer basis, and therefore real life takes priority. For my part: I have not had a moment's spare time in the last 6 months. I hope to have time in the next several weeks to think more about this, but I do not have any more specific information than that. I can't speak for any of the other authors.

Regarding deployment in production: I really do understand the concern here, and I share it. But this document is in draft status, which means the risk of bugs or incomplete features is understood to be nonzero, including by people who choose to deploy the draft specification. I am not pointing this out to be legalistic or to justify inaction, only to push back on the idea that somehow production deployment changes the authors' volunteer status or time commitment.

I very much plan to deal with this issue once real life lets up a bit, but I do not know how soon that is. That is really the best commitment I can give you.

Thanks for understanding,

-=rsw

@chris-wood
Copy link

@cryptosubtlety perhaps you could send a PR with your suggestions?

@cryptosubtlety
Copy link
Author

@chris-wood Are you suggesting that I make a PR before BLS RFC's authors agree that my proposed fix is correct, safe and they will accept my proposed fix?
If the authors give me a green light to my proposed fix, I'm happy to submit a PR.

@chris-wood
Copy link

Yeah, that's precisely what I'm suggesting :-) if the fix is agreeable, having a PR ready to go and review will surely save time. If it's not, then we close it, and no harm is done. (Plus, seeing the change might make evaluating the fix easier.)

@cryptosubtlety
Copy link
Author

"If it's not, then we close it, and no harm is done" --> it's a waste of time.
FWIW, I wrote my proof-of-concept fix in py_ecc because it's testable, so that all my tests will pass after the proposed fix.

@nealmcb
Copy link

nealmcb commented May 21, 2021

As noted above, this is not an IETF standard, it is not even submitted in a way that puts it under consideration to be an IETF standard. At this time It is a work product of the IRTF (a research group) and marked as "Intended status: Informational". For more clarity hear it straight from the renowned Brian E Carpenter
Re: Escalation: time commitment to fix production security bugs for BLS RFC v4?

It typically takes years and years to get to IETF Standard status, and few RFCs achieve that. It takes years just to get to a Proposed or Draft Standard. Read about the process on the IETF site, or at Wikipedia: Internet Standard. And read another internet-draft on the topic:
Just because it's an ID doesn't mean anything... at all... draft-wkumari-not-a-draft-11.

The people who should take responsibility for quickly resolving this are the people who have decided to ask others to trust them, by deploying this draft research work. Not the people who have volunteered to do research on the subject.

@cryptosubtlety
Copy link
Author

cryptosubtlety commented May 21, 2021

I'm not sure why everyone is enthusiastic in correcting me, a reader among several readers of the draft. Why don’t you try to correct the authors themselves who created the confusion?

1/ https://github.com/cfrg/draft-irtf-cfrg-bls-signature says “BLS Standard Draft. The repo is maintained by a working group aiming to standardize BLS signature scheme.”
2/ One of the draft's authors used the phrase "the standard" 4 times, see https://www.algorand.com/resources/blog/first-release-bls-library, not informational as you said.
3/ https://datatracker.ietf.org/doc/html/draft-irtf-cfrg-bls-signature-04#section-6 mentioned multiple names and implementations including Algorand, Chia, Dfinity, Ethereum 2.0 (I also don't understand the reasons). Two draft's authors worked for Algorand. So, it’s not like the group of people who do research are independent of groups of people who ask others to trust them.

@nealmcb
Copy link

nealmcb commented May 21, 2021

@cryptosubtlety That's a fair point. My initial goal was simply to link this conversation to some clarity on how the IETF itself fits in, since I learned about this on the IETF mailing list.

A top priority would seem to be for the authors of this site to update the language on this site (e.g. in the README, vs the language in the I-D which is much clearer) to clarify which goals are short-term vs long-term, which venues standardization might or might not take place in, caveats about possible flaws and bugs and expectations for responses, etc.

I appreciate your efforts to identify flaws, to encourage this discussion, and now for helping to identify those who have all of the above reasons (including deployment) to quickly resolve this. As others have noted, an honored way to continue contributing is to propose "working code" (a PR) for at least one possible approach, but you are free to leave it to those who have even better-aligned incentives.

@cryptosubtlety
Copy link
Author

I definitely agree on priority and short term vs long term fix. Note that the implementers already fixed the bugs for single signature and for FastAggregateVerify because the draft is correct. The only missing piece is to fix the same bug for AggregateVerify. It’s worth stressing that my proposal doesn't propose any new algorithm, it only makes all APIs consistent with each other. I think fixing this bug is the highest priority for now.

I’ve learned that people from different backgrounds and environments have different views on what “quick” means. One of the authors were contacted privately since Nov 2020, i.e., 6 months ago. For industrial security, 6 months is a very long time to fix a security bug.

@cryptosubtlety
Copy link
Author

The document has 5 authors who are security experts. It's hard to imagine that all 5 authors are too busy to review a security fix, not new algorithm. Therefore, waiting 6 months without even a concrete time commitment to fix is unacceptable to me. It's your document, you decide what to do with it. I give up.

asanso pushed a commit to asanso/draft-irtf-cfrg-bls-signature that referenced this issue Jun 3, 2021
@kwantam
Copy link
Collaborator

kwantam commented Jun 3, 2021

We plan to address this in the next update. Apologies that our schedules didn't match yours. As I've noted elsewhere, we're just volunteers. Apparently it is entirely possible for none of us to have time in a given 6-month period, since that just happened. I'm hopeful that my schedule will improve in the next few weeks.

@kwantam kwantam reopened this Jun 3, 2021
asanso pushed a commit to asanso/draft-irtf-cfrg-bls-signature that referenced this issue Jun 3, 2021
@zhenfeizhang
Copy link
Collaborator

Hey

Thanks @kwantam for all the heavy-lifting.

Just checking in here. I was not aware of this issue until just now, and that was on me. I am happy to contribute in any means towards a fix.

@asanso
Copy link

asanso commented Jun 8, 2021

@zhenfeizhang @kwantam opened a PR with a proposed fix in #40

@dot-asm
Copy link

dot-asm commented Jun 14, 2021

Let's get real for a moment. Do researchers actually assert that it's safe to use AggregateVerify with non-unique messages with PoP at hand? FastAggregateVerify rests on proof for same-message case, but it doesn't actually give you pass to use AggregateVerify with non-unique messages. Is there proof for the latter case? And if there is (where?), then which assurances does it actually provide? Or in other words, correct course of action is to first spell out the scheme's applicability boundaries, then discuss fixes.

@dot-asm
Copy link

dot-asm commented Jun 14, 2021

Besides, suggested mitigation works only when colluding parties choose a message different from all the others. But if they choose to share a message with one honest user and get it across, they can later claim they were sharing another with a different user. The only way I see to preclude the this kind of forgery is to discover colluding parties by checking all possible sums of involved public keys. Or require messages to be unique.

@cryptosubtlety
Copy link
Author

@dot-asm Your questions are great. Let's wait for the IRTF's authors to shed some lights on them.
Re checking all possible sums of public keys. My rough estimation (don't count on it) is that it's a hard problem because it's equivalent to finding a_i \in {0, 1} such that a_1 *PK_1 + a_2 * PK_2 + ... + a_n * PK_n = 0. This looks harder than https://en.wikipedia.org/wiki/Short_integer_solution_problem because SIS is at least linear while elliptic curve point addition is not linear.

@cryptosubtlety
Copy link
Author

From Dan Boneh and Victor Shoup's book https://toc.cryptobook.us/book.pdf, section "15.5.3.2 Method 2: proof of possession of the secret key" and "15.5.3.3 Proving security of both aggregation methods" allow arbitrary messages, i.e., non-distinct messages are allowed.

@dot-asm
Copy link

dot-asm commented Jun 20, 2021

Come on, and "which assurances does it actually provide?" Thing is that given the parameters of the corresponding attack game, the suggested fix is not needed. Because you can't introduce a message different from one signed by the victim, which would be paired with the victim's public key. In other words, the referred proof stands, but it doesn't apply to the described scenario.

Just in case for reference. I also view this the matter of discussion as problematic, but I'm not convinced that it requires more than mention and clarification/disambiguation.

@cryptosubtlety
Copy link
Author

From a security perspective, restricting AggregateVerify for PoP to distinct messages is the safest one. However, I think it will make using AggregateVerify practically useless, especially in the distributed environment like blockchain. From the verifier’s perspective, it's easy to check distinct messages, however making sure that distributed independent signers sign distinct messages is a challenging task.
P/S: For FastAggregateVerify, the use case is clear in the sense that it asks distributed signer to collectively certify a single goal/message, while AggregateVerify says that “hey signers, sign whatever you want, but make sure they’re distinct”.

@veorq
Copy link
Contributor

veorq commented Jun 21, 2021

Looks like the PoP-less rogue-key attack-safe version of aggregation doesn't have the property that @cryptosubtlety noted, or does it? (the BDN18 version, and section 2 in https://crypto.stanford.edu/~dabo/pubs/papers/BLSmultisig.html)

The IETF draft contains the comment "[BDN18] prove the security of another rogue key defense; this defense is not standardized in this document." I wonder if it has been considered and rejected (simplicity, performance?) or just not discussed yet.

@dot-asm
Copy link

dot-asm commented Jun 21, 2021

Looks like the PoP-less rogue-key attack-safe version of aggregation doesn't have the property that @cryptosubtlety noted, or does it?

Does it matter in the context of the discussion? I mean it's specifically PoP that is discussed here, so how would discussion about other schemes advance this one? Well, one can say that it would illuminate PoP limitations, but then it becomes rather about making suitable choices for specific applications. In other words, it sounds like the question is rather "what are these-n-those schemes out there, how do they compare?" As well as "why aren't they discussed in this draft?" These are good questions, but they would be a distraction here. Well, unless the suggestion is to scrap PoP altogether as non-viable for any application, or as there-always-is-better-alternative. Is it?

@veorq
Copy link
Contributor

veorq commented Jun 21, 2021

Didn't mean to derail the discussion, of course my implicit point was "why not killing two birds and switching to the improved construction", but I reckon it's too late to make such a change, isn't it?

@cryptosubtlety
Copy link
Author

Thanks @veorq for bringing https://crypto.stanford.edu/~dabo/pubs/papers/BLSmultisig.html up. We can look at it to see whether we can learn any ideas from it to improve PoP's security.
(t1, t2, t3) = H(PK1, PK2, PK3)
sig = t1 * sig1 + t2 * sig2 + t3 * sig3

From security's perspective, the attacker can't manipulate both public keys (PK1, PK2, PK3) and coefficients (t1, t2, t3) of aggregate signature at the same time, so it prevent all attacks that I'm aware of.
Another idea we can learn from BLSmultisig is that it "commits/binds" the public keys (PK1, PK2, PK3) to the signature sig. We can apply this idea to improve security at the application for proof-of-possession (PoP) by binding/committing/storing the Merkle hash of public keys together with aggregate signature. This doesn't prevent message binding attack in this thread, but it prevents key binding attack (in its simplest form if Verify(PK,m, sig) = true then FastAggregateVerify((PK1, -PK1, PK), m, sig) = true: afaik there is no effective way to fix key binding attack for PoP at cryptographic layer).

From applicability perspective, BLSmultisig.html has a drawback. The aggregator must know all the public keys (PK1, PK2, PK3) in advance to compute sig = t1 * sig1 + t2 * sig2 + t3 * sig3, i.e., it does not allow gradual aggregation. In PoP, sig = sig1 + sig2 + sig3 can be computed gradually over time by different entities: sig12 = sig1 + sig2 --> sig = sig12 + sig3.

@veorq
Copy link
Contributor

veorq commented Jun 23, 2021

Good points, thanks. Other problems are that it would be incompatible with the fast verification for same-message signing, and (most likely) with the optimization implemented by most wallets for verifying multiple aggregate sigs (in https://ethresear.ch/t/fast-verification-of-multiple-bls-signatures/5407).

zhenfeizhang added a commit that referenced this issue Jul 22, 2021
Strawman patch for issue #38 plus fixing serialization format link
@zhenfeizhang
Copy link
Collaborator

fixed with #40

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants