-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Verified Bedrock2 code for Number-Theoretic Transform #1997
base: master
Are you sure you want to change the base?
Conversation
In principle, I am interested in theory of polynomials and NTTs, as well as a Bedrock2 implementation. The main reason I have not undertaken it myself is simply the amount of work I expect it to take. Already polynomial CRT is impressive, let alone the rest of the PR. FYI, there is another exploration of NTT in Bedrock2 by @lukaszobernig at master...lukaszobernig:fiat-crypto:ntt#diff-2ba131d0bae59c3f1d7bf36175bb93917df54b968bb82545a3ad793cbecee00f As for what to merge, and in what shape, I think the main considerations are that (1) I want to only merge things I can see myself recommending, either for use or study and (2) it would be really cool to have something on NTTs, and fiat-crypto seems like a good place for this work to live. In that spirit, let me raise some high-level questions
Overall, this is an ambitious and impressive project. I would like to get something like this in, but it would require a fair bit of work and more tight collaboration between us. Please let me know what you think; we could also do a call if that sounds helpful to you. |
My math is quite rusty, so apologies if I'm missing something. If the aim is to prove that the NTT is a ring isomorphism using evaluation at roots of unity, then the proofs works because there is unicity of the interpolated polynomial given enough distinct values as far as I remember. With an "incomplete" NTT, there are not enough values for that. That was what I was trying to solve with the cyclotomic decomposition, how to give a type to something like (taken from MLKEM specification/FIPS203) Another small concern, Lukas assumes a primitive root of unity. So when instantiating the theorem, one will need to prove that the root is primitive, I don't know if that is a difficult to check? In my formalization, I only need to know that
I only formalized enough polynomial theory to prove the NTT, so there are a lot of things lacking. I did not even define polynomial evaluation for instance. I don't think I want to develop a full-fledged polynomial library, so I would only add things if specific needs arise. Regarding representation of polynomials, I also like how much of it is blind to the representation, though I ended up having to duplicate quite a bit of code to operate on list of coefficients rather than general polynomials, which is partly why the cyclotomic decomposition file is so big, and I wonder if I could have done better to avoid that and/or if representing directly polynomials as list of coefficients could have avoided that.
There is no problem with proving bedrock2 code directly, I just wanted to finally manage to have something working with Rupicola.
I really like the idea of automatically producing verified code from Gallina code, so I tried making it work. Though, from a usability point of view, I often couldn't make sense of why it wouldn't work, so ended up directly proving the bedrock2 code after a while. I think there is also a lack of lemmas useful for cryptography in the library, like multi-word arithmetic for instance. But that's a bit of chicken-and-egg issue, I could also have contributed those lemmas. Though if it's not going to be worked on, it's probably not worthwhile to keep using it imo. Happy to discuss how to proceed further with this. |
That's a choice I made since it was easier to state the spec in terms of lists, but the underlying NTT algorithm should be easily adapted to output a polynomial instead (and the spec will have to talk about coefficients instead of list indices).
From what I remember showing that |
@atrieu your explanation of cyclotomic decomposition is persuasive, and am coming around the the prespective that polynomial evaluation matching remainders of linear-polynomial divisors is a coincidence for the purposes of this work. So, yes, for the codomain of a partial NTT, I think it would indeed be a list of polynomials -- your cyclotomic decomposition. However, what I find appealing about @lukaszobernig's approach is that the NTT itself serves as the definition of the decomposition into remainders by (linear/cyclotomic) polynomials. What I am hoping for is a development structured similarly around the NTT recursion, but proven to materialize remainders with cyclotomics of degree 2^r for any r passed as an argument, rather than just r=0 where this matches polynomial evaluation. It's possible that I am naively optimistic here, but it seems worth thinking about whether the appropriate partial-NTT generalizations from @atrieu's work could be integrated into the recursion structure of @lukaszobernig's work. I imagine that would require finding all uses of relationships between polynomial remainder and polynomial evaluation, and replacing them with more general lemmas about remainders of the same polynomial with different divisors. Maybe that's what all the theory I haven't worked through in @atrieu's work is about, but it also seems plausible to me that using carefully-chosen recursion to avoid materializing intermediate layers just avoids a lot of incidental complexity? I'm curious about what you both think. |
Here is how I think the forward NTT could look like using cyclotomic decomposition and @lukaszobernig's recursion structure below. The Some remaining worries, I'm not sure how one would write the inverse NTT in the same style, and whether the isomorphism proof is going to be also easier. mathcomp_ntt.vFrom mathcomp Require Import all_ssreflect all_algebra.
Set Implicit Arguments.
Unset Strict Implicit.
Unset Printing Implicit Defensive.
Import GRing.Theory.
Local Open Scope ring_scope.
Section NTT.
Variable F : fieldType.
Variable w : F.
Definition modulus (k l : nat) :=
'X^(2 ^ k) - (w ^+ l)%:P.
Section WithMN.
(* Modulus is X^{2^n} + 1 = modulus n (2^m) *)
Context (m n : nat).
Context (w_neg1 : w ^+ (2 ^ m) = -1).
Section InductiveCase.
Context(char_F_neq2 : 2 != 0 :> F).
(* Recurse on r *)
Context
(rec_ntt : nat->nat->nat->{poly F}->seq {poly F})
(rec_moduli : nat->nat->seq nat)
(r' k l i : nat)
(r := r'.+1)
(p : {poly F}).
Let m1 := modulus (k - 1) (l%/2)%nat.
Let m2 := modulus (k - 1) ((2^m) + (l%/2))%nat.
Let p1 := p %% m1.
Let p2 := p %% m2.
Let ntt_lhs := rec_ntt r' (k - 1) (l%/2) p1.
Let ntt_rhs := rec_ntt r' (k - 1) ((2^m) + (l%/2)) p2.
Definition ntt_body' := ntt_lhs ++ ntt_rhs.
Let moduli_lhs := rec_moduli r' (l%/2).
Let moduli_rhs := rec_moduli r' ((2^m) + (l%/2)).
Definition moduli_body' := moduli_lhs ++ moduli_rhs.
Context (r_leq_k : (r <= k)%nat).
Context (r_leq_l : (l %% (2^r) = 0)%nat).
Context (p_ok : p %% (modulus k l) == p).
(* Maybe missing hypotheses but you get the idea *)
Context (h_rec_ntt :
forall (k l: nat) (p : {poly F}),
(r' <= k)%nat ->
(l %% (2^r') = 0)%nat ->
p %% (modulus k l) == p ->
all2 (fun q l => q == p %% (modulus (k - r') l)) (rec_ntt r' k l p) (rec_moduli r' l)).
Lemma ntt_body_spec':
all2 (fun q l => q == p %% (modulus (k - r) l)) ntt_body' moduli_body'.
Proof. Admitted.
(* Other lemmas that are probably needed about
map (fun l => (modulus (k - r) l)) moduli_body':
- The polynomials are pairwise coprime
- The product of the polynomials is equal to (modulus k l)
*)
End InductiveCase.
Definition ntt_body rec_ntt (r k l : nat) (p : {poly F}) : seq {poly F} :=
if r is r'.+1 then
ntt_body' rec_ntt r' k l p
else
[:: p].
Definition moduli_body rec_moduli (r l :nat) : seq nat :=
if r is r'.+1 then
moduli_body' rec_moduli r' l
else
[:: l].
Fixpoint ntt' r k l p := ntt_body ntt' r k l p.
Definition ntt p := ntt' (min m n) n (2^m) p.
Fixpoint moduli' r l := moduli_body moduli' r l.
Definition moduli := moduli' (min m n) (2^m).
Lemma ntt_spec p:
p == p %% ('X^(2 ^ n) + 1%:P) ->
all2 (fun q l => q == p %% (modulus (n - (min n m)) l)) (ntt p) moduli.
Proof. Admitted.
End WithMN.
Section SanityCheck.
(* The FIPS standards specify that the output ring of the ntt is something like
\Prod_{i = 0}^{2^{min(n,m)} - 1} Z_q[X]/(X^{n - min(n, m)} - w^{2bitrev_m(i) + 1})
We check below that we have the same order for the powers of w.
*)
Local Definition bitrev (n: nat) (i: nat): nat :=
let fix aux k := match k with
| O => if Nat.testbit i 0%nat then PeanoNat.Nat.setbit 0%nat (n - 1)%nat else 0%nat
| S k' => if Nat.testbit i k then PeanoNat.Nat.setbit (aux k') (n - 1 - k)%nat else aux k'
end in
aux (n - 1)%nat.
Local Notation bitrev8 := (bitrev 8%nat). (* Dilithium *)
Local Notation bitrev7 := (bitrev 7%nat). (* Kyber *)
(* Dilithium, X^{256} + 1 = X^{2^8} - w^{2^8} *)
Goal moduli 8 8 = List.map (fun k => (2 * (bitrev8 k) + 1)%nat) (List.seq 0 256%nat). reflexivity. Qed.
(* Kyber, X^{256} + 1 = X^{2^8} - w^{2^7} *)
Goal moduli 7 8 = List.map (fun k => (2 * (bitrev7 k) + 1)%nat) (List.seq 0 128%nat). reflexivity. Qed.
End SanityCheck.
End NTT. |
I like to think of NTT/inverse NTT as a constructive version of the Chinese remainder theorem for rings. Take a principal ideal domain This generalises to the situation
The "going up" version I expect to look very similar, a pair-wise Bezout-reconstruction that uses the computational CRT to go up one level each recursive step. I have proved a version of Bezout-reconstruction in |
I finished another version of the decomposition at https://github.com/atrieu/fiat-crypto/blob/ntt2/src/NTT/CyclotomicDecomposition2.v https://github.com/atrieu/fiat-crypto/blob/ntt2/src/NTT/CyclotomicDecomposition2.v#L441-L450 define the split/recombine with I then prove that they are ring homomorphisms: Lemma ntt_isomorphism2:
@Hierarchy.ring _ (EQ2 m1 m2) (ZERO2 m1 m2) (ONE2 m1 m2) (OPP2 m1 m2) (ADD2 m1 m2) (SUB2 m1 m2) (MUL2 m1 m2)
/\ @Ring.is_homomorphism _ eq1 one add mul _ (EQ2 m1 m2) (ONE2 m1 m2) (ADD2 m1 m2) (MUL2 m1 m2) ntt2
/\ @Ring.is_homomorphism _ (EQ2 m1 m2) (ONE2 m1 m2) (ADD2 m1 m2) (MUL2 m1 m2) _ eq1 one add mul intt2. The recursive calls are defined here as function compositions https://github.com/atrieu/fiat-crypto/blob/ntt2/src/NTT/CyclotomicDecomposition2.v#L477-L481 Definition ntt_body' (p: Pquot m0): Pquotl decomposition_body' :=
Pquotl_convert decomposition_body_spec' (Pquotl_app (Ring.apply_unop_pair (rec_ntt r' (k - 1) (Nat.div l 2) r_leq_k' r_leq_m' r_leq_l_lhs) (rec_ntt r' (k - 1) (Nat.pow 2 m + Nat.div l 2) r_leq_k' r_leq_m' r_leq_l_rhs) (ntt2 p))).
Definition intt_body' (pl: Pquotl decomposition_body'): Pquot m0 :=
intt2 (Ring.apply_unop_pair (rec_intt r' (k - 1) (Nat.div l 2) r_leq_k' r_leq_m' r_leq_l_lhs) (rec_intt r' (k - 1) (Nat.pow 2 m + Nat.div l 2) r_leq_k' r_leq_m' r_leq_l_rhs) (Pquotl_split (Pquotl_convert decomposition_body_spec'' pl))). These then form ring homomorphisms as compositions of homomorphisms. The final theorem is https://github.com/atrieu/fiat-crypto/blob/ntt2/src/NTT/CyclotomicDecomposition2.v#L607-L610 Lemma ntt_isomorphism:
forall n,
@Ring.is_homomorphism _ eq1 one add mul _ eql onel addl mull (ntt n)
/\ @Ring.is_homomorphism _ eql onel addl mull _ eq1 one add mul (intt n). Not sure if I like all the dependent types though. |
Yes, I like this! For the dependent types, do you need them just so you can state theorems using For the polynomial library, do you think there is an induction principle that would fit your use cases, maybe something along the lines of "degree induction" or maybe even of_list induction or induction on canonical minimal decompositions? I'm trying to figure out whether I like the abstracting over treatment of polynomial_ops. I am thinking of this as opposed to giving one implementation, proving the current assumed properties as lemmas, and treating it opaque thereafter -- there ought not to be two different notions of polynomials over the same coefficient ring anyway. Are posicyclic and negacyclic genuinelt different, or would it be a simplification to treat one as the other with negated input? |
It's only to use the definitions, which makes things quite short to write. My only grief is that we have definitions like
As far as I remember, I only had to use a "degree induction" to prove specifications of the euclidean division and such. The
I don't have a strong opinion on this. I can't see either having a strong advantage over the other, even an implementation of polynomials where
I haven't thought about it before, but I think you are right that it might simplify things to define one as the other with negated input. |
The PR is still in a bit of a rough state, but I'm opening it to see if there is interest in adding it.
The Number-Theoretic Transform is a technique to accelerate polynomial multiplications used in recent lattice-based cryptography for PQC.
This PR defines:
Polynomial.v
CyclotomicDecomposition.v
which defines an homomorphism from a typePquotl (cyclotomic_decomposition n 0)
toPquotl (cyclotomic_decomposition n k)
wherePquotl ql
is defined asPquotl (ql: list P): Type := { pl: list P | List.Forall2 (fun p q => Peq p (Pmod p q)) pl ql }
, andcyclotomic_decomposition n i
is the i-th layer decomposition ofX^n + 1
. It also defines various optimizations for the NTT.RupicolaNTT.v
BedrockNTT.v
, I initially tried to automatically synthesize the code using Rupicola, but ended up doing the proof manuallyRupicolaBarrettReduction.v
andRupicolaMontgomeryArithmetic.v
MLKEM.v
andMLDSA.v
.I believe the C code should look like what someone would write after reading the NIST standards with no other reference. In terms of performance, this is slower than the handwritten C reference implementations for Kyber/Dilithium which use a so-called centered signed representation for field elements, and delay reduction of the coefficients to the end of the NTT instead of systematically doing it at each step like the synthesised code.