Skip to content

Full specification of PRNGs / (un)aligned reads #655

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dhardy opened this issue Nov 27, 2018 · 6 comments
Closed

Full specification of PRNGs / (un)aligned reads #655

dhardy opened this issue Nov 27, 2018 · 6 comments

Comments

@dhardy
Copy link
Member

dhardy commented Nov 27, 2018

@burdges (here):

We discussed this before, but even PRNGs should lend themselves to specification, meaning if you write typical code using rand then you can easily later write a specification, including any the corner cases of next_*, etc. It's okay that next_* pulls out aligned data. It's not really okay if fill_bytes provided by BlockRng throws away bytes, but individual PRNGs could specify that behavior.

There are two points here:

  1. All PRNGs should specify alignment behaviour. I'd have to check, but I believe we already make some effort here.
  2. Should fill_bytes drop extra bytes to align future reads?

I believe currently all our PRNGs align all reads to their word boundary (u32 or u64, or possibly mixed depending on the read).

The question is less about reproducibility (which can be achieved with any rigorous specification), but about use in ciphers where the text is split into arbitrary chunks.

Proposal: status quo + unaligned wrapper

We could keep the current code as-is, but add e.g. ChaChaUnaligned or ChaChaCipher.

Our current CSPRNGs are built using a BlockRng or BlockRng64 struct around an implementation of BlockRngCore. We could add a BlockRngUnaligned (or maybe BlockCipher) struct to use a different implementation of next_* and fill_bytes around the same core, but with different alignment behaviour (i.e. still implement RngCore but potentially not drop any unused bytes, forcing unaligned reads).

In theory this struct could also have an xor_bytes function (see here).

@burdges
Copy link
Contributor

burdges commented Nov 27, 2018

I believe currently all our PRNGs align all reads to their word boundary (u32 or u64, or possibly mixed depending on the read).

I think that sounds fine for say Isaac where output is specified in words, but I think ChaCha's output is specified as byte aligned and little endian, even though the internals are word aligned. A priori I'd think ChaChaRng::fill_bytes should really be ChaCha, but it sounds fine if ChaChaRng::next_* take the next word boundary.

I do however also see advantage in using some standard BlockRng machinery for ChaCha. Would a BlockRng8 make sense? Could we rename BlockRng to BlockRng32 with the plans to introduce a generic BlockRng type with specialization?

Are alignment issues actually impacting performance? If so, then maybe BlockRngCore could take Item as a type parameter instead of as an associated type, so the same code could support multiple alignments. And some macro could delegate RngCore to the preferred alignment?

@burdges
Copy link
Contributor

burdges commented Nov 27, 2018

It could even look vaguely like:

pub struct ChaChaBlocks { ... }
impl Blocks<u32> for ChaChaBlocks { ... }
impl Blocks<u8> for ChaChaBlocks { ... }
rand_delegation!(ChaChaRng, BlockRng<u8, ChaChaBlocks> );

Afaik, you need a macro instead of type ChaChaRng = BlockRng<u8, ChaChaBlocks>; so that you can write inherent methods.

I hope I'm not making any messes or causing too many distractions! :)

@dhardy
Copy link
Member Author

dhardy commented Nov 27, 2018

But our ChaCha is called ChaChaRng, emphasis on RNG. We could also have ChaChaStream or ChaChaCipher or something.

Are alignment issues actually impacting performance?

I haven't done proper testing, and I think it would be arch-specific. There is bound to be at least a little overhead in the case that a new block must be generated in the middle of a word read.

If so, then maybe BlockRngCore could take Item as a type parameter instead of as an associated type

But read alignment isn't a property of the core, it's a property of the wrapper. The core just generates a whole block at a time. In fact, we might be able to drop the Item type altogether and just transmute the pointer within BlockRngCore::generate — except that we probably do want to ensure the block is aligned. I.e. why should <ChaChaBlocks as Blocks<u32>>::generate be any different than <ChaChaBlocks as Blocks<u8>>::generate?

@burdges
Copy link
Contributor

burdges commented Nov 27, 2018

It appears ChaChaRng does not currently use BlockRng, right?

Another question: Is it actually hard to write a specification for code using BlockRng? It's maybe not too bad since you're only dealing with BlockRng for one word size, so you can say the reads are aligned to that word size, which sounds okay.

Also, if the T: BlockRngCode is exposed by the individual PRNG crates, then users can implement whatever transformations they like, which sounds optimal.

Also, there are PRNGs that allocate generates return buffer themselves, like ChaCha. Are there many that depend on an outside buffer? If not, then what about borrowing that buffer? We cannot do this today, but maybe eventually something like :

pub trait Blocks<Item> {
    type Results: Unsize<[Item]>;  // Unsize and CoerceUnsized are still unstable and give us no methods anyways,  but..
    fn generate(&mut self) -> &Results;
}

except this leaves Results with endianness not statically known, if say R: Blocks<u8>+Blocks<u32>, so maybe instead

pub trait Blocks<Item> {
    type Results: Unsize<[Item]>;
    fn generate<T, F: FnOnce(&Results) -> T>(&mut self, f: F) -> T;
}

@dhardy
Copy link
Member Author

dhardy commented Nov 28, 2018

No, actually it does:

pub struct ChaChaRng(BlockRng<ChaChaCore>);

We don't use a type-def because we want inherent methods and don't use a derive macro because at the time pitdicker wasn't a fan of macros (though I wrote one).

What's wrong with the current generate? A wrapper could omit the buffer altogether. It does have one restriction over the above though: the exact size of the buffer is defined. Cut from chacha.rs:

impl BlockRngCore for ChaChaCore {
    type Item = u32;
    type Results = [u32; STATE_WORDS];  // STATE_WORDS = 16

    fn generate(&mut self, results: &mut Self::Results) {

@dhardy
Copy link
Member Author

dhardy commented Apr 19, 2019

I don't have much interest in adding an unaligned version of block RNGs; we'd have significant complications of code in next_u32 and next_u64 and likely some perf. impact, and I doubt there'd be enough usage of the unaligned versions to justify both (it would need multiple versions of the block rng code). IMO Rand should mostly focus on random numbers, not on cipher streams.

With the current block RNG code:

  • BlockRng always aligns to 32-bit boundaries
  • BlockRng64 always aligns to 64-bit boundaries, excepting next_u32 which aligns to 32-bit boundaries

This is all documented, so I think we can just close this issue since the original point was mostly about specification.

@dhardy dhardy closed this as completed Apr 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants