Skip to content

Commit 79c193e

Browse files
committed
Rewrite PRNG module documentation
1 parent 2f2e429 commit 79c193e

File tree

3 files changed

+238
-76
lines changed

3 files changed

+238
-76
lines changed

src/prng/isaac.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -163,7 +163,7 @@ impl BlockRngCore for IsaacCore {
163163
type Results = IsaacArray<Self::Item>;
164164

165165
/// Refills the output buffer, `results`. See also the pseudocode desciption
166-
/// of the algorithm in the [`Isaac64Rng`] documentation.
166+
/// of the algorithm in the [`IsaacRng`] documentation.
167167
///
168168
/// Optimisations used (similar to the reference implementation):
169169
///

src/prng/mod.rs

Lines changed: 232 additions & 75 deletions
Original file line numberDiff line numberDiff line change
@@ -8,81 +8,238 @@
88
// option. This file may not be copied, modified, or distributed
99
// except according to those terms.
1010

11-
//! Pseudo random number generators are algorithms to produce *apparently
12-
//! random* numbers deterministically, and usually fairly quickly.
13-
//!
14-
//! So long as the algorithm is computationally secure, is initialised with
15-
//! sufficient entropy (i.e. unknown by an attacker), and its internal state is
16-
//! also protected (unknown to an attacker), the output will also be
17-
//! *computationally secure*. Computationally Secure Pseudo Random Number
18-
//! Generators (CSPRNGs) are thus suitable sources of random numbers for
19-
//! cryptography. There are a couple of gotchas here, however. First, the seed
20-
//! used for initialisation must be unknown. Usually this should be provided by
21-
//! the operating system and should usually be secure, however this may not
22-
//! always be the case (especially soon after startup). Second, user-space
23-
//! memory may be vulnerable, for example when written to swap space, and after
24-
//! forking a child process should reinitialise any user-space PRNGs. For this
25-
//! reason it may be preferable to source random numbers directly from the OS
26-
//! for cryptographic applications.
27-
//!
28-
//! PRNGs are also widely used for non-cryptographic uses: randomised
29-
//! algorithms, simulations, games. In these applications it is usually not
30-
//! important for numbers to be cryptographically *unguessable*, but even
31-
//! distribution and independence from other samples (from the point of view
32-
//! of someone unaware of the algorithm used, at least) may still be important.
33-
//! Good PRNGs should satisfy these properties, but do not take them for
34-
//! granted; Wikipedia's article on
35-
//! [Pseudorandom number generators](https://en.wikipedia.org/wiki/Pseudorandom_number_generator)
36-
//! provides some background on this topic.
37-
//!
38-
//! Care should be taken when seeding (initialising) PRNGs. Some PRNGs have
39-
//! short periods for some seeds. If one PRNG is seeded from another using the
40-
//! same algorithm, it is possible that both will yield the same sequence of
41-
//! values (with some lag).
42-
//!
43-
//! ## Cryptographic security
44-
//!
45-
//! First, lets recap some terminology:
46-
//!
47-
//! - **PRNG:** *Pseudo-Random-Number-Generator* is another name for an
48-
//! *algorithmic generator*
49-
//! - **CSPRNG:** a *Cryptographically Secure* PRNG
50-
//!
51-
//! Security analysis requires a threat model and expert review; we can provide
52-
//! neither, but we can provide a few hints. We assume that the goal is to
53-
//! produce secret apparently-random data. Therefore, we need:
54-
//!
55-
//! - A good source of entropy. A known algorithm given known input data is
56-
//! trivial to predict, and likewise if there's a non-negligable chance that
57-
//! the input to a PRNG is guessable then there's a chance its output is too.
58-
//! We recommend seeding CSPRNGs with [`EntropyRng`] or [`OsRng`] which
59-
//! provide fresh "random" values from an external source.
60-
//! One can also seed from another CSPRNG, e.g. [`thread_rng`], which is faster,
61-
//! but adds another component which must be trusted.
62-
//! - A strong algorithmic generator. It is possible to use a good entropy
63-
//! source like [`OsRng`] directly, and in some cases this is the best option,
64-
//! but for better performance (or if requiring reproducible values generated
65-
//! from a fixed seed) it is common to use a local CSPRNG. The basic security
66-
//! that CSPRNGs must provide is making it infeasible to predict future output
67-
//! given a sample of past output. A further security that *some* CSPRNGs
68-
//! provide is *forward secrecy*; this ensures that in the event that the
69-
//! algorithm's state is revealed, it is infeasible to reconstruct past
70-
//! output. See the [`CryptoRng`] trait and notes on individual algorithms.
71-
//! - To be careful not to leak secrets like keys and CSPRNG's internal state
72-
//! and robust against "side channel attacks". This goes well beyond the scope
73-
//! of random number generation, but this crate takes some precautions:
74-
//! - to avoid printing CSPRNG state in log files, implementations have a
75-
//! custom `Debug` implementation which omits all internal state
76-
//! - [`thread_rng`] uses [`ReseedingRng`] to periodically refresh its state
77-
//! - in the future we plan to add some protection against fork attacks
78-
//! (where the process is forked and each clone generates the same "random"
79-
//! numbers); this is not yet implemented (see issues #314, #370)
80-
//!
81-
//! [`CryptoRng`]: ../trait.CryptoRng.html
82-
//! [`EntropyRng`]: ../rngs/struct.EntropyRng.html
83-
//! [`OsRng`]: ../rngs/struct.OsRng.html
84-
//! [`ReseedingRng`]: ../rngs/struct.ReseedingRng.html
85-
//! [`thread_rng`]: ../fn.thread_rng.html
11+
//! Pseudo-random number generators.
12+
//!
13+
//! Pseudo-random number generators are algorithms to produce apparently random
14+
//! numbers deterministically, and usually fairly quickly. See the documentation
15+
//! of the [`rngs` module] for some introduction to PRNGs.
16+
//!
17+
//! As mentioned there, PRNGs fall in two broad categories:
18+
//!
19+
//! - [normal PRNGs], primarily designed for simulations.
20+
//! - [CSPRNGs], primarily designed for cryptography.
21+
//!
22+
//! This module provides a few different PRNG implementations, and the
23+
//! documentation aims to provide enough background to make a reasonably
24+
//! informed choice.
25+
//!
26+
//!
27+
//! # Normal pseudo-random number generators (PRNGs)
28+
//!
29+
//! The goal of normal PRNGs is usually to find a good balance between
30+
//! simplicity, quality and performance. They are generally developed for doing
31+
//! simulations, but the good performance and simplicity make them suitable for
32+
//! many common programming problems.
33+
//!
34+
//! Mathematical theory is involved to ensure certain properties of the PRNG,
35+
//! most notably to prove that it doesn't cycle before generating all values in
36+
//! its period once.
37+
//!
38+
//! Usually there are three properties of a PRNG to consider:
39+
//!
40+
//! - performance
41+
//! - quality
42+
//! - extra features
43+
//!
44+
//! Currently Rand provides only one PRNG, and not a very good one at that:
45+
//! [`XorShiftRng`].
46+
//!
47+
//!
48+
//! ## Performance
49+
//!
50+
//! First it has to be said most PRNGs are really incredibly fast, and will
51+
//! rarely be a performance bottleneck. Performance of PRNGs is however a bit of
52+
//! a subtle thing. It depends much on the CPU architecture (32 vs. 64 bits),
53+
//! inlining, but also on the number available of registers, which makes
54+
//! performance dependent on the surrounding code.
55+
//!
56+
//! Some PRNGs are plain and simple faster than others, thanks to using cheaper
57+
//! instructions, shuffling around less state etc. But when absolute performance
58+
//! is a goal, benchmarking a few different PRNGs with your specific use case is
59+
//! always recommended.
60+
//!
61+
//!
62+
//! ## Quality
63+
//!
64+
//! Many PRNGs are not much more than a couple of bitwise and arithmetic
65+
//! operations. Their simplicity gives good performance, but also means there
66+
//! are small regularities hidden in the generated random number stream.
67+
//!
68+
//! How much do those hidden regularities matter? That is hard to say, and
69+
//! depends on how the RNG gets used. If there happen to be correlations between
70+
//! the random numbers and the algorithm they are used in, the results can be
71+
//! wrong or misleading.
72+
//!
73+
//! A random number generator can be considered good if it gives the correct
74+
//! results in as many applications as possible. There are test suites designed
75+
//! to test how well a PRNG performs on a wide range of possible uses, the
76+
//! latest and most complete of which are [TestU01] and [PractRand].
77+
//!
78+
//! It is easy to measure whether the PRNG is an actual performance bottleneck,
79+
//! but hard to measure that generated values are as random as you expect them
80+
//! to be. Then the safe choice is to use an RNG that performs well on the
81+
//! empirical RNG tests.
82+
//!
83+
//! In other situations is is clear that some small hidden regularities are of
84+
//! no concern. In that case, just choose a fast PRNG.
85+
//!
86+
//!
87+
//! ## Extra features
88+
//!
89+
//! Some PRNGs may provide extra features, which can be a reason to prefer that
90+
//! algorithm over others. Examples include:
91+
//!
92+
//! - Support for multiple streams, which helps with parallel tasks;
93+
//! - The ability to jump or seek around in the random number stream.
94+
//! - The algorithm uses some chaotic process, which will usually produce much
95+
//! more random-looking numbers, at the cost of loosing the ability to reason
96+
//! about things like it's period.
97+
//! - Given previous values, the next value may be predictable, but not
98+
//! trivially.
99+
//! - Having a huge period.
100+
//!
101+
//! ## Period
102+
//!
103+
//! The period of a PRNG is the number of values after which it starts repeating
104+
//! the same random number stream. While an important property, it will usually
105+
//! be of little concern as modern PRNGs just satisfy it.
106+
//!
107+
//! On today's hardware, even a fast RNG with a period of only 2<sup>64</sup>
108+
//! can be used for centuries before wrapping around. Yet we recommend a period
109+
//! of 2<sup>128</sup> or more, which most modern PRNGs satisfy. Or a shorter
110+
//! period, but support for multiple streams. Two reasons:
111+
//!
112+
//! If we see the entire period of an RNG as one long random number stream,
113+
//! every independently seeded RNG returns a slice of that stream. When multiple
114+
//! RNG are seeded randomly, there is an increasingly large chance to end up
115+
//! with a partially overlapping slice of the stream.
116+
//!
117+
//! If the period of the RNG is 2<sup>128</sup>, and we take 2<sup>48</sup> as
118+
//! the number of values usually consumed, it then takes about 2<sup>32</sup>
119+
//! random initializations to have a chance of 1 in a million to repeat part of
120+
//! an already used stream. Something that seems good enough for common use. As
121+
//! an estimation, `chance ~= 1-e^((-initializations^2)/(2*period/used))`.
122+
//!
123+
//! Also not the entire period of an RNG should be used. The RNG produces every
124+
//! possible value exactly the same number of times. This is not a property of
125+
//! true randomness, it is natural to expect some variation in how often values
126+
//! appear. This is known as the generalized birthday problem, see the
127+
//! [PCG paper] for a good explanation. It becomes noticable after generating
128+
//! more values than the root of the period (after 2<sup>64</sup> values for a
129+
//! period of 2<sup>128</sup>).
130+
//!
131+
//!
132+
//! # Cryptographically secure pseudo-random number generators (CSPRNGs)
133+
//!
134+
//! CSPRNGs have much higher requirements than normal PRNGs. The primary
135+
//! consideration is security. Performance and simplicity are also important,
136+
//! but in general CSPRNGs are more complex and slower than normal PRNGs.
137+
//! Quality is no longer a concern, as it is a requirement for a
138+
//! CSPRNG that the output is basically indistinguishable from true randomness,
139+
//! otherwise information could be leaked.
140+
//!
141+
//! There is a relation between CSPRNGs and cryptographic ciphers.
142+
//! One way to create a CSPRNG is to take a block cipher and to run in it
143+
//! counter mode, i.e. to encrypt a simple counter. Another option is to take a
144+
//! stream cipher, but to just leave out the part where it is combined (usually
145+
//! XOR-ed) with the plaintext.
146+
//!
147+
//! Rand currently provides two CSPRNGs:
148+
//!
149+
//! - [`ChaChaRng`]. A reasonable fast RNG using little memory, which works
150+
//! by encrypting a counter. Based on the ChaCha20 stream cipher.
151+
//! - [`Hc128Rng`]. A very fast array-based RNG that requires a lot of memory,
152+
//! 4 KiB. Based on the HC-128 stream cipher.
153+
//!
154+
//! Since the beginning of randomness support in Rust an RNG based on the ISAAC
155+
//! algorithm ([`IsaacRng`]) has been available. This an example of an algorithm
156+
//! advertised as secure (which it very well may be), but which since its design
157+
//! in 1996 never really attracted the attention of cryptography experts.
158+
//!
159+
//!
160+
//! ## Security
161+
//!
162+
//! The basic security that CSPRNGs must provide is making it infeasible to
163+
//! predict future output given a sample of past output. A further security that
164+
//! some CSPRNGs provide is *forward secrecy*; this ensures that in the event
165+
//! that the algorithm's state is revealed, it is infeasible to reconstruct past
166+
//! output.
167+
//!
168+
//! As an outsider it is hard to get a good idea about the security of an
169+
//! algorithm. People in the field of cryptography spend a lot of effort
170+
//! analyzing existing designs, and what was once considered good may now turn
171+
//! out to be weaker. Generally it is best to use algorithms well-analyzed by
172+
//! experts. Not the very latest design, and also not older algorithms that
173+
//! gained little attention. In practice it is best to just use algorithms
174+
//! recommended by reputable organizations, such as the ciphers selected by the
175+
//! eSTREAM contest, or some of those recommended by NIST.
176+
//!
177+
//! It is important to use a good source of entropy to get the seed for a
178+
//! CSPRNG. When a known algorithm is given known input data, it is trivial to
179+
//! predict. Likewise if there's a non-negligible chance that the input to a
180+
//! PRNG is guessable, then there's a chance its output is too. We recommend
181+
//! seeding CSPRNGs using the [`FromEntropy`] trait, which uses fresh random
182+
//! values from an external source, usually the OS. You can also seed from
183+
//! another CSPRNG, like [`ThreadRng`], which is faster, but adds another
184+
//! component which must be trusted.
185+
//!
186+
//! ## Not a crypto library
187+
//!
188+
//! When using a CSPRNG for cryptographic purposes, more is required than
189+
//! chosing a good algorithm.
190+
//!
191+
//! It is important not to leak secrets such as the seed or the RNG's internal
192+
//! state, and to prevent other kinds of "side channel attacks". This means
193+
//! among other things that memory needs to be zeroed on move, and should not
194+
//! be written to swap space on disk. Another problem is fork attacks, where
195+
//! the process is forked and each clone generates the same random numbers.
196+
//!
197+
//! This all goes beyond the scope of random number generation, use a good
198+
//! crypto library instead.
199+
//!
200+
//! Rand does take a few precautions however. To avoid printing a CSPRNG's state
201+
//! in log files, implementations have a custom `Debug` implementation which
202+
//! omits internal state. In the future we plan to add some protection against
203+
//! fork attacks.
204+
//!
205+
//! ## Performance
206+
//!
207+
//! Most algorithms don't generate values one at a time, as simple PRNGs, but in
208+
//! blocks. There will be a longer pause when such a block needs to be
209+
//! generated, while a random number can be returned almost instantly when there
210+
//! are unused values in the buffer.
211+
//!
212+
//! Performance may be different depending on the architecture; but in contrast
213+
//! to PRNGs that generate one value at a time, the performance of CSPRNGs
214+
//! rarely if ever depends on the surrounding code.
215+
//!
216+
//! Because generating blocks of values lends itself well to loop unrolling, the
217+
//! code size of CSPRNGs can be significant.
218+
//!
219+
//!
220+
//! # Further reading
221+
//!
222+
//! There is quite a lot that can be said about PRNGs. The [PCG paper] is a
223+
//! very approachable explaining more concepts.
224+
//!
225+
//! A good paper about RNG quality is
226+
//! ["Good random number generators are (not so) easy to find"](
227+
//! http://random.mat.sbg.ac.at/results/peter/A19final.pdf) by P. Hellekalek.
228+
//!
229+
//!
230+
//! [`rngs` module]: ../rngs/index.html
231+
//! [normal PRNGs]: #normal-pseudo-random-number-generators-prngs
232+
//! [CSPRNGs]: #cryptographically-secure-pseudo-random-number-generators-csprngs
233+
//! [`XorShiftRng`]: struct.XorShiftRng.html
234+
//! [`ChaChaRng`]: chacha/struct.ChaChaRng.html
235+
//! [`Hc128Rng`]: hc128/struct.Hc128Rng.html
236+
//! [`IsaacRng`]: isaac/struct.IsaacRng.html
237+
//! [`ThreadRng`]: ../rngs/struct.ThreadRng.html
238+
//! [`FromEntropy`]: ../trait.FromEntropy.html
239+
//! [TestU01]: http://simul.iro.umontreal.ca/testu01/tu01.html
240+
//! [PractRand]: http://pracrand.sourceforge.net/
241+
//! [PCG paper]: http://www.pcg-random.org/pdf/hmc-cs-2014-0905.pdf
242+
86243

87244
pub mod chacha;
88245
pub mod hc128;

src/rngs/mod.rs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,11 @@
1515
//! [`SmallRng`]. Also there is [`ThreadRng`], an implementation of [`StdRng`]
1616
//! put in thread-local storage.
1717
//!
18+
//! When to prefer [`StdRng`] over [`SmallRng`]? Whenever there might be an
19+
//! adversary who can effect the working of an algorithm by tweaking input.
20+
//! Examples include picking random filenames, hashmap seeding, or QuickSort
21+
//! using Random Pivoting.
22+
//!
1823
//! To get a seed for those PRNGs at runtime, [`EntropyRng`], [`OsRng`] and
1924
//! [`JitterRng`] are sources of external randomness. There might be situations
2025
//! where may want to use one of them directly, but generally they are used to

0 commit comments

Comments
 (0)