Skip to content

Commit f41f78b

Browse files
committed
Add Functional Programming - Stateful Types
1 parent fb57f21 commit f41f78b

File tree

2 files changed

+228
-0
lines changed

2 files changed

+228
-0
lines changed

SUMMARY.md

+1
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@
4444

4545
- [Functional Programming](./functional/index.md)
4646
- [Programming paradigms](./functional/paradigms.md)
47+
- [Stateful Types: Generics as Type Classes](./functional/stateful-types.md)
4748

4849
- [Additional Resources](./additional_resources/index.md)
4950
- [Design principles](./additional_resources/design-principles.md)

functional/stateful-types.md

+227
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,227 @@
1+
# Stateful Types: Generics as Type Classes
2+
3+
## Description
4+
5+
Rust's functional roots allow for it to be more expressive in the type system than many other languages, and turn many kinds of programming problems into "static typing" problems. A key part of this idea is the way generic types work.
6+
7+
In C++ and Java, for example, generic types a meta-programming construct for the compiler. A `Vec<int>` and `Vec<char>` in C++ are just two different copies of the same boilerplate code for a `Vec` type, with two different types filled in.
8+
9+
In Rust, the generic type parameter creates what is known as a "type class", and each value used by an end user *actually changes the type of each instatiation*. In other words, `Vec<usize>` and `Vec<char>` *are two different types*.
10+
11+
This is why `impl` blocks must specify generic parameters: different ones can have different `impl` blocks on them.
12+
13+
It is recommended in rust to use generic types to enforce invariants. The best example is a state machine.
14+
15+
## Example
16+
17+
Suppose you are designing an interpreted language runtime, which requires a JIT compiler in order to process. Many Rust novices coming from other languages would implement it this way:
18+
19+
```rust
20+
#[derive(Debug)]
21+
pub enum CompileError {
22+
// error type, std::error::Error impl skipped for brevity
23+
}
24+
25+
#[derive(Debug)]
26+
pub enum ExecError {
27+
// error type, std::error::Error impl skipped for brevity
28+
}
29+
30+
pub type ExecutionResult = Result<(), ExecError>;
31+
32+
pub type CompileResult = Result<(), CompileError>;
33+
34+
pub enum Param {
35+
GCSize(usize),
36+
MaxStack(usize),
37+
// more parameters...
38+
}
39+
40+
#[derive(Default)]
41+
pub struct Interpreter {
42+
// state and execution data
43+
}
44+
45+
impl Interpreter {
46+
/// Set a system prameter.
47+
///
48+
/// # Panics
49+
/// Will panic if called after a script is executed.
50+
///
51+
pub fn set_param(&mut self, p: Param) {
52+
/* update the state based on the parameter */
53+
}
54+
55+
/// Set up a new execution environment.
56+
///
57+
/// # Panics
58+
/// Will panic if called more than once or after a script has been compiled.
59+
///
60+
pub fn init(&mut self) {
61+
/* Create the heap, prepare for use... */
62+
}
63+
64+
/// Load and compile a script. Returns any errors encountered.
65+
///
66+
pub fn compile_script(&mut self, script: &str) -> CompileResult {
67+
/* actually do the compile... */
68+
Ok(())
69+
}
70+
71+
/// Execute all scripts added. Returns any errors encountered.
72+
///
73+
/// # Panics
74+
/// Will panic if zero scripts have been compiled.
75+
///
76+
pub fn exec(&mut self) -> ExecutionResult {
77+
/* actually run the script... */
78+
Ok(())
79+
}
80+
}
81+
82+
fn main() {
83+
let mut interp = Interpreter::default();
84+
interp.set_param(Param::GCSize(1024 * 1024 * 1024));
85+
interp.init();
86+
interp.compile_script("print('2 + 2')").unwrap();
87+
interp.exec().unwrap();
88+
}
89+
```
90+
91+
Why those chances to panic? Because there are *state invariants* here:
92+
93+
1. `set_param` can only be called in an "initial" state.
94+
1. `init` must be called before any scripts are compiled.
95+
1. `exec` can only be called in a "loaded" or "ready" state.
96+
1. `init` must be called *exactly once*.
97+
98+
It would be possible to add these to the `Error` types instead of panicking. However, this solution is suboptimal. Not only would users of the code correctly have to `unwrap()` a result all the time, but the requirement to call init exactly once is still unenforcable across an entire program without additional state (like `called = true` in the struct).
99+
100+
What would really be neat is if it were possible to create a compile-time error if it were misused. After all, every user's program contains the invalid call order in the logic itself.
101+
102+
In Rust, this is actually possible! The solution is to *change the type* in order to enforce the invariants. How? With a private generic parameter.
103+
104+
Here is what that looks like:
105+
106+
```rust,ignore
107+
// this is a module to prevent users outside this crate from doing their own impls
108+
mod state_trait {
109+
pub(crate) trait State {}
110+
111+
pub(crate) struct Init {
112+
params: Vec<Param>,
113+
}
114+
impl State for Init {}
115+
116+
pub(crate) struct Loaded {
117+
scripts: Vec<CompiledScript>,
118+
}
119+
impl State for Loaded {}
120+
121+
pub(crate) struct Ready(Vec<CompiledScript>);
122+
impl State for Ready {}
123+
}
124+
use state_trait::*;
125+
126+
127+
struct Interpreter<S: State> {
128+
// same fields for execution, but now add:
129+
state_data: S,
130+
}
131+
132+
impl Default for Interpreter<Init> {
133+
/* impl does the same thing the old default one did */
134+
}
135+
136+
impl Interpreter<Init> {
137+
/// Set a system prameter.
138+
///
139+
/// # Panics
140+
/// Will panic if called after a script is executed.
141+
///
142+
pub fn set_param(&mut self, p: Param) {
143+
/* update the state based on the parameter */
144+
}
145+
146+
/// Initialize this intepreter, disallowing more parameter sets.
147+
///
148+
pub fn init(self) -> Interpreter<Loaded> {
149+
/* copy all of the parameters into self's members... */
150+
151+
// return our new initialized interpreter
152+
Interpreter {
153+
state_data: Loaded { scripts: vec![] },
154+
// copy other fields...
155+
}
156+
}
157+
}
158+
159+
impl Interpreter<Loaded> {
160+
/// Load and compile a script. Returns any errors encountered.
161+
///
162+
pub fn compile_script(&mut self, script: &str) -> CompileResult {
163+
/* actually do the compile... */
164+
Ok(())
165+
}
166+
167+
// Indicates we are done compiling scripts.
168+
pub fn ready(self) -> Interpreter<Ready> {
169+
/* prepare to actually execute scripts... */
170+
171+
Interpreter {
172+
state_data: Ready(),
173+
// copy other fields...
174+
}
175+
}
176+
}
177+
178+
impl Interpreter<Ready> {
179+
/// Execute all scripts added. Returns any errors encountered.
180+
///
181+
/// # Panics
182+
/// Will panic if no scripts had been compiled.
183+
///
184+
pub fn exec(&mut self) -> ExecutionResult {
185+
/* actually run the script... */
186+
Ok(())
187+
}
188+
189+
/// Returns to a state where more scripts can be loaded.
190+
pub fn reset(self) -> Interpreter<Loaded> {
191+
/* clean up any state as needed... */
192+
193+
Interpreter {
194+
state_data: Loaded(self.0),
195+
// copy other fields...
196+
}
197+
}
198+
}
199+
```
200+
201+
With this approach, if the user were to make a mistake and set a parameter after init:
202+
203+
```rust,ignore
204+
fn main() {
205+
let mut interp = Interpreter::<Init>::default().init();
206+
interp.set_param(Param::GCSize(1024 * 1024 * 1024);
207+
}
208+
```
209+
210+
They would get a syntax error. The type `Interpreter<Loaded>` does not implement set param, only the type `Interpreter<Init>` does.
211+
212+
## Disadvantages
213+
214+
This is a lot of typing. Depending on the amount of change caused by state transitions, an `InvalidState` enum value in an error type might be simpler.
215+
216+
## Alternatives
217+
218+
There are a number of simpler state machines, however, that have their own patterns:
219+
220+
1. If the state transition is during construction/finalizing of an object, see [Builder Pattern](../patterns/creational/builder.md).
221+
1. If the state transitions don't change invariants much, see [Strategy Pattern](../patterns/behavioural/strategy.md).
222+
223+
## See also
224+
225+
FIXME: I looked through several papers on functional languages, but they are all too abstract and none of them describe this specifically. Any suggestions?
226+
227+
FIXME: I remember seeing a Rust talk which described this in more detail, but I can't find it again.

0 commit comments

Comments
 (0)