Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit ab24ffe

Browse files
committedNov 27, 2014
Copied all the grammar productions from reference.md to grammar.md
1 parent ffc5f1c commit ab24ffe

File tree

1 file changed

+18
-755
lines changed

1 file changed

+18
-755
lines changed
 

‎src/doc/grammar.md

Lines changed: 18 additions & 755 deletions
Original file line numberDiff line numberDiff line change
@@ -683,254 +683,53 @@ return_expr : "return" expr ? ;
683683

684684
# Type system
685685

686-
## Types
687-
688-
Every slot, item and value in a Rust program has a type. The _type_ of a
689-
*value* defines the interpretation of the memory holding it.
686+
**FIXME:** is this entire chapter relevant here? Or should it all have been covered by some production already?
690687

691-
Built-in types and type-constructors are tightly integrated into the language,
692-
in nontrivial ways that are not possible to emulate in user-defined types.
693-
User-defined types have limited capabilities.
688+
## Types
694689

695690
### Primitive types
696691

697-
The primitive types are the following:
698-
699-
* The "unit" type `()`, having the single "unit" value `()` (occasionally called
700-
"nil"). [^unittype]
701-
* The boolean type `bool` with values `true` and `false`.
702-
* The machine types.
703-
* The machine-dependent integer and floating-point types.
704-
705-
[^unittype]: The "unit" value `()` is *not* a sentinel "null pointer" value for
706-
reference slots; the "unit" type is the implicit return type from functions
707-
otherwise lacking a return type, and can be used in other contexts (such as
708-
message-sending or type-parametric code) as a zero-size type.]
692+
**FIXME:** grammar?
709693

710694
#### Machine types
711695

712-
The machine types are the following:
713-
714-
* The unsigned word types `u8`, `u16`, `u32` and `u64`, with values drawn from
715-
the integer intervals [0, 2^8 - 1], [0, 2^16 - 1], [0, 2^32 - 1] and
716-
[0, 2^64 - 1] respectively.
717-
718-
* The signed two's complement word types `i8`, `i16`, `i32` and `i64`, with
719-
values drawn from the integer intervals [-(2^(7)), 2^7 - 1],
720-
[-(2^(15)), 2^15 - 1], [-(2^(31)), 2^31 - 1], [-(2^(63)), 2^63 - 1]
721-
respectively.
722-
723-
* The IEEE 754-2008 `binary32` and `binary64` floating-point types: `f32` and
724-
`f64`, respectively.
696+
**FIXME:** grammar?
725697

726698
#### Machine-dependent integer types
727699

728-
The `uint` type is an unsigned integer type with the same number of bits as the
729-
platform's pointer type. It can represent every memory address in the process.
730-
731-
The `int` type is a signed integer type with the same number of bits as the
732-
platform's pointer type. The theoretical upper bound on object and array size
733-
is the maximum `int` value. This ensures that `int` can be used to calculate
734-
differences between pointers into an object or array and can address every byte
735-
within an object along with one byte past the end.
700+
**FIXME:** grammar?
736701

737702
### Textual types
738703

739-
The types `char` and `str` hold textual data.
740-
741-
A value of type `char` is a [Unicode scalar value](
742-
http://www.unicode.org/glossary/#unicode_scalar_value) (ie. a code point that
743-
is not a surrogate), represented as a 32-bit unsigned word in the 0x0000 to
744-
0xD7FF or 0xE000 to 0x10FFFF range. A `[char]` array is effectively an UCS-4 /
745-
UTF-32 string.
746-
747-
A value of type `str` is a Unicode string, represented as an array of 8-bit
748-
unsigned bytes holding a sequence of UTF-8 codepoints. Since `str` is of
749-
unknown size, it is not a _first class_ type, but can only be instantiated
750-
through a pointer type, such as `&str` or `String`.
704+
**FIXME:** grammar?
751705

752706
### Tuple types
753707

754-
A tuple *type* is a heterogeneous product of other types, called the *elements*
755-
of the tuple. It has no nominal name and is instead structurally typed.
756-
757-
Tuple types and values are denoted by listing the types or values of their
758-
elements, respectively, in a parenthesized, comma-separated list.
759-
760-
Because tuple elements don't have a name, they can only be accessed by
761-
pattern-matching.
762-
763-
The members of a tuple are laid out in memory contiguously, in order specified
764-
by the tuple type.
765-
766-
An example of a tuple type and its use:
767-
768-
```
769-
type Pair<'a> = (int, &'a str);
770-
let p: Pair<'static> = (10, "hello");
771-
let (a, b) = p;
772-
assert!(b != "world");
773-
```
708+
**FIXME:** grammar?
774709

775710
### Array, and Slice types
776711

777-
Rust has two different types for a list of items:
778-
779-
* `[T ..N]`, an 'array'
780-
* `&[T]`, a 'slice'.
781-
782-
An array has a fixed size, and can be allocated on either the stack or the
783-
heap.
784-
785-
A slice is a 'view' into an array. It doesn't own the data it points
786-
to, it borrows it.
787-
788-
An example of each kind:
789-
790-
```{rust}
791-
let vec: Vec<int> = vec![1, 2, 3];
792-
let arr: [int, ..3] = [1, 2, 3];
793-
let s: &[int] = vec.as_slice();
794-
```
795-
796-
As you can see, the `vec!` macro allows you to create a `Vec<T>` easily. The
797-
`vec!` macro is also part of the standard library, rather than the language.
798-
799-
All in-bounds elements of arrays, and slices are always initialized, and access
800-
to an array or slice is always bounds-checked.
712+
**FIXME:** grammar?
801713

802714
### Structure types
803715

804-
A `struct` *type* is a heterogeneous product of other types, called the
805-
*fields* of the type.[^structtype]
806-
807-
[^structtype]: `struct` types are analogous `struct` types in C,
808-
the *record* types of the ML family,
809-
or the *structure* types of the Lisp family.
810-
811-
New instances of a `struct` can be constructed with a [struct
812-
expression](#structure-expressions).
813-
814-
The memory layout of a `struct` is undefined by default to allow for compiler
815-
optimizations like field reordering, but it can be fixed with the
816-
`#[repr(...)]` attribute. In either case, fields may be given in any order in
817-
a corresponding struct *expression*; the resulting `struct` value will always
818-
have the same memory layout.
819-
820-
The fields of a `struct` may be qualified by [visibility
821-
modifiers](#re-exporting-and-visibility), to allow access to data in a
822-
structure outside a module.
823-
824-
A _tuple struct_ type is just like a structure type, except that the fields are
825-
anonymous.
826-
827-
A _unit-like struct_ type is like a structure type, except that it has no
828-
fields. The one value constructed by the associated [structure
829-
expression](#structure-expressions) is the only value that inhabits such a
830-
type.
716+
**FIXME:** grammar?
831717

832718
### Enumerated types
833719

834-
An *enumerated type* is a nominal, heterogeneous disjoint union type, denoted
835-
by the name of an [`enum` item](#enumerations). [^enumtype]
836-
837-
[^enumtype]: The `enum` type is analogous to a `data` constructor declaration in
838-
ML, or a *pick ADT* in Limbo.
839-
840-
An [`enum` item](#enumerations) declares both the type and a number of *variant
841-
constructors*, each of which is independently named and takes an optional tuple
842-
of arguments.
843-
844-
New instances of an `enum` can be constructed by calling one of the variant
845-
constructors, in a [call expression](#call-expressions).
846-
847-
Any `enum` value consumes as much memory as the largest variant constructor for
848-
its corresponding `enum` type.
849-
850-
Enum types cannot be denoted *structurally* as types, but must be denoted by
851-
named reference to an [`enum` item](#enumerations).
852-
853-
### Recursive types
854-
855-
Nominal types &mdash; [enumerations](#enumerated-types) and
856-
[structures](#structure-types) &mdash; may be recursive. That is, each `enum`
857-
constructor or `struct` field may refer, directly or indirectly, to the
858-
enclosing `enum` or `struct` type itself. Such recursion has restrictions:
859-
860-
* Recursive types must include a nominal type in the recursion
861-
(not mere [type definitions](#type-definitions),
862-
or other structural types such as [arrays](#array,-and-slice-types) or [tuples](#tuple-types)).
863-
* A recursive `enum` item must have at least one non-recursive constructor
864-
(in order to give the recursion a basis case).
865-
* The size of a recursive type must be finite;
866-
in other words the recursive fields of the type must be [pointer types](#pointer-types).
867-
* Recursive type definitions can cross module boundaries, but not module *visibility* boundaries,
868-
or crate boundaries (in order to simplify the module system and type checker).
869-
870-
An example of a *recursive* type and its use:
871-
872-
```
873-
enum List<T> {
874-
Nil,
875-
Cons(T, Box<List<T>>)
876-
}
877-
878-
let a: List<int> = List::Cons(7, box List::Cons(13, box List::Nil));
879-
```
720+
**FIXME:** grammar?
880721

881722
### Pointer types
882723

883-
All pointers in Rust are explicit first-class values. They can be copied,
884-
stored into data structures, and returned from functions. There are two
885-
varieties of pointer in Rust:
886-
887-
* References (`&`)
888-
: These point to memory _owned by some other value_.
889-
A reference type is written `&type` for some lifetime-variable `f`,
890-
or just `&'a type` when you need an explicit lifetime.
891-
Copying a reference is a "shallow" operation:
892-
it involves only copying the pointer itself.
893-
Releasing a reference typically has no effect on the value it points to,
894-
with the exception of temporary values, which are released when the last
895-
reference to them is released.
896-
897-
* Raw pointers (`*`)
898-
: Raw pointers are pointers without safety or liveness guarantees.
899-
Raw pointers are written as `*const T` or `*mut T`,
900-
for example `*const int` means a raw pointer to an integer.
901-
Copying or dropping a raw pointer has no effect on the lifecycle of any
902-
other value. Dereferencing a raw pointer or converting it to any other
903-
pointer type is an [`unsafe` operation](#unsafe-functions).
904-
Raw pointers are generally discouraged in Rust code;
905-
they exist to support interoperability with foreign code,
906-
and writing performance-critical or low-level functions.
907-
908-
The standard library contains additional 'smart pointer' types beyond references
909-
and raw pointers.
724+
**FIXME:** grammar?
910725

911726
### Function types
912727

913-
The function type constructor `fn` forms new function types. A function type
914-
consists of a possibly-empty set of function-type modifiers (such as `unsafe`
915-
or `extern`), a sequence of input types and an output type.
916-
917-
An example of a `fn` type:
918-
919-
```
920-
fn add(x: int, y: int) -> int {
921-
return x + y;
922-
}
923-
924-
let mut x = add(5,7);
925-
926-
type Binop<'a> = |int,int|: 'a -> int;
927-
let bo: Binop = add;
928-
x = bo(5,7);
929-
```
728+
**FIXME:** grammar?
930729

931730
### Closure types
932731

933-
```{.ebnf .notation}
732+
```antlr
934733
closure_type := [ 'unsafe' ] [ '<' lifetime-list '>' ] '|' arg-list '|'
935734
[ ':' bound-list ] [ '->' type ]
936735
procedure_type := 'proc' [ '<' lifetime-list '>' ] '(' arg-list ')'
@@ -941,574 +740,38 @@ bound-list := bound | bound '+' bound-list
941740
bound := path | lifetime
942741
```
943742

944-
The type of a closure mapping an input of type `A` to an output of type `B` is
945-
`|A| -> B`. A closure with no arguments or return values has type `||`.
946-
Similarly, a procedure mapping `A` to `B` is `proc(A) -> B` and a no-argument
947-
and no-return value closure has type `proc()`.
948-
949-
An example of creating and calling a closure:
950-
951-
```rust
952-
let captured_var = 10i;
953-
954-
let closure_no_args = || println!("captured_var={}", captured_var);
955-
956-
let closure_args = |arg: int| -> int {
957-
println!("captured_var={}, arg={}", captured_var, arg);
958-
arg // Note lack of semicolon after 'arg'
959-
};
960-
961-
fn call_closure(c1: ||, c2: |int| -> int) {
962-
c1();
963-
c2(2);
964-
}
965-
966-
call_closure(closure_no_args, closure_args);
967-
968-
```
969-
970-
Unlike closures, procedures may only be invoked once, but own their
971-
environment, and are allowed to move out of their environment. Procedures are
972-
allocated on the heap (unlike closures). An example of creating and calling a
973-
procedure:
974-
975-
```rust
976-
let string = "Hello".to_string();
977-
978-
// Creates a new procedure, passing it to the `spawn` function.
979-
spawn(proc() {
980-
println!("{} world!", string);
981-
});
982-
983-
// the variable `string` has been moved into the previous procedure, so it is
984-
// no longer usable.
985-
986-
987-
// Create an invoke a procedure. Note that the procedure is *moved* when
988-
// invoked, so it cannot be invoked again.
989-
let f = proc(n: int) { n + 22 };
990-
println!("answer: {}", f(20));
991-
992-
```
993-
994743
### Object types
995744

996-
Every trait item (see [traits](#traits)) defines a type with the same name as
997-
the trait. This type is called the _object type_ of the trait. Object types
998-
permit "late binding" of methods, dispatched using _virtual method tables_
999-
("vtables"). Whereas most calls to trait methods are "early bound" (statically
1000-
resolved) to specific implementations at compile time, a call to a method on an
1001-
object type is only resolved to a vtable entry at compile time. The actual
1002-
implementation for each vtable entry can vary on an object-by-object basis.
1003-
1004-
Given a pointer-typed expression `E` of type `&T` or `Box<T>`, where `T`
1005-
implements trait `R`, casting `E` to the corresponding pointer type `&R` or
1006-
`Box<R>` results in a value of the _object type_ `R`. This result is
1007-
represented as a pair of pointers: the vtable pointer for the `T`
1008-
implementation of `R`, and the pointer value of `E`.
1009-
1010-
An example of an object type:
1011-
1012-
```
1013-
trait Printable {
1014-
fn stringify(&self) -> String;
1015-
}
1016-
1017-
impl Printable for int {
1018-
fn stringify(&self) -> String { self.to_string() }
1019-
}
1020-
1021-
fn print(a: Box<Printable>) {
1022-
println!("{}", a.stringify());
1023-
}
1024-
1025-
fn main() {
1026-
print(box 10i as Box<Printable>);
1027-
}
1028-
```
1029-
1030-
In this example, the trait `Printable` occurs as an object type in both the
1031-
type signature of `print`, and the cast expression in `main`.
745+
**FIXME:** grammar?
1032746

1033747
### Type parameters
1034748

1035-
Within the body of an item that has type parameter declarations, the names of
1036-
its type parameters are types:
1037-
1038-
```ignore
1039-
fn map<A: Clone, B: Clone>(f: |A| -> B, xs: &[A]) -> Vec<B> {
1040-
if xs.len() == 0 {
1041-
return vec![];
1042-
}
1043-
let first: B = f(xs[0].clone());
1044-
let mut rest: Vec<B> = map(f, xs.slice(1, xs.len()));
1045-
rest.insert(0, first);
1046-
return rest;
1047-
}
1048-
```
1049-
1050-
Here, `first` has type `B`, referring to `map`'s `B` type parameter; and `rest`
1051-
has type `Vec<B>`, a vector type with element type `B`.
749+
**FIXME:** grammar?
1052750

1053751
### Self types
1054752

1055-
The special type `self` has a meaning within methods inside an impl item. It
1056-
refers to the type of the implicit `self` argument. For example, in:
1057-
1058-
```
1059-
trait Printable {
1060-
fn make_string(&self) -> String;
1061-
}
1062-
1063-
impl Printable for String {
1064-
fn make_string(&self) -> String {
1065-
(*self).clone()
1066-
}
1067-
}
1068-
```
1069-
1070-
`self` refers to the value of type `String` that is the receiver for a call to
1071-
the method `make_string`.
753+
**FIXME:** grammar?
1072754

1073755
## Type kinds
1074756

1075-
Types in Rust are categorized into kinds, based on various properties of the
1076-
components of the type. The kinds are:
1077-
1078-
* `Send`
1079-
: Types of this kind can be safely sent between tasks.
1080-
This kind includes scalars, boxes, procs, and
1081-
structural types containing only other owned types.
1082-
All `Send` types are `'static`.
1083-
* `Copy`
1084-
: Types of this kind consist of "Plain Old Data"
1085-
which can be copied by simply moving bits.
1086-
All values of this kind can be implicitly copied.
1087-
This kind includes scalars and immutable references,
1088-
as well as structural types containing other `Copy` types.
1089-
* `'static`
1090-
: Types of this kind do not contain any references (except for
1091-
references with the `static` lifetime, which are allowed).
1092-
This can be a useful guarantee for code
1093-
that breaks borrowing assumptions
1094-
using [`unsafe` operations](#unsafe-functions).
1095-
* `Drop`
1096-
: This is not strictly a kind,
1097-
but its presence interacts with kinds:
1098-
the `Drop` trait provides a single method `drop`
1099-
that takes no parameters,
1100-
and is run when values of the type are dropped.
1101-
Such a method is called a "destructor",
1102-
and are always executed in "top-down" order:
1103-
a value is completely destroyed
1104-
before any of the values it owns run their destructors.
1105-
Only `Send` types can implement `Drop`.
1106-
1107-
* _Default_
1108-
: Types with destructors, closure environments,
1109-
and various other _non-first-class_ types,
1110-
are not copyable at all.
1111-
Such types can usually only be accessed through pointers,
1112-
or in some cases, moved between mutable locations.
1113-
1114-
Kinds can be supplied as _bounds_ on type parameters, like traits, in which
1115-
case the parameter is constrained to types satisfying that kind.
1116-
1117-
By default, type parameters do not carry any assumed kind-bounds at all. When
1118-
instantiating a type parameter, the kind bounds on the parameter are checked to
1119-
be the same or narrower than the kind of the type that it is instantiated with.
1120-
1121-
Sending operations are not part of the Rust language, but are implemented in
1122-
the library. Generic functions that send values bound the kind of these values
1123-
to sendable.
757+
**FIXME:** this this probably not relevant to the grammar...
1124758

1125759
# Memory and concurrency models
1126760

1127-
Rust has a memory model centered around concurrently-executing _tasks_. Thus
1128-
its memory model and its concurrency model are best discussed simultaneously,
1129-
as parts of each only make sense when considered from the perspective of the
1130-
other.
1131-
1132-
When reading about the memory model, keep in mind that it is partitioned in
1133-
order to support tasks; and when reading about tasks, keep in mind that their
1134-
isolation and communication mechanisms are only possible due to the ownership
1135-
and lifetime semantics of the memory model.
761+
**FIXME:** is this entire chapter relevant here? Or should it all have been covered by some production already?
1136762

1137763
## Memory model
1138764

1139-
A Rust program's memory consists of a static set of *items*, a set of
1140-
[tasks](#tasks) each with its own *stack*, and a *heap*. Immutable portions of
1141-
the heap may be shared between tasks, mutable portions may not.
1142-
1143-
Allocations in the stack consist of *slots*, and allocations in the heap
1144-
consist of *boxes*.
1145-
1146765
### Memory allocation and lifetime
1147766

1148-
The _items_ of a program are those functions, modules and types that have their
1149-
value calculated at compile-time and stored uniquely in the memory image of the
1150-
rust process. Items are neither dynamically allocated nor freed.
1151-
1152-
A task's _stack_ consists of activation frames automatically allocated on entry
1153-
to each function as the task executes. A stack allocation is reclaimed when
1154-
control leaves the frame containing it.
1155-
1156-
The _heap_ is a general term that describes boxes. The lifetime of an
1157-
allocation in the heap depends on the lifetime of the box values pointing to
1158-
it. Since box values may themselves be passed in and out of frames, or stored
1159-
in the heap, heap allocations may outlive the frame they are allocated within.
1160-
1161767
### Memory ownership
1162768

1163-
A task owns all memory it can *safely* reach through local variables, as well
1164-
as boxes and references.
1165-
1166-
When a task sends a value that has the `Send` trait to another task, it loses
1167-
ownership of the value sent and can no longer refer to it. This is statically
1168-
guaranteed by the combined use of "move semantics", and the compiler-checked
1169-
_meaning_ of the `Send` trait: it is only instantiated for (transitively)
1170-
sendable kinds of data constructor and pointers, never including references.
1171-
1172-
When a stack frame is exited, its local allocations are all released, and its
1173-
references to boxes are dropped.
1174-
1175-
When a task finishes, its stack is necessarily empty and it therefore has no
1176-
references to any boxes; the remainder of its heap is immediately freed.
1177-
1178769
### Memory slots
1179770

1180-
A task's stack contains slots.
1181-
1182-
A _slot_ is a component of a stack frame, either a function parameter, a
1183-
[temporary](#lvalues,-rvalues-and-temporaries), or a local variable.
1184-
1185-
A _local variable_ (or *stack-local* allocation) holds a value directly,
1186-
allocated within the stack's memory. The value is a part of the stack frame.
1187-
1188-
Local variables are immutable unless declared otherwise like: `let mut x = ...`.
1189-
1190-
Function parameters are immutable unless declared with `mut`. The `mut` keyword
1191-
applies only to the following parameter (so `|mut x, y|` and `fn f(mut x:
1192-
Box<int>, y: Box<int>)` declare one mutable variable `x` and one immutable
1193-
variable `y`).
1194-
1195-
Methods that take either `self` or `Box<Self>` can optionally place them in a
1196-
mutable slot by prefixing them with `mut` (similar to regular arguments):
1197-
1198-
```
1199-
trait Changer {
1200-
fn change(mut self) -> Self;
1201-
fn modify(mut self: Box<Self>) -> Box<Self>;
1202-
}
1203-
```
1204-
1205-
Local variables are not initialized when allocated; the entire frame worth of
1206-
local variables are allocated at once, on frame-entry, in an uninitialized
1207-
state. Subsequent statements within a function may or may not initialize the
1208-
local variables. Local variables can be used only after they have been
1209-
initialized; this is enforced by the compiler.
1210-
1211771
### Boxes
1212772

1213-
A _box_ is a reference to a heap allocation holding another value, which is
1214-
constructed by the prefix operator `box`. When the standard library is in use,
1215-
the type of a box is `std::owned::Box<T>`.
1216-
1217-
An example of a box type and value:
1218-
1219-
```
1220-
let x: Box<int> = box 10;
1221-
```
1222-
1223-
Box values exist in 1:1 correspondence with their heap allocation, copying a
1224-
box value makes a shallow copy of the pointer. Rust will consider a shallow
1225-
copy of a box to move ownership of the value. After a value has been moved,
1226-
the source location cannot be used unless it is reinitialized.
1227-
1228-
```
1229-
let x: Box<int> = box 10;
1230-
let y = x;
1231-
// attempting to use `x` will result in an error here
1232-
```
1233-
1234773
## Tasks
1235774

1236-
An executing Rust program consists of a tree of tasks. A Rust _task_ consists
1237-
of an entry function, a stack, a set of outgoing communication channels and
1238-
incoming communication ports, and ownership of some portion of the heap of a
1239-
single operating-system process.
1240-
1241775
### Communication between tasks
1242776

1243-
Rust tasks are isolated and generally unable to interfere with one another's
1244-
memory directly, except through [`unsafe` code](#unsafe-functions). All
1245-
contact between tasks is mediated by safe forms of ownership transfer, and data
1246-
races on memory are prohibited by the type system.
1247-
1248-
When you wish to send data between tasks, the values are restricted to the
1249-
[`Send` type-kind](#type-kinds). Restricting communication interfaces to this
1250-
kind ensures that no references move between tasks. Thus access to an entire
1251-
data structure can be mediated through its owning "root" value; no further
1252-
locking or copying is required to avoid data races within the substructure of
1253-
such a value.
1254-
1255777
### Task lifecycle
1256-
1257-
The _lifecycle_ of a task consists of a finite set of states and events that
1258-
cause transitions between the states. The lifecycle states of a task are:
1259-
1260-
* running
1261-
* blocked
1262-
* panicked
1263-
* dead
1264-
1265-
A task begins its lifecycle &mdash; once it has been spawned &mdash; in the
1266-
*running* state. In this state it executes the statements of its entry
1267-
function, and any functions called by the entry function.
1268-
1269-
A task may transition from the *running* state to the *blocked* state any time
1270-
it makes a blocking communication call. When the call can be completed &mdash;
1271-
when a message arrives at a sender, or a buffer opens to receive a message
1272-
&mdash; then the blocked task will unblock and transition back to *running*.
1273-
1274-
A task may transition to the *panicked* state at any time, due being killed by
1275-
some external event or internally, from the evaluation of a `panic!()` macro.
1276-
Once *panicking*, a task unwinds its stack and transitions to the *dead* state.
1277-
Unwinding the stack of a task is done by the task itself, on its own control
1278-
stack. If a value with a destructor is freed during unwinding, the code for the
1279-
destructor is run, also on the task's control stack. Running the destructor
1280-
code causes a temporary transition to a *running* state, and allows the
1281-
destructor code to cause any subsequent state transitions. The original task
1282-
of unwinding and panicking thereby may suspend temporarily, and may involve
1283-
(recursive) unwinding of the stack of a failed destructor. Nonetheless, the
1284-
outermost unwinding activity will continue until the stack is unwound and the
1285-
task transitions to the *dead* state. There is no way to "recover" from task
1286-
panics. Once a task has temporarily suspended its unwinding in the *panicking*
1287-
state, a panic occurring from within this destructor results in *hard* panic.
1288-
A hard panic currently results in the process aborting.
1289-
1290-
A task in the *dead* state cannot transition to other states; it exists only to
1291-
have its termination status inspected by other tasks, and/or to await
1292-
reclamation when the last reference to it drops.
1293-
1294-
# Runtime services, linkage and debugging
1295-
1296-
The Rust _runtime_ is a relatively compact collection of Rust code that
1297-
provides fundamental services and datatypes to all Rust tasks at run-time. It
1298-
is smaller and simpler than many modern language runtimes. It is tightly
1299-
integrated into the language's execution model of memory, tasks, communication
1300-
and logging.
1301-
1302-
### Memory allocation
1303-
1304-
The runtime memory-management system is based on a _service-provider
1305-
interface_, through which the runtime requests blocks of memory from its
1306-
environment and releases them back to its environment when they are no longer
1307-
needed. The default implementation of the service-provider interface consists
1308-
of the C runtime functions `malloc` and `free`.
1309-
1310-
The runtime memory-management system, in turn, supplies Rust tasks with
1311-
facilities for allocating releasing stacks, as well as allocating and freeing
1312-
heap data.
1313-
1314-
### Built in types
1315-
1316-
The runtime provides C and Rust code to assist with various built-in types,
1317-
such as arrays, strings, and the low level communication system (ports,
1318-
channels, tasks).
1319-
1320-
Support for other built-in types such as simple types, tuples and enums is
1321-
open-coded by the Rust compiler.
1322-
1323-
### Task scheduling and communication
1324-
1325-
The runtime provides code to manage inter-task communication. This includes
1326-
the system of task-lifecycle state transitions depending on the contents of
1327-
queues, as well as code to copy values between queues and their recipients and
1328-
to serialize values for transmission over operating-system inter-process
1329-
communication facilities.
1330-
1331-
### Linkage
1332-
1333-
The Rust compiler supports various methods to link crates together both
1334-
statically and dynamically. This section will explore the various methods to
1335-
link Rust crates together, and more information about native libraries can be
1336-
found in the [ffi guide][ffi].
1337-
1338-
In one session of compilation, the compiler can generate multiple artifacts
1339-
through the usage of either command line flags or the `crate_type` attribute.
1340-
If one or more command line flag is specified, all `crate_type` attributes will
1341-
be ignored in favor of only building the artifacts specified by command line.
1342-
1343-
* `--crate-type=bin`, `#[crate_type = "bin"]` - A runnable executable will be
1344-
produced. This requires that there is a `main` function in the crate which
1345-
will be run when the program begins executing. This will link in all Rust and
1346-
native dependencies, producing a distributable binary.
1347-
1348-
* `--crate-type=lib`, `#[crate_type = "lib"]` - A Rust library will be produced.
1349-
This is an ambiguous concept as to what exactly is produced because a library
1350-
can manifest itself in several forms. The purpose of this generic `lib` option
1351-
is to generate the "compiler recommended" style of library. The output library
1352-
will always be usable by rustc, but the actual type of library may change from
1353-
time-to-time. The remaining output types are all different flavors of
1354-
libraries, and the `lib` type can be seen as an alias for one of them (but the
1355-
actual one is compiler-defined).
1356-
1357-
* `--crate-type=dylib`, `#[crate_type = "dylib"]` - A dynamic Rust library will
1358-
be produced. This is different from the `lib` output type in that this forces
1359-
dynamic library generation. The resulting dynamic library can be used as a
1360-
dependency for other libraries and/or executables. This output type will
1361-
create `*.so` files on linux, `*.dylib` files on osx, and `*.dll` files on
1362-
windows.
1363-
1364-
* `--crate-type=staticlib`, `#[crate_type = "staticlib"]` - A static system
1365-
library will be produced. This is different from other library outputs in that
1366-
the Rust compiler will never attempt to link to `staticlib` outputs. The
1367-
purpose of this output type is to create a static library containing all of
1368-
the local crate's code along with all upstream dependencies. The static
1369-
library is actually a `*.a` archive on linux and osx and a `*.lib` file on
1370-
windows. This format is recommended for use in situations such as linking
1371-
Rust code into an existing non-Rust application because it will not have
1372-
dynamic dependencies on other Rust code.
1373-
1374-
* `--crate-type=rlib`, `#[crate_type = "rlib"]` - A "Rust library" file will be
1375-
produced. This is used as an intermediate artifact and can be thought of as a
1376-
"static Rust library". These `rlib` files, unlike `staticlib` files, are
1377-
interpreted by the Rust compiler in future linkage. This essentially means
1378-
that `rustc` will look for metadata in `rlib` files like it looks for metadata
1379-
in dynamic libraries. This form of output is used to produce statically linked
1380-
executables as well as `staticlib` outputs.
1381-
1382-
Note that these outputs are stackable in the sense that if multiple are
1383-
specified, then the compiler will produce each form of output at once without
1384-
having to recompile. However, this only applies for outputs specified by the
1385-
same method. If only `crate_type` attributes are specified, then they will all
1386-
be built, but if one or more `--crate-type` command line flag is specified,
1387-
then only those outputs will be built.
1388-
1389-
With all these different kinds of outputs, if crate A depends on crate B, then
1390-
the compiler could find B in various different forms throughout the system. The
1391-
only forms looked for by the compiler, however, are the `rlib` format and the
1392-
dynamic library format. With these two options for a dependent library, the
1393-
compiler must at some point make a choice between these two formats. With this
1394-
in mind, the compiler follows these rules when determining what format of
1395-
dependencies will be used:
1396-
1397-
1. If a static library is being produced, all upstream dependencies are
1398-
required to be available in `rlib` formats. This requirement stems from the
1399-
reason that a dynamic library cannot be converted into a static format.
1400-
1401-
Note that it is impossible to link in native dynamic dependencies to a static
1402-
library, and in this case warnings will be printed about all unlinked native
1403-
dynamic dependencies.
1404-
1405-
2. If an `rlib` file is being produced, then there are no restrictions on what
1406-
format the upstream dependencies are available in. It is simply required that
1407-
all upstream dependencies be available for reading metadata from.
1408-
1409-
The reason for this is that `rlib` files do not contain any of their upstream
1410-
dependencies. It wouldn't be very efficient for all `rlib` files to contain a
1411-
copy of `libstd.rlib`!
1412-
1413-
3. If an executable is being produced and the `-C prefer-dynamic` flag is not
1414-
specified, then dependencies are first attempted to be found in the `rlib`
1415-
format. If some dependencies are not available in an rlib format, then
1416-
dynamic linking is attempted (see below).
1417-
1418-
4. If a dynamic library or an executable that is being dynamically linked is
1419-
being produced, then the compiler will attempt to reconcile the available
1420-
dependencies in either the rlib or dylib format to create a final product.
1421-
1422-
A major goal of the compiler is to ensure that a library never appears more
1423-
than once in any artifact. For example, if dynamic libraries B and C were
1424-
each statically linked to library A, then a crate could not link to B and C
1425-
together because there would be two copies of A. The compiler allows mixing
1426-
the rlib and dylib formats, but this restriction must be satisfied.
1427-
1428-
The compiler currently implements no method of hinting what format a library
1429-
should be linked with. When dynamically linking, the compiler will attempt to
1430-
maximize dynamic dependencies while still allowing some dependencies to be
1431-
linked in via an rlib.
1432-
1433-
For most situations, having all libraries available as a dylib is recommended
1434-
if dynamically linking. For other situations, the compiler will emit a
1435-
warning if it is unable to determine which formats to link each library with.
1436-
1437-
In general, `--crate-type=bin` or `--crate-type=lib` should be sufficient for
1438-
all compilation needs, and the other options are just available if more
1439-
fine-grained control is desired over the output format of a Rust crate.
1440-
1441-
# Appendix: Rationales and design tradeoffs
1442-
1443-
*TODO*.
1444-
1445-
# Appendix: Influences and further references
1446-
1447-
## Influences
1448-
1449-
> The essential problem that must be solved in making a fault-tolerant
1450-
> software system is therefore that of fault-isolation. Different programmers
1451-
> will write different modules, some modules will be correct, others will have
1452-
> errors. We do not want the errors in one module to adversely affect the
1453-
> behaviour of a module which does not have any errors.
1454-
>
1455-
> &mdash; Joe Armstrong
1456-
1457-
> In our approach, all data is private to some process, and processes can
1458-
> only communicate through communications channels. *Security*, as used
1459-
> in this paper, is the property which guarantees that processes in a system
1460-
> cannot affect each other except by explicit communication.
1461-
>
1462-
> When security is absent, nothing which can be proven about a single module
1463-
> in isolation can be guaranteed to hold when that module is embedded in a
1464-
> system [...]
1465-
>
1466-
> &mdash; Robert Strom and Shaula Yemini
1467-
1468-
> Concurrent and applicative programming complement each other. The
1469-
> ability to send messages on channels provides I/O without side effects,
1470-
> while the avoidance of shared data helps keep concurrent processes from
1471-
> colliding.
1472-
>
1473-
> &mdash; Rob Pike
1474-
1475-
Rust is not a particularly original language. It may however appear unusual by
1476-
contemporary standards, as its design elements are drawn from a number of
1477-
"historical" languages that have, with a few exceptions, fallen out of favour.
1478-
Five prominent lineages contribute the most, though their influences have come
1479-
and gone during the course of Rust's development:
1480-
1481-
* The NIL (1981) and Hermes (1990) family. These languages were developed by
1482-
Robert Strom, Shaula Yemini, David Bacon and others in their group at IBM
1483-
Watson Research Center (Yorktown Heights, NY, USA).
1484-
1485-
* The Erlang (1987) language, developed by Joe Armstrong, Robert Virding, Claes
1486-
Wikstr&ouml;m, Mike Williams and others in their group at the Ericsson Computer
1487-
Science Laboratory (&Auml;lvsj&ouml;, Stockholm, Sweden) .
1488-
1489-
* The Sather (1990) language, developed by Stephen Omohundro, Chu-Cheow Lim,
1490-
Heinz Schmidt and others in their group at The International Computer
1491-
Science Institute of the University of California, Berkeley (Berkeley, CA,
1492-
USA).
1493-
1494-
* The Newsqueak (1988), Alef (1995), and Limbo (1996) family. These
1495-
languages were developed by Rob Pike, Phil Winterbottom, Sean Dorward and
1496-
others in their group at Bell Labs Computing Sciences Research Center
1497-
(Murray Hill, NJ, USA).
1498-
1499-
* The Napier (1985) and Napier88 (1988) family. These languages were
1500-
developed by Malcolm Atkinson, Ron Morrison and others in their group at
1501-
the University of St. Andrews (St. Andrews, Fife, UK).
1502-
1503-
Additional specific influences can be seen from the following languages:
1504-
1505-
* The structural algebraic types and compilation manager of SML.
1506-
* The attribute and assembly systems of C#.
1507-
* The references and deterministic destructor system of C++.
1508-
* The memory region systems of the ML Kit and Cyclone.
1509-
* The typeclass system of Haskell.
1510-
* The lexical identifier rule of Python.
1511-
* The block syntax of Ruby.
1512-
1513-
[ffi]: guide-ffi.html
1514-
[plugin]: guide-plugin.html

0 commit comments

Comments
 (0)