Skip to content

Commit d95e9c9

Browse files
authored
Add doc pages for SIMD/small-numbers (#3936)
1 parent e8058d5 commit d95e9c9

File tree

2 files changed

+203
-0
lines changed

2 files changed

+203
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
---
2+
layout: documentation-page
3+
collectionName: Miscellaneous extensions
4+
title: SIMD
5+
---
6+
7+
# SIMD
8+
9+
The OxCaml compiler provides built-in 128-bit SIMD vector types, as well as
10+
intrinsics for amd64 SIMD instructions up to and including SSE4.2.
11+
12+
<!-- CR mslater: link to simd libraries -->
13+
To get started with SIMD, add the `ocaml_simd_sse` library to your dependencies.
14+
You may also want to use `ppx_simd`, which provides convenient syntax for
15+
defining constants like blend and shuffle masks.
16+
17+
## Types
18+
19+
When SIMD is enabled, the following 128-bit SIMD vector types are available:
20+
21+
```
22+
int8x16
23+
int8x16#
24+
int16x8
25+
int16x8#
26+
int32x4
27+
int32x4#
28+
int64x2
29+
int64x2#
30+
float32x4
31+
float32x4#
32+
float64x2
33+
float64x2#
34+
```
35+
36+
The types ending with `#` are unboxed: they are passed between functions in XMM
37+
registers, stored in structures as flat data, and may be stored in flat arrays.
38+
The operations provided by `Ocaml_simd_sse` operate on unboxed vectors. For
39+
more detail on unboxed types, see the [docs](../unboxed-types/intro).
40+
41+
The types without `#` are boxed: when passed to a non-inlined function, they
42+
will be copied to a heap allocated (abstract) block. Boxed vectors are not
43+
necessarily 16-byte aligned, so will generate unaligned load/store instructions.
44+
45+
Within a function, all SIMD vectors live in floating-point registers or 16-byte
46+
aligned stack slots.
47+
48+
## Intrinsics
49+
50+
SIMD vectors are opaque: no operations on them are built into the
51+
language. Instead, the compiler translates certain "builtin" externals directly
52+
to SIMD instructions. Your code should use the `ocaml_simd_sse` library, which
53+
exposes an OxCaml API for these intrinsics.
54+
55+
```ocaml
56+
module Float32x4 = Ocaml_simd_sse.Float32x4
57+
58+
let v = Float32x4.set 1.0 2.0 3.0 4.0
59+
let v = Float32x4.sqrt v
60+
let x, y, z, w = Float32x4.splat v
61+
```
62+
63+
SIMD vectors may be loaded from / stored to strings, bytes, bigstrings, and
64+
arrays of the corresponding unboxed type. Load and store operations are also
65+
provided by `ocaml_simd_sse`, rather than Base or Core.
66+
67+
```ocaml
68+
module Int8x16 = Ocaml_simd_sse.Int8x16
69+
70+
let text = "abcdefghijklmnopqrstuvwxyz"
71+
let floats = [| 1.0; 2.0 |]
72+
let ints = [| 1; 2 |]
73+
74+
let _ = Int8x16.String.get text ~byte:0
75+
let _ = Float64x2.Float_array.get floats ~idx:0 (* Float array optimization required *)
76+
let _ = Int64x2.Immediate_array.get_tagged ints ~idx:0
77+
```
78+
79+
Some operations require the user to choose a specific behavior at compile
80+
time. To do so, you must provide a compile time constant generated by
81+
`ppx_simd`. Refer to `ppx_simd` for more details.
82+
83+
```ocaml
84+
module Int32x4 = Ocaml_simd_sse.Int32x4
85+
86+
let x = Int32x4.set 0 2 4 6
87+
let y = Int32x4.set 1 3 5 7
88+
let z = Int32x4.blend [%blend 0, 1, 0, 1] x y
89+
```
90+
91+
## C ABI
92+
93+
Like floats, both boxed and unboxed SIMD vectors may be passed to C stubs. The
94+
OxCaml runtime provides several helper functions for working with SIMD vectors.
95+
96+
```ocaml
97+
external simd_stub : (int8x16[@unboxed]) -> (int8x16[@unboxed]) =
98+
"unboxed_integer_simd_stub" "boxed_integer_simd_stub"
99+
100+
(* ... *)
101+
```
102+
```c
103+
#include <caml/simd.h>
104+
105+
__m128i unboxed_integer_simd_stub(__m128i v) {
106+
return v;
107+
}
108+
109+
value boxed_integer_simd_stub(value v) {
110+
return caml_copy_vec128i(unboxed_integer_simd_stub(Vec128_vali(v)));
111+
}
112+
```
113+
114+
## Future Work
115+
116+
Support for wider vectors and NEON/AVX2/AVX512 intrinsics is coming soon.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
---
2+
layout: documentation-page
3+
collectionName: Miscellaneous extensions
4+
title: Small Numbers
5+
---
6+
7+
# Small Numbers
8+
9+
The small numbers extension adds `float32`, `int16`, and `int8` types to OxCaml.
10+
Currently, only `float32` (single-precision IEEE float) is implemented.
11+
12+
## Float32
13+
14+
When small numbers are enabled, the following float32 types are available:
15+
16+
```
17+
float32
18+
float32#
19+
float32 array
20+
float32# array
21+
```
22+
23+
Literals use the `s` suffix:
24+
25+
```
26+
1.0s : float32
27+
#1.0s : float32#
28+
```
29+
30+
Pattern matching on `float32`s is not supported.
31+
32+
### Operations
33+
34+
Operations on 32-bit floats are available via the `Stdlib_stable.Float32` and
35+
`Stdlib_stable.Float32_u` libraries, which provide `Base`-like APIs.
36+
37+
### Representation
38+
39+
The boxed `float32` type is encoded as a custom block with similar semantics to
40+
`int32`. Similarly, `float32 array` is a typical OxCaml array containing boxed
41+
elements.
42+
43+
The `float32#` type is unboxed:
44+
45+
- Function arguments and returns of type `float32#` are passed using
46+
floating-point registers.
47+
48+
- Record fields of type `float32#` are not boxed, but each take up one word of
49+
space. Using float32 records requires the mixed blocks extension, which is
50+
also enabled by default.
51+
52+
- Arrays of type `float32# array` contain tightly packed unboxed float32
53+
elements. The array itself is a custom block with similar semantics to
54+
`int32# array`.
55+
56+
Like floats, compiler optimizations allow boxed float32s to remain unboxed while
57+
being manipulated within the scope of a function.
58+
59+
### C ABI
60+
61+
Both boxed and unboxed float32s may be passed to C stubs. The OxCaml runtime
62+
provides helper functions for working with float32s.
63+
64+
```ocaml
65+
external float32_stub : (float32[@unboxed]) -> (float32[@unboxed]) =
66+
"boxed_float32_stub" "unboxed_float32_stub"
67+
68+
external float32_hash_stub : float32# -> float32# =
69+
"boxed_float32_stub" "unboxed_float32_stub"
70+
71+
(* ... *)
72+
```
73+
```c
74+
#include <caml/float32.h>
75+
76+
float unboxed_float32_stub(float v) {
77+
return v;
78+
}
79+
80+
value boxed_float32_stub(value v) {
81+
return caml_copy_float32(unboxed_float32_stub(Float32_val(v)));
82+
}
83+
```
84+
85+
## Int8 / Int16
86+
87+
Coming soon.

0 commit comments

Comments
 (0)