Skip to content

Commit 24c602f

Browse files
author
jared
committed
Added chapter on syntactic forms of LambdaBuffers files.
1 parent e3a40b6 commit 24c602f

File tree

3 files changed

+259
-2
lines changed

3 files changed

+259
-2
lines changed

_typos.toml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,12 @@
11
[default.extend-words]
22
substituters = "substituters"
3-
hask= "hask"
3+
hask = "hask"
4+
Nd = "Nd"
45

56
[type.pdf]
67
extend-glob = ["*.pdf"]
78
check-file = false
89

910
[type.png]
1011
extend-glob = ["*.png"]
11-
check-file = false
12+
check-file = false

docs/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
- [LambdaBuffers to Purescript](purescript.md)
77
- [Design](design.md)
88
- [API](api.md)
9+
- [LambdaBuffers file](syntax.md)
910
- [Compiler](compiler.md)
1011
- [Codegen](codegen.md)
1112
- [Command line interface](command-line-interface.md)

docs/syntax.md

Lines changed: 255 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,255 @@
1+
# LambdaBuffers file
2+
3+
The input to LambdaBuffers is a text file which contains a module that defines
4+
a specification of the types you want to generate.
5+
This section gives the exact syntax of a LambdaBuffers file, and informally describes meaning of the syntactic constructs.
6+
7+
The name of a LambdaBuffers file must end with `.lbf`.
8+
9+
## Notation
10+
In the following description of a LambdaBuffers file's syntax, we use
11+
a similar BNF syntax from [Section 10.1 of the Haskell Report](https://www.haskell.org/onlinereport/haskell2010/).
12+
So, the following notational conventions are used for presenting syntax.
13+
14+
| Syntax | Description |
15+
| ------------- | --------------------------------------------------------------------------- |
16+
| `[pattern]` | optional |
17+
| `{pattern}` | zero or more repetitions |
18+
| `(pattern)` | grouping |
19+
| `pat1⎮pat2` | choice |
20+
| `pat1\pat2` | difference -- elements generated by `pat1` except those generated by `pat2` |
21+
| `'terminal'` | terminal syntax surrounded by single quotes |
22+
23+
<!-- Apparently, `mdbook`'s markdown can't escape the vertical bar in codeblocks in a table....
24+
So, we're using code point U+23AE to look like a vertical bar when it really isn't...
25+
26+
| `pat1|pat2` | choice |
27+
-->
28+
29+
Note that the terminal syntax permits C-style escape sequences e.g.
30+
`'\n'` denotes line feed (newline), and `'\r'` denotes carriage return.
31+
32+
Productions will be of the form
33+
34+
```text
35+
nonterm -> alt1 | ... | altn
36+
```
37+
38+
## Input file representation
39+
The input file is Unicode text where the encoding is subject to the system locale.
40+
We will often use the unqualified term *character* to refer to a Unicode code point in the input file.
41+
42+
## Characters
43+
The following terms are used to denote specific Unicode character categories:
44+
45+
- `upper` denotes a Unicode code point categorized as an uppercase letter or titlecase letter (i.e., with General Category value Lt or Lu).
46+
47+
- `lower` denotes a Unicode code point categorized as a lower-case letter (i.e., with General Category value Ll).
48+
49+
- `alphanum` denotes either `upper` or `lower`; or a Unicode code point categorized as a modifier letter, other letter, decimal digit number, letter number, or other number (i.e., with General Category value Lt, Lu, Ll, Lm, Lo, Nd, Nl or No).
50+
51+
- `space` denotes a Unicode code point categorized as a separator space (i.e., with General Category value Zs), or any of the control characters `'\t'`, `'\n'`, `'\r'`, `'\f'`, or `'\v'`.
52+
53+
Interested readers may find details of Unicode character categories in [Section 4.5 of The Unicode Standard 15.1.0](https://www.unicode.org/versions/Unicode15.1.0/), and the [Unicode Character Database](https://unicode.org/ucd/).
54+
55+
## Lexical syntax
56+
57+
Tokens form the vocabulary of LambdaBuffers files.
58+
The classes of tokens are defined as follows.
59+
60+
```text
61+
keyword -> 'module' | 'sum' | 'prod' | 'record'
62+
| 'opaque' | 'class' | 'instance' | 'import'
63+
| 'qualified' | 'as'
64+
modulename -> uppercamelcase
65+
longmodulename -> long modulename
66+
tyname -> uppercamelcase
67+
fieldname -> lowercamelcase\keyword
68+
longtyname -> long tyname
69+
varname -> lowers\keyword
70+
punctuation -> '<=' | ',' | '(' | ')' | '{' | '}'
71+
| ':' | ':-' | '=' | '|'
72+
classname -> uppercamelcase
73+
longclassname -> long uppercamelcase
74+
```
75+
76+
where
77+
78+
```text
79+
uppercamelcase -> upper { alphanum }
80+
lowercamelcase -> lower { alphanum }
81+
long -> { uppercamelcase '.' }
82+
lowers -> lower { lower }
83+
```
84+
85+
Input files are broken into *tokens* which use the *maximal munch* rule i.e.,
86+
at each point, the next token is the longest sequence of characters that
87+
form a valid token.
88+
`space`s or line comments are ignored except as it separates tokens that
89+
would otherwise combine into a single token.
90+
91+
### Line comments
92+
A *line comment* starts with the terminal `'--'` followed by zero or more printable Unicode characters stopping at the first end of line (`'\n'` or `'\r\n'`).
93+
94+
## Syntax of LambdaBuffers files
95+
A LambdaBuffers file defines a module that is a collection of data types, classes, instance clauses, and derive clauses.
96+
97+
The overall layout of a LambdaBuffers file is:
98+
99+
```text
100+
module -> 'module' longmodulename { import } { statement }
101+
```
102+
103+
The file must specify the module's `longmodulename` where its `modulename` must match the file's name not including the `.lbf` extension.
104+
After, the file may contain a sequence of `import`s followed by a sequence of `statement`s.
105+
106+
### Import
107+
Imports bring *entities* (types and classes) of other modules into scope.
108+
109+
```text
110+
import -> 'import' [ 'qualified' ] longmodulename [ 'as' longmodulename ] [ importspec ]
111+
importspec -> '(' [ { tyname ',' } tyname [','] ] ')'
112+
```
113+
114+
If `importspec` is omitted, then all entities specified in the module are imported; otherwise only the specified entities are imported.
115+
116+
### Statement
117+
118+
Statements define types, classes, instance clauses, and derive clauses.
119+
120+
```text
121+
statement -> typedef
122+
| classdef
123+
| instanceclause
124+
| deriveclause
125+
```
126+
127+
#### Type definitions
128+
Types may be either sum types, product types, record types, or opaque types.
129+
130+
```text
131+
typedef -> prodtypedef | sumtypedef | recordtypedef | opaquetypedef
132+
```
133+
134+
##### Product type definition
135+
A product type definition defines a new product type.
136+
137+
```text
138+
prodtypedef -> 'prod' tyname { varname } '=' prod
139+
prod -> { tyexpr }
140+
tyexpr -> varname
141+
| longtyname
142+
| '(' prod ')'
143+
```
144+
145+
Product type definitions instruct the code generator to generate a product type for the target language.
146+
147+
##### Sum type definition
148+
A sum type definition defines a new sum type.
149+
150+
```text
151+
sumtypedef -> 'sum' tyname { varname } '=' sum
152+
sum -> sumconstructor { '|' sumconstructor }
153+
sumconstructor -> tyname prod
154+
```
155+
156+
Sum type definitions instruct the code generator to generate a sum type for the target language.
157+
158+
##### Record type definition
159+
A record type definition defines a new record type.
160+
161+
```text
162+
recordtypedef -> 'record' tyname { varname } '=' record
163+
record -> '{' [ field { ',' field } ] '}'
164+
field -> fieldname ':' prod
165+
````
166+
167+
Record type definitions instruct the code generator to generate a record type for the target language.
168+
169+
##### Opaque type
170+
An opaque type definition defines a new opaque type.
171+
172+
```text
173+
opaquetypedef -> 'opaque' tyname { varname }
174+
```
175+
176+
Opaque type definitions do not instruct the code generator to generate code, and an opaque type must be instead implemented in the target language.
177+
178+
#### Class definition
179+
A class definition introduces a new class.
180+
181+
```text
182+
classdef -> 'class' [ constraintexps '<=' ] classname { varname }
183+
constraintexp -> classref { varname }
184+
| '(' constraintexps ')'
185+
constraintexps -> [ constraintexp { ',' constraintexp } ]
186+
```
187+
188+
Class definitions do not instruct the code generator to generate code, but
189+
instead provides a means to communicate with the code generator the
190+
instances one would like to generate (via a derive clause).
191+
192+
#### Instance clause
193+
An instance clause specifies a type is an instance of a class.
194+
195+
```text
196+
instanceclause -> 'instance' constraint [ ':-' constraintexps ]
197+
constraint -> classref { tyexpr }
198+
```
199+
200+
Instance clauses do not instruct the code generator to generate code, but
201+
instead instructs the compiler (semantic checking) that the target language
202+
provides instances for the given type provided that the given `constraintexps`
203+
have instances.
204+
205+
#### Derive clause
206+
Derive clauses instruct the code generator to generate code for a type so that it is an instance of a class.
207+
208+
```text
209+
deriveclause -> 'derive' constraint
210+
```
211+
212+
Note the code generation of a type for a class is implemented via builtin derivation rules (which developers may extend).
213+
214+
### Syntax reference
215+
The summarized productions of a LambdaBuffers file is as follows.
216+
217+
```text
218+
module -> 'module' longmodulename { import } { statement }
219+
220+
import -> 'import' [ 'qualified' ] longmodulename [ 'as' longmodulename ] [ importspec ]
221+
importspec -> '(' [ { tyname ',' } tyname [','] ] ')'
222+
223+
statement -> typedef
224+
| classdef
225+
| instanceclause
226+
| deriveclause
227+
228+
typedef -> prodtypedef | sumtypedef | recordtypedef | opaquetypedef
229+
230+
prodtypedef -> 'prod' tyname { varname } '=' prod
231+
prod -> { tyexpr }
232+
tyexpr -> varname
233+
| longtyname
234+
| '(' prod ')'
235+
236+
sumtypedef -> 'sum' tyname { varname } '=' sum
237+
sum -> sumconstructor { '|' sumconstructor }
238+
sumconstructor -> tyname prod
239+
240+
recordtypedef -> 'record' tyname { varname } '=' record
241+
record -> '{' [ field { ',' field } ] '}'
242+
field -> fieldname ':' prod
243+
244+
opaquetypedef -> 'opaque' tyname { varname }
245+
246+
classdef -> 'class' [ constraintexps '<=' ] classname { varname }
247+
constraintexp -> classref { varname }
248+
| '(' constraintexps ')'
249+
constraintexps -> [ constraintexp { ',' constraintexp } ]
250+
251+
instanceclause -> 'instance' constraint [ ':-' constraintexps ]
252+
constraint -> classref { tyexpr }
253+
254+
deriveclause -> 'derive' constraint
255+
```

0 commit comments

Comments
 (0)