|
| 1 | +# LambdaBuffers file |
| 2 | + |
| 3 | +The input to LambdaBuffers is a text file which contains a module that defines |
| 4 | + a specification of the types you want to generate. |
| 5 | +This section gives the exact syntax of a LambdaBuffers file, and informally describes meaning of the syntactic constructs. |
| 6 | + |
| 7 | +The name of a LambdaBuffers file must end with `.lbf`. |
| 8 | + |
| 9 | +## Notation |
| 10 | +In the following description of a LambdaBuffers file's syntax, we use |
| 11 | + a similar BNF syntax from [Section 10.1 of the Haskell Report](https://www.haskell.org/onlinereport/haskell2010/). |
| 12 | +So, the following notational conventions are used for presenting syntax. |
| 13 | + |
| 14 | +| Syntax | Description | |
| 15 | +| ------------- | --------------------------------------------------------------------------- | |
| 16 | +| `[pattern]` | optional | |
| 17 | +| `{pattern}` | zero or more repetitions | |
| 18 | +| `(pattern)` | grouping | |
| 19 | +| `pat1⎮pat2` | choice | |
| 20 | +| `pat1\pat2` | difference -- elements generated by `pat1` except those generated by `pat2` | |
| 21 | +| `'terminal'` | terminal syntax surrounded by single quotes | |
| 22 | + |
| 23 | +<!-- Apparently, `mdbook`'s markdown can't escape the vertical bar in codeblocks in a table.... |
| 24 | + So, we're using code point U+23AE to look like a vertical bar when it really isn't... |
| 25 | +
|
| 26 | +| `pat1|pat2` | choice | |
| 27 | +--> |
| 28 | + |
| 29 | +Note that the terminal syntax permits C-style escape sequences e.g. |
| 30 | + `'\n'` denotes line feed (newline), and `'\r'` denotes carriage return. |
| 31 | + |
| 32 | +Productions will be of the form |
| 33 | + |
| 34 | +```text |
| 35 | +nonterm -> alt1 | ... | altn |
| 36 | +``` |
| 37 | + |
| 38 | +## Input file representation |
| 39 | +The input file is Unicode text where the encoding is subject to the system locale. |
| 40 | +We will often use the unqualified term *character* to refer to a Unicode code point in the input file. |
| 41 | + |
| 42 | +## Characters |
| 43 | +The following terms are used to denote specific Unicode character categories: |
| 44 | + |
| 45 | +- `upper` denotes a Unicode code point categorized as an uppercase letter or titlecase letter (i.e., with General Category value Lt or Lu). |
| 46 | + |
| 47 | +- `lower` denotes a Unicode code point categorized as a lower-case letter (i.e., with General Category value Ll). |
| 48 | + |
| 49 | +- `alphanum` denotes either `upper` or `lower`; or a Unicode code point categorized as a modifier letter, other letter, decimal digit number, letter number, or other number (i.e., with General Category value Lt, Lu, Ll, Lm, Lo, Nd, Nl or No). |
| 50 | + |
| 51 | +- `space` denotes a Unicode code point categorized as a separator space (i.e., with General Category value Zs), or any of the control characters `'\t'`, `'\n'`, `'\r'`, `'\f'`, or `'\v'`. |
| 52 | + |
| 53 | +Interested readers may find details of Unicode character categories in [Section 4.5 of The Unicode Standard 15.1.0](https://www.unicode.org/versions/Unicode15.1.0/), and the [Unicode Character Database](https://unicode.org/ucd/). |
| 54 | + |
| 55 | +## Lexical syntax |
| 56 | + |
| 57 | +Tokens form the vocabulary of LambdaBuffers files. |
| 58 | +The classes of tokens are defined as follows. |
| 59 | + |
| 60 | +```text |
| 61 | +keyword -> 'module' | 'sum' | 'prod' | 'record' |
| 62 | + | 'opaque' | 'class' | 'instance' | 'import' |
| 63 | + | 'qualified' | 'as' |
| 64 | +modulename -> uppercamelcase |
| 65 | +longmodulename -> long modulename |
| 66 | +tyname -> uppercamelcase |
| 67 | +fieldname -> lowercamelcase\keyword |
| 68 | +longtyname -> long tyname |
| 69 | +varname -> lowers\keyword |
| 70 | +punctuation -> '<=' | ',' | '(' | ')' | '{' | '}' |
| 71 | + | ':' | ':-' | '=' | '|' |
| 72 | +classname -> uppercamelcase |
| 73 | +longclassname -> long uppercamelcase |
| 74 | +``` |
| 75 | + |
| 76 | +where |
| 77 | + |
| 78 | +```text |
| 79 | +uppercamelcase -> upper { alphanum } |
| 80 | +lowercamelcase -> lower { alphanum } |
| 81 | +long -> { uppercamelcase '.' } |
| 82 | +lowers -> lower { lower } |
| 83 | +``` |
| 84 | + |
| 85 | +Input files are broken into *tokens* which use the *maximal munch* rule i.e., |
| 86 | + at each point, the next token is the longest sequence of characters that |
| 87 | + form a valid token. |
| 88 | +`space`s or line comments are ignored except as it separates tokens that |
| 89 | + would otherwise combine into a single token. |
| 90 | + |
| 91 | +### Line comments |
| 92 | +A *line comment* starts with the terminal `'--'` followed by zero or more printable Unicode characters stopping at the first end of line (`'\n'` or `'\r\n'`). |
| 93 | + |
| 94 | +## Syntax of LambdaBuffers files |
| 95 | +A LambdaBuffers file defines a module that is a collection of data types, classes, instance clauses, and derive clauses. |
| 96 | + |
| 97 | +The overall layout of a LambdaBuffers file is: |
| 98 | + |
| 99 | +```text |
| 100 | +module -> 'module' longmodulename { import } { statement } |
| 101 | +``` |
| 102 | + |
| 103 | +The file must specify the module's `longmodulename` where its `modulename` must match the file's name not including the `.lbf` extension. |
| 104 | +After, the file may contain a sequence of `import`s followed by a sequence of `statement`s. |
| 105 | + |
| 106 | +### Import |
| 107 | +Imports bring *entities* (types and classes) of other modules into scope. |
| 108 | + |
| 109 | +```text |
| 110 | +import -> 'import' [ 'qualified' ] longmodulename [ 'as' longmodulename ] [ importspec ] |
| 111 | +importspec -> '(' [ { tyname ',' } tyname [','] ] ')' |
| 112 | +``` |
| 113 | + |
| 114 | +If `importspec` is omitted, then all entities specified in the module are imported; otherwise only the specified entities are imported. |
| 115 | + |
| 116 | +### Statement |
| 117 | + |
| 118 | +Statements define types, classes, instance clauses, and derive clauses. |
| 119 | + |
| 120 | +```text |
| 121 | +statement -> typedef |
| 122 | + | classdef |
| 123 | + | instanceclause |
| 124 | + | deriveclause |
| 125 | +``` |
| 126 | + |
| 127 | +#### Type definitions |
| 128 | +Types may be either sum types, product types, record types, or opaque types. |
| 129 | + |
| 130 | +```text |
| 131 | +typedef -> prodtypedef | sumtypedef | recordtypedef | opaquetypedef |
| 132 | +``` |
| 133 | + |
| 134 | +##### Product type definition |
| 135 | +A product type definition defines a new product type. |
| 136 | + |
| 137 | +```text |
| 138 | +prodtypedef -> 'prod' tyname { varname } '=' prod |
| 139 | +prod -> { tyexpr } |
| 140 | +tyexpr -> varname |
| 141 | + | longtyname |
| 142 | + | '(' prod ')' |
| 143 | +``` |
| 144 | + |
| 145 | +Product type definitions instruct the code generator to generate a product type for the target language. |
| 146 | + |
| 147 | +##### Sum type definition |
| 148 | +A sum type definition defines a new sum type. |
| 149 | + |
| 150 | +```text |
| 151 | +sumtypedef -> 'sum' tyname { varname } '=' sum |
| 152 | +sum -> sumconstructor { '|' sumconstructor } |
| 153 | +sumconstructor -> tyname prod |
| 154 | +``` |
| 155 | + |
| 156 | +Sum type definitions instruct the code generator to generate a sum type for the target language. |
| 157 | + |
| 158 | +##### Record type definition |
| 159 | +A record type definition defines a new record type. |
| 160 | + |
| 161 | +```text |
| 162 | +recordtypedef -> 'record' tyname { varname } '=' record |
| 163 | +record -> '{' [ field { ',' field } ] '}' |
| 164 | +field -> fieldname ':' prod |
| 165 | +```` |
| 166 | +
|
| 167 | +Record type definitions instruct the code generator to generate a record type for the target language. |
| 168 | +
|
| 169 | +##### Opaque type |
| 170 | +An opaque type definition defines a new opaque type. |
| 171 | +
|
| 172 | +```text |
| 173 | +opaquetypedef -> 'opaque' tyname { varname } |
| 174 | +``` |
| 175 | + |
| 176 | +Opaque type definitions do not instruct the code generator to generate code, and an opaque type must be instead implemented in the target language. |
| 177 | + |
| 178 | +#### Class definition |
| 179 | +A class definition introduces a new class. |
| 180 | + |
| 181 | +```text |
| 182 | +classdef -> 'class' [ constraintexps '<=' ] classname { varname } |
| 183 | +constraintexp -> classref { varname } |
| 184 | + | '(' constraintexps ')' |
| 185 | +constraintexps -> [ constraintexp { ',' constraintexp } ] |
| 186 | +``` |
| 187 | + |
| 188 | +Class definitions do not instruct the code generator to generate code, but |
| 189 | + instead provides a means to communicate with the code generator the |
| 190 | + instances one would like to generate (via a derive clause). |
| 191 | + |
| 192 | +#### Instance clause |
| 193 | +An instance clause specifies a type is an instance of a class. |
| 194 | + |
| 195 | +```text |
| 196 | +instanceclause -> 'instance' constraint [ ':-' constraintexps ] |
| 197 | +constraint -> classref { tyexpr } |
| 198 | +``` |
| 199 | + |
| 200 | +Instance clauses do not instruct the code generator to generate code, but |
| 201 | + instead instructs the compiler (semantic checking) that the target language |
| 202 | + provides instances for the given type provided that the given `constraintexps` |
| 203 | + have instances. |
| 204 | + |
| 205 | +#### Derive clause |
| 206 | +Derive clauses instruct the code generator to generate code for a type so that it is an instance of a class. |
| 207 | + |
| 208 | +```text |
| 209 | +deriveclause -> 'derive' constraint |
| 210 | +``` |
| 211 | + |
| 212 | +Note the code generation of a type for a class is implemented via builtin derivation rules (which developers may extend). |
| 213 | + |
| 214 | +### Syntax reference |
| 215 | +The summarized productions of a LambdaBuffers file is as follows. |
| 216 | + |
| 217 | +```text |
| 218 | +module -> 'module' longmodulename { import } { statement } |
| 219 | +
|
| 220 | +import -> 'import' [ 'qualified' ] longmodulename [ 'as' longmodulename ] [ importspec ] |
| 221 | +importspec -> '(' [ { tyname ',' } tyname [','] ] ')' |
| 222 | +
|
| 223 | +statement -> typedef |
| 224 | + | classdef |
| 225 | + | instanceclause |
| 226 | + | deriveclause |
| 227 | +
|
| 228 | +typedef -> prodtypedef | sumtypedef | recordtypedef | opaquetypedef |
| 229 | +
|
| 230 | +prodtypedef -> 'prod' tyname { varname } '=' prod |
| 231 | +prod -> { tyexpr } |
| 232 | +tyexpr -> varname |
| 233 | + | longtyname |
| 234 | + | '(' prod ')' |
| 235 | +
|
| 236 | +sumtypedef -> 'sum' tyname { varname } '=' sum |
| 237 | +sum -> sumconstructor { '|' sumconstructor } |
| 238 | +sumconstructor -> tyname prod |
| 239 | +
|
| 240 | +recordtypedef -> 'record' tyname { varname } '=' record |
| 241 | +record -> '{' [ field { ',' field } ] '}' |
| 242 | +field -> fieldname ':' prod |
| 243 | +
|
| 244 | +opaquetypedef -> 'opaque' tyname { varname } |
| 245 | +
|
| 246 | +classdef -> 'class' [ constraintexps '<=' ] classname { varname } |
| 247 | +constraintexp -> classref { varname } |
| 248 | + | '(' constraintexps ')' |
| 249 | +constraintexps -> [ constraintexp { ',' constraintexp } ] |
| 250 | +
|
| 251 | +instanceclause -> 'instance' constraint [ ':-' constraintexps ] |
| 252 | +constraint -> classref { tyexpr } |
| 253 | +
|
| 254 | +deriveclause -> 'derive' constraint |
| 255 | +``` |
0 commit comments