|
| 1 | +--- |
| 2 | +sidebar_position: 5 |
| 3 | +--- |
| 4 | + |
| 5 | +# Compiler |
| 6 | + |
| 7 | +This page contains a high level breakdown of the different |
| 8 | +steps needed to compile Mew code. |
| 9 | + |
| 10 | +```mermaid |
| 11 | +flowchart LR; |
| 12 | + subgraph Frontend |
| 13 | + AST-->HIR |
| 14 | + HIR-->MIR |
| 15 | + MIR-->LIR |
| 16 | + end |
| 17 | + subgraph Backend |
| 18 | + LIR-->C# |
| 19 | + C#-->CIL |
| 20 | + LIR-. Future .->CIL |
| 21 | + end |
| 22 | +``` |
| 23 | + |
| 24 | +## 1. AST Parsing |
| 25 | + |
| 26 | +The parsing step iterates through all source files, and |
| 27 | +builds a syntax tree for each of them. |
| 28 | +The syntax tree represents the code as it was written, |
| 29 | +maintaining the trivia such as white space, comments etc. |
| 30 | + |
| 31 | +Each node in the AST has a reference to both it's parent |
| 32 | +and children. |
| 33 | + |
| 34 | +Apart from being the basis for `HIR` generation, the AST |
| 35 | +is also used to interact with the source code programatically, |
| 36 | +i.e. from the LSP server. |
| 37 | + |
| 38 | +## 2. `HIR` generation |
| 39 | + |
| 40 | +HIR, short for _High-level Intermediate Representation_, |
| 41 | +represents a bound tree, where all types are known. |
| 42 | + |
| 43 | +The HIR references resolved _symbols_ for the different parts |
| 44 | +of Mew (namespaces, types, functions, parameters, variables etc). |
| 45 | +For example, two code block that calls a function, will have |
| 46 | +the same symbol reference to that function. |
| 47 | + |
| 48 | +1. Build symbol table |
| 49 | + 1. Namespaces |
| 50 | + 1. Types |
| 51 | + 1. Free functions |
| 52 | + 1. Type members |
| 53 | +1. Binding |
| 54 | + 1. Types |
| 55 | + 1. Free functions |
| 56 | + 1. Top level statements |
| 57 | + |
| 58 | +:::info |
| 59 | +HIR might contain errors, represented as error symbols. |
| 60 | +::: |
| 61 | + |
| 62 | +## 3. `MIR` generation |
| 63 | + |
| 64 | +MIR, short for _Medium-level Intermediate Representation_, |
| 65 | +is a lowered HIR, without constructs such as `while`/`loop`/`if`. |
| 66 | + |
| 67 | +All higher level constructs such as loops and conditions |
| 68 | +been lowered into labels and branches. |
| 69 | + |
| 70 | +Control flow analysis and some optimizations |
| 71 | +are done here as well. |
| 72 | + |
| 73 | +:::info |
| 74 | +MIR might contain errors, represented as error symbols. |
| 75 | +::: |
| 76 | + |
| 77 | +## 4. `LIR` generation |
| 78 | + |
| 79 | +LIR, short for _Low-level Intermediate Representation_, |
| 80 | +is a lowered MIR, resembling the final byte code that will |
| 81 | +be emitted. |
| 82 | + |
| 83 | +:::warning |
| 84 | +LIR **MUST NOT** contain any errors. |
| 85 | +::: |
| 86 | + |
| 87 | +## 5. Emitting |
| 88 | + |
| 89 | +Finally, the LIR is transpiled into C# and compiled to |
| 90 | +native assembly code using NativeAOT, which in turn can be executed. |
| 91 | + |
| 92 | +:::note |
| 93 | +There are plans in the future to emit CIL directly. |
| 94 | +This was done in the early prototype, but turned out to |
| 95 | +be too cumbersome while the language was in active development. |
| 96 | +::: |
0 commit comments