|
1 |
| -"""Update build result by incrementally processing changed modules. |
| 1 | +"""Update build by processing changes using fine-grained dependencies. |
2 | 2 |
|
3 | 3 | Use fine-grained dependencies to update targets in other modules that
|
4 | 4 | may be affected by externally-visible changes in the changed modules.
|
5 | 5 |
|
6 |
| -Terms: |
| 6 | +This forms the core of the fine-grained incremental daemon mode. This |
| 7 | +module is not used at all by the 'classic' (non-daemon) incremental |
| 8 | +mode. |
7 | 9 |
|
8 |
| -* A 'target' is a function definition or the top level of a module. We |
9 |
| - refer to targets using their fully qualified name (e.g. 'mod.Cls.attr'). |
10 |
| - Targets are the smallest units of processing during fine-grained |
11 |
| - incremental checking. |
12 |
| -* A 'trigger' represents the properties of a part of a program, and it |
13 |
| - gets triggered/activated when these properties change. For example, |
14 |
| - '<mod.func>' refers to a module-level function, and it gets triggered |
15 |
| - if the signature of the function changes, or if if the function is |
16 |
| - removed. |
| 10 | +Here is some motivation for this mode: |
17 | 11 |
|
18 |
| -Some program state is maintained across multiple build increments: |
| 12 | +* By keeping program state in memory between incremental runs, we |
| 13 | + only have to process changed modules, not their dependencies. The |
| 14 | + classic incremental mode has to deserialize the symbol tables of |
| 15 | + all dependencies of changed modules, which can be slow for large |
| 16 | + programs. |
19 | 17 |
|
20 |
| -* The full ASTs of all modules in memory all the time (+ type map). |
21 |
| -* Maintain a fine-grained dependency map, which is from triggers to |
22 |
| - targets/triggers. The latter determine what other parts of a program |
23 |
| - need to be processed again due to an externally visible change to a |
24 |
| - module. |
| 18 | +* Fine-grained dependencies allow processing only the relevant parts |
| 19 | + of modules indirectly affected by a change. Say, if only one function |
| 20 | + in a large module is affected by a change in another module, only this |
| 21 | + function is processed. The classic incremental mode always processes |
| 22 | + an entire file as a unit, which is typically much slower. |
25 | 23 |
|
26 |
| -We perform a fine-grained incremental program update like this: |
| 24 | +* It's possible to independently process individual modules within an |
| 25 | + import cycle (SCC). Small incremental changes can be fast independent |
| 26 | + of the size of the related SCC. In classic incremental mode, any change |
| 27 | + within a SCC requires the entire SCC to be processed, which can slow |
| 28 | + things down considerably. |
| 29 | +
|
| 30 | +Some terms: |
| 31 | +
|
| 32 | +* A *target* is a function/method definition or the top level of a module. |
| 33 | + We refer to targets using their fully qualified name (e.g. |
| 34 | + 'mod.Cls.method'). Targets are the smallest units of processing during |
| 35 | + fine-grained incremental checking. |
| 36 | +
|
| 37 | +* A *trigger* represents the properties of a part of a program, and it |
| 38 | + gets triggered/fired when these properties change. For example, |
| 39 | + '<mod.func>' refers to a module-level function. It gets triggered if |
| 40 | + the signature of the function changes, or if the function is removed, |
| 41 | + for example. |
| 42 | +
|
| 43 | +Some program state is maintained across multiple build increments in |
| 44 | +memory: |
| 45 | +
|
| 46 | +* The full ASTs of all modules are stored in memory all the time (this |
| 47 | + includes the type map). |
| 48 | +
|
| 49 | +* A fine-grained dependency map is maintained, which maps triggers to |
| 50 | + affected program locations (these can be targets, triggers, or |
| 51 | + classes). The latter determine what other parts of a program need to |
| 52 | + be processed again due to a fired trigger. |
| 53 | +
|
| 54 | +Here's a summary of how a fine-grained incremental program update happens: |
27 | 55 |
|
28 | 56 | * Determine which modules have changes in their source code since the
|
29 |
| - previous build. |
30 |
| -* Fully process these modules, creating new ASTs and symbol tables |
31 |
| - for them. Retain the existing ASTs and symbol tables of modules that |
32 |
| - have no changes in their source code. |
33 |
| -* Determine which parts of the changed modules have changed. The result |
34 |
| - is a set of triggered triggers. |
35 |
| -* Using the dependency map, decide which other targets have become |
36 |
| - stale and need to be reprocessed. |
37 |
| -* Replace old ASTs of the modules that we reprocessed earlier with |
38 |
| - the new ones, but try to retain the identities of original externally |
39 |
| - visible AST nodes so that we don't (always) need to patch references |
40 |
| - in the rest of the program. |
41 |
| -* Semantically analyze and type check the stale targets. |
42 |
| -* Repeat the previous steps until nothing externally visible has changed. |
| 57 | + previous update. |
| 58 | +
|
| 59 | +* Process changed modules one at a time. Perform a separate full update |
| 60 | + for each changed module, but only report the errors after all modules |
| 61 | + have been processed, since the intermediate states can generate bogus |
| 62 | + errors due to only seeing a partial set of changes. |
| 63 | +
|
| 64 | +* Each changed module is processed in full. We parse the module, and |
| 65 | + run semantic analysis to create a new AST and symbol table for the |
| 66 | + module. Reuse the existing ASTs and symbol tables of modules that |
| 67 | + have no changes in their source code. At the end of this stage, we have |
| 68 | + two ASTs and symbol tables for the changed module (the old and the new |
| 69 | + versions). The latter AST has not yet been type checked. |
| 70 | +
|
| 71 | +* Take a snapshot of the old symbol table. This is used later to determine |
| 72 | + which properties of the module have changed and which triggers to fire. |
| 73 | +
|
| 74 | +* Merge the old AST with the new AST, preserving the identities of |
| 75 | + externally visible AST nodes for which we can find a corresponding node |
| 76 | + in the new AST. (Look at mypy.server.astmerge for the details.) This |
| 77 | + way all external references to AST nodes in the changed module will |
| 78 | + continue to point to the right nodes (assuming they still have a valid |
| 79 | + target). |
| 80 | +
|
| 81 | +* Type check the new module. |
| 82 | +
|
| 83 | +* Take another snapshot of the symbol table of the changed module. |
| 84 | + Look at the differences between the old and new snapshots to determine |
| 85 | + which parts of the changed modules have changed. The result is a set of |
| 86 | + fired triggers. |
| 87 | +
|
| 88 | +* Using the dependency map and the fired triggers, decide which other |
| 89 | + targets have become stale and need to be reprocessed. |
| 90 | +
|
| 91 | +* Create new fine-grained dependencies for the changed module. We don't |
| 92 | + garbage collect old dependencies, since extra dependencies are relatively |
| 93 | + harmless (they take some memory and can theoretically slow things down |
| 94 | + a bit by causing redundant work). This is implemented in |
| 95 | + mypy.server.deps. |
| 96 | +
|
| 97 | +* Strip the stale AST nodes that we found above. This returns them to a |
| 98 | + state resembling the end of semantic analysis pass 1. We'll run semantic |
| 99 | + analysis again on the existing AST nodes, and since semantic analysis |
| 100 | + is not idempotent, we need to revert some changes made during semantic |
| 101 | + analysis. This is implemented in mypy.server.aststrip. |
| 102 | +
|
| 103 | +* Run semantic analyzer passes 2 and 3 on the stale AST nodes, and type |
| 104 | + check them. We also need to do the symbol table snapshot comparison |
| 105 | + dance to find any changes, and we need to merge ASTs to preserve AST node |
| 106 | + identities. |
| 107 | +
|
| 108 | +* If some triggers haven been fired, continue processing and repeat the |
| 109 | + previous steps until no triggers are fired. |
| 110 | +
|
| 111 | +This is module is tested using end-to-end fine-grained incremental mode |
| 112 | +test cases (test-data/unit/fine-grained*.test). |
43 | 113 |
|
44 | 114 | Major todo items:
|
45 | 115 |
|
46 |
| -- Support multiple type checking passes |
| 116 | +- Fully support multiple type checking passes |
47 | 117 | """
|
48 | 118 |
|
49 | 119 | import os.path
|
|
0 commit comments