Skip to content

Commit b108ce2

Browse files
committed
Updates
1 parent 4ba59b5 commit b108ce2

File tree

2 files changed

+151
-0
lines changed

2 files changed

+151
-0
lines changed

ReadyToInterpret.md

Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
# Investigation on making interpreter work with ReadyToRun
2+
3+
## Status
4+
5+
This document is preliminary - it only covers the most basic case - it doesn't even cover very often used case (i.e. virtual method calls).
6+
7+
I am doing a Hackathon overnight trying to get something working, not designing something for the long term, yet.
8+
9+
## Goals
10+
11+
- Figure out how relevant parts of ready to run works.
12+
- Figure out how to hack it so that we can get into the CoreCLR interpreter.
13+
14+
## Non Goals
15+
16+
- Deliver a working prototype (I just don't have the time - and the CoreCLR interpreter is not the right target)
17+
- Come up with an optimal design (Same, I just don't have the time)
18+
19+
## High-level observations
20+
21+
We already have a mechanism to call an arbitrary managed method from the native runtime - this mechanism can be used to call ReadyToRun compiled method. So in general, interpreter -> ReadyToRun is not an issue.
22+
23+
The key challenge is to get ReadyToRun code to call into the interpreter.
24+
25+
## Understanding what happened when we are about to make an outgoing call from ReadyToRun
26+
27+
When ReadyToRun code makes a call to a static function, it
28+
29+
- push the arguments on the register/stack as per the calling convention
30+
- call into a redirection cell
31+
- get into the runtime.
32+
33+
Inside the runtime, I will eventually get to `ExternalMethodFixupWorker` defined in `prestub.cpp`.
34+
35+
At this point, I have
36+
- transitionBlock - no idea what it is
37+
- pIndirection - the address for storing the callee address
38+
- sectionIndex - a number, pushed by the thunk, and
39+
- pModule - a pointer to the module containing the call instruction
40+
41+
Since the call comes from a ReadyToRun image, `pModule` must have a ready to run image
42+
43+
We can easily calculate the RVA of the `pIndirection`
44+
45+
If the call provided the `sectionIndex`, we will just use it, otherwise we can still calculate the section index based on the RVA.
46+
47+
The calculation is simply by sequentially scanning the import sections, each section is self describing its address range so we can check
48+
49+
The import section has an array signature - using the rva - beginning rva of the section. we can index into the signature array to find the signature.
50+
51+
The signature is then parsed to become a `MethodDesc` - where the method preparation continues as usual
52+
53+
Last but not least, eventually, the `pIndirection` will be patched with that entry point, and the call proceed by using the arguments already on the stack/restored registers.
54+
55+
## How the potential hack looks like
56+
57+
We keep everything the same up to the method preparation part.
58+
59+
We knew it is possible to produce an `InterpreterMethodInfo` given a `MethodDesc` when the system is ready to JIT, so we should be able to produce the `InterpreterMethodInfo` there.
60+
61+
The arguments are already on the registers, but we can't dynamically generate the `InterpreterStub`, the only reasonable thing is to pre-generate the stubs in the ReadyToRun image itself.
62+
63+
> A stub per signature is necessary because each signature need a different way to populate the arguments (and the interpreter method info). On the other hand, a stub per signature is sufficient because if we knew how to prepare the register to begin with, we must know exactly what steps are needed to put them into a format the `InterpretMethodBody` likes. As people points out, this is going to be a large volume, this is by no means optimal.
64+
65+
The stub generation code can 'mostly' be exactly the same as `GenerateInterpreterStub` with two twists:
66+
67+
- We need to use indirection to get to the `InterpreterMethodInfo` object. That involves having a slot that the `InterpreterMethodInfo` construction process need to patch.
68+
- What if the call signature involves unknown struct size (e.g. a method in A.dll take a struct in B.dll where B.dll is considered not in the same version bubble)
69+
70+
Next, we need the data structure that get us to the address of the stub as well as the address of the cell storing the `InterpreterMethodInfo`. What we have is `pIndirection` and therefore `MethodDesc`.
71+
72+
To do that, we might want to mimic how the runtime locate ReadyToRun code.
73+
74+
Here is a stack of how the ready to run code discovery look like:
75+
76+
```
77+
coreclr!ReadyToRunInfo::GetEntryPoint+0x238 [C:\dev\runtime\src\coreclr\vm\readytoruninfo.cpp @ 1148]
78+
coreclr!MethodDesc::GetPrecompiledR2RCode+0x24e [C:\dev\runtime\src\coreclr\vm\prestub.cpp @ 507]
79+
coreclr!MethodDesc::GetPrecompiledCode+0x30 [C:\dev\runtime\src\coreclr\vm\prestub.cpp @ 443]
80+
coreclr!MethodDesc::PrepareILBasedCode+0x5e6 [C:\dev\runtime\src\coreclr\vm\prestub.cpp @ 412]
81+
coreclr!MethodDesc::PrepareCode+0x20f [C:\dev\runtime\src\coreclr\vm\prestub.cpp @ 319]
82+
coreclr!CodeVersionManager::PublishVersionableCodeIfNecessary+0x5a1 [C:\dev\runtime\src\coreclr\vm\codeversion.cpp @ 1739]
83+
coreclr!MethodDesc::DoPrestub+0x72d [C:\dev\runtime\src\coreclr\vm\prestub.cpp @ 2869]
84+
coreclr!PreStubWorker+0x46d [C:\dev\runtime\src\coreclr\vm\prestub.cpp @ 2698]
85+
coreclr!ThePreStub+0x55 [C:\dev\runtime\src\coreclr\vm\amd64\ThePreStubAMD64.asm @ 21]
86+
coreclr!CallDescrWorkerInternal+0x83 [C:\dev\runtime\src\coreclr\vm\amd64\CallDescrWorkerAMD64.asm @ 74]
87+
coreclr!CallDescrWorkerWithHandler+0x12b [C:\dev\runtime\src\coreclr\vm\callhelpers.cpp @ 66]
88+
coreclr!MethodDescCallSite::CallTargetWorker+0xb79 [C:\dev\runtime\src\coreclr\vm\callhelpers.cpp @ 595]
89+
coreclr!MethodDescCallSite::Call+0x24 [C:\dev\runtime\src\coreclr\vm\callhelpers.h @ 465]
90+
```
91+
92+
The interesting part, of course, is how `GetEntryPoint` works. Turn out it is just a `NativeHashtable` lookup given a `VersionResilientMethodHashCode`, so we should be able to encode the same hash table for the stubs as well.
93+
94+
Note that `GetEntryPoint` has the fixup concept, maybe we can use the same concept to patch the slot for `InterpreterMethodInfo`.
95+
96+
## How to implement the potential hack
97+
98+
From the compiler side:
99+
100+
### When do we need to generate the stubs?
101+
When the ReadyToRun compiler generate a call, the JIT will call back into crossgen2 to create a slot for it. At that point, we should know what we need to make sure a stub is available for it by working with the dependency tracking engine.
102+
103+
### Actually generate the stubs
104+
105+
To stub generation should mostly work the same as in `GenerateInterpreterStub` today with a couple twists
106+
- We don't need to generate the `InterpreterMethodInfo`, that work is left until runtime.
107+
- If the stub involve types with unknown size, we need to generate the right stub code for it (e.g. A.dll call a function that involves a struct defined in `B.dll` where they are not in the same version bubble)
108+
- The stub needs an instance of `InterpreterMethodInfo`, it cannot be hardcoded, the pointer of it must be read from somewhere else.
109+
- Whenever we generate the stub, we need to store it somewhere so that we can follow the logic as in `MethodEntryPointTableNode`
110+
111+
From the runtime side:
112+
113+
### Locating the stub
114+
- When we reach `ExternalMethodFixupWorker`, we need to use the table to get back to the generated stubs
115+
116+
### Preparing the data
117+
- We need to create the `InterpreterMethodInfo` and make sure the stub code will be able to read it.
118+
119+
## Alternative designs
120+
Following the thought on the earlier prototype for tagged pointers, we could envision a solution that ditch all those stubs, e.g.
121+
122+
1. Changing the call convention for every method so that it is the same as what the interpreter method likes.
123+
124+
Pros:
125+
- Consistency, easily to understand
126+
- No need for stubs, efficient for interpreter calls
127+
128+
Cons:
129+
- Lots of work to have a different calling convention
130+
- Inefficient for non interpreter calls
131+
132+
2. Changing the call site so that it detects tagged pointers and call differently
133+
134+
Pros:
135+
- Similar with what we have in the tagged pointer prototype
136+
- No need for stubs, efficient for interpreter calls
137+
138+
Cons:
139+
- Every call involves dual call code
140+
141+
3. The approach described in this document (i.e. using stubs)
142+
143+
Pros:
144+
- Probably cheapest to implement
145+
146+
Cons:
147+
- Lots of stubs
148+
- Inefficient for interpreter call (involve stack rewriting)
149+
- Unclear how it could work with virtual or interface calls
150+
151+
I haven't put more thoughts into these alternative solutions, but I am aware they exists.
File renamed without changes.

0 commit comments

Comments
 (0)