Skip to content

Commit 4ba59b5

Browse files
committed
Tagged Pointer Prototype
1 parent eb65806 commit 4ba59b5

File tree

8 files changed

+251
-5
lines changed

8 files changed

+251
-5
lines changed

# Tagged Function Prototype.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# Tagged Function Prototype
2+
3+
The goal of this prototype is to investigate whether or not the tagged function concept is practically feasible in the CoreCLR code base.
4+
5+
## How does the CoreCLR interpreter work today?
6+
7+
This section covers a small portion of how the interpreter integrate with the runtime. It does NOT attempt to explain the full interpreter execution process.
8+
9+
The interpreter work by pretending itself as jitted code, as such, it needs to
10+
11+
1. Convert the incoming arguments from the register/stack to something C++ understands
12+
2. The control flows to `InterpretMethodBody`, where it interprets the byte code.
13+
3. Call any other callee as if they are jitted code as well, and
14+
4. Put thing back on the stack as if it were produced by jitted code.
15+
16+
Step 1 is something require special generated code to do, right now, it is done by `GenerateInterpreterStub`. It is meant to be a tiny routine that take arguments from the stack
17+
and rewrite the stack so that the values can be consumed by C++.
18+
19+
## What do we want?
20+
21+
We want to get rid of the concept of interpreter stub, and instead, have the caller calling the actual `InterpretMethodBody` directly.
22+
23+
`InterpretMethodBody` requires an `InterpreterMethodInfo` object, which basically is a representation where we can easily access its signature and its byte code.
24+
25+
So the problem is reduced to:
26+
27+
1. Identify a caller that is currently calling using the standard calling convention.
28+
2. Get that caller to access an `InterpreterMethodInfo` object, and so
29+
3. Make it calls `InterpretMethodBody` instead.
30+
31+
## Wrong attempts
32+
33+
I tried 3 different approaches to that and only the last one succeed. These wrong attempts are documented just so we don't try the same wrong idea again.
34+
35+
### Idea 1
36+
37+
- Make `GenerateInterpreterStub` return a tagged pointer instead
38+
39+
This approach failed because `GenerateInterpreterStub` is called as part of `ThePreStub`. `ThePreStub` works by leaving the call arguments on the stack, so the incoming call arguments are already on the stack, and we at least need some code to get it back.
40+
41+
### Idea 2
42+
43+
Now we know we must perform call `InterpretMethodBody` earlier then `ThePreStub`, which means `ThePreStub` must be replaced by something else. In fact, how does `ThePreStub` knows what `MethodDesc` to interpret? Upon investigation, I learn about this concept of `Precode`.
44+
45+
Basically, every method has a `Precode`, that is a simple `jmp` instruction the goes somewhere else. This is the first instruction that get executed. To begin with, that instruction jumps to `ThePreStub`, and that instruction is code generated. Given the precode, we can get to the MethodDesc.
46+
47+
What that means is that we need to get rid of the code generation during the Precode generation, which means will no longer have the jmp instruction. Instead, we will put a thing there that allow us to get to the `InterpreterMethodInfo`.
48+
49+
A reasonable choice is to put a pointer to the `InterpreterMethodInfo` object right there. We will tag the least significant bit of it so that we know it is not a normal function entry point.
50+
51+
To be more concrete, the precode is generated during `MethodDesc::EnsureTemporaryEntryPointCore`. We will modify that code so that it translate the `MethodDesc` into an `InterpreterMethodInfo` there and tag it so that we put it into the method table there.
52+
53+
The reason why this approach fails is more subtle. It turns out that the `InterpreterMethodInfo` construction process leveraged the code that supports the JIT to extract the IL, and that code assumed the method tables are also properly constructed, but that's not true at the time `MethodDesc::EnsureTemporaryEntryPointCore` is called. So we must delay the process of `InterpreterMethodInfo` object construction.
54+
55+
## Working approach
56+
57+
### Idea 3
58+
59+
To get around the cyclic dependency issue above, I tagged the MethodDesc pointer instead. By the time we are about to call the function, then we construct the `InterpreterMethodInfo`. This worked.
60+
61+
The down side of this approach, obviously, is that the pointer in the method table is no longer a valid entry point, so anything else that try to call it will lead to an access violation. This will work in a pure interpreted scenario, where the interpreter is the only thing that runs in the process.
62+
63+
Suppose we also want to let (e.g. ReadyToRun) code to run, that won't work unless we also change the ReadyToRun callers.
64+
65+
The code in this branch demonstrated this concept. It will execute some code under the interpreter (and fail pretty quickly because I haven't implemented everything yet).
66+
67+
### Lowlights
68+
69+
This code is still using dynamic code generation for a couple of things. We are still generating code for GC write barrier, and we are still generating some glue code for pinvoke. Lastly, the call made by the interpreter is not converted to use the new call convention yet. These seems to be solvable problems.

src/coreclr/minipal/Windows/doublemapping.cpp

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -184,8 +184,15 @@ void* VMToOSInterface::ReserveDoubleMappedMemory(void *mapperHandle, size_t offs
184184
return pResult;
185185
}
186186

187+
extern void andrew_debug();
188+
187189
void *VMToOSInterface::CommitDoubleMappedMemory(void* pStart, size_t size, bool isExecutable)
188190
{
191+
if (isExecutable)
192+
{
193+
// Whenever this is called, we are generating code.
194+
andrew_debug();
195+
}
189196
return VirtualAlloc(pStart, size, MEM_COMMIT, isExecutable ? PAGE_EXECUTE_READ : PAGE_READWRITE);
190197
}
191198

src/coreclr/vm/callhelpers.cpp

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@
1818
#include "invokeutil.h"
1919
#include "argdestination.h"
2020

21+
void andrew_debug();
22+
2123
#if defined(FEATURE_MULTICOREJIT) && defined(_DEBUG)
2224

2325
// Allow system module for Appx
@@ -33,6 +35,10 @@ void AssertMulticoreJitAllowedModule(PCODE pTarget)
3335

3436
#endif
3537

38+
void* ToInterpreterMethodInfo(MethodDesc* pMd);
39+
40+
void CallInterpretMethod(void* interpreterMethodInfo, BYTE* ilArgs);
41+
3642
// For X86, INSTALL_COMPLUS_EXCEPTION_HANDLER grants us sufficient protection to call into
3743
// managed code.
3844
//
@@ -60,7 +66,44 @@ void CallDescrWorkerWithHandler(
6066

6167
BEGIN_CALL_TO_MANAGEDEX(fCriticalCall ? EEToManagedCriticalCall : EEToManagedDefault);
6268

69+
#ifdef FEATURE_INTERPRETER
70+
uint64_t pCallTarget = (uint64_t)(pCallDescrData->pTarget);
71+
if ((pCallTarget & 0x3) == 0x3)
72+
{
73+
//
74+
// Experiment comment:
75+
// Step 4: When we call a method, we simply redirect it to use CallInterpretMethod instead
76+
// of calling a stub and then redirecting back to call InterpretMethod anyway.
77+
//
78+
// That involves first converting the MethodDesc to an InterpreterMethodInfo. We will store
79+
// that on the MethodTable slot so we do not do repeated conversion.
80+
//
81+
MethodDesc* pMD = (MethodDesc*)(pCallTarget & (~0x3));
82+
83+
if (pMD->IsIL() && !pMD->IsUnboxingStub())
84+
{
85+
void* translated = ToInterpreterMethodInfo(pMD);
86+
*(pMD->GetAddrOfSlot()) = pCallTarget = (PCODE)translated;
87+
}
88+
}
89+
if ((pCallTarget & 0x3) == 0x1)
90+
{
91+
//
92+
// Experiment comment:
93+
// Step 5: Now we have an InterpreterMethodInfo, simply call CallInterpretMethod
94+
//
95+
// That involves first converting the MethodDesc to an InterpreterMethodInfo. We will store
96+
// that on the MethodTable slot so we do not do repeated conversion.
97+
//
98+
CallInterpretMethod((void*)(pCallTarget & (~0x1)), (BYTE*)pCallDescrData->pSrc);
99+
}
100+
else
101+
{
102+
CallDescrWorker(pCallDescrData);
103+
}
104+
#else
63105
CallDescrWorker(pCallDescrData);
106+
#endif
64107

65108
END_CALL_TO_MANAGED();
66109
}

src/coreclr/vm/ecall.cpp

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,16 @@ void ECall::PopulateManagedStringConstructors()
8888
MethodDesc* pMD = CoreLibBinder::GetMethod((BinderMethodID)(METHOD__STRING__CTORF_FIRST + i));
8989
_ASSERTE(pMD != NULL);
9090

91+
#ifdef FEATURE_INTERPRETER
92+
//
93+
// Experiment Comment:
94+
// Step 3: GetMultiCallableAddrOfCode will eventually try to interpret the entry point
95+
// as a Precode. For that, we simply ignore that work.
96+
//
97+
PCODE pDest = (PCODE)((uint64_t)pMD & 0x03);
98+
#else
9199
PCODE pDest = pMD->GetMultiCallableAddrOfCode();
100+
#endif
92101

93102
ECall::DynamicallyAssignFCallImpl(pDest, ECallCtor_First + i);
94103
}

src/coreclr/vm/interpreter.cpp

Lines changed: 88 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1692,11 +1692,20 @@ CorJitResult Interpreter::GenerateInterpreterStub(CEEInfo* comp,
16921692
#else
16931693
#error unsupported platform
16941694
#endif
1695-
stub = sl.Link(SystemDomain::GetGlobalLoaderAllocator()->GetStubHeap());
1695+
if (false)
1696+
{
1697+
stub = sl.Link(SystemDomain::GetGlobalLoaderAllocator()->GetStubHeap());
16961698

1697-
*nativeSizeOfCode = static_cast<ULONG>(stub->GetNumCodeBytes());
1698-
// TODO: manage reference count of interpreter stubs. Look for examples...
1699-
*nativeEntry = dac_cast<BYTE*>(stub->GetEntryPoint());
1699+
*nativeSizeOfCode = static_cast<ULONG>(stub->GetNumCodeBytes());
1700+
// TODO: manage reference count of interpreter stubs. Look for examples...
1701+
*nativeEntry = dac_cast<BYTE*>(stub->GetEntryPoint());
1702+
}
1703+
else
1704+
{
1705+
// Intentionally avoid generating any code - but it is required
1706+
// for us to have an unique nativeEntry since it will be part of some hash table
1707+
*nativeEntry = (BYTE*)(((uint64_t)interpMethInfo) | 0x1);
1708+
}
17001709
}
17011710

17021711
// Initialize the arg offset information.
@@ -1773,6 +1782,81 @@ ARG_SLOT Interpreter::ExecuteMethodWrapper(struct InterpreterMethodInfo* interpM
17731782
return retVal;
17741783
}
17751784

1785+
COR_ILMETHOD_DECODER* CopiedGetAndVerifyMetadataILHeader(MethodDesc* pMD, PrepareCodeConfig* pConfig, COR_ILMETHOD_DECODER* pDecoderMemory)
1786+
{
1787+
// STANDARD_VM_CONTRACT;
1788+
_ASSERTE(pMD != NULL);
1789+
_ASSERTE(!pMD->IsNoMetadata());
1790+
_ASSERTE(pConfig != NULL);
1791+
_ASSERTE(pDecoderMemory != NULL);
1792+
1793+
COR_ILMETHOD_DECODER* pHeader = NULL;
1794+
COR_ILMETHOD* ilHeader = pConfig->GetILHeader();
1795+
if (ilHeader == NULL)
1796+
return NULL;
1797+
1798+
COR_ILMETHOD_DECODER::DecoderStatus status = COR_ILMETHOD_DECODER::FORMAT_ERROR;
1799+
{
1800+
// Decoder ctor can AV on a malformed method header
1801+
AVInRuntimeImplOkayHolder AVOkay;
1802+
pHeader = new (pDecoderMemory) COR_ILMETHOD_DECODER(ilHeader, pMD->GetMDImport(), &status);
1803+
}
1804+
1805+
if (status == COR_ILMETHOD_DECODER::FORMAT_ERROR)
1806+
COMPlusThrowHR(COR_E_BADIMAGEFORMAT, BFA_BAD_IL);
1807+
1808+
return pHeader;
1809+
}
1810+
1811+
COR_ILMETHOD_DECODER* CopiedGetAndVerifyILHeader(MethodDesc* pMD, PrepareCodeConfig* pConfig, COR_ILMETHOD_DECODER* pIlDecoderMemory)
1812+
{
1813+
// STANDARD_VM_CONTRACT;
1814+
_ASSERTE(pMD != NULL);
1815+
if (pMD->IsIL())
1816+
{
1817+
return CopiedGetAndVerifyMetadataILHeader(pMD, pConfig, pIlDecoderMemory);
1818+
}
1819+
else if (pMD->IsILStub())
1820+
{
1821+
ILStubResolver* pResolver = pMD->AsDynamicMethodDesc()->GetILStubResolver();
1822+
return pResolver->GetILHeader();
1823+
}
1824+
1825+
_ASSERTE(pMD->IsNoMetadata());
1826+
return NULL;
1827+
}
1828+
1829+
void ToMethodInfo(MethodDesc* ftn, COR_ILMETHOD_DECODER* ILHeader, CORINFO_METHOD_INFO* pMethodInfo);
1830+
1831+
void* ToInterpreterMethodInfo(MethodDesc* pMd)
1832+
{
1833+
// Given a methodDesc, convert it into an InterpreterMethodInfo
1834+
GCX_PREEMP();
1835+
NativeCodeVersion activeCodeVersion(pMd);
1836+
PrepareCodeConfigBuffer pccb(activeCodeVersion);
1837+
PrepareCodeConfig* pConfig = pccb.GetConfig();
1838+
COR_ILMETHOD_DECODER ilDecoderTemp;
1839+
COR_ILMETHOD_DECODER* pilHeader = CopiedGetAndVerifyILHeader(pMd, pConfig, &ilDecoderTemp);
1840+
CORINFO_METHOD_INFO methodInfo;
1841+
ToMethodInfo(pMd, pilHeader, &methodInfo);
1842+
CEEInfo* jitInfo = new CEEInfo(pMd, true);
1843+
1844+
InterpreterMethodInfo* interpreterMethodInfo;
1845+
BYTE* unusedEntry;
1846+
uint32_t unusedSize;
1847+
1848+
// The method is modified so that it won't generate code anymore
1849+
Interpreter::GenerateInterpreterStub(jitInfo, &methodInfo, &unusedEntry, &unusedSize, &interpreterMethodInfo, false);
1850+
return (void*)(((uint64_t)interpreterMethodInfo) | 0x1);
1851+
}
1852+
1853+
void CallInterpretMethod(void* interpreterMethodInfo, BYTE* ilArgs)
1854+
{
1855+
struct InterpreterMethodInfo* interpMethInfo = (struct InterpreterMethodInfo*)interpreterMethodInfo;
1856+
// TODO, return!
1857+
Interpreter::InterpretMethodBody(interpMethInfo, true, ilArgs, nullptr);
1858+
}
1859+
17761860
// TODO: Add GSCookie checks
17771861

17781862
// static

src/coreclr/vm/interpreter.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -746,8 +746,10 @@ class Interpreter
746746
friend float F_CALL_CONV InterpretMethodFloat(InterpreterMethodInfo* methInfo, BYTE* ilArgs, void* stubContext);
747747
friend double F_CALL_CONV InterpretMethodDouble(InterpreterMethodInfo* methInfo, BYTE* ilArgs, void* stubContext);
748748

749+
public:
749750
// This will be inlined into the bodies of the methods above
750751
static inline ARG_SLOT InterpretMethodBody(InterpreterMethodInfo* interpMethInfo, bool directCall, BYTE* ilArgs, void* stubContext);
752+
private:
751753

752754
// The local frame size of the method being interpreted.
753755
static size_t GetFrameSize(InterpreterMethodInfo* interpMethInfo);

src/coreclr/vm/jitinterface.cpp

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12885,6 +12885,11 @@ void ThrowExceptionForJit(HRESULT res)
1288512885
BOOL g_fAllowRel32 = TRUE;
1288612886
#endif
1288712887

12888+
void ToMethodInfo(MethodDesc* ftn, COR_ILMETHOD_DECODER* ILHeader, CORINFO_METHOD_INFO* pMethodInfo)
12889+
{
12890+
MethodInfoHelperContext cxt{ ftn, ILHeader };
12891+
getMethodInfoHelper(cxt, pMethodInfo);
12892+
}
1288812893

1288912894
// ********************************************************************
1289012895
// README!!

src/coreclr/vm/method.cpp

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2660,6 +2660,12 @@ MethodDesc* MethodDesc::GetMethodDescFromStubAddr(PCODE addr, BOOL fSpeculative
26602660

26612661
// Otherwise this must be some kind of precode
26622662
//
2663+
#ifdef FEATURE_INTERPRETER
2664+
if ((addr & 0x03) == 0x03)
2665+
{
2666+
return (MethodDesc*)((addr) & (~0x03));
2667+
}
2668+
#endif
26632669
PTR_Precode pPrecode = Precode::GetPrecodeFromEntryPoint(addr, fSpeculative);
26642670
PREFIX_ASSUME(fSpeculative || (pPrecode != NULL));
26652671
if (pPrecode != NULL)
@@ -2758,6 +2764,16 @@ void MethodDesc::EnsureTemporaryEntryPointCore(AllocMemTracker *pamTracker)
27582764
GetMethodDescChunk()->DetermineAndSetIsEligibleForTieredCompilation();
27592765
PTR_PCODE pSlot = GetAddrOfSlot();
27602766

2767+
#ifdef FEATURE_INTERPRETER
2768+
//
2769+
// Experiment comment:
2770+
// Step 1: Whenever we create a MethodDesc, we used to generate a precode for it
2771+
// This change avoids that and we put in a tagged MethodDesc there instead.
2772+
//
2773+
IfFailThrow(EnsureCodeDataExists(pamTracker));
2774+
// TODO: Interlocked?
2775+
PCODE tempEntryPoint = m_codeData->TemporaryEntryPoint = (PCODE)((uint64_t)this | 0x3);
2776+
#else
27612777
AllocMemTracker amt;
27622778
AllocMemTracker *pamTrackerPrecode = pamTracker != NULL ? pamTracker : &amt;
27632779
Precode* pPrecode = Precode::Allocate(GetPrecodeType(), this, GetLoaderAllocator(), pamTrackerPrecode);
@@ -2769,6 +2785,7 @@ void MethodDesc::EnsureTemporaryEntryPointCore(AllocMemTracker *pamTracker)
27692785

27702786
PCODE tempEntryPoint = m_codeData->TemporaryEntryPoint;
27712787
_ASSERTE(tempEntryPoint != (PCODE)NULL);
2788+
#endif
27722789

27732790
if (*pSlot == (PCODE)NULL)
27742791
{
@@ -2866,16 +2883,26 @@ Precode* MethodDesc::GetOrCreatePrecode()
28662883

28672884
void MethodDesc::MarkPrecodeAsStableEntrypoint()
28682885
{
2886+
#ifndef FEATURE_INTERPRETER
28692887
#if _DEBUG
28702888
PCODE tempEntry = GetTemporaryEntryPointIfExists();
28712889
_ASSERTE(tempEntry != (PCODE)NULL);
28722890
PrecodeType requiredType = GetPrecodeType();
2891+
//
2892+
// Experiment comment:
2893+
// Step 2: Later on, for various reasons, we might want to know more about the function
2894+
// Currently, it is done by Precode::GetPrecodeFromEntryPoint, which will obviously not
2895+
// work because the entry point is no longer a Precode.
2896+
//
2897+
// Depending on what we actually want, this needs to be fixed differently. For example
2898+
// this is just doing some assertion, we can simply ignore it.
2899+
//
28732900
PrecodeType availableType = Precode::GetPrecodeFromEntryPoint(tempEntry)->GetType();
28742901
_ASSERTE(requiredType == availableType);
28752902
#endif
28762903
_ASSERTE(!HasPrecode());
28772904
_ASSERTE(RequiresStableEntryPoint());
2878-
2905+
#endif
28792906
InterlockedUpdateFlags3(enum_flag3_HasStableEntryPoint | enum_flag3_HasPrecode, TRUE);
28802907
}
28812908

0 commit comments

Comments
 (0)