Skip to content

Add support for string constructors to the interpreter #115914

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

kg
Copy link
Member

@kg kg commented May 22, 2025

  • Updates the compiler to identify NEWOBJ opcodes that are operating on string or multidim arrays, and generates a different specialized newobj opcode for them.
  • Updates the callstub generator to know how to generate the appropriate type of stub for those constructors.
  • Adds a specialized newobj opcode for strings and md arrays (md arrays not actually implemented in this PR.)
  • Modifies InvokeCompiledMethod to accept the code address from outside.

@Copilot Copilot AI review requested due to automatic review settings May 22, 2025 22:26
@kg kg requested review from BrzVlad and janvorli as code owners May 22, 2025 22:26
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for string constructors in the interpreter and updates the associated call stub generation.

  • Added a new test case (TestStringCtor) to verify string constructor functionality
  • Updated interpreter execution logic to correctly handle fcalls for string constructors
  • Adjusted call stub generation to account for special string constructors

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/tests/JIT/interpreter/Interpreter.cs Added test method for string constructor support
src/coreclr/vm/interpexec.cpp Updated interpreter method call to support special string constructors
src/coreclr/vm/callstubgenerator.cpp Modified call stub generation for special constructor handling

@@ -1186,15 +1186,46 @@ void InterpExecMethod(InterpreterFrame *pInterpreterFrame, InterpMethodContextFr
callArgsOffset = ip[2];
methodSlot = ip[3];

OBJECTREF objRef = AllocateObject((MethodTable*)pMethod->pDataItems[ip[4]]);
MethodTable *pClass = (MethodTable*)pMethod->pDataItems[ip[4]];
// FIXME: Duplicated code from CALL_INTERP_SLOT
Copy link
Preview

Copilot AI May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider refactoring the duplicated code block for handling string constructor invocations to improve maintainability.

Suggested change
// FIXME: Duplicated code from CALL_INTERP_SLOT

Copilot uses AI. Check for mistakes.

@kg
Copy link
Member Author

kg commented May 23, 2025

Anyone know what's up with this crossdac failure on CI?

  [458/464] Linking CXX static library unwinder\unwinder_dac.lib
  [459/464] Building RC object dlls\mscordbi\CMakeFiles\mscordbi.dir\Native.rc.res
  [460/464] Building C object D:\a\_work\1\s\artifacts\obj\external\libunwind\CMakeFiles\libunwind_xdac.dir\D_\a\_work\1\s\src\native\external\libunwind\src\dwarf\Gparser.c.obj
  FAILED: D:/a/_work/1/s/artifacts/obj/external/libunwind/CMakeFiles/libunwind_xdac.dir/D_/a/_work/1/s/src/native/external/libunwind/src/dwarf/Gparser.c.obj 
  C:\PROGRA~1\MICROS~1\2022\ENTERP~1\VC\Tools\MSVC\1443~1.348\bin\Hostx64\x64\cl.exe  /nologo -DBUILDENV_CHECKED=1 -DCROSS_COMPILE -DDEBUG -DDISABLE_CONTRACTS -DHAVE_CONFIG_H=1 -DHAVE_DL_ITERATE_PHDR=1 -DHAVE_UNW_GET_ACCESSORS -DHAVE___THREAD=0 -DHOST_64BIT -DHOST_AMD64 -DHOST_WINDOWS -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_STRING=\"\" -DTARGET_64BIT -DTARGET_AMD64 -DTARGET_LINUX -DTARGET_UNIX -DUNW_REMOTE_ONLY -DURTBLDENV_FRIENDLY=Checked -D_CRT_DECLARE_NONSTDC_NAMES -D_CRT_SECURE_NO_WARNINGS -D_DBG -D_DEBUG -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -D_TIME_BITS=64 -D_Thread_local="" -D__amd64__ -D__linux__ -D__x86_64__ -ID:\a\_work\1\s\artifacts\obj\external\libunwind -ID:\a\_work\1\s\src\native\external\libunwind_extras -ID:\a\_work\1\s\src\native -ID:\a\_work\1\s\src\native\inc -ID:\a\_work\1\s\src\native\external\libunwind\include\tdep -ID:\a\_work\1\s\src\native\external\libunwind\include -ID:\a\_work\1\s\artifacts\obj\external\libunwind\include\tdep -ID:\a\_work\1\s\artifacts\obj\external\libunwind\include -ID:\a\_work\1\s\src\native\external\libunwind\include\remote -ID:\a\_work\1\s\src\native\external\libunwind\include\remote\win -ID:\a\_work\1\s\src\native\external\libunwind\src /DWIN32 /D_WINDOWS -std:c11 -MTd /O2 /nologo /W4 /WX /Oi /Oy- /Gm- /Zp8 /Gy /GS /fp:precise /FC /MP /Zm200 /Zc:strictStrings /Zc:wchar_t /Zc:inline /Zc:forScope /wd4065 /wd4100 /wd4127 /wd4131 /wd4189 /wd4200 /wd4201 /wd4206 /wd4239 /wd4245 /wd4291 /wd4310 /wd4324 /wd4366 /wd4456 /wd4457 /wd4458 /wd4459 /wd4463 /wd4505 /wd4702 /wd4706 /wd4733 /wd4815 /wd4838 /wd4918 /wd4960 /wd4961 /wd5105 /wd5205 /we4007 /we4013 /we4102 /we4551 /we4640 /we4806 /we4055 /we4146 /we4242 /we4244 /we4267 /we4302 /we4308 /we4509 /we4510 /we4532 /we4533 /we4610 /we4611 /we4700 /we4701 /we4703 /we4789 /we4995 /we4996 /w34092 /w34121 /w34125 /w34130 /w34132 /w34212 /w34530 /w35038 /w44177 /Zi /ZH:SHA_256 /source-charset:utf-8 /guard:cf /guard:ehcont /permissive- -wd4068 -wd4334 -wd4311 -wd4475 -wd4477 /TC /showIncludes /FoD:\a\_work\1\s\artifacts\obj\external\libunwind\CMakeFiles\libunwind_xdac.dir\D_\a\_work\1\s\src\native\external\libunwind\src\dwarf\Gparser.c.obj /FdD:\a\_work\1\s\artifacts\obj\external\libunwind\CMakeFiles\libunwind_xdac.dir\ /FS -c D:\a\_work\1\s\src\native\external\libunwind\src\dwarf\Gparser.c
  D:\a\_work\1\s\src\native\external\libunwind\src\dwarf\Gparser.c(1181): fatal error C1090: PDB API call failed, error code '23': (0x00000005)
  [461/464] Building CXX object dlls\mscordbi\CMakeFiles\mscordbi.dir\mscordbi.cpp.obj
  ninja: build stopped: subcommand failed.
##[error]BUILD: Error: native component build failed. Refer to the build log files for details.

EDIT: Looks like https://developercommunity.visualstudio.com/t/C1090-PDB-API-call-failed-error-code-2/48897

@jkotas
Copy link
Member

jkotas commented May 23, 2025

Also tracked here: #48070 . Build analysis should flag it for you.

// fcall that is basically a static method that returns the new instance.
if (pMD && pClass->HasComponentSize())
{
// The compiler didn't know about this so it reserved space for a this-reference. We need to skip
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a temporary workaround. The compiler can know about this (by checking CORINFO_FLG_VAROBJSIZE flag). What needs to happen to move this logic to the compiler?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m_compHnd didn't appear to expose the things I needed to determine this. I can take another look.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the flag CORINFO_FLG_VAROBJSIZE that @jkotas mentioned offline yesterday:

if (opcode == CEE_NEWOBJ)
{
if (clsFlags & CORINFO_FLG_VAROBJSIZE)
{
assert(!(clsFlags & CORINFO_FLG_ARRAY)); // arrays handled separately
// This is a 'new' of a variable sized object, wher
// the constructor is to return the object. In this case
// the constructor claims to return VOID but we know it
// actually returns the new object
assert(callRetTyp == TYP_VOID);
callRetTyp = TYP_REF;
call->gtType = TYP_REF;
impSpillSpecialSideEff();
impPushOnStack(call, typeInfo(clsHnd));
}
else

You can get it by getClassAttribs or by the getCallInfo in the CORINFO_CALL_INFO::classFlags.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing this on the compiler side complicates it a lot. Could we just keep the extra slot allocation we ignore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. 0298d2e
Not sure how I feel about it.

callArgsOffset = ip[2];
methodSlot = ip[3];

// FIXME: Duplicated code from CALL_INTERP_SLOT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand why this opcode is not a normal call like the others. Could we avoid having this code duplication here ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generating the call and getting everything to work right given the way we do the tagged method pointer and then cache the call target looked like a pain.

right now we rely on being able to cache the call target and then use CodeInfo to figure out whether it is interp code or jit code. the helpers for this are a third category so we would need another different tag for them or would need to generate a generic helper opcode.

if we're not ok with a special opcode for arrays and strings i can figure something out, but i don't know how long it will take

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

md arrays make it worse because you have to do a weird thing to adapt the array of dimensions to its actual call signature, so that one would need additional setup opcodes before a regular call. you can see that in the mdarray draft, look for an array called dims

// FIXME: Duplicated code from CALL_INTERP_SLOT
size_t targetMethod = (size_t)pMethod->pDataItems[methodSlot];
MethodDesc *pMD = nullptr;
if (targetMethod & INTERP_METHOD_HANDLE_TAG)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be assert instead? If this is not set, the pMD is null and the TryGetMultiCallableAddrOfCode below will crash.

kg added 3 commits May 29, 2025 13:08
Update src/coreclr/vm/callstubgenerator.cpp

Co-authored-by: Aaron Robinson <[email protected]>

Update isSpecialConstructor to match other parts of the runtime

Migrate some string/array ctor smarts from interpexec to compiler

Separate newobj opcode for string and mdarray
@kg kg force-pushed the interp-stringctor branch from e1e5efd to c450924 Compare May 29, 2025 22:55
@@ -1213,6 +1213,28 @@ void InterpExecMethod(InterpreterFrame *pInterpreterFrame, InterpMethodContextFr
ip += 5;
goto CALL_INTERP_SLOT;
}
case INTOP_NEWOBJ_VAROBJSIZE:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between this and INTOP_CALL?

In other words - if the JIT produced a regular INTOP_CALL targetMethod instead of this special opcode, where would it break?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, nowhere, but mdarrays are going to use this opcode and have special behavior. I'm open to generating call for this and reserving the opcode only for mdarray.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generating call for this and reserving the opcode only for mdarray.

I think it would make more sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll test generating CALL and see if anything breaks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants