-
Notifications
You must be signed in to change notification settings - Fork 5k
Fix test execution jobs in the coreclr-release-outerloop-nightly pipeline #85278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging subscribers to this area: @hoyosjs Issue DetailsAs described in the issue the Thanks Tomas /cc @dotnet/runtime-infrastructure
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much does this pipeline get used? It's been broken and never looked at, so is it worth to spend the resources? How does it differ from the outer loop pipeline?
@hoyosjs - I have raised exactly the same concerns on the original issue thread, I believe we should consider consolidating it with the |
Having said that, I think the additional testing does provide some value, if I understand it correctly, it runs the tests in release mode in contrast to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Just FYI I have rerun the failed jobs because apparently many of them failed due to transient network failures; the updated results should be available in an hour or so. |
I did see some related to crossgen2 and Avx512 which seems to need investigation. |
So did I. One of the annoying limitations of AzDO is the inability to rerun just a single job (or I just don't know how to do it). |
OK, most of the runs have finished and now I believe the failures are "real". One stable bug is the "coreroot_determinism" failure, I'll look into that as I suspect I might have an idea what is going on there. For some of the AVX 512 failures, I'm seeing the relatively uncommon exit code C000001D meaning invalid instruction so I'm wondering whether we're sure the Helix HW we're using supports these new processor features. |
|
It's more likely that there is some encoding bug for a new instruction and outerloop hits the right edge case to trigger it. Is there a repro? |
@tannergooding - I am able to repro the issue locally when running the HW intrinsic tests in Crossgen2 mode using basically the following command sequence: build clr+libs -c Release src\tests\build release test JIT\HardwareIntrinsics\HardwareIntrinsics_X86_Avx512_r.csproj src\tests\run release crossgen2 In the release mode I see a failure that looks similar to the one in the lab: 17:07:08.042 Failed test: _Avx512DQ_VL_Vector128_r::JIT.HardwareIntrinsics.X86._Avx512DQ_VL_Vector128.Program.BroadcastPairScalarToVector128UInt32() 17:07:08.053 Running test: _Avx512DQ_VL_Vector128_r::JIT.HardwareIntrinsics.X86._Avx512DQ_VL_Vector128.Program.BroadcastPairScalarToVector128Single() Beginning scenario: RunBasicScenario_UnsafeRead Fatal error. System.Runtime.InteropServices.SEHException (0x80004005): External component has thrown an exception. at JIT.HardwareIntrinsics.X86._Avx512DQ_VL_Vector128.SimpleUnaryOpTest__BroadcastPairScalarToVector128Single.RunBasicScenario_UnsafeRead() at JIT.HardwareIntrinsics.X86._Avx512DQ_VL_Vector128.Program.BroadcastPairScalarToVector128Single() at Program.<$>g__TestExecutor3|0_2(System.IO.StreamWriter, System.IO.StreamWriter, <>c__DisplayClass0_0 ByRef) at Program.$(System.String[]) Sadly the call stack involves about ten runtime JITted methods without symbol information so I haven't yet figured out how to drill deeper into the failure. In debug mode, I'm hitting a JIT assertion instead: build clr -c Debug src\tests\build test JIT\HardwareIntrinsics\HardwareIntrinsics_X86_Avx512_r.csproj src\tests\run crossgen2 16:56:42.023 Running test: _Avx512BW_r::JIT.HardwareIntrinsics.X86._Avx512BW.Program.ShiftLeftLogicalVariableInt16() Assert failure(PID 32348 [0x00007e5c], Thread: 13636 [0x3544]): Assertion failed 'inputSize == 4 || inputSize == 8' in 'JIT.HardwareIntrinsics.X86._Avx512BW.SimpleBinaryOpTest__ShiftLeftLogicalVariableInt16:RunBasicScenario_UnsafeRead():this' during 'Generate code' (IL size 114; hash 0x77e786fa; MinOpts) File: C:\git\runtime7\src\coreclr\jit\emitxarch.cpp Line: 15481 Image: c:\git\runtime7\artifacts\tests\coreclr\windows.x64.debug\tests\core_root\corerun.exe with the following call stack: clrjit.dll!assertAbort(const char * why, const char * file, unsigned int line) Line 304 C++ > clrjit.dll!emitter::TryEvexCompressDisp8Byte(emitter::instrDesc * id, __int64 dsp, bool * dspInByte) Line 15481 C++ clrjit.dll!emitter::emitInsSizeSVCalcDisp(emitter::instrDesc * id, unsigned __int64 code, int var, int dsp) Line 3751 C++ clrjit.dll!emitter::emitInsSizeSV(emitter::instrDesc * id, unsigned __int64 code, int var, int dsp) Line 3828 C++ clrjit.dll!emitter::emitIns_R_R_S(instruction ins, emitAttr attr, _regNumber_enum reg1, _regNumber_enum reg2, int varx, int offs) Line 6844 C++ clrjit.dll!emitter::emitIns_SIMD_R_R_S(instruction ins, emitAttr attr, _regNumber_enum targetReg, _regNumber_enum op1Reg, int varx, int offs) Line 8279 C++ clrjit.dll!CodeGen::inst_RV_RV_TT(instruction ins, emitAttr size, _regNumber_enum targetReg, _regNumber_enum op1Reg, GenTree * op2, bool isRMW) Line 1130 C++ clrjit.dll!CodeGen::genHWIntrinsic_R_R_RM(GenTreeHWIntrinsic * node, instruction ins, emitAttr attr, _regNumber_enum targetReg, _regNumber_enum op1Reg, GenTree * op2) Line 665 C++ clrjit.dll!CodeGen::genHWIntrinsic_R_R_RM(GenTreeHWIntrinsic * node, instruction ins, emitAttr attr) Line 637 C++ clrjit.dll!CodeGen::genHWIntrinsic(GenTreeHWIntrinsic * node) Line 261 C++ clrjit.dll!CodeGen::genCodeForTreeNode(GenTree * treeNode) Line 1898 C++ clrjit.dll!CodeGen::genCodeForBBlist() Line 469 C++ clrjit.dll!CodeGen::genGenerateMachineCode() Line 1915 C++ clrjit.dll!CodeGenPhase::DoPhase() Line 1650 C++ clrjit.dll!Phase::Run() Line 61 C++ clrjit.dll!DoPhase(CodeGen * _codeGen, Phases _phase, void(CodeGen::*)() _action) Line 1664 C++ clrjit.dll!CodeGen::genGenerateCode(void * * codePtr, unsigned int * nativeSizeOfCode) Line 1674 C++ when building the method JIT.HardwareIntrinsics.X86._Avx512BW.SimpleBinaryOpTest__ShiftLeftLogicalVariableInt16.RunBasicScenario_UnsafeRead() where the proximate cause is Hope that helps Tomas |
As described in the issue
#85263
the
coreclr-release-outerloop-nightly
pipeline has been malfunctioning for quite a while due to not publishing the native test components. This simple change fixes that - while it doesn't make the pipeline completely green, at least we're now sending the tests to Helix and observing just a couple of remaining test failures.Thanks
Tomas
/cc @dotnet/runtime-infrastructure