Skip to content

Crash when using MPS backend on macOS 14 #6746

Open
@zcbenz

Description

@zcbenz

🐛 Describe the bug

I'm working on a Node.js port of ExecuTorch, and I got a crash which only happens on macOS 14 in the hosted runner of GitHub Actions. The macOS 15 runner is fine, and I could not reproduce the crash on any of my local computers.

It is probably just a macOS 14 bug that got fixed in 15, but in case it is a hidden ExecuTorch bug I'm pasting the stack trace here.

Process 4164 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x580)
    frame #0: 0x0000000196dfca3c MPSCore`MPSDevice::GetMPSLibrary_DoNotUse(MPSLibraryInfo const*) + 60
MPSCore`MPSDevice::GetMPSLibrary_DoNotUse:
->  0x196dfca3c <+60>: ldapr  x21, [x8]
    0x196dfca40 <+64>: cbnz   x21, 0x196dfcc64          ; <+612>
    0x196dfca44 <+68>: mov    x20, x1
    0x196dfca48 <+72>: mov    x19, x0
Target 0: (node) stopped.
Process 4164 launched: '/Users/runner/hostedtoolcache/node/22.10.0/arm64/bin/node' (arm64)
(lldb) thread backtrace all
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x580)
  * frame #0: 0x0000000196dfca3c MPSCore`MPSDevice::GetMPSLibrary_DoNotUse(MPSLibraryInfo const*) + 60
    frame #1: 0x0000000196dffe44 MPSCore`___lldb_unnamed_symbol1299 + 1008
    frame #2: 0x0000000196e003a8 MPSCore`___lldb_unnamed_symbol1300 + 68
    frame #3: 0x00000001ea0d908c MetalPerformanceShadersGraph`___lldb_unnamed_symbol55543 + 524
    frame #4: 0x0000000118e19a44 executorch.node`executorch::backends::mps::delegate::MPSExecutor::set_inputs_outputs(this=0x0000600003c1cc00, inputs=size=1, outputs=size=1) at MPSExecutor.mm:59:39
    frame #5: 0x0000000118e108b4 executorch.node`executorch::backends::MPSBackend::execute(this=0x0000000119444aa0, context=0x000000016fdfc7c8, handle=0x0000600003c1cc00, args=0x0000600018a14090) const at MPSBackend.mm:113:21
    frame #6: 0x0000000118de6098 executorch.node`executorch::runtime::BackendDelegate::Execute(this=0x0000600000712be0, backend_execution_context=0x000000016fdfc7c8, args=0x0000600018a14090) const at method.cpp:127:22
    frame #7: 0x0000000118de59f8 executorch.node`executorch::runtime::Method::execute_instruction(this=0x0000600002274000) at method.cpp:1102:38
    frame #8: 0x0000000118de6b1c executorch.node`executorch::runtime::Method::execute(this=0x0000600002274000) at method.cpp:1295:21
    frame #9: 0x0000000118dfb8d8 executorch.node`executorch::extension::Module::execute(this=0x000060000270a300, method_name="forward", input_values=size=1) at module.cpp:189:3
    frame #10: 0x0000000118d75fb8 executorch.node`(anonymous namespace)::Execute(mod=0x000060000270a300, env=0x0000600003224340, name="forward", args=size=1) at module.cc:70:15

/cc @DenisVieriu97

Versions

I'm building with the latest ExecuTorch (5b51bb8).

Metadata

Metadata

Assignees

Labels

module: mpsIssues related to Apple's MPS delegation and code under backends/apple/mps/triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions