Skip to content

Conversation

tgymnich
Copy link
Member

@tgymnich tgymnich commented Sep 16, 2024

PR status as of 2025-06-19:

Implement @mtlprintf and friends using os_log

TODO:

  • Printing of floats does not work since they will be converted to doubles due to the vararg calling convention which will be caught by our IR checker
  • Version check the @mtlprintf macro
  • Add tests
  • Capturing and logging are mutually exclusive

depends on: JuliaGPU/GPUCompiler.jl#630
notify: #226

@tgymnich tgymnich self-assigned this Sep 16, 2024
@tgymnich tgymnich marked this pull request as ready for review September 16, 2024 12:12
@tgymnich
Copy link
Member Author

@maleadt Any idea how we can implement the version check for the @mtlprintf macro? I know we could check the air version inside the kernel but I'd like to avoid that.

Also can we get rid of the double check in check_ir?

@christiangnrd
Copy link
Member

Would it be worth benchmarking the performance difference between having logging active vs not?

@tgymnich
Copy link
Member Author

tgymnich commented Sep 17, 2024

Would it be worth benchmarking the performance difference between having logging active vs not?

@christiangnrd Sure. I don't expect there to be much overhead besides allocation of the log buffer and checking it for logs after running a kernel. But we might want to look into only conditionally adding MTLLogState since logging also prevents GPU frame capture.

@christiangnrd christiangnrd mentioned this pull request Sep 17, 2024
2 tasks
@maleadt
Copy link
Member

maleadt commented Sep 17, 2024

@maleadt Any idea how we can implement the version check for the @mtlprintf macro? I know we could check the air version inside the kernel but I'd like to avoid that.

Given that the macro expands way to early, I don't think there's anything we can do but checking in the kernel. Why are you opposed to that? GPUCompiler.jl has infrastructure to optimize those checks away, see e.g. how CUDA.jl exposes the device capability and PTX ISA version to the kernel.

@tgymnich
Copy link
Member Author

We could also wrap the macro and accompanying functions in if Metal.macos_version() >= v"15".

@christiangnrd
Copy link
Member

christiangnrd commented Sep 18, 2024

If we do that we should have definitions in both cases and give an informative error if Metal.macos_version() < v"15".

@maleadt
Copy link
Member

maleadt commented Sep 18, 2024

Actually, looks like I provided the run-time queries already:

@device_function @inline metal_version() = SimpleVersion(metal_major(), metal_minor())
@device_function @inline air_version() = SimpleVersion(air_major(), air_minor())

So we can just use that in the generated code, generating an error when emitting code for an older platform. That of course depends on #416 for meaningful reporting, but we'll get there.

I'd rather not simply check based on the macOS version during macro expansion, since we might want to target older Metal versions than the system supports.

Copy link
Member

@christiangnrd christiangnrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! However do you know what's causing the tests to hang?

@tgymnich
Copy link
Member Author

tgymnich commented Sep 19, 2024

Looks good! However do you know what's causing the tests to hang?

@christiangnrd The hangs are caused by this one line:

@print_and_throw "@mtlprintf requires Metal 3.2 (macOS 15) or higher"

@christiangnrd
Copy link
Member

christiangnrd commented Sep 21, 2024

@maleadt Could we have one of the Apple Silicon runners upgraded to Sequoia so the output tests don't get ignored? Edit: All the runners are running 13.3.1. Should we also have one on macOS 14?

I would also like to see #420 merged first (with benchmarks run on macOS 15) to see how big the impact of enabling logging is.

@tgymnich
Copy link
Member Author

@christiangnrd I recently made changes so that logging (e.g. MTLLogState and friends) is only enabled whenever we actually use the feature.

@christiangnrd
Copy link
Member

Just pushed a whitespace-only formatting commit

@christiangnrd
Copy link
Member

@christiangnrd I recently made changes so that logging (e.g. MTLLogState and friends) is only enabled whenever we actually use the feature.

In that case I still think we should be able to test on macOS 15, but I think we should merge this as soon as it's ready.

@maleadt
Copy link
Member

maleadt commented Sep 23, 2024

I've upgraded one of the workers to macOS 15:
image

See the macos_version tag which can be used to select on this.

@maleadt
Copy link
Member

maleadt commented Feb 6, 2025

How is this not blocked by #433 anymore?

@tgymnich
Copy link
Member Author

tgymnich commented Feb 7, 2025

@maleadt #433 is still an issue on some older devices.
However the main issue with this PR (mentioned here: #418 (comment)) can be resolved by using Core.print instead of Base.print in the command buffer callback.

@maleadt
Copy link
Member

maleadt commented Feb 10, 2025

However the main issue with this PR (mentioned here: #418 (comment)) can be resolved by using Core.print instead of Base.print in the command buffer callback.

I'm having a hard time understanding that. If switching to a non-yielding print fixes the issue, it implies we were running into #532. But that doesn't match you mentioning 100% GPU usage in #433?

@christiangnrd
Copy link
Member

FWIW I can replicate the hang but not the 100% GPU usage mentioned in #433.

Comment on lines +281 to +283
@static if !is_macos(v"15.0.0")
error("Logging is only supported on macOS 15 or higher")
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The macOS 13/14 output test fails because compilation fails before reaching this check. I tested it before rebasing and the availability PR doesn't seem to affect it.

@christiangnrd

This comment was marked as off-topic.

Copy link

codecov bot commented Jun 18, 2025

Codecov Report

Attention: Patch coverage is 70.45455% with 13 lines in your changes missing coverage. Please review.

Project coverage is 80.61%. Comparing base (962e9e1) to head (3e75f7f).

Files with missing lines Patch % Lines
lib/mtl/command_queue.jl 0.00% 11 Missing ⚠️
src/compiler/execution.jl 88.88% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #418      +/-   ##
==========================================
- Coverage   80.82%   80.61%   -0.22%     
==========================================
  Files          61       62       +1     
  Lines        2681     2718      +37     
==========================================
+ Hits         2167     2191      +24     
- Misses        514      527      +13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@christiangnrd
Copy link
Member

@tgymnich @maleadt I updated the PR description with the current status of this PR. Did I miss or get anything wrong? Would be nice to get some form of this out soon if possible.

Obviously we can't release something that causes hangs, but maybe we could have it be a no-op in unsupported OS versions until someone has the resources to make it depend on target version instead of OS version.

@tgymnich
Copy link
Member Author

We may still need to fix #532 before

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kernels Things about kernels and how they are compiled.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants