Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metadata to dist/spdm components #1037

Open
clumsy opened this issue Mar 28, 2025 · 6 comments
Open

Add metadata to dist/spdm components #1037

clumsy opened this issue Mar 28, 2025 · 6 comments

Comments

@clumsy
Copy link
Contributor

clumsy commented Mar 28, 2025

Description

TorchX already supports App metadata. Unfortunately there's no way to pass metadata via torchxconfig or CLI unlike env.

Motivation/Background

While implementation is scheduler specific and not all of them handle metadata. In AWS Batch we already translate metadata to Batch job tags: #775. It is also added to AWS SageMaker scheduler.

If we add metadata to dist/spmd the TorchX users will have 1 less reason to create custom components only to copy dist/spmd almost verbatim.

Detailed Proposal

Add a new parameter to dist/spmd component, just like env.

Alternatives

The users are forced to implement their own components and deviate from vanilla TorchX.

Additional context/links

#775

@clumsy
Copy link
Contributor Author

clumsy commented Mar 28, 2025

If everyone is in favor I'm happy to contribute as always @d4l3k @kiukchung @tonykao8080

@d4l3k
Copy link
Member

d4l3k commented Mar 28, 2025

@clumsy sounds reasonable to me! I wonder if this is generic enough that we should support it for all components via run/.torchxconfig

@clumsy
Copy link
Contributor Author

clumsy commented Mar 28, 2025

I think many schedulers have some notion of metadata/tags - so that we can mark the job itself for some automation, but not leak this into the app.

We have added this to AWS Batch already, I can add to AWS SageMaker and maybe Kubernetes if it's missing there.

@clumsy
Copy link
Contributor Author

clumsy commented Mar 28, 2025

The implementation is trivial (see the linked PR) @d4l3k

We used it in a custom component, but don't want to maintain it internally long term just for this.

@clumsy
Copy link
Contributor Author

clumsy commented Mar 29, 2025

I forgot that app.metadata support was added to AWS SageMaker scheduler since day 1.

@clumsy
Copy link
Contributor Author

clumsy commented Mar 29, 2025

I can add metadata to other components as well, @d4l3k . E.g. in utils I think it will work well. But I won't be able to test them all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants