Add a Heat aware DistributedSampler for torch usage. #1807

Berkant03 · 2025-02-24T07:39:15Z

Due Diligence

General:
- title of the PR is suitable to appear in the Release Notes
Implementation:
- unit tests: all split configurations tested
- unit tests: multiple dtypes tested
- benchmarks: created for new functionality
- benchmarks: performance improved or maintained
- documentation updated where needed

Description

Issue/s resolved: #1789

Changes proposed:

Add a Heat aware DistributedSampler for usage for PyTorch use cases

Type of change

New feature (non-breaking change which adds functionality)

Does this change modify the behaviour of other functions? If so, which?

no

… on a process local basis.

for more information, see https://pre-commit.ci

github-actions · 2025-02-24T08:56:57Z

Thank you for the PR!

codecov · 2025-02-24T09:40:29Z

Codecov Report

Attention: Patch coverage is 18.51852% with 88 lines in your changes missing coverage. Please review.

Project coverage is 91.62%. Comparing base (23e373b) to head (875643d).
Report is 58 commits behind head on main.

Files with missing lines	Patch %	Lines
heat/utils/data/datatools.py	18.51%	88 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1807      +/-   ##
==========================================
- Coverage   92.26%   91.62%   -0.64%     
==========================================
  Files          84       84              
  Lines       12447    12554     +107     
==========================================
+ Hits        11484    11503      +19     
- Misses        963     1051      +88

Flag	Coverage Δ
unit	`91.62% <18.51%> (-0.64%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

for more information, see https://pre-commit.ci

github-actions · 2025-03-11T10:36:36Z

Thank you for the PR!

github-actions · 2025-03-11T10:38:25Z

Thank you for the PR!

github-actions · 2025-03-13T09:33:36Z

Thank you for the PR!

Berkant03 · 2025-03-17T09:05:16Z

When using the normal comm.Bcast the Bcast only works the first time and the seconds time not anymore.

❯ mpirun -np 2 python test.py
0 tensor([2, 4, 3, 0, 1], dtype=torch.int32)
1 tensor([2, 4, 3, 0, 1], dtype=torch.int32)
...
1 tensor([455,   0,   0,   0,  32], dtype=torch.int32)
0 tensor([1, 4, 3, 2, 0], dtype=torch.int32)

Berkant03 and others added 2 commits February 18, 2025 13:09

Added possible Distributed Dataset and Sampler for torch usage. Works…

3dcd4fe

… on a process local basis.

[pre-commit.ci] auto fixes from pre-commit.com hooks

15d8c66

for more information, see https://pre-commit.ci

github-actions bot added features utils labels Feb 24, 2025

Berkant03 added enhancement New feature or request memory footprint and removed memory footprint labels Feb 24, 2025

ClaudiaComito added this to the 1.6 milestone Feb 24, 2025

Berkant03 and others added 3 commits March 11, 2025 11:29

Added Distributed new sample that shuffles all rows across the ranks

56bdbfd

[pre-commit.ci] auto fixes from pre-commit.com hooks

b14d928

for more information, see https://pre-commit.ci

add edge case for manual seed of torch

5868ddf

added some argument checks and updated documentation a bit

875643d

Berkant03 added the PR talk label Mar 17, 2025

Berkant03 mentioned this pull request Mar 17, 2025

[Bug]: Broadcasting after the first time does not send buffer correctly. #1830

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a Heat aware DistributedSampler for torch usage. #1807

Add a Heat aware DistributedSampler for torch usage. #1807

Berkant03 commented Feb 24, 2025 •

edited

Loading

github-actions bot commented Feb 24, 2025

codecov bot commented Feb 24, 2025 •

edited

Loading

github-actions bot commented Mar 11, 2025

github-actions bot commented Mar 11, 2025

github-actions bot commented Mar 13, 2025

Berkant03 commented Mar 17, 2025 •

edited

Loading

Add a Heat aware DistributedSampler for torch usage. #1807

Are you sure you want to change the base?

Add a Heat aware DistributedSampler for torch usage. #1807

Conversation

Berkant03 commented Feb 24, 2025 • edited Loading

Due Diligence

Description

Changes proposed:

Type of change

Does this change modify the behaviour of other functions? If so, which?

github-actions bot commented Feb 24, 2025

codecov bot commented Feb 24, 2025 • edited Loading

Codecov Report

github-actions bot commented Mar 11, 2025

github-actions bot commented Mar 11, 2025

github-actions bot commented Mar 13, 2025

Berkant03 commented Mar 17, 2025 • edited Loading

Berkant03 commented Feb 24, 2025 •

edited

Loading

codecov bot commented Feb 24, 2025 •

edited

Loading

Berkant03 commented Mar 17, 2025 •

edited

Loading