[Verl] Add entropy loss to `cross_entropy_loss` and `fused_linear_cross_entropy_loss` by hongpeng-guo · Pull Request #551 · linkedin/Liger-Kernel

hongpeng-guo · 2025-01-30T04:34:59Z

Summary

In RLHF workflows, such as verl, the actor forward function usually generates both losses of cross_entropy_loss (-log_probs) and entropy_loss, the later was used to encourage the policy to be not over-deterministic.

There is a real needs for a kernel that will generates both the two losses, without materializing the huge logits tensor. Liger-kernel's fused_linear_cross_entropy_loss already works well to generate the cross_entropy_loss, but only calculating the second part of the loss, i.e., the entropy loss.

This PR adds the entropy loss option to the existing FLCE loss, and work as one important step to support verl.

Adding the entropy calculation in the second pass of online softmax in cross_entropy.py::liger_cross_entropy_kernel, both the loss and its gradient subject to input are calculated and stored;
Propagate the changes to relevant modules in fused_linear_cross_entropy.py,
Propagate relavent changes to other functional modules in PyTorch interface.

Testing Done

Made existing unit tests working; Adding new unittest WIP.

Hardware Type:
run make test to ensure correctness
run make checkstyle to ensure code style
run make test-convergence to ensure convergence

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

Tcc0403 · 2025-01-30T11:15:43Z

Please add a unit test with return_entropy_loss. You can write a new pytorch implementation like CrossEntropyWithZLoss, or return_entropy_loss functionality on top of it.

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

hongpeng-guo · 2025-02-03T05:18:29Z

Update: Met some numerical unstable issue, inverstigating

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

qingquansong

Thanks for the efforts! Let's try to test thoroughly on both accuracy (numerical stability) and speed (including old ones) before checking in. Considering we may fused more and more losses such as the existing Z loss and the added entropy loss, api outputs kind of diverged and also make the loss quite heavy with multiple branches coupling together (like label smoothing, target weights, etc) We probably need to refactor a bit to make it cleaner to dev later. cc @ByronHsu @shivam15s @Tcc0403 @shimizust

src/liger_kernel/ops/cross_entropy.py

qingquansong · 2025-02-03T06:14:57Z

src/liger_kernel/ops/cross_entropy.py

    #                    = max(X) + log (sum(e ^ (X_i - max(X)))) = m + log d
    lse = m + tl.log(d)

+    # 3.5 Calculate the entropy loss


can probably put an equation in the PR description and also a simple one in a comment here to demonstrate how the entropy_loss is calculated (especially on the reuse of m and d computed in the first pass online softmax

src/liger_kernel/ops/cross_entropy.py

qingquansong · 2025-02-03T06:56:46Z

src/liger_kernel/ops/cross_entropy.py

        # TODO: Implement weighted z_loss. Currently, z_loss is not scaled by weight.
        z_loss = z_loss / n_non_ignore
+        # TODO: Implement weighted entropy loss. Currently, entropy loss is not scaled by weight.
+        entropy_loss = entropy_loss / n_non_ignore


It seems you had done the implementation of weight provided case already above? dX_entropy_block = dX_entropy_block / sum_non_ignore_weight Did I misunderstand anything? If this is not the right equation for the weighted case, please use dX_entropy_block = dX_entropy_block / n_non_ignore above and also list a comment above as an TODO item.

Thanks for catching this. I think this is a bug in my program. I just fixed it. But it seems the numerical problem is still there. Maybe we need to take a deeper look.

BTW, it seems the CI stops running for this PR for some reason.

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

Co-authored-by: Qingquan Song <ustcsqq@gmail.com>

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

hongpeng-guo · 2025-05-03T23:16:00Z

close this PR for now.

hongpeng-guo added 6 commits January 30, 2025 02:52

run make checkstyle

05f0edb

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

wip initial try test existing unitest

6a26dbb

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

ruff style check

7dad560

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

fix for cross_entropy

1b13b2f

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

fix checkstyle

8a43d1e

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

wip fix flce

82d9b55

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

hongpeng-guo marked this pull request as draft January 30, 2025 04:38

hongpeng-guo changed the title ~~[Feature] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss~~ [WIP][Feature][Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss Jan 30, 2025

hongpeng-guo changed the title ~~[WIP][Feature][Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss~~ [WIP][Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss Jan 30, 2025

hongpeng-guo mentioned this pull request Jan 30, 2025

[Liger-kernel] Add an option to use _apply_liger_kernel_to_instance() to load model verl-project/verl#133

Merged

hongpeng-guo added 4 commits January 30, 2025 08:04

fix bugs

984e85f

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

fix bugs

eb90401

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

fix

7684eed

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

fix a unit test

bed2d45

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

hongpeng-guo requested a review from ByronHsu January 30, 2025 09:56

hongpeng-guo added 4 commits February 3, 2025 03:29

fix ce kernel, add unit test make it work

a967e65

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

fix style

068b9be

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

add unit test to flce

32ac203

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

revert the chanegs on unit tests

201f47e

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

hongpeng-guo changed the base branch from main to hpguo/ruff_style_check February 3, 2025 04:41

hongpeng-guo marked this pull request as ready for review February 3, 2025 05:17

hongpeng-guo changed the title ~~[WIP][Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss~~ [Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss Feb 3, 2025

hongpeng-guo added 2 commits February 3, 2025 05:26

improve ce unit test

38c5d44

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

improve ce unit test

96c3192

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

qingquansong reviewed Feb 3, 2025

View reviewed changes

hongpeng-guo and others added 3 commits February 3, 2025 09:48

handle comments partial

af84880

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

Update src/liger_kernel/ops/cross_entropy.py

4307e37

Co-authored-by: Qingquan Song <ustcsqq@gmail.com>

fix typo

8d65866

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

hongpeng-guo changed the base branch from hpguo/ruff_style_check to main February 3, 2025 09:54

Merge branch 'main' into hpguo/ruff_style_check

d43d0ee

hongpeng-guo changed the base branch from main to hpguo/ruff_style_check February 3, 2025 09:55

Merge branch 'hpguo/ruff_style_check' into hpguo/lce_add_entropy_loss

5f6253b

hongpeng-guo changed the base branch from hpguo/ruff_style_check to main February 5, 2025 22:15

hongpeng-guo added 13 commits February 6, 2025 02:21

fix bug in softcap and ce weight confusion

4c97042

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

fix bug in softcap and ce weight confusion

74d0f0e

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

bisec unittes to test on ci

8005999

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

refactor code

e341aea

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

revert changes to unit tests

ced5709

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

change a new way calculating entropy

c1d36e6

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

make deriv stable

b1053a3

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

bisect unitets

7af2fe3

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

fix wip

6162e88

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

try to make it numerical stable

02fd778

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

wip another

7f53b59

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

revert a unittest

62d2ca3

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

update unittest

0d6487c

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>

hongpeng-guo closed this May 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Verl] Add entropy loss to `cross_entropy_loss` and `fused_linear_cross_entropy_loss`#551

[Verl] Add entropy loss to `cross_entropy_loss` and `fused_linear_cross_entropy_loss`#551
hongpeng-guo wants to merge 34 commits intomainfrom
hpguo/lce_add_entropy_loss

hongpeng-guo commented Jan 30, 2025 •

edited

Loading

Uh oh!

Tcc0403 commented Jan 30, 2025

Uh oh!

hongpeng-guo commented Feb 3, 2025

Uh oh!

qingquansong left a comment •

edited

Loading

Uh oh!

Uh oh!

qingquansong Feb 3, 2025

Uh oh!

Uh oh!

Uh oh!

qingquansong Feb 3, 2025

Uh oh!

hongpeng-guo Feb 3, 2025

Uh oh!

hongpeng-guo Feb 3, 2025

Uh oh!

hongpeng-guo commented May 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hongpeng-guo commented Jan 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing Done

Uh oh!

Tcc0403 commented Jan 30, 2025

Uh oh!

hongpeng-guo commented Feb 3, 2025

Uh oh!

qingquansong left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

qingquansong Feb 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

qingquansong Feb 3, 2025

Choose a reason for hiding this comment

Uh oh!

hongpeng-guo Feb 3, 2025

Choose a reason for hiding this comment

Uh oh!

hongpeng-guo Feb 3, 2025

Choose a reason for hiding this comment

Uh oh!

hongpeng-guo commented May 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hongpeng-guo commented Jan 30, 2025 •

edited

Loading

qingquansong left a comment •

edited

Loading