huggingface / trl Public

generated from fastai/nbdev_template

Notifications You must be signed in to change notification settings
Fork 2.3k
Star 16.2k

Code
Issues 516
Pull requests 87
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: huggingface/trl

Labels 34 Milestones 0

New pull request New

87 Open 2,213 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Bug Fix] OnlineDPOTrainer with vLLM Server Mode

#4500 opened Nov 7, 2025 by YangKai0616

Loading…

Consistency regarding relative imports

#4498 opened Nov 6, 2025 by qgallouedec

Loading…

Add vLLM quantization option for colocate

#4496 opened Nov 6, 2025 by sergiopaniego • Draft

7 tasks

adding [SimPER](https://arxiv.org/abs/2502.00883)

#4486 opened Nov 6, 2025 by leeparkuky

Loading…

2 of 5 tasks

Move RLOOTrainer to trl.experimental.rloo

#4484 opened Nov 6, 2025 by behroozazarkhalili

Loading…

Move PRMTrainer to trl.experimental.prm

#4483 opened Nov 6, 2025 by behroozazarkhalili

Loading…

Move PPOTrainer to trl.experimental.ppo

#4482 opened Nov 6, 2025 by behroozazarkhalili

Loading…

[ORPO] Move ORPOTrainer to experimental

#4480 opened Nov 6, 2025 by behroozazarkhalili

Loading…

refactor: Move NashMDTrainer to experimental module

#4477 opened Nov 5, 2025 by behroozazarkhalili

Loading…

Move KTOTrainer to experimental module

#4475 opened Nov 5, 2025 by behroozazarkhalili

Loading…

Move GKDTrainer to experimental module

#4474 opened Nov 5, 2025 by behroozazarkhalili

Loading…

7 tasks done

Move OnlineDPOTrainer to experimental module

#4473 opened Nov 5, 2025 by behroozazarkhalili

Loading…

refactor: Move CPOTrainer to experimental module

#4470 opened Nov 5, 2025 by behroozazarkhalili

Loading…

Add attention_mask to signature_columns

#4459 opened Nov 5, 2025 by shubhamjain0594

Loading…

5 tasks

Add num_generations_eval parameter for efficient evaluation

#4458 opened Nov 5, 2025 by mingxuetian

Loading…

Prevent upcasting layers in prepare_model_for_kbit_training

#4457 opened Nov 5, 2025 by sergiopaniego

Loading…

5 tasks

added 10 papers (+trainer cross-links) for #4407

#4441 opened Nov 3, 2025 by SSusantAchary

Loading…

4 tasks done

docs: add KTO (2402.01306) to Paper Index + link ref to KTOTrainer

#4440 opened Nov 3, 2025 by SSusantAchary

Loading…

refactor: Move Mergekit integration to experimental submodule

#4438 opened Nov 3, 2025 by behroozazarkhalili

Loading…

docs: Unify model examples to use trl-lib namespace

#4431 opened Nov 2, 2025 by behroozazarkhalili

Loading…

docs: Expand speeding up training guide with acceleration methods

#4428 opened Nov 2, 2025 by behroozazarkhalili

Loading…

docs: Expand training customization examples

#4427 opened Nov 2, 2025 by behroozazarkhalili

Loading…

4 tasks done

Replace flash attention2 with kernels-community/flash-attn2

#4426 opened Nov 2, 2025 by tamoghnokandar

Loading…

4 of 5 tasks

docs: Extend CLI basic usage examples to all supported CLIs

#4425 opened Nov 2, 2025 by behroozazarkhalili

Loading…

docs: Rewrite PEFT integration guide with comprehensive examples

#4421 opened Nov 2, 2025 by behroozazarkhalili

Loading…

Previous 1 2 3 4 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!