First stable version of R1-V.
What's Changed
- Update image link by @HaozheZhao in #23
- fix a bug for image input when computing logp by @Yifan-Song793 in #32
- Fix format reward function #19 by @JamesHujy in #34
- support Qwen2.5-VL by @yuezih in #50
- Fix pixel_value repeat error by @wuhuikai in #49
- Add vLLM Trainer by @TobiasLee in #44
- update sft code by @HaozheZhao in #55
- Fix batch sampling bug by @yuezih in #54
- remove the extra import code in SFT script by @HaozheZhao in #77
- save processor after sft training by @HaozheZhao in #86
- fix #74, SFT codes stuck problem by @LiuRicky in #84
- Adapt Qwen updates in transformers by @yuezih in #92
- V0.1.0 by @chenllliang in #97
New Contributors
- @HaozheZhao made their first contribution in #23
- @Yifan-Song793 made their first contribution in #32
- @JamesHujy made their first contribution in #34
- @yuezih made their first contribution in #50
- @wuhuikai made their first contribution in #49
- @TobiasLee made their first contribution in #44
- @LiuRicky made their first contribution in #84
- @chenllliang made their first contribution in #97
Full Changelog: https://github.com/Deep-Agent/R1-V/commits/v0.1.0