OpenCompass v0.1.5
Dive into our newly improved features, bug fixes, and most notably our enhanced dataset support, coming together to refine your experience.
🆕 Highlights:
- Boosted Dataset Integrations: This release paves the way for support on numerous datasets like
ds1000
,promptbench
,antropics evals
,kaoshi
, and many more, making OpenCompass more versatile than ever. - More Evaluation Types: We starts integrating subjective and agent-adied LLM evaluation into OpenCompass. Stay tuned!
Explore the detailed changes:
🌟 New Features:
- 📦 New Datasets and Features:
📖 Documentation:
- News updates and introduction figure in README (#375, #413)
- Updated
get_started.md
and fixed naming issues (#377, #380) - New FAQ section added (#384)
- README addition in
longeval
(#389) - Multimodal documentation introduced (#334)
🛠️ Bug Fixes:
- Addressed a potential OOM issue (#387)
- Added
has_image
fix to scienceqa (#391) - Resolved performance issues of
visualglm
(#424) - Debug logger fix for summarizer (#417)
- Addressed errors in keep keys (#431)
⚙ Enhancements and Refactors:
- Refinement in docs and codes for better user guidance (#409)
- Custom summarizer argument added in CLI mode (#411)
mlugowl
llamaadapter introduced (#405)- Enhanced mm models support on public datasets (#412)
- Customized config path support (#423)
🎉 New Contributors:
A heartfelt welcome to our first-time contributors:
@wangxidong06 (First PR)
@so2liu (First PR)
@HoBeedzc (First PR)
@CuteyThyme (First PR)
@chenbohua3 (First PR)
To all contributors, old and new, thank you for continually enhancing OpenCompass! Your efforts are deeply valued. 🙌 🎉
If you love OpenCompass, don't forget to star 🌟 our GitHub repository! Your feedback, reviews, and contributions immensely help in shaping the product.
Changelog
- [Doc] Update News by @tonysy in #375
- Update get_started.md by @liushz in #377
- [CI] Publish to Pypi by @gaotongxiao in #366
- [Docs] Fix incorrect name in get_started by @gaotongxiao in #380
- fix potential OOM issue by @cdpath in #387
- [Docs] Add FAQ by @gaotongxiao in #384
- Add CMB by @wangxidong06 in #376
- [Fix]: Add has_image to scienceqa by @YuanLiuuuuuu in #391
- [Feat] support ds1000 dataset by @yingfhu in #395
- [Feat] implementation for support promptbench by @yingfhu in #239
- [Feat] refine docs and codes for more user guides by @yingfhu in #409
- [Docs] Readme in longeval by @philipwangOvO in #389
- feat: add custom summarizer argument in CLI run mode 在CLI启动模式中添加自定义Summarizer参数 by @so2liu in #411
- Yhzhang/add mlugowl llamaadapter by @ZhangYuanhan-AI in #405
- [Feat] Support mm models on public dataset and fix several issues. by @yyk-wew in #412
- [Docs] Add intro figure to README by @gaotongxiao in #413
- [fix] summarizer debug logger by @HoBeedzc in #417
- [Doc] Update news by @Leymore in #420
- [Feature] Use local accuracy from hf implements by @Leymore in #416
- [Feat] support antropics evals dataset by @yingfhu in #422
- [Fix] Fix performance issue of visualglm. by @yyk-wew in #424
- [Feature] Log gold answer in prediction output by @gaotongxiao in #419
- Support GSM8k evaluation with tools by Lagent and LangChain by @mzr1996 in #277
- [Sync] Initial support of subjective evaluation by @gaotongxiao in #421
- [Fix] P0: errors in keep keys by @gaotongxiao in #431
- add evaluation of scibench by @CuteyThyme in #393
- [Feature] Add kaoshi dataset by @liushz in #392
- [Docs] Add multimodal docs by @fangyixiao18 in #334
- support customize config path by @chenbohua3 in #423
Full Changelog: 0.1.4...0.1.5