OpenCompass v0.1.9
🌟 Highlights
- 🚀 New API Integrations: A leap forward with the addition of multiple new APIs, including Baidu, Moonshot, Sensetime, and more, broadening the scope and capabilities of OpenCompass.
- 🔵 Circular Evaluation Feature: Introducing Circular Eval, an enhancement for comprehensive and dynamic evaluations within the platform.
- 🤖 Turbomind Inference Integration: Integration of Turbomind inference through its RPC API, enhancing the platform's inferencing capabilities.
🚀 New Features & Enhancements
- Model & API Development: Explore new capabilities with DataCanvas Alaya LM, Lightllm API, 360API, and enhanced Turbomind Python API integration (#612, #613, #601, #484).
- Circular Evaluation Implementation: Elevate your evaluation methods with the newly added Circular Eval feature, offering a more nuanced and detailed analysis capability (#610).
- Rich Dataset Additions: Enrich your research with new datasets - FinanceIQ, SVAMP, GSM_Hard, and updated Mathbench for diverse applications (#596, #604, #619, #580, #607).
🛠 Improvements & Fixes
- Subjective Evaluation Bug Fixes: Improved accuracy in subjective evaluations (#589).
- Dataset and Feature Fixes: Resolving issues in CMB dataset, various feature enhancements, and fixes (#587, #592, #615, #632).
📚 Documentation Updates
- README & FAQ Enhancements: Updated for better clarity and assistance (#582, #622, #628, #629).
- Typo and Spelling Corrections: Ensuring accuracy and professionalism in documentation (#594, #637).
🎊 New Contributors
Welcoming new contributors to the OpenCompass family!
- @rahidzeynal, @Sniper970119, @ZhangRaymond, @HunterKruger, @helloyongyang, and @Yggdrasill7D6. Your contributions are greatly appreciated!
What's Changed
- Add author as: author='OpenCompass Contributors' by @rahidzeynal in #578
- [Doc] Update README by @tonysy in #582
- [Feature] Update mathbench by @tonysy in #580
- Fix bugs in subjective evaluation by @frankweijue in #589
- [Fix] fix cmb dataset by @Leymore in #587
- [Fix] change save_every defaults to 1 by @yingfhu in #592
- update word spell by @Sniper970119 in #594
- Add FinanceIQ dataset by @ZhangRaymond in #596
- [Feat] support humaneval and mbpp pass@k by @yingfhu in #598
- [Feature] Add multi-prompt generation demo by @jingmingzhuo in #568
- Mathbench update postprocess by @liushz in #600
- [Feature] Add arithmetic to mathbench by @liushz in #607
- Add support for DataCanvas Alaya LM by @HunterKruger in #612
- [Feature] Support Lightllm api by @helloyongyang in #613
- [Feature] Support 360API and FixKRetriever for CSQA dataset by @tonysy in #601
- Integrate turbomind python api by @lvhan028 in #484
- [Bug] Update api with generation_kargs by @tonysy in #614
- [Fix] Fix gen inferencer by @Leymore in #615
- [Docs] update ds1000 code eval docs by @yingfhu in #618
- [Feature] Add SVAMP dataset by @liushz in #604
- [Feature] support download from modelscope by @KevinNuNu in #534
- [Doc] Update README and requirements. by @tonysy in #622
- [Sync] Fix cmnli, fix vicuna meta template, fix longbench postprocess and other minor fixes by @Leymore in #625
- [API] Update API by @tonysy in #624
- [Feature] Add circular eval by @Leymore in #610
- [Doc] Update FAQ by @Leymore in #628
- [Doc] Update README by @tonysy in #629
- [Bug] fix icl eval with nested list by @yingfhu in #632
- Fix LightllmAPI list bug by @helloyongyang in #635
- fix typo in README by @Yggdrasill7D6 in #637
- [Sync] update codes by @Leymore in #641
- [Feature] Add GSM_Hard dataset by @liushz in #619
- [Feat] support zhipu post process by @yingfhu in #642
- [Sync] Bump version to 0.1.9 by @Leymore in #644
Explore the detailed changes in the full changelog.
Thank you to all the contributors for this release. Your dedication and hard work continue to enhance OpenCompass, making it an ever-evolving and dynamic tool for the community. Let's dive into the new possibilities with OpenCompass v0.1.9! 🎉🧮💻