Skip to content

Commit 2378d8c

Browse files
Albertorizzolisimedw
authored andcommitted
Update README.md
Added twitter, discord and other links plus a bunch of emojis.
1 parent f462fe7 commit 2378d8c

File tree

1 file changed

+34
-20
lines changed

1 file changed

+34
-20
lines changed

README.md

Lines changed: 34 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,38 +1,52 @@
1-
# BenchLLM
1+
# 🏋️‍♂️ BenchLLM 🏋️‍♀️
22

3-
BenchLLM is a Python-based open-source library that streamlines the testing process for Large Language Models (LLMs) and AI-powered applications. It offers an intuitive and robust way to validate and score the output of your code with minimal boilerplate or configuration.
3+
🦾 Continuous Integration for LLM powered applications 🦙🦅🤖
44

5-
BenchLLM is actively used at [V7](https://www.v7labs.com) for improving our LLM applications and now Open Sourced under MIT License to share with the wider community
5+
[![GitHub Repo stars](https://img.shields.io/github/stars/v7labs/BenchLLM?style=social)](https://github.com/v7labs/BenchLLM/stargazers)
6+
[![Twitter Follow](https://img.shields.io/twitter/follow/V7Labs?style=social)](https://twitter.com/V7Labs)
7+
[![Discord Follow](https://dcbadge.vercel.app/api/server/x7ExfHb3bG?style=flat)](https://discord.gg/x7ExfHb3bG)
8+
9+
BenchLLM is a Python-based open-source library that streamlines the testing of Large Language Models (LLMs) and AI-powered applications. It measures the accuracy of your model, agents, or chains by validating responses on any number of tests via LLMs.
10+
11+
BenchLLM is actively used at [V7](https://www.v7labs.com) for improving our LLM applications and is now Open Sourced under MIT License to share with the wider community
12+
13+
14+
## 💡 Get help on [Discord](https://discord.gg/x7ExfHb3bG) or [Tweet at us](https://twitter.com/V7Labs)
15+
16+
<hr/>
617

718
Use BenchLLM to:
819

9-
- Easily set up a comprehensive testing suite for your LLMs.
10-
- Continous integration for your langchain/agents/models.
11-
- Elimiate flaky chains and create confidence in your code.
20+
- Test the responses of your LLM across any number of prompts.
21+
- Continuous integration for chains like [Langchain](https://github.com/hwchase17/langchain), agents like [AutoGPT](https://github.com/Significant-Gravitas/Auto-GPT), or LLM models like [Llama](https://github.com/facebookresearch/llama) or GPT-4.
22+
- Eliminate flaky chains and create confidence in your code.
23+
- Spot inaccurate responses and hallucinations in your application at every version.
1224

13-
> **NOTE:** BenchLLM is in the early stage of development and will be subject to rapid changes.
25+
<hr/>
1426

15-
For bug reporting, feature requests, or contributions, please open an issue or submit a pull request (PR) on our GitHub page.
27+
> ⚠️ **NOTE:** BenchLLM is in the early stage of development and will be subject to rapid changes.
28+
>
29+
>For bug reporting, feature requests, or contributions, please open an issue or submit a pull request (PR) on our GitHub page.
1630
17-
## BenchLLM Testing Methodology
31+
## 🧪 BenchLLM Testing Methodology
1832

1933
BenchLLM implements a distinct two-step methodology for validating your machine learning models:
2034

21-
1. **Testing**: This stage involves running your code against various tests and capturing the predictions produced by your model without immediate judgment or comparison.
35+
1. **Testing**: This stage involves running your code against any number of expected responses and capturing the predictions produced by your model without immediate judgment or comparison.
2236

23-
2. **Evaluation**: During this phase, the recorded predictions are compared against the expected output. Detailed comparison reports, including pass/fail status and other metrics, are generated.
37+
2. **Evaluation**: The recorded predictions are compared against the expected output using LLMs to verify factual similarity (or optionally manually). Detailed comparison reports, including pass/fail status and other metrics, are generated.
2438

2539
This methodical separation offers a comprehensive view of your model's performance and allows for better control and refinement of each step.
2640

27-
## Install
41+
## 🚀 Install
2842

2943
To install BenchLLM we use pip
3044

3145
```
3246
pip install benchllm
3347
```
3448

35-
## Usage
49+
## 💻 Usage
3650

3751
Start by importing the library and use the @benchllm.test decorator to mark the function you'd like to test:
3852

@@ -102,9 +116,9 @@ The non interactive evaluators also supports `--workers N` to run in the evaluat
102116
$ bench run --evaluator string-match --workers 5
103117
```
104118

105-
### Eval
119+
### 🧮 Eval
106120

107-
While bench run runs each test function and then evaluates their output, it can often be beneficial to separate these into two steps. For example, if you want a person to manually do the evaluation or if you want to try multiple evaluation methods on the same function.
121+
While _bench run_ runs each test function and then evaluates their output, it can often be beneficial to separate these into two steps. For example, if you want a person to manually do the evaluation or if you want to try multiple evaluation methods on the same function.
108122

109123
```bash
110124
$ bench run --no-eval
@@ -117,7 +131,7 @@ Then later you can evaluate them with
117131
$ bench eval output/latest/predictions
118132
```
119133

120-
## API
134+
## 🔌 API
121135

122136
For more detailed control, BenchLLM provides an API.
123137
You are not required to add YML/JSON tests to be able to evaluate your model.
@@ -149,14 +163,14 @@ results = evaluator.run()
149163
print(results)
150164
```
151165

152-
## Commands
166+
## ☕️ Commands
153167

154168
- `bench add`: Add a new test to a suite.
155169
- `bench tests`: List all tests in a suite.
156170
- `bench run`: Run all or target test suites.
157171
- `bench eval`: Runs the evaluation of an existing test run.
158172

159-
## Contribute
173+
## 🙌 Contribute
160174

161175
BenchLLM is developed for Python 3.10, although it may work with other Python versions as well. We recommend using a Python 3.10 environment. You can use conda or any other environment manager to set up the environment:
162176

@@ -180,6 +194,6 @@ Contribution steps:
180194
4. Test your changes.
181195
5. Submit a pull request.
182196

183-
We adhere to PEP8 style guide. Please follow this guide when contributing.
197+
We adhere to the PEP8 style guide. Please follow this guide when contributing.
184198

185-
For further information and advanced usage, please refer to the comprehensive BenchLLM documentation. If you need any support, feel free to open an issue on our GitHub page.
199+
If you need any support, feel free to open an issue on our GitHub page.

0 commit comments

Comments
 (0)