You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+55-1
Original file line number
Diff line number
Diff line change
@@ -109,4 +109,58 @@ Solar 10.7b Instruct | :x: | :x: | Produced a chat conversation
109
109
StarCoder2 3b | | |
110
110
StarCoder2 7b | | |
111
111
StarCoder2 15b | | |
112
-
Yi-34b Chat | :large_orange_diamond: | :large_orange_diamond: | Close to a valid drawing, outdated libraries
112
+
Yi-34b Chat | :large_orange_diamond: | :large_orange_diamond: | Close to a valid drawing, outdated libraries
113
+
114
+
---
115
+
---
116
+
117
+
### Non-coding tests
118
+
119
+
#### Termination word
120
+
Background: Tests the ability for an LLM to incorporate a termination word into their response.
121
+
122
+
Scenario: Uses a Group Chat with a Story_writer and a Product_manager. Story_writer is to write some story ideas and the Product_manager is to review and terminate when satisified by outputting a specific word (e.g. "TERMINATE", "BAZINGA", etc.).
123
+
124
+
Store_writer's system message: **An ideas person, loves coming up with ideas for kids books.**
125
+
126
+
Product_manager's system message: **Great in evaluating story ideas from your writers and determining whether they would be unique and interesting for kids. Reply with suggested improvements if they aren't good enough, otherwise reply `{termination_word}` at the end when you're satisfied there's one good story idea.**
127
+
128
+
Prompt for the chat manager: **Come up with 3 story ideas for Grade 3 kids.**
129
+
130
+
See the [results](results) folder for code outputs.
131
+
132
+
Note 1: `TERMINATE` is the standard used by AutoGen.
133
+
Note 2: Some LLMs included the terminating word but the quality of the full response was not perfect.
134
+
135
+
|| Key |
136
+
| --- | --- |
137
+
|:white_check_mark:| Output termination word correctly |
138
+
|:x:| Performed task, didn't output termination word |
139
+
|:thumbsdown:| Didn't understand/participate in task |
Nexus Raven | :thumbsdown::thumbsdown: | :thumbsdown::thumbsdown: | :thumbsdown::thumbsdown: | :thumbsdown::thumbsdown: | :thumbsdown::thumbsdown: | Tried to call a python function to create the stories |
0 commit comments