Skip to content

Commit 653306c

Browse files
nagkumar91Nagkumar ArkalgudNagkumar Arkalgud
authored
Task/fix non adv sim query response (Azure#37941)
* Update task_query_response.prompty remove required keys * Update task_simulate.prompty * Update task_query_response.prompty * Update task_simulate.prompty * Fix the api_key needed * Update for release * Black fix for file * Fix the gpt-4o issue * Fix typo * fix unnecessary else * Fix the black error --------- Co-authored-by: Nagkumar Arkalgud <[email protected]> Co-authored-by: Nagkumar Arkalgud <[email protected]>
1 parent cbbd1c6 commit 653306c

File tree

4 files changed

+116
-3
lines changed

4 files changed

+116
-3
lines changed

sdk/evaluation/azure-ai-evaluation/CHANGELOG.md

+12
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,17 @@
11
# Release History
22

3+
4+
## 1.0.0b5 (Unreleased)
5+
6+
### Features Added
7+
8+
### Breaking Changes
9+
10+
### Bugs Fixed
11+
- Non adversarial simulator works with `gpt-4o` models using the `json_schema` response format
12+
13+
### Other Changes
14+
315
## 1.0.0b4 (2024-10-16)
416

517
### Breaking Changes

sdk/evaluation/azure-ai-evaluation/README.md

+95-2
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,95 @@ Output with a string that continues the conversation, responding to the latest m
139139
{{ conversation_history }}
140140

141141
```
142+
143+
Query Response generaing prompty for gpt-4o with `json_schema` support
144+
Use this file as an override.
145+
```yaml
146+
---
147+
name: TaskSimulatorQueryResponseGPT4o
148+
description: Gets queries and responses from a blob of text
149+
model:
150+
api: chat
151+
parameters:
152+
temperature: 0.0
153+
top_p: 1.0
154+
presence_penalty: 0
155+
frequency_penalty: 0
156+
response_format:
157+
type: json_schema
158+
json_schema:
159+
name: QRJsonSchema
160+
schema:
161+
type: object
162+
properties:
163+
items:
164+
type: array
165+
items:
166+
type: object
167+
properties:
168+
q:
169+
type: string
170+
r:
171+
type: string
172+
required:
173+
- q
174+
- r
175+
176+
inputs:
177+
text:
178+
type: string
179+
num_queries:
180+
type: integer
181+
182+
183+
---
184+
system:
185+
You're an AI that helps in preparing a Question/Answer quiz from Text for "Who wants to be a millionaire" tv show
186+
Both Questions and Answers MUST BE extracted from given Text
187+
Frame Question in a way so that Answer is RELEVANT SHORT BITE-SIZED info from Text
188+
RELEVANT info could be: NUMBER, DATE, STATISTIC, MONEY, NAME
189+
A sentence should contribute multiple QnAs if it has more info in it
190+
Answer must not be more than 5 words
191+
Answer must be picked from Text as is
192+
Question should be as descriptive as possible and must include as much context as possible from Text
193+
Output must always have the provided number of QnAs
194+
Output must be in JSON format.
195+
Output must have {{num_queries}} objects in the format specified below. Any other count is unacceptable.
196+
Text:
197+
<|text_start|>
198+
On January 24, 1984, former Apple CEO Steve Jobs introduced the first Macintosh. In late 2003, Apple had 2.06 percent of the desktop share in the United States.
199+
Some years later, research firms IDC and Gartner reported that Apple's market share in the U.S. had increased to about 6%.
200+
<|text_end|>
201+
Output with 5 QnAs:
202+
[
203+
{
204+
"q": "When did the former Apple CEO Steve Jobs introduced the first Macintosh?",
205+
"r": "January 24, 1984"
206+
},
207+
{
208+
"q": "Who was the former Apple CEO that introduced the first Macintosh on January 24, 1984?",
209+
"r": "Steve Jobs"
210+
},
211+
{
212+
"q": "What percent of the desktop share did Apple have in the United States in late 2003?",
213+
"r": "2.06 percent"
214+
},
215+
{
216+
"q": "What were the research firms that reported on Apple's market share in the U.S.?",
217+
"r": "IDC and Gartner"
218+
},
219+
{
220+
"q": "What was the percentage increase of Apple's market share in the U.S., as reported by research firms IDC and Gartner?",
221+
"r": "6%"
222+
}
223+
]
224+
Text:
225+
<|text_start|>
226+
{{ text }}
227+
<|text_end|>
228+
Output with {{ num_queries }} QnAs:
229+
```
230+
142231
Application code:
143232
144233
```python
@@ -156,6 +245,7 @@ model_config = {
156245
"azure_deployment": os.environ.get("AZURE_DEPLOYMENT"),
157246
# not providing key would make the SDK pick up `DefaultAzureCredential`
158247
# use "api_key": "<your API key>"
248+
"api_version": "2024-08-01-preview" # keep this for gpt-4o
159249
}
160250

161251
# Use Wikipedia to get some text for the simulation
@@ -208,11 +298,14 @@ async def callback(
208298

209299
async def main():
210300
simulator = Simulator(model_config=model_config)
301+
current_dir = os.path.dirname(__file__)
302+
query_response_override_for_latest_gpt_4o = os.path.join(current_dir, "TaskSimulatorQueryResponseGPT4o.prompty")
211303
outputs = await simulator(
212304
target=callback,
213305
text=text,
306+
query_response_generating_prompty=query_response_override_for_latest_gpt_4o, # use this only with latest gpt-4o
214307
num_queries=2,
215-
max_conversation_turns=4,
308+
max_conversation_turns=1,
216309
user_persona=[
217310
f"I am a student and I want to learn more about {wiki_search_term}",
218311
f"I am a teacher and I want to teach my students about {wiki_search_term}"
@@ -234,7 +327,7 @@ if __name__ == "__main__":
234327
#### Adversarial Simulator
235328

236329
```python
237-
from from azure.ai.evaluation.simulator import AdversarialSimulator, AdversarialScenario
330+
from azure.ai.evaluation.simulator import AdversarialSimulator, AdversarialScenario
238331
from azure.identity import DefaultAzureCredential
239332
from typing import Any, Dict, List, Optional
240333
import asyncio

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_version.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22
# Copyright (c) Microsoft Corporation. All rights reserved.
33
# ---------------------------------------------------------
44

5-
VERSION = "1.0.0b4"
5+
VERSION = "1.0.0b5"

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/simulator/_simulator.py

+8
Original file line numberDiff line numberDiff line change
@@ -432,6 +432,14 @@ async def _generate_query_responses(
432432
if isinstance(query_responses, dict):
433433
keys = list(query_responses.keys())
434434
return query_responses[keys[0]]
435+
if isinstance(query_responses, str):
436+
query_responses = json.loads(query_responses)
437+
if isinstance(query_responses, dict):
438+
if len(query_responses.keys()) == 1:
439+
return query_responses[list(query_responses.keys())[0]]
440+
return query_responses # type: ignore
441+
if isinstance(query_responses, list):
442+
return query_responses
435443
return json.loads(query_responses)
436444
except Exception as e:
437445
raise RuntimeError("Error generating query responses") from e

0 commit comments

Comments
 (0)