Skip to content

Commit 34412d4

Browse files
authored
Merge pull request #8 from madebygps/feature/azure-ai-evaluation
Add travel planning agent evaluation scripts in English and Spanish
2 parents 980943f + 0a221ee commit 34412d4

File tree

7 files changed

+1361
-8
lines changed

7 files changed

+1361
-8
lines changed

.env.sample

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,6 @@ OPENAI_MODEL=gpt-3.5-turbo
99
# Configure for GitHub models: (GITHUB_TOKEN already exists inside Codespaces)
1010
GITHUB_MODEL=gpt-5-mini
1111
GITHUB_TOKEN=YOUR-GITHUB-PERSONAL-ACCESS-TOKEN
12-
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
12+
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
13+
# Optional: Set to log evaluation results to Azure AI Foundry for rich visualization
14+
AZURE_AI_PROJECT=https://YOUR-ACCOUNT.services.ai.azure.com/api/projects/YOUR-PROJECT

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -174,6 +174,7 @@ You can run the examples in this repository by executing the scripts in the `exa
174174
| [openai_tool_calling.py](examples/openai_tool_calling.py) | Tool calling with the low-level OpenAI SDK, showing manual tool dispatch. |
175175
| [workflow_basic.py](examples/workflow_basic.py) | A workflow-based agent. |
176176
| [agent_otel_aspire.py](examples/agent_otel_aspire.py) | An agent with OpenTelemetry tracing, metrics, and structured logs exported to the [Aspire Dashboard](https://aspire.dev/dashboard/standalone/). |
177+
| [agent_evaluation.py](examples/agent_evaluation.py) | Evaluate a travel planner agent using [Azure AI Evaluation](https://learn.microsoft.com/azure/ai-foundry/concepts/evaluation-evaluators/agent-evaluators) agent evaluators (IntentResolution, ToolCallAccuracy, TaskAdherence, ResponseCompleteness). Optionally set `AZURE_AI_PROJECT` in `.env` to log results to [Azure AI Foundry](https://learn.microsoft.com/azure/ai-foundry/how-to/develop/agent-evaluate-sdk). |
177178
178179
## Using the Aspire Dashboard for telemetry
179180

0 commit comments

Comments
 (0)