From a742c106461ffe1956c7f89df333253c6c923f5d Mon Sep 17 00:00:00 2001 From: Eric Zhu Date: Sat, 25 Jan 2025 00:51:34 -0800 Subject: [PATCH] Update model client documentation to reflect the latest recommendations. --- .../tutorial/models.ipynb | 539 ++++++++++++------ .../components/model-clients.ipynb | 202 ++++++- 2 files changed, 539 insertions(+), 202 deletions(-) diff --git a/python/packages/autogen-core/docs/src/user-guide/agentchat-user-guide/tutorial/models.ipynb b/python/packages/autogen-core/docs/src/user-guide/agentchat-user-guide/tutorial/models.ipynb index 6b81147bb504..095c09d5caff 100644 --- a/python/packages/autogen-core/docs/src/user-guide/agentchat-user-guide/tutorial/models.ipynb +++ b/python/packages/autogen-core/docs/src/user-guide/agentchat-user-guide/tutorial/models.ipynb @@ -1,191 +1,350 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Models\n", - "\n", - "In many cases, agents need access to LLM model services such as OpenAI, Azure OpenAI, or local models. Since there are many different providers with different APIs, `autogen-core` implements a protocol for [model clients](../../core-user-guide/components/model-clients.ipynb) and `autogen-ext` implements a set of model clients for popular model services. AgentChat can use these model clients to interact with model services. \n", - "\n", - "```{note}\n", - "See {py:class}`~autogen_ext.models.cache.ChatCompletionCache` for a caching wrapper to use with the following clients.\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## OpenAI\n", - "\n", - "To access OpenAI models, install the `openai` extension, which allows you to use the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "vscode": { - "languageId": "shellscript" - } - }, - "outputs": [], - "source": [ - "pip install \"autogen-ext[openai]\"" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You will also need to obtain an [API key](https://platform.openai.com/account/api-keys) from OpenAI." - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [], - "source": [ - "from autogen_ext.models.openai import OpenAIChatCompletionClient\n", - "\n", - "openai_model_client = OpenAIChatCompletionClient(\n", - " model=\"gpt-4o-2024-08-06\",\n", - " # api_key=\"sk-...\", # Optional if you have an OPENAI_API_KEY environment variable set.\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To test the model client, you can use the following code:" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "CreateResult(finish_reason='stop', content='The capital of France is Paris.', usage=RequestUsage(prompt_tokens=15, completion_tokens=7), cached=False, logprobs=None)\n" - ] - } - ], - "source": [ - "from autogen_core.models import UserMessage\n", - "\n", - "result = await openai_model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n", - "print(result)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "```{note}\n", - "You can use this client with models hosted on OpenAI-compatible endpoints, however, we have not tested this functionality.\n", - "See {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient` for more information.\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Azure OpenAI\n", - "\n", - "Similarly, install the `azure` and `openai` extensions to use the {py:class}`~autogen_ext.models.openai.AzureOpenAIChatCompletionClient`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "vscode": { - "languageId": "shellscript" - } - }, - "outputs": [], - "source": [ - "pip install \"autogen-ext[openai,azure]\"" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To use the client, you need to provide your deployment id, Azure Cognitive Services endpoint, api version, and model capabilities.\n", - "For authentication, you can either provide an API key or an Azure Active Directory (AAD) token credential.\n", - "\n", - "The following code snippet shows how to use AAD authentication.\n", - "The identity used must be assigned the [Cognitive Services OpenAI User](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/role-based-access-control#cognitive-services-openai-user) role." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from autogen_ext.models.openai import AzureOpenAIChatCompletionClient\n", - "from azure.identity import DefaultAzureCredential, get_bearer_token_provider\n", - "\n", - "# Create the token provider\n", - "token_provider = get_bearer_token_provider(DefaultAzureCredential(), \"https://cognitiveservices.azure.com/.default\")\n", - "\n", - "az_model_client = AzureOpenAIChatCompletionClient(\n", - " azure_deployment=\"{your-azure-deployment}\",\n", - " model=\"{model-name, such as gpt-4o}\",\n", - " api_version=\"2024-06-01\",\n", - " azure_endpoint=\"https://{your-custom-endpoint}.openai.azure.com/\",\n", - " azure_ad_token_provider=token_provider, # Optional if you choose key-based authentication.\n", - " # api_key=\"sk-...\", # For key-based authentication.\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "See [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/managed-identity#chat-completions) for how to use the Azure client directly or for more information." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Local Models\n", - "\n", - "See [this guide](../../core-user-guide/faqs.md#what-are-model-capabilities-and-how-do-i-specify-them) for how to override a model's default capabilities definitions in autogen.\n", - "\n", - "More to come. Stay tuned!" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": ".venv", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.12.7" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} \ No newline at end of file + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Models\n", + "\n", + "In many cases, agents need access to LLM model services such as OpenAI, Azure OpenAI, or local models. Since there are many different providers with different APIs, `autogen-core` implements a protocol for [model clients](../../core-user-guide/components/model-clients.ipynb) and `autogen-ext` implements a set of model clients for popular model services. AgentChat can use these model clients to interact with model services. \n", + "\n", + "```{note}\n", + "See {py:class}`~autogen_ext.models.cache.ChatCompletionCache` for a caching wrapper to use with the following clients.\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## OpenAI\n", + "\n", + "To access OpenAI models, install the `openai` extension, which allows you to use the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "vscode": { + "languageId": "shellscript" + } + }, + "outputs": [], + "source": [ + "pip install \"autogen-ext[openai]\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You will also need to obtain an [API key](https://platform.openai.com/account/api-keys) from OpenAI." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [], + "source": [ + "from autogen_ext.models.openai import OpenAIChatCompletionClient\n", + "\n", + "openai_model_client = OpenAIChatCompletionClient(\n", + " model=\"gpt-4o-2024-08-06\",\n", + " # api_key=\"sk-...\", # Optional if you have an OPENAI_API_KEY environment variable set.\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To test the model client, you can use the following code:" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "CreateResult(finish_reason='stop', content='The capital of France is Paris.', usage=RequestUsage(prompt_tokens=15, completion_tokens=7), cached=False, logprobs=None)\n" + ] + } + ], + "source": [ + "from autogen_core.models import UserMessage\n", + "\n", + "result = await openai_model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n", + "print(result)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "```{note}\n", + "You can use this client with models hosted on OpenAI-compatible endpoints, however, we have not tested this functionality.\n", + "See {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient` for more information.\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Azure OpenAI\n", + "\n", + "Similarly, install the `azure` and `openai` extensions to use the {py:class}`~autogen_ext.models.openai.AzureOpenAIChatCompletionClient`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "vscode": { + "languageId": "shellscript" + } + }, + "outputs": [], + "source": [ + "pip install \"autogen-ext[openai,azure]\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To use the client, you need to provide your deployment id, Azure Cognitive Services endpoint, api version, and model capabilities.\n", + "For authentication, you can either provide an API key or an Azure Active Directory (AAD) token credential.\n", + "\n", + "The following code snippet shows how to use AAD authentication.\n", + "The identity used must be assigned the [Cognitive Services OpenAI User](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/role-based-access-control#cognitive-services-openai-user) role." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from autogen_ext.models.openai import AzureOpenAIChatCompletionClient\n", + "from azure.identity import DefaultAzureCredential, get_bearer_token_provider\n", + "\n", + "# Create the token provider\n", + "token_provider = get_bearer_token_provider(DefaultAzureCredential(), \"https://cognitiveservices.azure.com/.default\")\n", + "\n", + "az_model_client = AzureOpenAIChatCompletionClient(\n", + " azure_deployment=\"{your-azure-deployment}\",\n", + " model=\"{model-name, such as gpt-4o}\",\n", + " api_version=\"2024-06-01\",\n", + " azure_endpoint=\"https://{your-custom-endpoint}.openai.azure.com/\",\n", + " azure_ad_token_provider=token_provider, # Optional if you choose key-based authentication.\n", + " # api_key=\"sk-...\", # For key-based authentication.\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "See [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/managed-identity#chat-completions) for how to use the Azure client directly or for more information." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Azure AI Foundry\n", + "\n", + "[Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/) (previously known as Azure AI Studio) offers models hosted on Azure.\n", + "To use those models, you use the {py:class}`~autogen_ext.models.azure.AzureAIChatCompletionClient`.\n", + "\n", + "You need to install the `azure` extra to use this client." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "vscode": { + "languageId": "shellscript" + } + }, + "outputs": [], + "source": [ + "pip install \"autogen-ext[azure]\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Below is an example of using this client with the Phi-4 model from [GitHub Marketplace](https://github.com/marketplace/models)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "finish_reason='stop' content='The capital of France is Paris.' usage=RequestUsage(prompt_tokens=14, completion_tokens=8) cached=False logprobs=None\n" + ] + } + ], + "source": [ + "import os\n", + "\n", + "from autogen_core.models import UserMessage\n", + "from autogen_ext.models.azure import AzureAIChatCompletionClient\n", + "from azure.core.credentials import AzureKeyCredential\n", + "\n", + "client = AzureAIChatCompletionClient(\n", + " model=\"Phi-4\",\n", + " endpoint=\"https://models.inference.ai.azure.com\",\n", + " # To authenticate with the model you will need to generate a personal access token (PAT) in your GitHub settings.\n", + " # Create your PAT token by following instructions here: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens\n", + " credential=AzureKeyCredential(os.environ[\"GITHUB_TOKEN\"]),\n", + " model_info={\n", + " \"json_output\": False,\n", + " \"function_calling\": False,\n", + " \"vision\": False,\n", + " \"family\": \"unknown\",\n", + " },\n", + ")\n", + "\n", + "result = await client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n", + "print(result)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Ollama (Local)\n", + "\n", + "[Ollama](https://ollama.com/) is a local model server that can run models locally on your machine.\n", + "\n", + "Currently, we recommend using the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient`\n", + "to interact with Ollama server.\n", + "\n", + "```{note}\n", + "Small local models are typically not as capable as larger models on the cloud.\n", + "For some tasks they may not perform as well and the output may be suprising.\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "finish_reason='unknown' content='The capital of France is Paris.' usage=RequestUsage(prompt_tokens=32, completion_tokens=8) cached=False logprobs=None\n" + ] + } + ], + "source": [ + "from autogen_core.models import UserMessage\n", + "from autogen_ext.models.openai import OpenAIChatCompletionClient\n", + "\n", + "model_client = OpenAIChatCompletionClient(\n", + " model=\"llama3.2:latest\",\n", + " base_url=\"http://localhost:11434/v1\",\n", + " api_key=\"placeholder\",\n", + " model_info={\n", + " \"vision\": False,\n", + " \"function_calling\": True,\n", + " \"json_output\": False,\n", + " \"family\": \"unknown\",\n", + " },\n", + ")\n", + "\n", + "response = await model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n", + "print(response)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Gemini (experimental)\n", + "\n", + "Gemini currently offers [an OpenAI-compatible API (beta)](https://ai.google.dev/gemini-api/docs/openai).\n", + "So you can use the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient` with the Gemini API.\n", + "\n", + "```{note}\n", + "While some model providers may offer OpenAI-compatible APIs, they may still have minor differences.\n", + "For example, the `finish_reason` field may be different in the response.\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "finish_reason='unknown' content='Paris\\n' usage=RequestUsage(prompt_tokens=8, completion_tokens=2) cached=False logprobs=None\n" + ] + } + ], + "source": [ + "import os\n", + "\n", + "from autogen_core.models import UserMessage\n", + "from autogen_ext.models.openai import OpenAIChatCompletionClient\n", + "\n", + "model_client = OpenAIChatCompletionClient(\n", + " model=\"gemini-1.5-flash\",\n", + " base_url=\"https://generativelanguage.googleapis.com/v1beta/openai/\",\n", + " api_key=os.getenv(\"GEMINI_API_KEY\"),\n", + " model_info={\n", + " \"vision\": True,\n", + " \"function_calling\": True,\n", + " \"json_output\": True,\n", + " \"family\": \"unknown\",\n", + " },\n", + ")\n", + "\n", + "response = await model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n", + "print(response)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.7" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/python/packages/autogen-core/docs/src/user-guide/core-user-guide/components/model-clients.ipynb b/python/packages/autogen-core/docs/src/user-guide/core-user-guide/components/model-clients.ipynb index 4447838b6fa5..1e96c7de041e 100644 --- a/python/packages/autogen-core/docs/src/user-guide/core-user-guide/components/model-clients.ipynb +++ b/python/packages/autogen-core/docs/src/user-guide/core-user-guide/components/model-clients.ipynb @@ -6,8 +6,7 @@ "source": [ "# Model Clients\n", "\n", - "AutoGen provides the {py:mod}`autogen_core.models` module with a suite of built-in\n", - "model clients for using ChatCompletion API.\n", + "AutoGen provides a suite of built-in model clients for using ChatCompletion API.\n", "All model clients implement the {py:class}`~autogen_core.models.ChatCompletionClient` protocol class." ] }, @@ -17,12 +16,35 @@ "source": [ "## Built-in Model Clients\n", "\n", - "Currently there are two built-in model clients:\n", - "{py:class}`~autogen_ext.models.OpenAIChatCompletionClient` and\n", - "{py:class}`~autogen_ext.models.AzureOpenAIChatCompletionClient`.\n", - "Both clients are asynchronous.\n", + "Currently there are three built-in model clients:\n", + "* {py:class}`~autogen_ext.models.OpenAIChatCompletionClient`\n", + "* {py:class}`~autogen_ext.models.AzureOpenAIChatCompletionClient`\n", + "* {py:class}`~autogen_ext.models.AzureAIChatCompletionClient`\n", "\n", - "To use the {py:class}`~autogen_ext.models.OpenAIChatCompletionClient`, you need to provide the API key\n", + "\n", + "### OpenAI\n", + "\n", + "To use the {py:class}`~autogen_ext.models.OpenAIChatCompletionClient`, you need to install the `openai` extra." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "vscode": { + "languageId": "shellscript" + } + }, + "outputs": [], + "source": [ + "# pip install \"autogen-ext[openai]\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You also need to provide the API key\n", "either through the environment variable `OPENAI_API_KEY` or through the `api_key` argument." ] }, @@ -96,7 +118,90 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Default [Model Capabilities](../faqs.md#what-are-model-capabilities-and-how-do-i-specify-them) may be overridden should the need arise.\n" + "### OpenAI-Compatible API\n", + "\n", + "You can use the {py:class}`~autogen_ext.models.OpenAIChatCompletionClient` to interact with OpenAI-compatible APIs such as Ollama and Gemini (beta).\n", + "\n", + "#### Ollama (local)\n", + "\n", + "The below example shows how to use a local model running on [Ollama](https://ollama.com) server." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "finish_reason='unknown' content='The capital of France is Paris.' usage=RequestUsage(prompt_tokens=32, completion_tokens=8) cached=False logprobs=None\n" + ] + } + ], + "source": [ + "from autogen_core.models import UserMessage\n", + "from autogen_ext.models.openai import OpenAIChatCompletionClient\n", + "\n", + "model_client = OpenAIChatCompletionClient(\n", + " model=\"llama3.2:latest\",\n", + " base_url=\"http://localhost:11434/v1\",\n", + " api_key=\"placeholder\",\n", + " model_info={\n", + " \"vision\": False,\n", + " \"function_calling\": True,\n", + " \"json_output\": False,\n", + " \"family\": \"unknown\",\n", + " },\n", + ")\n", + "\n", + "response = await model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n", + "print(response)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Gemini (beta)\n", + "\n", + "The below example shows how to use the Gemini model." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "finish_reason='unknown' content='Paris\\n' usage=RequestUsage(prompt_tokens=8, completion_tokens=2) cached=False logprobs=None\n" + ] + } + ], + "source": [ + "import os\n", + "\n", + "from autogen_core.models import UserMessage\n", + "from autogen_ext.models.openai import OpenAIChatCompletionClient\n", + "\n", + "model_client = OpenAIChatCompletionClient(\n", + " model=\"gemini-1.5-flash\",\n", + " base_url=\"https://generativelanguage.googleapis.com/v1beta/openai/\",\n", + " api_key=os.getenv(\"GEMINI_API_KEY\"),\n", + " model_info={\n", + " \"vision\": True,\n", + " \"function_calling\": True,\n", + " \"json_output\": True,\n", + " \"family\": \"unknown\",\n", + " },\n", + ")\n", + "\n", + "response = await model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n", + "print(response)" ] }, { @@ -265,8 +370,7 @@ "\n", "To use the {py:class}`~autogen_ext.models.AzureOpenAIChatCompletionClient`, you need to provide\n", "the deployment id, Azure Cognitive Services endpoint, api version, and model capabilities.\n", - "For authentication, you can either provide an API key or an Azure Active Directory (AAD) token credential.\n", - "To use AAD authentication, you need to first install the `azure-identity` package." + "For authentication, you can either provide an API key or an Azure Active Directory (AAD) token credential." ] }, { @@ -279,7 +383,7 @@ }, "outputs": [], "source": [ - "# pip install azure-identity" + "# pip install \"autogen-ext[openai,azure]\"" ] }, { @@ -321,6 +425,76 @@ "```" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Azure AI Foundry\n", + "\n", + "[Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/) (previously known as Azure AI Studio) offers models hosted on Azure.\n", + "To use those models, you use the {py:class}`~autogen_ext.models.azure.AzureAIChatCompletionClient`.\n", + "\n", + "You need to install the `azure` extra to use this client." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "vscode": { + "languageId": "shellscript" + } + }, + "outputs": [], + "source": [ + "# pip install \"autogen-ext[openai,azure]\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Below is an example of using this client with the Phi-4 model from [GitHub Marketplace](https://github.com/marketplace/models)." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "finish_reason='stop' content='The capital of France is Paris.' usage=RequestUsage(prompt_tokens=14, completion_tokens=8) cached=False logprobs=None\n" + ] + } + ], + "source": [ + "import os\n", + "\n", + "from autogen_core.models import UserMessage\n", + "from autogen_ext.models.azure import AzureAIChatCompletionClient\n", + "from azure.core.credentials import AzureKeyCredential\n", + "\n", + "client = AzureAIChatCompletionClient(\n", + " model=\"Phi-4\",\n", + " endpoint=\"https://models.inference.ai.azure.com\",\n", + " # To authenticate with the model you will need to generate a personal access token (PAT) in your GitHub settings.\n", + " # Create your PAT token by following instructions here: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens\n", + " credential=AzureKeyCredential(os.environ[\"GITHUB_TOKEN\"]),\n", + " model_info={\n", + " \"json_output\": False,\n", + " \"function_calling\": False,\n", + " \"vision\": False,\n", + " \"family\": \"unknown\",\n", + " },\n", + ")\n", + "\n", + "result = await client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n", + "print(result)" + ] + }, { "cell_type": "markdown", "metadata": {}, @@ -337,7 +511,11 @@ { "cell_type": "code", "execution_count": null, - "metadata": {}, + "metadata": { + "vscode": { + "languageId": "shellscript" + } + }, "outputs": [], "source": [ "# pip install -U \"autogen-ext[openai, diskcache]\""