From a742c106461ffe1956c7f89df333253c6c923f5d Mon Sep 17 00:00:00 2001
From: Eric Zhu <ekzhu@users.noreply.github.com>
Date: Sat, 25 Jan 2025 00:51:34 -0800
Subject: [PATCH] Update model client documentation to reflect the latest
 recommendations.

---
 .../tutorial/models.ipynb                     | 539 ++++++++++++------
 .../components/model-clients.ipynb            | 202 ++++++-
 2 files changed, 539 insertions(+), 202 deletions(-)

diff --git a/python/packages/autogen-core/docs/src/user-guide/agentchat-user-guide/tutorial/models.ipynb b/python/packages/autogen-core/docs/src/user-guide/agentchat-user-guide/tutorial/models.ipynb
index 6b81147bb504..095c09d5caff 100644
--- a/python/packages/autogen-core/docs/src/user-guide/agentchat-user-guide/tutorial/models.ipynb
+++ b/python/packages/autogen-core/docs/src/user-guide/agentchat-user-guide/tutorial/models.ipynb
@@ -1,191 +1,350 @@
 {
-    "cells": [
-        {
-            "cell_type": "markdown",
-            "metadata": {},
-            "source": [
-                "# Models\n",
-                "\n",
-                "In many cases, agents need access to LLM model services such as OpenAI, Azure OpenAI, or local models. Since there are many different providers with different APIs, `autogen-core` implements a protocol for [model clients](../../core-user-guide/components/model-clients.ipynb) and `autogen-ext` implements a set of model clients for popular model services. AgentChat can use these model clients to interact with model services. \n",
-                "\n",
-                "```{note}\n",
-                "See {py:class}`~autogen_ext.models.cache.ChatCompletionCache` for a caching wrapper to use with the following clients.\n",
-                "```"
-            ]
-        },
-        {
-            "cell_type": "markdown",
-            "metadata": {},
-            "source": [
-                "## OpenAI\n",
-                "\n",
-                "To access OpenAI models, install the `openai` extension, which allows you to use the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient`."
-            ]
-        },
-        {
-            "cell_type": "code",
-            "execution_count": null,
-            "metadata": {
-                "vscode": {
-                    "languageId": "shellscript"
-                }
-            },
-            "outputs": [],
-            "source": [
-                "pip install \"autogen-ext[openai]\""
-            ]
-        },
-        {
-            "cell_type": "markdown",
-            "metadata": {},
-            "source": [
-                "You will also need to obtain an [API key](https://platform.openai.com/account/api-keys) from OpenAI."
-            ]
-        },
-        {
-            "cell_type": "code",
-            "execution_count": 10,
-            "metadata": {},
-            "outputs": [],
-            "source": [
-                "from autogen_ext.models.openai import OpenAIChatCompletionClient\n",
-                "\n",
-                "openai_model_client = OpenAIChatCompletionClient(\n",
-                "    model=\"gpt-4o-2024-08-06\",\n",
-                "    # api_key=\"sk-...\", # Optional if you have an OPENAI_API_KEY environment variable set.\n",
-                ")"
-            ]
-        },
-        {
-            "cell_type": "markdown",
-            "metadata": {},
-            "source": [
-                "To test the model client, you can use the following code:"
-            ]
-        },
-        {
-            "cell_type": "code",
-            "execution_count": 11,
-            "metadata": {},
-            "outputs": [
-                {
-                    "name": "stdout",
-                    "output_type": "stream",
-                    "text": [
-                        "CreateResult(finish_reason='stop', content='The capital of France is Paris.', usage=RequestUsage(prompt_tokens=15, completion_tokens=7), cached=False, logprobs=None)\n"
-                    ]
-                }
-            ],
-            "source": [
-                "from autogen_core.models import UserMessage\n",
-                "\n",
-                "result = await openai_model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
-                "print(result)"
-            ]
-        },
-        {
-            "cell_type": "markdown",
-            "metadata": {},
-            "source": [
-                "```{note}\n",
-                "You can use this client with models hosted on OpenAI-compatible endpoints, however, we have not tested this functionality.\n",
-                "See {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient` for more information.\n",
-                "```"
-            ]
-        },
-        {
-            "cell_type": "markdown",
-            "metadata": {},
-            "source": [
-                "## Azure OpenAI\n",
-                "\n",
-                "Similarly, install the `azure` and `openai` extensions to use the {py:class}`~autogen_ext.models.openai.AzureOpenAIChatCompletionClient`."
-            ]
-        },
-        {
-            "cell_type": "code",
-            "execution_count": null,
-            "metadata": {
-                "vscode": {
-                    "languageId": "shellscript"
-                }
-            },
-            "outputs": [],
-            "source": [
-                "pip install \"autogen-ext[openai,azure]\""
-            ]
-        },
-        {
-            "cell_type": "markdown",
-            "metadata": {},
-            "source": [
-                "To use the client, you need to provide your deployment id, Azure Cognitive Services endpoint, api version, and model capabilities.\n",
-                "For authentication, you can either provide an API key or an Azure Active Directory (AAD) token credential.\n",
-                "\n",
-                "The following code snippet shows how to use AAD authentication.\n",
-                "The identity used must be assigned the [Cognitive Services OpenAI User](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/role-based-access-control#cognitive-services-openai-user) role."
-            ]
-        },
-        {
-            "cell_type": "code",
-            "execution_count": null,
-            "metadata": {},
-            "outputs": [],
-            "source": [
-                "from autogen_ext.models.openai import AzureOpenAIChatCompletionClient\n",
-                "from azure.identity import DefaultAzureCredential, get_bearer_token_provider\n",
-                "\n",
-                "# Create the token provider\n",
-                "token_provider = get_bearer_token_provider(DefaultAzureCredential(), \"https://cognitiveservices.azure.com/.default\")\n",
-                "\n",
-                "az_model_client = AzureOpenAIChatCompletionClient(\n",
-                "    azure_deployment=\"{your-azure-deployment}\",\n",
-                "    model=\"{model-name, such as gpt-4o}\",\n",
-                "    api_version=\"2024-06-01\",\n",
-                "    azure_endpoint=\"https://{your-custom-endpoint}.openai.azure.com/\",\n",
-                "    azure_ad_token_provider=token_provider,  # Optional if you choose key-based authentication.\n",
-                "    # api_key=\"sk-...\", # For key-based authentication.\n",
-                ")"
-            ]
-        },
-        {
-            "cell_type": "markdown",
-            "metadata": {},
-            "source": [
-                "See [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/managed-identity#chat-completions) for how to use the Azure client directly or for more information."
-            ]
-        },
-        {
-            "cell_type": "markdown",
-            "metadata": {},
-            "source": [
-                "## Local Models\n",
-                "\n",
-                "See [this guide](../../core-user-guide/faqs.md#what-are-model-capabilities-and-how-do-i-specify-them) for how to override a model's default capabilities definitions in autogen.\n",
-                "\n",
-                "More to come. Stay tuned!"
-            ]
-        }
-    ],
-    "metadata": {
-        "kernelspec": {
-            "display_name": ".venv",
-            "language": "python",
-            "name": "python3"
-        },
-        "language_info": {
-            "codemirror_mode": {
-                "name": "ipython",
-                "version": 3
-            },
-            "file_extension": ".py",
-            "mimetype": "text/x-python",
-            "name": "python",
-            "nbconvert_exporter": "python",
-            "pygments_lexer": "ipython3",
-            "version": "3.12.7"
-        }
-    },
-    "nbformat": 4,
-    "nbformat_minor": 2
-}
\ No newline at end of file
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Models\n",
+    "\n",
+    "In many cases, agents need access to LLM model services such as OpenAI, Azure OpenAI, or local models. Since there are many different providers with different APIs, `autogen-core` implements a protocol for [model clients](../../core-user-guide/components/model-clients.ipynb) and `autogen-ext` implements a set of model clients for popular model services. AgentChat can use these model clients to interact with model services. \n",
+    "\n",
+    "```{note}\n",
+    "See {py:class}`~autogen_ext.models.cache.ChatCompletionCache` for a caching wrapper to use with the following clients.\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## OpenAI\n",
+    "\n",
+    "To access OpenAI models, install the `openai` extension, which allows you to use the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "shellscript"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "pip install \"autogen-ext[openai]\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You will also need to obtain an [API key](https://platform.openai.com/account/api-keys) from OpenAI."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from autogen_ext.models.openai import OpenAIChatCompletionClient\n",
+    "\n",
+    "openai_model_client = OpenAIChatCompletionClient(\n",
+    "    model=\"gpt-4o-2024-08-06\",\n",
+    "    # api_key=\"sk-...\", # Optional if you have an OPENAI_API_KEY environment variable set.\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To test the model client, you can use the following code:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "CreateResult(finish_reason='stop', content='The capital of France is Paris.', usage=RequestUsage(prompt_tokens=15, completion_tokens=7), cached=False, logprobs=None)\n"
+     ]
+    }
+   ],
+   "source": [
+    "from autogen_core.models import UserMessage\n",
+    "\n",
+    "result = await openai_model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
+    "print(result)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "```{note}\n",
+    "You can use this client with models hosted on OpenAI-compatible endpoints, however, we have not tested this functionality.\n",
+    "See {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient` for more information.\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Azure OpenAI\n",
+    "\n",
+    "Similarly, install the `azure` and `openai` extensions to use the {py:class}`~autogen_ext.models.openai.AzureOpenAIChatCompletionClient`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "shellscript"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "pip install \"autogen-ext[openai,azure]\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To use the client, you need to provide your deployment id, Azure Cognitive Services endpoint, api version, and model capabilities.\n",
+    "For authentication, you can either provide an API key or an Azure Active Directory (AAD) token credential.\n",
+    "\n",
+    "The following code snippet shows how to use AAD authentication.\n",
+    "The identity used must be assigned the [Cognitive Services OpenAI User](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/role-based-access-control#cognitive-services-openai-user) role."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from autogen_ext.models.openai import AzureOpenAIChatCompletionClient\n",
+    "from azure.identity import DefaultAzureCredential, get_bearer_token_provider\n",
+    "\n",
+    "# Create the token provider\n",
+    "token_provider = get_bearer_token_provider(DefaultAzureCredential(), \"https://cognitiveservices.azure.com/.default\")\n",
+    "\n",
+    "az_model_client = AzureOpenAIChatCompletionClient(\n",
+    "    azure_deployment=\"{your-azure-deployment}\",\n",
+    "    model=\"{model-name, such as gpt-4o}\",\n",
+    "    api_version=\"2024-06-01\",\n",
+    "    azure_endpoint=\"https://{your-custom-endpoint}.openai.azure.com/\",\n",
+    "    azure_ad_token_provider=token_provider,  # Optional if you choose key-based authentication.\n",
+    "    # api_key=\"sk-...\", # For key-based authentication.\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "See [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/managed-identity#chat-completions) for how to use the Azure client directly or for more information."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Azure AI Foundry\n",
+    "\n",
+    "[Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/) (previously known as Azure AI Studio) offers models hosted on Azure.\n",
+    "To use those models, you use the {py:class}`~autogen_ext.models.azure.AzureAIChatCompletionClient`.\n",
+    "\n",
+    "You need to install the `azure` extra to use this client."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "shellscript"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "pip install \"autogen-ext[azure]\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Below is an example of using this client with the Phi-4 model from [GitHub Marketplace](https://github.com/marketplace/models)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "finish_reason='stop' content='The capital of France is Paris.' usage=RequestUsage(prompt_tokens=14, completion_tokens=8) cached=False logprobs=None\n"
+     ]
+    }
+   ],
+   "source": [
+    "import os\n",
+    "\n",
+    "from autogen_core.models import UserMessage\n",
+    "from autogen_ext.models.azure import AzureAIChatCompletionClient\n",
+    "from azure.core.credentials import AzureKeyCredential\n",
+    "\n",
+    "client = AzureAIChatCompletionClient(\n",
+    "    model=\"Phi-4\",\n",
+    "    endpoint=\"https://models.inference.ai.azure.com\",\n",
+    "    # To authenticate with the model you will need to generate a personal access token (PAT) in your GitHub settings.\n",
+    "    # Create your PAT token by following instructions here: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens\n",
+    "    credential=AzureKeyCredential(os.environ[\"GITHUB_TOKEN\"]),\n",
+    "    model_info={\n",
+    "        \"json_output\": False,\n",
+    "        \"function_calling\": False,\n",
+    "        \"vision\": False,\n",
+    "        \"family\": \"unknown\",\n",
+    "    },\n",
+    ")\n",
+    "\n",
+    "result = await client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
+    "print(result)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Ollama (Local)\n",
+    "\n",
+    "[Ollama](https://ollama.com/) is a local model server that can run models locally on your machine.\n",
+    "\n",
+    "Currently, we recommend using the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient`\n",
+    "to interact with Ollama server.\n",
+    "\n",
+    "```{note}\n",
+    "Small local models are typically not as capable as larger models on the cloud.\n",
+    "For some tasks they may not perform as well and the output may be suprising.\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "finish_reason='unknown' content='The capital of France is Paris.' usage=RequestUsage(prompt_tokens=32, completion_tokens=8) cached=False logprobs=None\n"
+     ]
+    }
+   ],
+   "source": [
+    "from autogen_core.models import UserMessage\n",
+    "from autogen_ext.models.openai import OpenAIChatCompletionClient\n",
+    "\n",
+    "model_client = OpenAIChatCompletionClient(\n",
+    "    model=\"llama3.2:latest\",\n",
+    "    base_url=\"http://localhost:11434/v1\",\n",
+    "    api_key=\"placeholder\",\n",
+    "    model_info={\n",
+    "        \"vision\": False,\n",
+    "        \"function_calling\": True,\n",
+    "        \"json_output\": False,\n",
+    "        \"family\": \"unknown\",\n",
+    "    },\n",
+    ")\n",
+    "\n",
+    "response = await model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
+    "print(response)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Gemini (experimental)\n",
+    "\n",
+    "Gemini currently offers [an OpenAI-compatible API (beta)](https://ai.google.dev/gemini-api/docs/openai).\n",
+    "So you can use the {py:class}`~autogen_ext.models.openai.OpenAIChatCompletionClient` with the Gemini API.\n",
+    "\n",
+    "```{note}\n",
+    "While some model providers may offer OpenAI-compatible APIs, they may still have minor differences.\n",
+    "For example, the `finish_reason` field may be different in the response.\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "finish_reason='unknown' content='Paris\\n' usage=RequestUsage(prompt_tokens=8, completion_tokens=2) cached=False logprobs=None\n"
+     ]
+    }
+   ],
+   "source": [
+    "import os\n",
+    "\n",
+    "from autogen_core.models import UserMessage\n",
+    "from autogen_ext.models.openai import OpenAIChatCompletionClient\n",
+    "\n",
+    "model_client = OpenAIChatCompletionClient(\n",
+    "    model=\"gemini-1.5-flash\",\n",
+    "    base_url=\"https://generativelanguage.googleapis.com/v1beta/openai/\",\n",
+    "    api_key=os.getenv(\"GEMINI_API_KEY\"),\n",
+    "    model_info={\n",
+    "        \"vision\": True,\n",
+    "        \"function_calling\": True,\n",
+    "        \"json_output\": True,\n",
+    "        \"family\": \"unknown\",\n",
+    "    },\n",
+    ")\n",
+    "\n",
+    "response = await model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
+    "print(response)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": ".venv",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.7"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/python/packages/autogen-core/docs/src/user-guide/core-user-guide/components/model-clients.ipynb b/python/packages/autogen-core/docs/src/user-guide/core-user-guide/components/model-clients.ipynb
index 4447838b6fa5..1e96c7de041e 100644
--- a/python/packages/autogen-core/docs/src/user-guide/core-user-guide/components/model-clients.ipynb
+++ b/python/packages/autogen-core/docs/src/user-guide/core-user-guide/components/model-clients.ipynb
@@ -6,8 +6,7 @@
    "source": [
     "# Model Clients\n",
     "\n",
-    "AutoGen provides the {py:mod}`autogen_core.models` module with a suite of built-in\n",
-    "model clients for using ChatCompletion API.\n",
+    "AutoGen provides a suite of built-in model clients for using ChatCompletion API.\n",
     "All model clients implement the {py:class}`~autogen_core.models.ChatCompletionClient` protocol class."
    ]
   },
@@ -17,12 +16,35 @@
    "source": [
     "## Built-in Model Clients\n",
     "\n",
-    "Currently there are two built-in model clients:\n",
-    "{py:class}`~autogen_ext.models.OpenAIChatCompletionClient` and\n",
-    "{py:class}`~autogen_ext.models.AzureOpenAIChatCompletionClient`.\n",
-    "Both clients are asynchronous.\n",
+    "Currently there are three built-in model clients:\n",
+    "* {py:class}`~autogen_ext.models.OpenAIChatCompletionClient`\n",
+    "* {py:class}`~autogen_ext.models.AzureOpenAIChatCompletionClient`\n",
+    "* {py:class}`~autogen_ext.models.AzureAIChatCompletionClient`\n",
     "\n",
-    "To use the {py:class}`~autogen_ext.models.OpenAIChatCompletionClient`, you need to provide the API key\n",
+    "\n",
+    "### OpenAI\n",
+    "\n",
+    "To use the {py:class}`~autogen_ext.models.OpenAIChatCompletionClient`, you need to install the `openai` extra."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "shellscript"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# pip install \"autogen-ext[openai]\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You also need to provide the API key\n",
     "either through the environment variable `OPENAI_API_KEY` or through the `api_key` argument."
    ]
   },
@@ -96,7 +118,90 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Default [Model Capabilities](../faqs.md#what-are-model-capabilities-and-how-do-i-specify-them) may be overridden should the need arise.\n"
+    "### OpenAI-Compatible API\n",
+    "\n",
+    "You can use the {py:class}`~autogen_ext.models.OpenAIChatCompletionClient` to interact with OpenAI-compatible APIs such as Ollama and Gemini (beta).\n",
+    "\n",
+    "#### Ollama (local)\n",
+    "\n",
+    "The below example shows how to use a local model running on [Ollama](https://ollama.com) server."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "finish_reason='unknown' content='The capital of France is Paris.' usage=RequestUsage(prompt_tokens=32, completion_tokens=8) cached=False logprobs=None\n"
+     ]
+    }
+   ],
+   "source": [
+    "from autogen_core.models import UserMessage\n",
+    "from autogen_ext.models.openai import OpenAIChatCompletionClient\n",
+    "\n",
+    "model_client = OpenAIChatCompletionClient(\n",
+    "    model=\"llama3.2:latest\",\n",
+    "    base_url=\"http://localhost:11434/v1\",\n",
+    "    api_key=\"placeholder\",\n",
+    "    model_info={\n",
+    "        \"vision\": False,\n",
+    "        \"function_calling\": True,\n",
+    "        \"json_output\": False,\n",
+    "        \"family\": \"unknown\",\n",
+    "    },\n",
+    ")\n",
+    "\n",
+    "response = await model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
+    "print(response)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Gemini (beta)\n",
+    "\n",
+    "The below example shows how to use the Gemini model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "finish_reason='unknown' content='Paris\\n' usage=RequestUsage(prompt_tokens=8, completion_tokens=2) cached=False logprobs=None\n"
+     ]
+    }
+   ],
+   "source": [
+    "import os\n",
+    "\n",
+    "from autogen_core.models import UserMessage\n",
+    "from autogen_ext.models.openai import OpenAIChatCompletionClient\n",
+    "\n",
+    "model_client = OpenAIChatCompletionClient(\n",
+    "    model=\"gemini-1.5-flash\",\n",
+    "    base_url=\"https://generativelanguage.googleapis.com/v1beta/openai/\",\n",
+    "    api_key=os.getenv(\"GEMINI_API_KEY\"),\n",
+    "    model_info={\n",
+    "        \"vision\": True,\n",
+    "        \"function_calling\": True,\n",
+    "        \"json_output\": True,\n",
+    "        \"family\": \"unknown\",\n",
+    "    },\n",
+    ")\n",
+    "\n",
+    "response = await model_client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
+    "print(response)"
    ]
   },
   {
@@ -265,8 +370,7 @@
     "\n",
     "To use the {py:class}`~autogen_ext.models.AzureOpenAIChatCompletionClient`, you need to provide\n",
     "the deployment id, Azure Cognitive Services endpoint, api version, and model capabilities.\n",
-    "For authentication, you can either provide an API key or an Azure Active Directory (AAD) token credential.\n",
-    "To use AAD authentication, you need to first install the `azure-identity` package."
+    "For authentication, you can either provide an API key or an Azure Active Directory (AAD) token credential."
    ]
   },
   {
@@ -279,7 +383,7 @@
    },
    "outputs": [],
    "source": [
-    "# pip install azure-identity"
+    "# pip install \"autogen-ext[openai,azure]\""
    ]
   },
   {
@@ -321,6 +425,76 @@
     "```"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Azure AI Foundry\n",
+    "\n",
+    "[Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/) (previously known as Azure AI Studio) offers models hosted on Azure.\n",
+    "To use those models, you use the {py:class}`~autogen_ext.models.azure.AzureAIChatCompletionClient`.\n",
+    "\n",
+    "You need to install the `azure` extra to use this client."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "shellscript"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# pip install \"autogen-ext[openai,azure]\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Below is an example of using this client with the Phi-4 model from [GitHub Marketplace](https://github.com/marketplace/models)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "finish_reason='stop' content='The capital of France is Paris.' usage=RequestUsage(prompt_tokens=14, completion_tokens=8) cached=False logprobs=None\n"
+     ]
+    }
+   ],
+   "source": [
+    "import os\n",
+    "\n",
+    "from autogen_core.models import UserMessage\n",
+    "from autogen_ext.models.azure import AzureAIChatCompletionClient\n",
+    "from azure.core.credentials import AzureKeyCredential\n",
+    "\n",
+    "client = AzureAIChatCompletionClient(\n",
+    "    model=\"Phi-4\",\n",
+    "    endpoint=\"https://models.inference.ai.azure.com\",\n",
+    "    # To authenticate with the model you will need to generate a personal access token (PAT) in your GitHub settings.\n",
+    "    # Create your PAT token by following instructions here: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens\n",
+    "    credential=AzureKeyCredential(os.environ[\"GITHUB_TOKEN\"]),\n",
+    "    model_info={\n",
+    "        \"json_output\": False,\n",
+    "        \"function_calling\": False,\n",
+    "        \"vision\": False,\n",
+    "        \"family\": \"unknown\",\n",
+    "    },\n",
+    ")\n",
+    "\n",
+    "result = await client.create([UserMessage(content=\"What is the capital of France?\", source=\"user\")])\n",
+    "print(result)"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -337,7 +511,11 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "metadata": {},
+   "metadata": {
+    "vscode": {
+     "languageId": "shellscript"
+    }
+   },
    "outputs": [],
    "source": [
     "# pip install -U \"autogen-ext[openai, diskcache]\""