From 18cb8d0e907a1ab19d556202213cadd9effd97cb Mon Sep 17 00:00:00 2001
From: Rui Ji <ruiji@usc.edu>
Date: Fri, 21 Jul 2023 15:35:07 +0800
Subject: [PATCH] DOC: Adding Examples to documentation (#196)

---
 doc/source/getting_started/index.rst          |  2 +-
 doc/source/reference/index.rst                |  3 +-
 doc/source/user_guide/examples/AI_Podcast.rst | 77 +++++++++++++++++++
 doc/source/user_guide/examples/chat.rst       | 31 ++++++++
 doc/source/user_guide/faq.rst                 |  3 +
 doc/source/user_guide/index.rst               | 21 ++++-
 6 files changed, 134 insertions(+), 3 deletions(-)
 create mode 100644 doc/source/user_guide/examples/AI_Podcast.rst
 create mode 100644 doc/source/user_guide/examples/chat.rst
 create mode 100644 doc/source/user_guide/faq.rst

diff --git a/doc/source/getting_started/index.rst b/doc/source/getting_started/index.rst
index 032ccb04e1..cf0d570146 100644
--- a/doc/source/getting_started/index.rst
+++ b/doc/source/getting_started/index.rst
@@ -1,6 +1,6 @@
 .. _getting_started_index:
 
 ===============
-Getting Started
+Getting Started 🚀
 ===============
 
diff --git a/doc/source/reference/index.rst b/doc/source/reference/index.rst
index 0ca640ae2b..77f0658654 100644
--- a/doc/source/reference/index.rst
+++ b/doc/source/reference/index.rst
@@ -1,5 +1,6 @@
 .. _reference_index:
 
 =============
-API Reference
+API Reference 📋
 =============
+
diff --git a/doc/source/user_guide/examples/AI_Podcast.rst b/doc/source/user_guide/examples/AI_Podcast.rst
new file mode 100644
index 0000000000..f19ae76931
--- /dev/null
+++ b/doc/source/user_guide/examples/AI_Podcast.rst
@@ -0,0 +1,77 @@
+==========
+**AI_Podcast** 🎙
+==========
+
+🌟 **Description**:
+
+🎙️AI Podcast - Voice Conversations with Multiple Agents on M2 Max 💻
+
+🌟 **Support Language** :
+
+English (AI_Podcast.py)
+
+Chinese (AI_Podcast_ZH.py)
+
+🌟 **Used Technology (EN version)** :
+
+    @ `OpenAI <https://twitter.com/OpenAI>`_ 's `whisper <https://pypi.org/project/openai-whisper/>`_
+
+    @ `ggerganov <https://twitter.com/ggerganov>`_ 's `ggml <https://github.com/ggerganov/ggml>`_
+
+    @ `WizardLM_AI <https://twitter.com/WizardLM_AI>`_ 's `wizardlm v1.0 <https://huggingface.co/WizardLM>`_
+
+    @ `lmsysorg <https://twitter.com/lmsysorg>`_ 's `vicuna v1.3 <https://huggingface.co/lmsys/vicuna-7b-v1.3>`_
+
+    @ `Xorbitsio inference <https://github.com/xorbitsai/inference>`_ as a launcher
+
+🌟 **Detailed Explanation on the Demo Functionality** :
+
+1. Generate the Wizardlm Model and Vicuna Model when the program is launching with Xorbits Inference.
+   Initiate the Chatroom by giving the two chatbot their names and telling them that there is a human user
+   called "username", where "username" is given by user's input. Initialize a empty chat history for the chatroom.
+
+2. Use Audio device to store recording into file, and transcribe the file using OpenAI's Whisper to receive a human readable text as string.
+
+3. Based on the input message string, determine which agents the user want to talk to. Call the target agents and
+   parse in the input string and chat history for the model to generate.
+
+4. When the responses are ready, use Macos's "Say" Command to produce audio through speaker. Each agents have their
+   own voice while speaking.
+
+5. Store the user input and the agent response into chat history, and recursively looping the program until user
+   explicitly says words like "see you" in their responses.
+
+🌟 **Highlight Features with Xinference** :
+
+1. With Xinference's distributed system, we can easily deploy two different models in the same session and in the
+   same "chatroom". With enough resources, the framework can deploy any amount of models you like at the same time.
+
+2. With Xinference, you can deploy the model easily by just adding a few lines of code.
+   For examples, for launching the vicuna model in the demo, just by::
+
+     args = parser.parse_args()
+     endpoint = args.endpoint
+     client = Client(endpoint)
+
+     model_a = "vicuna-v1.3"
+     model_a_uid = client.launch_model(
+         model_name=model_a,
+         model_format="ggmlv3",
+         model_size_in_billions=7,
+         quantization="q4_0",
+         n_ctx=2048,
+     )
+     model_a_ref = client.get_model(model_a_uid)
+
+   Then, the Xinference clinet will handle "target model downloading and caching", "set up environment and process
+   for the model", and "run the service at selected endpoint. " You are now ready to play with your llm model.
+
+🌟 **Original Demo Video** :
+
+    * `🎙️AI Podcast - Voice Conversations with Multiple Agents on M2 Max💻🔥🤖 <https://twitter.com/yichaocheng/status/1679129417778442240>`_
+
+🌟 **Source Code** :
+
+    * `AI_Podcast <https://github.com/xorbitsai/inference/blob/main/examples/AI_podcast.py>`_ (English Version)
+
+    * AI_Podcast_ZH (Chinese Version)
\ No newline at end of file
diff --git a/doc/source/user_guide/examples/chat.rst b/doc/source/user_guide/examples/chat.rst
new file mode 100644
index 0000000000..d9c4ca3327
--- /dev/null
+++ b/doc/source/user_guide/examples/chat.rst
@@ -0,0 +1,31 @@
+==========
+**chat** 🤖️
+==========
+
+🌟 **Description**:
+
+Demonstrate how to interact with Xinference to play with LLM chat functionality with an AI agent 💻
+
+🌟 **Used Technology**:
+
+    @ `ggerganov <https://twitter.com/ggerganov>`_ 's `ggml <https://github.com/ggerganov/ggml>`_
+
+    @ `Xorbitsio inference <https://github.com/xorbitsai/inference>`_ as a launcher
+
+    @ All LLaMA and Chatglm models supported by `Xorbitsio inference <https://github.com/xorbitsai/inference>`_
+
+🌟 **Detailed Explanation on the Demo Functionality** :
+
+1. Take the user command line input in the terminal and grab the required parameters for model lanuching.
+
+2. Launch the Xinference frameworks and automatically deploy the model user demanded into the cluster.
+
+3. Initialize an empty chat history to store all the context in the chatroom.
+
+4. Recursively ask for user's input as prompt and let the model to generate response based on the prompt and the
+   chat history. Show the Output of the response in the terminal.
+
+5. Store the user's input and agent's response into the chat history as context for the upcoming rounds.
+
+🌟 **Source Code** :
+    * `chat <https://github.com/RayJi01/Xprobe_inference/blob/main/examples/chat.py>`_
\ No newline at end of file
diff --git a/doc/source/user_guide/faq.rst b/doc/source/user_guide/faq.rst
new file mode 100644
index 0000000000..b6381fc527
--- /dev/null
+++ b/doc/source/user_guide/faq.rst
@@ -0,0 +1,3 @@
+==========
+**FAQ** 📚
+==========
\ No newline at end of file
diff --git a/doc/source/user_guide/index.rst b/doc/source/user_guide/index.rst
index ad6e9a0eb6..50b063a33f 100644
--- a/doc/source/user_guide/index.rst
+++ b/doc/source/user_guide/index.rst
@@ -1,6 +1,25 @@
 .. _user_guide_index:
 
 ==========
-User Guide
+User Guide 📒
 ==========
 
+With Xinference, you can unlock the full potential of your data and leverage its capabilities in diverse scenarios.
+Whether you are working with complex datasets, conducting research, or developing innovative projects,
+Xinference provides the flexibility and versatility to meet your unique requirements.
+
+In this comprehensive guide, we will walk you through the process of utilizing Xinference effectively.
+You will discover the various features and functionalities it offers,allowing you to harness the
+power of advanced inference techniques and make informed decisions based on reliable and accurate results.
+
+Additionally, we understand that you may have questions along the way. We have compiled a list of
+frequently asked questions to address common queries and provide you with the necessary insights to
+maximize your experience with Xinference.
+
+.. toctree::
+   :maxdepth: 2
+   :hidden:
+
+   examples/AI_Podcast
+   examples/chat
+   faq