InternLM
diff --git a/‎README.md
+1 b/‎README.md
+1
diff --git a/‎README_ja.md
+1 b/‎README_ja.md
+1
diff --git a/‎README_zh-CN.md
+1 b/‎README_zh-CN.md
+1
diff --git a/‎docs/en/multi_modal/gemma3.md
+30 b/‎docs/en/multi_modal/gemma3.md
+30
diff --git a/‎docs/en/multi_modal/index.rst
+1 b/‎docs/en/multi_modal/index.rst
+1
diff --git a/‎docs/en/supported_models/supported_models.md
+1 b/‎docs/en/supported_models/supported_models.md
+1
diff --git a/‎docs/zh_cn/multi_modal/gemma3.md
+30 b/‎docs/zh_cn/multi_modal/gemma3.md
+30
diff --git a/‎docs/zh_cn/multi_modal/index.rst
+1 b/‎docs/zh_cn/multi_modal/index.rst
+1
diff --git a/‎docs/zh_cn/supported_models/supported_models.md
+1 b/‎docs/zh_cn/supported_models/supported_models.md
+1
diff --git a/‎lmdeploy/archs.py
+1-1 b/‎lmdeploy/archs.py
+1-1
diff --git a/‎lmdeploy/pytorch/configurations/gemma.py
+18-1 b/‎lmdeploy/pytorch/configurations/gemma.py
+18-1
@@ -171,6 +171,7 @@ LMDeploy is a toolkit for compressing, deploying, and serving LLM, developed by
   <li>GLM-4V (9B)</li>
   <li>Llama3.2-vision (11B, 90B)</li>
   <li>Molmo (7B-D,72B)</li>
+  <li>Gemma3 (1B - 27B)</li>
 </ul>
 </td>
 </tr>
 
@@ -169,6 +169,7 @@ LMDeploy TurboMindエンジンは卓越した推論能力を持ち、さまざ
   <li>GLM-4V (9B)</li>
   <li>Llama3.2-vision (11B, 90B)</li>
   <li>Molmo (7B-D,72B)</li>
+  <li>Gemma3 (1B - 27B)</li>
 </ul>
 </td>
 </tr>
 
@@ -173,6 +173,7 @@ LMDeploy TurboMind 引擎拥有卓越的推理能力，在各种规模的模型
   <li>GLM-4V (9B)</li>
   <li>Llama3.2-vision (11B, 90B)</li>
   <li>Molmo (7B-D,72B)</li>
+  <li>Gemma3 (1B - 27B)</li>
 </ul>
 </td>
 </tr>
 
@@ -0,0 +1,30 @@
+# Gemma3
+
+## Introduction
+
+Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone.
+
+## Quick Start
+
+Install LMDeploy by following the [installation guide](../get_started/installation.md).
+
+### Prepare
+
+When deploying the **Gemma3** model using LMDeploy, please install the latest transformers.
+
+### Offline inference pipeline
+
+The following sample code shows the basic usage of VLM pipeline. For more examples, please refer to [VLM Offline Inference Pipeline](./vl_pipeline.md).
+
+```python
+from lmdeploy import pipeline
+from lmdeploy.vl import load_image
+
+
+if __name__ == "__main__":
+    pipe = pipeline('google/gemma-3-12b-it')
+
+    image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
+    response = pipe(('describe this image', image))
+    print(response)
+```
@@ -16,3 +16,4 @@ Vision-Language Models
    qwen2_vl.md
    qwen2_5_vl.md
    molmo.md
+   gemma3.md
@@ -99,6 +99,7 @@ The following tables detail the models supported by LMDeploy's TurboMind engine
 | Mono-InternVL<sup>\[1\]</sup>  |     2B      | MLLM |    Yes    |   Yes   |   Yes   |  -   |   -   |
 |            ChemVLM             |   8B-26B    | MLLM |    Yes    |   Yes   |   No    |  -   |   -   |
 |             Gemma2             |   9B-27B    | LLM  |    Yes    |   Yes   |   Yes   |  -   |   -   |
+|             Gemma3             |   1B-27B    | MLLM |    Yes    |   Yes   |   Yes   |  -   |   -   |
 |              GLM4              |     9B      | LLM  |    Yes    |   Yes   |   Yes   |  No  |  No   |
 |             GLM-4V             |     9B      | MLLM |    Yes    |   Yes   |   Yes   |  No  |  Yes  |
 |           CodeGeeX4            |     9B      | LLM  |    Yes    |   Yes   |   Yes   |  -   |   -   |
 
@@ -0,0 +1,30 @@
+# Gemma3
+
+## 简介
+
+Gemma 是 Google 推出的轻量级、最先进的开放模型系列，采用与创建 Gemini 模型相同的研究和技术构建而成。Gemma3 模型是多模态模型，可处理文本和图像输入并生成文本输出，对预训练和指令微调均具有开源的权重。Gemma3 具有 128K 的大型上下文窗口，支持 140 多种语言，并且比以前的版本提供更多尺寸。Gemma3 模型非常适合各种文本生成和图像理解任务，包括问答、总结和推理。它们的尺寸相对较小，因此可以将其部署在资源有限的环境中，例如笔记本电脑、台式机或您自己的云基础设施，从而让每个人都能轻松访问最先进的 AI 模型，并帮助促进创新。
+
+## 快速开始
+
+请参考[安装文档](../get_started/installation.md)安装 LMDeploy。
+
+### 准备
+
+在使用 LMDeploy 部署 **Gemma3** 模型时，请安装最新的 transformers。
+
+### 离线推理 pipeline
+
+以下是使用pipeline进行离线推理的示例，更多用法参考[VLM离线推理 pipeline](./vl_pipeline.md)。
+
+```python
+from lmdeploy import pipeline
+from lmdeploy.vl import load_image
+
+
+if __name__ == "__main__":
+    pipe = pipeline('google/gemma-3-12b-it')
+
+    image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
+    response = pipe(('describe this image', image))
+    print(response)
+```
@@ -16,3 +16,4 @@
    qwen2_vl.md
    qwen2_5_vl.md
    molmo.md
+   gemma3.md
@@ -99,6 +99,7 @@
 | Mono-InternVL<sup>\[1\]</sup>  |     2B      | MLLM |   Yes\*   |   Yes   |   Yes   |  -   |   -   |
 |            ChemVLM             |   8B-26B    | MLLM |    Yes    |   Yes   |   No    |  -   |   -   |
 |             Gemma2             |   9B-27B    | LLM  |    Yes    |   Yes   |   Yes   |  -   |   -   |
+|             Gemma3             |   1B-27B    | MLLM |    Yes    |   Yes   |   Yes   |  -   |   -   |
 |              GLM4              |     9B      | LLM  |    Yes    |   Yes   |   Yes   |  No  |  No   |
 |             GLM-4V             |     9B      | MLLM |    Yes    |   Yes   |   Yes   |  No  |  Yes  |
 |           CodeGeeX4            |     9B      | LLM  |    Yes    |   Yes   |   Yes   |  -   |   -   |
 
@@ -120,7 +120,7 @@ def check_vl_llm(config: dict) -> bool:
         'InternVLChatModel', 'MiniGeminiLlamaForCausalLM', 'MGMLlamaForCausalLM', 'MiniCPMV',
         'LlavaForConditionalGeneration', 'LlavaNextForConditionalGeneration', 'Phi3VForCausalLM',
         'Qwen2VLForConditionalGeneration', 'Qwen2_5_VLForConditionalGeneration', 'MllamaForConditionalGeneration',
-        'MolmoForCausalLM'
+        'MolmoForCausalLM', 'Gemma3ForConditionalGeneration'
     ])
     if arch == 'QWenLMHeadModel' and 'visual' in config:
         return True
 
@@ -8,11 +8,28 @@ class GemmaModelConfigBuilder(AutoModelConfigBuilder):
     @classmethod
     def condition(cls, hf_config):
         """config."""
-        return hf_config.model_type in ['gemma', 'gemma2']
+        return hf_config.model_type in ['gemma', 'gemma2', 'gemma3_text']
 
     @classmethod
     def build(cls, hf_config, model_path: str = None, **kwargs):
         """build gemma."""
         cfg = DefaultModelConfigBuilder.build(hf_config, model_path, **kwargs)
         cfg.head_dim = hf_config.head_dim
         return cfg
+
+
+class GemmaVLModelConfigBuilder(AutoModelConfigBuilder):
+
+    @classmethod
+    def condition(cls, hf_config):
+        """config."""
+        model_arch = hf_config.architectures[0] if hf_config.architectures else None
+        return model_arch == 'Gemma3ForConditionalGeneration'
+
+    @classmethod
+    def build(cls, hf_config, model_path: str = None, **kwargs):
+        """build gemma."""
+        hf_config.text_config.architectures = ['Gemma3ForCausalLM']
+        cfg = DefaultModelConfigBuilder.build(hf_config.text_config, model_path, **kwargs)
+        cfg.hf_config = hf_config
+        return cfg