Skip to content

Commit 7c33db5

Browse files
authored
Add gemma3 (#3272)
* Add gemma3 text model * Add gemma vl * update doc * add tp * fix doc * readmes
1 parent 028b94c commit 7c33db5

File tree

18 files changed

+1112
-15
lines changed

18 files changed

+1112
-15
lines changed

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -171,6 +171,7 @@ LMDeploy is a toolkit for compressing, deploying, and serving LLM, developed by
171171
<li>GLM-4V (9B)</li>
172172
<li>Llama3.2-vision (11B, 90B)</li>
173173
<li>Molmo (7B-D,72B)</li>
174+
<li>Gemma3 (1B - 27B)</li>
174175
</ul>
175176
</td>
176177
</tr>

README_ja.md

+1
Original file line numberDiff line numberDiff line change
@@ -169,6 +169,7 @@ LMDeploy TurboMindエンジンは卓越した推論能力を持ち、さまざ
169169
<li>GLM-4V (9B)</li>
170170
<li>Llama3.2-vision (11B, 90B)</li>
171171
<li>Molmo (7B-D,72B)</li>
172+
<li>Gemma3 (1B - 27B)</li>
172173
</ul>
173174
</td>
174175
</tr>

README_zh-CN.md

+1
Original file line numberDiff line numberDiff line change
@@ -173,6 +173,7 @@ LMDeploy TurboMind 引擎拥有卓越的推理能力,在各种规模的模型
173173
<li>GLM-4V (9B)</li>
174174
<li>Llama3.2-vision (11B, 90B)</li>
175175
<li>Molmo (7B-D,72B)</li>
176+
<li>Gemma3 (1B - 27B)</li>
176177
</ul>
177178
</td>
178179
</tr>

docs/en/multi_modal/gemma3.md

+30
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# Gemma3
2+
3+
## Introduction
4+
5+
Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models are multimodal, handling text and image input and generating text output, with open weights for both pre-trained variants and instruction-tuned variants. Gemma 3 has a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as laptops, desktops or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone.
6+
7+
## Quick Start
8+
9+
Install LMDeploy by following the [installation guide](../get_started/installation.md).
10+
11+
### Prepare
12+
13+
When deploying the **Gemma3** model using LMDeploy, please install the latest transformers.
14+
15+
### Offline inference pipeline
16+
17+
The following sample code shows the basic usage of VLM pipeline. For more examples, please refer to [VLM Offline Inference Pipeline](./vl_pipeline.md).
18+
19+
```python
20+
from lmdeploy import pipeline
21+
from lmdeploy.vl import load_image
22+
23+
24+
if __name__ == "__main__":
25+
pipe = pipeline('google/gemma-3-12b-it')
26+
27+
image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
28+
response = pipe(('describe this image', image))
29+
print(response)
30+
```

docs/en/multi_modal/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -16,3 +16,4 @@ Vision-Language Models
1616
qwen2_vl.md
1717
qwen2_5_vl.md
1818
molmo.md
19+
gemma3.md

docs/en/supported_models/supported_models.md

+1
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,7 @@ The following tables detail the models supported by LMDeploy's TurboMind engine
9999
| Mono-InternVL<sup>\[1\]</sup> | 2B | MLLM | Yes | Yes | Yes | - | - |
100100
| ChemVLM | 8B-26B | MLLM | Yes | Yes | No | - | - |
101101
| Gemma2 | 9B-27B | LLM | Yes | Yes | Yes | - | - |
102+
| Gemma3 | 1B-27B | MLLM | Yes | Yes | Yes | - | - |
102103
| GLM4 | 9B | LLM | Yes | Yes | Yes | No | No |
103104
| GLM-4V | 9B | MLLM | Yes | Yes | Yes | No | Yes |
104105
| CodeGeeX4 | 9B | LLM | Yes | Yes | Yes | - | - |

docs/zh_cn/multi_modal/gemma3.md

+30
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# Gemma3
2+
3+
## 简介
4+
5+
Gemma 是 Google 推出的轻量级、最先进的开放模型系列,采用与创建 Gemini 模型相同的研究和技术构建而成。Gemma3 模型是多模态模型,可处理文本和图像输入并生成文本输出,对预训练和指令微调均具有开源的权重。Gemma3 具有 128K 的大型上下文窗口,支持 140 多种语言,并且比以前的版本提供更多尺寸。Gemma3 模型非常适合各种文本生成和图像理解任务,包括问答、总结和推理。它们的尺寸相对较小,因此可以将其部署在资源有限的环境中,例如笔记本电脑、台式机或您自己的云基础设施,从而让每个人都能轻松访问最先进的 AI 模型,并帮助促进创新。
6+
7+
## 快速开始
8+
9+
请参考[安装文档](../get_started/installation.md)安装 LMDeploy。
10+
11+
### 准备
12+
13+
在使用 LMDeploy 部署 **Gemma3** 模型时,请安装最新的 transformers。
14+
15+
### 离线推理 pipeline
16+
17+
以下是使用pipeline进行离线推理的示例,更多用法参考[VLM离线推理 pipeline](./vl_pipeline.md)
18+
19+
```python
20+
from lmdeploy import pipeline
21+
from lmdeploy.vl import load_image
22+
23+
24+
if __name__ == "__main__":
25+
pipe = pipeline('google/gemma-3-12b-it')
26+
27+
image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
28+
response = pipe(('describe this image', image))
29+
print(response)
30+
```

docs/zh_cn/multi_modal/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -16,3 +16,4 @@
1616
qwen2_vl.md
1717
qwen2_5_vl.md
1818
molmo.md
19+
gemma3.md

docs/zh_cn/supported_models/supported_models.md

+1
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,7 @@
9999
| Mono-InternVL<sup>\[1\]</sup> | 2B | MLLM | Yes\* | Yes | Yes | - | - |
100100
| ChemVLM | 8B-26B | MLLM | Yes | Yes | No | - | - |
101101
| Gemma2 | 9B-27B | LLM | Yes | Yes | Yes | - | - |
102+
| Gemma3 | 1B-27B | MLLM | Yes | Yes | Yes | - | - |
102103
| GLM4 | 9B | LLM | Yes | Yes | Yes | No | No |
103104
| GLM-4V | 9B | MLLM | Yes | Yes | Yes | No | Yes |
104105
| CodeGeeX4 | 9B | LLM | Yes | Yes | Yes | - | - |

lmdeploy/archs.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,7 @@ def check_vl_llm(config: dict) -> bool:
120120
'InternVLChatModel', 'MiniGeminiLlamaForCausalLM', 'MGMLlamaForCausalLM', 'MiniCPMV',
121121
'LlavaForConditionalGeneration', 'LlavaNextForConditionalGeneration', 'Phi3VForCausalLM',
122122
'Qwen2VLForConditionalGeneration', 'Qwen2_5_VLForConditionalGeneration', 'MllamaForConditionalGeneration',
123-
'MolmoForCausalLM'
123+
'MolmoForCausalLM', 'Gemma3ForConditionalGeneration'
124124
])
125125
if arch == 'QWenLMHeadModel' and 'visual' in config:
126126
return True

lmdeploy/pytorch/configurations/gemma.py

+18-1
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,28 @@ class GemmaModelConfigBuilder(AutoModelConfigBuilder):
88
@classmethod
99
def condition(cls, hf_config):
1010
"""config."""
11-
return hf_config.model_type in ['gemma', 'gemma2']
11+
return hf_config.model_type in ['gemma', 'gemma2', 'gemma3_text']
1212

1313
@classmethod
1414
def build(cls, hf_config, model_path: str = None, **kwargs):
1515
"""build gemma."""
1616
cfg = DefaultModelConfigBuilder.build(hf_config, model_path, **kwargs)
1717
cfg.head_dim = hf_config.head_dim
1818
return cfg
19+
20+
21+
class GemmaVLModelConfigBuilder(AutoModelConfigBuilder):
22+
23+
@classmethod
24+
def condition(cls, hf_config):
25+
"""config."""
26+
model_arch = hf_config.architectures[0] if hf_config.architectures else None
27+
return model_arch == 'Gemma3ForConditionalGeneration'
28+
29+
@classmethod
30+
def build(cls, hf_config, model_path: str = None, **kwargs):
31+
"""build gemma."""
32+
hf_config.text_config.architectures = ['Gemma3ForCausalLM']
33+
cfg = DefaultModelConfigBuilder.build(hf_config.text_config, model_path, **kwargs)
34+
cfg.hf_config = hf_config
35+
return cfg

0 commit comments

Comments
 (0)