Skip to content

Commit

Permalink
ENH: optimize error msg for foundation models (#153)
Browse files Browse the repository at this point in the history
  • Loading branch information
UranusSeven authored Jul 11, 2023
1 parent cfa57d3 commit d510964
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 12 deletions.
19 changes: 11 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,17 +158,20 @@ To view the builtin models, run the following command:
$ xinference list --all
```

| Name | Format | Size (in billions) | Quantization |
| -------------------- | ------- | ------------------ |--------------------------------------------------------------------------------------------------------------------------------|
| baichuan | ggmlv3 | [7] | ['q2_K', 'q3_K_L', 'q3_K_M', 'q3_K_S', 'q4_0', 'q4_1', 'q4_K_M', 'q4_K_S', 'q5_0', 'q5_1', 'q5_K_M', 'q5_K_S', 'q6_K', 'q8_0'] |
| wizardlm-v1.0 | ggmlv3 | [7, 13, 33] | ['q2_K', 'q3_K_L', 'q3_K_M', 'q3_K_S', 'q4_0', 'q4_1', 'q4_K_M', 'q4_K_S', 'q5_0', 'q5_1', 'q5_K_M', 'q5_K_S', 'q6_K', 'q8_0'] |
| vicuna-v1.3 | ggmlv3 | [7, 13] | ['q2_K', 'q3_K_L', 'q3_K_M', 'q3_K_S', 'q4_0', 'q4_1', 'q4_K_M', 'q4_K_S', 'q5_0', 'q5_1', 'q5_K_M', 'q5_K_S', 'q6_K', 'q8_0'] |
| orca | ggmlv3 | [3, 7, 13] | ['q4_0', 'q4_1', 'q5_0', 'q5_1', 'q8_0'] |
| chatglm | ggmlv3 | [6] | ['q4_0', 'q4_1', 'q5_0', 'q5_1', 'q8_0'] |
| chatglm2 | ggmlv3 | [6] | ['q4_0', 'q4_1', 'q5_0', 'q5_1', 'q8_0'] |
| Name | Type | Language | Format | Size (in billions) | Quantization |
| -------------------- |------------------|----------|--------|--------------------|----------------------------------------|
| baichuan | Foundation Model | en, zh | ggmlv3 | 7 | 'q2_K', 'q3_K_L', ... , 'q6_K', 'q8_0' |
| chatglm | SFT Model | en, zh | ggmlv3 | 6 | 'q4_0', 'q4_1', 'q5_0', 'q5_1', 'q8_0' |
| chatglm2 | SFT Model | en, zh | ggmlv3 | 6 | 'q4_0', 'q4_1', 'q5_0', 'q5_1', 'q8_0' |
| wizardlm-v1.0 | SFT Model | en | ggmlv3 | 7, 13, 33 | 'q2_K', 'q3_K_L', ... , 'q6_K', 'q8_0' |
| vicuna-v1.3 | SFT Model | en | ggmlv3 | 7, 13 | 'q2_K', 'q3_K_L', ... , 'q6_K', 'q8_0' |
| orca | SFT Model | en | ggmlv3 | 3, 7, 13 | 'q4_0', 'q4_1', 'q5_0', 'q5_1', 'q8_0' |


**NOTE**:
- Xinference will download models automatically for you, and by default the models will be saved under `${USER}/.xinference/cache`.
- Foundation models only provide interface `generate`.
- SFT models provide both `generate` and `chat`.

## Roadmap
Xinference is currently under active development. Here's a roadmap outlining our planned
Expand Down
6 changes: 2 additions & 4 deletions xinference/core/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,18 +77,16 @@ async def _wrap_generator(self, ret: Any):
return ret

async def generate(self, prompt: str, *args, **kwargs):
logger.warning("Generate, self address: %s", self.address)

if not hasattr(self._model, "generate"):
raise AttributeError("generate")
raise AttributeError(f"Model {self._model.model_spec} is not for generate.")

return self._wrap_generator(
getattr(self._model, "generate")(prompt, *args, **kwargs)
)

async def chat(self, prompt: str, *args, **kwargs):
if not hasattr(self._model, "chat"):
raise AttributeError("chat")
raise AttributeError(f"Model {self._model.model_spec} is not for chat.")

return self._wrap_generator(
getattr(self._model, "chat")(prompt, *args, **kwargs)
Expand Down

0 comments on commit d510964

Please sign in to comment.