26 Oct 09:22

XprobeBot

ee8b7e9

v0.5.5

What's new in 0.5.5 (2023-10-26)

These are the changes in inference v0.5.5.

Enhancements

ENH: display language tags by @Minamiyama in #558
ENH: filter models by type by @Minamiyama in #559
ENH: disable create embeddings using LLMs by @UranusSeven in #570
ENH: benchmark latency by @UranusSeven in #576
ENH: configurable XINFERENCE_HOME env by @ChengjieLi28 in #566

Bug fixes

BUG: Fix bge-base-zh and bge-large-zh from ModelScope by @ChengjieLi28 in #571
BUG: When change model revision, xinference still uses the previous model by @ChengjieLi28 in #573
BUG: incorrect vLLM config by @UranusSeven in #579
BUG: fix llama-2 stop words by @UranusSeven in #580

Documentation

DOC: Incompatibility Between NVIDIA Driver and PyTorch Version by @onesuper in #551
DOC: Examples and resources page by @onesuper in #561

Full Changelog: v0.5.4...v0.5.5

Contributors

Minamiyama, onesuper, and 2 other contributors

Assets 2

20 Oct 16:07

XprobeBot

v0.5.4

a6bf734

v0.5.4

What's new in 0.5.4 (2023-10-20)

These are the changes in inference v0.5.4.

New features

FEAT: wizardcoder python by @UranusSeven in #539
FEAT: Support grammar-based sampling for ggml models by @aresnow1 in #525
FEAT: speculative decoding by @UranusSeven in #509

Enhancements

ENH: Download embedding models from ModelScope by @ChengjieLi28 in #532
ENH: lock transformers version by @UranusSeven in #549
ENH: Support downloading code-llama family models from ModelScope by @ChengjieLi28 in #557
ENH: Add gguf format of codellama-instruct by @aresnow1 in #567

Bug fixes

BUG: Fix stream not compatible with openai by @codingl2k1 in #524
BUG: set trust_remote_code to true by default by @richzw in #555
BUG: add quantization to valid file name by @richzw in #562
BUG: remove "generate" ability from Baichuan-2-chat json config by @Minamiyama in #556

Documentation

DOC: update pot files by @UranusSeven in #538
DOC: Add Client API reference by @codingl2k1 in #543
DOC: Add client doc to the user guide by @codingl2k1 in #547

New Contributors

@richzw made their first contribution in #555
@Minamiyama made their first contribution in #556

Full Changelog: v0.5.3...v0.5.4

Contributors

Minamiyama, richzw, and 4 other contributors

Assets 2

13 Oct 14:59

XprobeBot

v0.5.3

1fbcb80

v0.5.3

What's new in 0.5.3 (2023-10-13)

These are the changes in inference v0.5.3.

New features

FEAT: Add BAAI/BGE v1.5 family models by @ChengjieLi28 in #522
FEAT: Support Mistral & Mistral-Instruct by @Bojun-Feng in #510
FEAT: Add --model-uid to launch sub command by @codingl2k1 in #529
FEAT: Support stable diffusion by @codingl2k1 in #484

Enhancements

REF: Use restful client as default client by @aresnow1 in #470
REF: refactor client codes for xinference-client by @ChengjieLi28 in #528

Bug fixes

BUG: Fix listing embedding models by @aresnow1 in #514

Tests

TST: fix tiny llama by @UranusSeven in #513

Documentation

DOC: hardware specific installations by @UranusSeven in #517
DOC: update installation by @UranusSeven in #527

Full Changelog: v0.5.2...v0.5.3

Contributors

Bojun-Feng, aresnow1, and 3 other contributors

Assets 2

27 Sep 11:28

XprobeBot

v0.5.2

3a5082f

v0.5.2

What's new in 0.5.2 (2023-09-27)

These are the changes in inference v0.5.2.

Enhancements

ENH: validate model URI on register by @UranusSeven in #476
ENH: Skip download for embedding models by @aresnow1 in #499
ENH: set trust_remote_code to true by @UranusSeven in #500

Full Changelog: v0.5.1...v0.5.2

Contributors

aresnow1 and UranusSeven

Assets 2

26 Sep 13:46

XprobeBot

v0.5.1

aac134d

v0.5.1

What's new in 0.5.1 (2023-09-26)

These are the changes in inference v0.5.1.

Enhancements

ENH: Safe iterate stream of ggml model by @codingl2k1 in #449
ENH: Skip download if model exists by @aresnow1 in #495

Documentation

DOC: vLLM by @UranusSeven in #491

Full Changelog: v0.5.0...v0.5.1

Contributors

aresnow1, UranusSeven, and codingl2k1

Assets 2

22 Sep 09:56

XprobeBot

v0.5.0

1b4e14f

v0.5.0

What's new in 0.5.0 (2023-09-22)

These are the changes in inference v0.5.0.

New features

FEAT: incorporate vLLM by @UranusSeven in #445
FEAT: add register model page for dashboard by @Bojun-Feng in #420
FEAT: internlm 20b by @UranusSeven in #486
FEAT: support glaive coder by @UranusSeven in #490
FEAT: Support download models from modelscope by @aresnow1 in #475

Enhancements

ENH: shorten OpenBuddy's desc by @UranusSeven in #471
ENH: enable vLLM on Linux with cuda by @UranusSeven in #472
ENH: vLLM engine supports more models by @UranusSeven in #477
ENH: remove subpool on failure by @UranusSeven in #478
ENH: support trust_remote_code when launching a model by @UranusSeven in #479
ENH: vLLM auto tensor parallel by @UranusSeven in #480

Bug fixes

BUG: llama-cpp version dismatch by @Bojun-Feng in #473
BUG: incorrect endpoint on host 0.0.0.0 by @UranusSeven in #474
BUG: prompt style not set as expected on web UI by @UranusSeven in #489

Tests

TST: Fix windows CI by @aresnow1 in #455

Documentation

DOC: Getting started guide for beginners by @onesuper in #460

Full Changelog: v0.4.4...v0.5.0

Contributors

onesuper, Bojun-Feng, and 2 other contributors

Assets 2

19 Sep 04:38

XprobeBot

v0.4.4

f227c93

v0.4.4

What's new in 0.4.4 (2023-09-19)

These are the changes in inference v0.4.4.

Bug fixes

BUG: stop auto download from self-hosted storage for locale zh_CN by @UranusSeven in #465

Full Changelog: v0.4.3...v0.4.4

Contributors

UranusSeven

Assets 2

16 Sep 04:54

XprobeBot

v0.4.3

fd64f71

v0.4.3

What's new in 0.4.3 (2023-09-16)

These are the changes in inference v0.4.3.

Others

CHORE: Specify xoscar version by @aresnow1 in #457

Full Changelog: v0.4.2...v0.4.3

Contributors

aresnow1

Assets 2

15 Sep 11:26

XprobeBot

v0.4.2

f8df104

v0.4.2

What's new in 0.4.2 (2023-09-15)

These are the changes in inference v0.4.2.

New features

FEAT: concurrent generation by @codingl2k1 in #417
FEAT: Support gguf by @aresnow1 in #446
FEAT: Support OpenBuddy by @codingl2k1 in #444

Enhancements

ENH: client support desc model by @UranusSeven in #442
ENH: caching from self-hosted storage by @UranusSeven in #419
ENH: Assign worker sub pool at runtime instead of pre-allocated by @ChengjieLi28 in #437
ENH: add benchmark script by @UranusSeven in #451

Bug fixes

BUG: Fix restful client for embedding models by @aresnow1 in #439
BUG: cmdline double line breaker by @UranusSeven in #441
BUG: no error raised on unsupported fmt by @UranusSeven in #443
BUG: Xinferecen list failed if embedding models are launched by @aresnow1 in #452

Tests

TST: skip self-hosted storage tests by @UranusSeven in #453

Documentation

DOC: fix baichuan-2 and make naming consistent by @UranusSeven in #432
DOC: update hot topics by @UranusSeven in #456

Others

CI: Fix Windows CI by @codingl2k1 in #440

New Contributors

@ChengjieLi28 made their first contribution in #437

Full Changelog: v0.4.1...v0.4.2

Contributors

aresnow1, ChengjieLi28, and 2 other contributors

Assets 2

07 Sep 05:47

XprobeBot

v0.4.1

3fff879

v0.4.1

What's new in 0.4.1 (2023-09-07)

These are the changes in inference v0.4.1.

Bug fixes

BUG: Searching in UI results in white screen by @Bojun-Feng in #431
BUG: Include json in MANIFEST.in by @aresnow1 in #435

Documentation

DOC: Embedding models by @aresnow1 in #421

Full Changelog: v0.4.0...v0.4.1

Contributors

Bojun-Feng and aresnow1

Assets 2

Releases: xorbitsai/inference

v0.5.5

What's new in 0.5.5 (2023-10-26)

Enhancements

Bug fixes

Documentation

Contributors

v0.5.4

What's new in 0.5.4 (2023-10-20)

New features

Enhancements

Bug fixes

Documentation

New Contributors

Contributors

v0.5.3

What's new in 0.5.3 (2023-10-13)

New features

Enhancements

Bug fixes

Tests

Documentation

Contributors

v0.5.2

What's new in 0.5.2 (2023-09-27)

Enhancements

Contributors

v0.5.1

What's new in 0.5.1 (2023-09-26)

Enhancements

Documentation

Contributors

v0.5.0

What's new in 0.5.0 (2023-09-22)

New features

Enhancements

Bug fixes

Tests

Documentation

Contributors

v0.4.4

What's new in 0.4.4 (2023-09-19)

Bug fixes

Contributors

v0.4.3

What's new in 0.4.3 (2023-09-16)

Others

Contributors

v0.4.2

What's new in 0.4.2 (2023-09-15)

New features

Enhancements

Bug fixes

Tests

Documentation

Others

New Contributors

Contributors

v0.4.1

What's new in 0.4.1 (2023-09-07)

Bug fixes

Documentation

Contributors