Allow truncation when embedding #14493

huydt84 · 2025-07-02T04:30:59Z

Sometimes it frustrates me because llama-server automatically stops when slot.n_ctx < input token length in embedding task. I want it to be able to truncate the input token, as an option.

huydt84 · 2025-07-04T06:38:27Z

@ngxson Please check

huydt-bti and others added 3 commits July 2, 2025 13:19

add --truncate-embed

16affc5

Merge branch 'master' into huydt/truncate-embed

d058cd0

fix func doc

e4730ce

huydt84 requested a review from ngxson as a code owner July 2, 2025 04:31

fix lint

f6d7495

github-actions bot added examples server labels Jul 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow truncation when embedding #14493

Allow truncation when embedding #14493

huydt84 commented Jul 2, 2025

Uh oh!

huydt84 commented Jul 4, 2025

Uh oh!

Uh oh!

Allow truncation when embedding #14493

Are you sure you want to change the base?

Allow truncation when embedding #14493

Conversation

huydt84 commented Jul 2, 2025

Uh oh!

huydt84 commented Jul 4, 2025

Uh oh!

Uh oh!