Skip to content

Flag for minimum speculative decoding probability #2271

@Nabokov86

Description

@Nabokov86

Thank you for adding MTP support! It works great!

Could you also add the --spec-draft-p-min (minimum speculative decoding probability) flag from llama.cpp? For me it speeds up generation quite a bit if set > 0.5.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions