trunacate input prompt (SamplingParams) in docker-compose #14019

bartmch · 2025-02-28T05:56:25Z

bartmch
Feb 28, 2025

Hi,

I am using the following docker-compose for deploying my Llama 3.1 70B model with vLLM:

llama70b_BF8:
    image: vllm/vllm-openai:latest
    container_name: llama70b_BF8
    runtime: nvidia
    environment:
      - CUDA_VISIBLE_DEVICES=1
      - VLLM_LOGGING_LEVEL=WARNING # Suppress logging of user prompts
      - VLLM_TRACE_FUNCTION=1 # Enable function-level tracing
      - OTEL_SERVICE_NAME=llama70b_BF8 # Name for traces in Tempo
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:1010/ # Main OTEL endpoint
      - OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://lgtm:1010/ # Ensure Tempo gets traces
      #- OTEL_EXPORTER_OTLP_TRACES_INSECURE=true # Disable TLS for local setup
      - OTEL_RESOURCE_ATTRIBUTES="service.name=llama70b_BF8"
      - OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf # Use HTTP-based OTEL protocol
      - OTEL_LOGS_EXPORTER=otlp
      #- OTEL_EXPORTER_OTLP_LOGS_ENDPOINT=http://lgtm:1010/v1/logs # Send logs to LGTM Loki
    command: [ "--model", "/chat-llm"]
    ports:
      - "8000:8000"
    volumes:
      - /docker/model_data/chat-llm:/chat-llm
    networks:
      ai:
        aliases:
          - llama70b

I would like the back to truncate any incoming tokens overflowing the context window. E.g. by being able to use "truncate_prompt_tokens" from SamplingParams. I am using this with Open-WebUI, truncating at the front-end is not an option for me. Thank you!

hongboshi1234 · 2025-02-28T16:53:48Z

hongboshi1234
Feb 28, 2025

Did not see this truncate_prompt_tokens can be set during server starting time. I would consider submit a feature request, or modify the code myself for my container

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

trunacate input prompt (SamplingParams) in docker-compose #14019

{{title}}

Replies: 1 comment

{{title}}

Select a reply

trunacate input prompt (SamplingParams) in docker-compose #14019

bartmch Feb 28, 2025

Replies: 1 comment

hongboshi1234 Feb 28, 2025

bartmch
Feb 28, 2025

hongboshi1234
Feb 28, 2025