Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] error setting tokenizer with custom generation params for vllm #563

Open
rawsh opened this issue Feb 14, 2025 · 0 comments
Open

[BUG] error setting tokenizer with custom generation params for vllm #563

rawsh opened this issue Feb 14, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@rawsh
Copy link

rawsh commented Feb 14, 2025

Describe the bug

TypeError: expected str, bytes or os.PathLike object, not dict

With config from the readme

│ /root/anaconda3/envs/zero/lib/python3.10/site-packages/lighteval/models/model_loader.py:150 in   │                                                                                     
│ load_model_with_accelerate_or_default                                                            │                                                                                     
│                                                                                                  │                                                                                     
│   147 │   elif isinstance(config, VLLMModelConfig):                                              │                                                                                     
│   148 │   │   if not is_vllm_available():                                                        │                                                                                     
│   149 │   │   │   raise ImportError(NO_VLLM_ERROR_MSG)                                           │                                                                                     
│ _ 150 │   │   model = VLLMModel(config=config, env_config=env_config)                            │                                                                                     
│   151 │   │   return model                                                                       │                                                                                     
│   152 │   else:                                                                                  │                                                                                     
│   153 │   │   model = TransformersModel(config=config, env_config=env_config) 
│ /root/anaconda3/envs/zero/lib/python3.10/site-packages/lighteval/models/vllm/vllm_model.py:116   │
│ in __init__                                                                                      │
│                                                                                                  │
│   113 │   │   self.data_parallel_size = int(config.data_parallel_size)                           │
│   114 │   │                                                                                      │
│   115 │   │   self._add_special_tokens = config.add_special_tokens if config.add_special_token   │
│ _ 116 │   │   self._tokenizer = self._create_auto_tokenizer(config, env_config)                  │
│   117 │   │                                                                                      │
│   118 │   │   self._max_length = int(config.max_model_length) if config.max_model_length is no   │
│   119 
│ /root/anaconda3/envs/zero/lib/python3.10/site-packages/lighteval/models/vllm/vllm_model.py:202   │                                                                                     
│ in _create_auto_tokenizer                                                                        │                                                                                     
│                                                                                                  │                                                                                     
│   199 │   │   return model                                                                       │                                                                                     
│   200 │                                                                                          │                                                                                     
│   201 │   def _create_auto_tokenizer(self, config: VLLMModelConfig, env_config: EnvConfig):      │                                                                                     
│ _ 202 │   │   tokenizer = get_tokenizer(                                                         │                                                                                     
│   203 │   │   │   config.pretrained,                                                             │                                                                                     
│   204 │   │   │   tokenizer_mode="auto",                                                         │                                                                                     
│   205 │   │   │   trust_remote_code=config.trust_remote_code,
│ /root/anaconda3/envs/zero/lib/python3.10/site-packages/vllm/transformers_utils/tokenizer.py:120  │                                                                                     
│ in get_tokenizer                                                                                 │                                                                                     
│                                                                                                  │                                                                                     
│   117 │   │   kwargs["truncation_side"] = "left"                                                 │                                                                                     
│   118 │                                                                                          │                                                                                     
│   119 │   # Separate model folder from file path for GGUF models                                 │                                                                                     
│ _ 120 │   is_gguf = check_gguf_file(tokenizer_name)                                              │                                                                                     
│   121 │   if is_gguf:                                                                            │                                                                                     
│   122 │   │   kwargs["gguf_file"] = Path(tokenizer_name).name                                    │                                                                                     
│   123 │   │   tokenizer_name = Path(tokenizer_name).parent 
│ /root/anaconda3/envs/zero/lib/python3.10/site-packages/vllm/transformers_utils/utils.py:8 in     │                                                                                     
│ check_gguf_file                                                                                  │                                                                                     
│                                                                                                  │                                                                                     
│    5                                                                                             │                                                                                     
│    6 def check_gguf_file(model: Union[str, PathLike]) -> bool:                                   │                                                                                     
│    7 │   """Check if the file is a GGUF model."""                                                │                                                                                     
│ _  8 │   model = Path(model)                                                                     │                                                                                     
│    9 │   if not model.is_file():                                                                 │                                                                                     
│   10 │   │   return False                                                                        │                                                                                     
│   11 │   elif model.suffix == ".gguf":
TypeError: expected str, bytes or os.PathLike object, not dict

To Reproduce

model: # Model specific parameters
  base_params:
    model_args: "pretrained=Qwen/Qwen2.5-7B-Instruct,dtype=bfloat16,max_model_length=768,gpu_memory_utilisation=0.7" # Model args that you would pass in the command line
  generation: # Generation specific parameters
    temperature: 1.0
    stop_tokens: null
    truncate_prompt: false

Expected behavior

can set custom generation params properly

Version info

0.70

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant