Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow specifying model revision #441

Open
tomtseng opened this issue Feb 8, 2025 · 0 comments
Open

Allow specifying model revision #441

tomtseng opened this issue Feb 8, 2025 · 0 comments

Comments

@tomtseng
Copy link

tomtseng commented Feb 8, 2025

Issue

I'd like to specify which revision of the model should be evaluated. My use case here is that I have a Huggingface model where I pushed commits throughout training, and I would like to know its AlpacaEval score at several of these training checkpoints.

Is there some way of specifying the model revision that I have overlooked?

What I've tried

The obvious thing to try is to add revision under completions_kwargs, like

Qwen2.5-0.5B-Instruct:
  prompt_template: "Qwen1.5-72B-Chat/prompt.txt"
  fn_completions: "huggingface_local_completions"
  completions_kwargs:
    model_name: "Qwen/Qwen2.5-0.5B-Instruct"
    # just for the sake of this example, I'm picking an arbitrary commit from the history of this model
    revision: "d955144"
  pretty_name: "Qwen2.5 0.5B Instruct"
  link: "https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct"

But this errors with

INFO:root:Kwargs to completion: {'do_sample': True, 'model_kwargs': {'revision': 'main', 'torch_dtype': None, 'device_map': 'auto', 'load_in_8bit': False}, 'batch_size': 1, 'max_new_tokens': 2000, 'temperature': 0.7}
hub_kwargs={'revision': None, 'token': None, 'trust_remote_code': False, '_commit_hash': None}
model_kwargs={'revision': 'main', 'torch_dtype': None, 'device_map': 'auto', 'load_in_8bit': False}
Chunking for generation:   0%|                                                                                                                                                                                                                          | 0/1 [00:02<?, ?it/s]
Traceback (most recent call last):
  File "/Users/t/.pyenv/versions/myenv-3.10.16/bin/alpaca_eval", line 8, in <module>
    sys.exit(main())
  File "/Users/t/.pyenv/versions/3.10.16/envs/myenv-3.10.16/lib/python3.10/site-packages/alpaca_eval/main.py", line 608, in main
    fire.Fire(ALL_FUNCTIONS)
  File "/Users/t/.pyenv/versions/3.10.16/envs/myenv-3.10.16/lib/python3.10/site-packages/fire/core.py", line 135, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/Users/t/.pyenv/versions/3.10.16/envs/myenv-3.10.16/lib/python3.10/site-packages/fire/core.py", line 468, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/Users/t/.pyenv/versions/3.10.16/envs/myenv-3.10.16/lib/python3.10/site-packages/fire/core.py", line 684, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/Users/t/.pyenv/versions/3.10.16/envs/myenv-3.10.16/lib/python3.10/site-packages/alpaca_eval/main.py", line 343, in evaluate_from_model
    model_outputs = get_completions(
  File "/Users/t/.pyenv/versions/3.10.16/envs/myenv-3.10.16/lib/python3.10/site-packages/alpaca_eval/main.py", line 328, in get_completions
    completions = fn_completions(prompts=prompts, **configs["completions_kwargs"])["completions"]
  File "/Users/t/.pyenv/versions/3.10.16/envs/myenv-3.10.16/lib/python3.10/site-packages/alpaca_eval/decoders/huggingface_local.py", line 129, in huggingface_local_completions
    pipeline = transformers.pipeline(
  File "/Users/t/.pyenv/versions/3.10.16/envs/myenv-3.10.16/lib/python3.10/site-packages/transformers/pipelines/__init__.py", line 928, in pipeline
    framework, model = infer_framework_load_model(
TypeError: transformers.pipelines.base.infer_framework_load_model() got multiple values for keyword argument 'revision'

The fix for huggingface_local_completions is something like

@@ -121,7 +121,9 @@ def huggingface_local_completions(

     default_kwargs = dict(
         do_sample=do_sample,
-        model_kwargs={k: v for k, v in model_kwargs.items() if k != "trust_remote_code"},
+        model_kwargs={
+            k: v for k, v in model_kwargs.items() if k != "revision" and k != "trust_remote_code"
+        },
         batch_size=batch_size,
     )
     default_kwargs.update(kwargs)
@@ -131,6 +133,7 @@ def huggingface_local_completions(
         model=model,
         tokenizer=tokenizer,
         **default_kwargs,
+        revision=model_kwargs.get("revision", None),
         trust_remote_code=model_kwargs.get("trust_remote_code", False),
     )

but I have not looked at other decoders.

Versions

Python 3.10.16
alpaca-eval 0.6.6
macOS Sequoia 15.3

@tomtseng tomtseng changed the title Feature request: specify model revision Allow specifying model revision Feb 8, 2025
tomtseng added a commit to AlignmentResearch/alpaca_eval that referenced this issue Feb 13, 2025
This is v0.6.6 but with
tatsu-lab#441 addressed: it allows
specifying the model revision to be evaluated under
`completions_kwargs`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant