-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail to run Tiny-llama example #143
Comments
Thanks for pointing out outdated documentation! You can find an example with the new API here: https://github.com/intel/intel-npu-acceleration-library/blob/main/examples/llama.py |
Reopening until doc is updated |
@alessandropalla The example you gave doesn't work.
|
you need to use the latest library release |
@alessandropalla I get a different error when attempting to run the update Llama file you linked above.
|
I was also having similar issues, but I was eventually able to find workarounds for each of them and get things running. Here is my setup:
I first got the same error that @shira-g reported:
Following the advice of @alessandropalla got me past that one. Then I encountered the same error as @Jaylyn-Barbee:
I ended up writing a function in my script to modify the model object to give the attention layers the rotary_emb attribute that the npu_library was expecting (see below). Then, I got a new error:
My fix for this error was to modify \Lib\site-packages\intel_npu_acceleration_library\nn\llm.py, line 245 from this: That got things running for me. My complete script (I pieced together a few different scripts, so it's a little different from the original Intel example):
I'm new to pytorch, transformers, and the intel_npu_acceleration_library, so I don't have a specific recommendation on how to fix the library and/or the official example script - but maybe someone will find my workarounds helpful. |
I borrowed the idea from @richardanichols and fixed the code as follows:
The output is
I don't know whether is it correct. But it works for me |
@lcwyylcwyy I tried yours and it gives me this output:
|
@SuperFico2100 Yes, I found the reason, in the /home/xxxxx/anaconda3/envs/intel_npu/lib/python3.12/site-packages/intel_npu_acceleration_library/nn/llm.py line 245, need to change the output from
to
|
Thanks to @SuperFico2100
to
|
Describe the bug
I run the following example: https://github.com/intel/intel-npu-acceleration-library?tab=readme-ov-file#run-a-tiny-llama-model-on-the-npu
and it fails with:
Traceback (most recent call last):
File "C:\Users\sdp\shira\npu_acc\our_bench.py", line 7, in
model = NPUModelForCausalLM.from_pretrained(model_id, use_cache=True, dtype=torch.int8).eval()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\sdp\miniconda3\envs\npu_acc\Lib\functools.py", line 388, in _method
return self.func(cls_or_self, *self.args, *args, **keywords)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: NPUModel.from_pretrained() missing 1 required positional argument: 'config'
The text was updated successfully, but these errors were encountered: