Open
Description
I can export llama2 with -qmode=8da4w with NO problem, but when I tried the -qmode=8da4w-gptq, it fails.
-
installed packages
executorch 0.3.0a0+aaa2f2e
torch 2.4.0.dev20240507+cpu
torchao 0.1
torchtune 0.1.1 -
command to reproduce
$python -m examples.models.llama2.export_llama --checkpoint ~/.cache/huggingface/hub/models--meta-llama--Llama-2-7b/snapshots/69656aac4cb47911a639f5890ff35b41ceb82e98/consolidated.00.pth --params ~/.cache/huggingface/hub/models--meta-llama--Llama-2-7b/snapshots/69656aac4cb47911a639f5890ff35b41ceb82e98/params.json -kv --use_sdpa_with_kv_cache -X -qmode 8da4w-gptq --calibration_limit 128 --calibration_seq_length 2048 --group_size 128 -d fp32 --max_seq_length 4096
- error log
[INFO 2024-05-16 13:12:11,098 builder.py:84] Loading model with checkpoint=~/.cache/huggingface/hub/models--meta-llama--Llama-2-7b/snapshots/69656aac4cb47911a639f5890ff35b41ceb82e98/consolidated.00.pth, params=~/.cache/huggingface/hub/models--meta-llama--Llama-2-7b-hf/snapshots/01c7f73d771dfac7d292323805ebc428287df4f9/params.json, use_kv_cache=True, weight_type=WeightType.LLAMA
[INFO 2024-05-16 13:12:11,173 builder.py:105] Loaded model with dtype=torch.bfloat16
[INFO 2024-05-16 13:12:11,445 config.py:58] PyTorch version 2.4.0.dev20240507+cpu available.
[INFO 2024-05-16 13:12:13,489 huggingface.py:162] Using device 'cpu'
~/anaconda3/envs/executorch/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
[WARNING 2024-05-16 13:12:19,098 task.py:763] [Task: wikitext] metric word_perplexity is defined, but aggregation is not. using default aggregation=weighted_perplexity
[WARNING 2024-05-16 13:12:19,098 task.py:775] [Task: wikitext] metric word_perplexity is defined, but higher_is_better is not. using default higher_is_better=False
[WARNING 2024-05-16 13:12:19,098 task.py:763] [Task: wikitext] metric byte_perplexity is defined, but aggregation is not. using default aggregation=weighted_perplexity
[WARNING 2024-05-16 13:12:19,098 task.py:775] [Task: wikitext] metric byte_perplexity is defined, but higher_is_better is not. using default higher_is_better=False
[WARNING 2024-05-16 13:12:19,098 task.py:763] [Task: wikitext] metric bits_per_byte is defined, but aggregation is not. using default aggregation=bits_per_byte
[WARNING 2024-05-16 13:12:19,098 task.py:775] [Task: wikitext] metric bits_per_byte is defined, but higher_is_better is not. using default higher_is_better=False
Repo card metadata block was not found. Setting CardData to empty.
[WARNING 2024-05-16 13:12:22,935 repocard.py:107] Repo card metadata block was not found. Setting CardData to empty.
Obtaining GPTQ calibration inputs on: ['wikitext']
[INFO 2024-05-16 13:12:23,026 task.py:395] Building contexts for wikitext on rank 0...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 62/62 [00:00<00:00, 544.89it/s]
[INFO 2024-05-16 13:12:23,144 evaluator.py:362] Running loglikelihood_rolling requests
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 62/62 [05:20<00:00, 5.16s/it]
Tracing model for GPTQ
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] failed while attempting to run meta for aten.view.default
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] Traceback (most recent call last):
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 1624, in _dispatch_impl
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] r = func(*args, **kwargs)
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_ops.py", line 630, in __call__
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] return self_._op(*args, **kwargs)
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_refs/__init__.py", line 4550, in view
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] return _reshape_view_helper(a, *shape, allow_copy=False)
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_refs/__init__.py", line 3623, in _reshape_view_helper
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] shape = utils.infer_size(shape, a.numel())
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_prims_common/__init__.py", line 900, in infer_size
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] torch._check(
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/__init__.py", line 1149, in _check
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] _check_with(RuntimeError, cond, message)
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/__init__.py", line 1132, in _check_with
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] raise error_type(message_evaluated)
E0516 13:17:43.372245 123454950823744 torch/_subclasses/fake_tensor.py:1628] [0/0] RuntimeError: shape '[1, 2048, 32, 64]' is invalid for input of size 8192
Traceback (most recent call last):
File "~/anaconda3/envs/executorch/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "~/anaconda3/envs/executorch/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/data0/limingw/workspace/llm/executorch/examples/models/llama2/export_llama.py", line 30, in <module>
main() # pragma: no cover
File "/data0/limingw/workspace/llm/executorch/examples/models/llama2/export_llama.py", line 26, in main
export_llama(modelname, args)
File "/data0/limingw/workspace/llm/executorch/examples/models/llama2/export_llama_lib.py", line 303, in export_llama
return _export_llama(modelname, args)
File "/data0/limingw/workspace/llm/executorch/examples/models/llama2/export_llama_lib.py", line 381, in _export_llama
builder_exported_to_edge = _prepare_for_llama_export(
File "/data0/limingw/workspace/llm/executorch/examples/models/llama2/export_llama_lib.py", line 366, in _prepare_for_llama_export
.source_transform(transforms)
File "/data0/limingw/workspace/llm/executorch/examples/models/llama2/builder.py", line 203, in source_transform
self.model = transform(self.model)
File "/data0/limingw/workspace/llm/executorch/examples/models/llama2/source_transformation/quantize.py", line 129, in quantize
model = gptq_quantizer.quantize(model, inputs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torchao/quantization/GPTQ.py", line 1368, in quantize
state_dict = self._create_quantized_state_dict(
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torchao/quantization/GPTQ.py", line 731, in _create_quantized_state_dict
GPTQ_runner = GenericGPTQRunner(
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torchao/quantization/GPTQ.py", line 323, in __init__
exported_model = torch._dynamo.export(
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1264, in inner
result_traced = opt_f(*args, **kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 416, in _fn
return fn(*args, **kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 981, in catch_errors
return callback(frame, cache_entry, hooks, frame_state, skip=1)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 410, in _convert_frame_assert
return _compile(
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_utils_internal.py", line 70, in wrapper_function
return function(*args, **kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 703, in _compile
guarded_code = compile_inner(code, one_graph, hooks, transform)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 273, in time_wrapper
r = func(*args, **kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 570, in compile_inner
out_code = transform_code_object(code, transform)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py", line 1167, in transform_code_object
transformations(instructions, code_options)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 172, in _fn
return fn(*args, **kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 517, in transform
tracer.run()
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2234, in run
super().run()
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 884, in run
while self.step():
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 799, in step
self.dispatch_table[inst.opcode](self, inst)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 494, in wrapper
return inner_fn(self, inst)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1253, in CALL_FUNCTION
self.call_function(fn, args, {})
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 737, in call_function
self.push(fn.call_function(self, args, kwargs))
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/variables/nn_module.py", line 353, in call_function
return tx.inline_user_function_return(
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 743, in inline_user_function_return
return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2448, in inline_call
return cls.inline_call_(parent, func, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2564, in inline_call_
tracer.run()
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 884, in run
while self.step():
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 799, in step
self.dispatch_table[inst.opcode](self, inst)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 494, in wrapper
return inner_fn(self, inst)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1294, in CALL_FUNCTION_EX
self.call_function(fn, argsvars.items, kwargsvars)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 737, in call_function
self.push(fn.call_function(self, args, kwargs))
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 339, in call_function
return super().call_function(tx, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 293, in call_function
return super().call_function(tx, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 743, in inline_user_function_return
return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2448, in inline_call
return cls.inline_call_(parent, func, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2564, in inline_call_
tracer.run()
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 884, in run
while self.step():
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 799, in step
self.dispatch_table[inst.opcode](self, inst)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 494, in wrapper
return inner_fn(self, inst)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1253, in CALL_FUNCTION
self.call_function(fn, args, {})
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 737, in call_function
self.push(fn.call_function(self, args, kwargs))
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 339, in call_function
return super().call_function(tx, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 293, in call_function
return super().call_function(tx, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 743, in inline_user_function_return
return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2448, in inline_call
return cls.inline_call_(parent, func, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2564, in inline_call_
tracer.run()
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 884, in run
while self.step():
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 799, in step
self.dispatch_table[inst.opcode](self, inst)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 494, in wrapper
return inner_fn(self, inst)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1253, in CALL_FUNCTION
self.call_function(fn, args, {})
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 737, in call_function
self.push(fn.call_function(self, args, kwargs))
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 293, in call_function
return super().call_function(tx, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 743, in inline_user_function_return
return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2448, in inline_call
return cls.inline_call_(parent, func, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2564, in inline_call_
tracer.run()
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 884, in run
while self.step():
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 799, in step
self.dispatch_table[inst.opcode](self, inst)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 494, in wrapper
return inner_fn(self, inst)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1253, in CALL_FUNCTION
self.call_function(fn, args, {})
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 737, in call_function
self.push(fn.call_function(self, args, kwargs))
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 293, in call_function
return super().call_function(tx, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
return tx.inline_user_function_return(self, [*self.self_args(), *args], kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 743, in inline_user_function_return
return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2448, in inline_call
return cls.inline_call_(parent, func, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2564, in inline_call_
tracer.run()
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 884, in run
while self.step():
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 799, in step
self.dispatch_table[inst.opcode](self, inst)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 494, in wrapper
return inner_fn(self, inst)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1253, in CALL_FUNCTION
self.call_function(fn, args, {})
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 737, in call_function
self.push(fn.call_function(self, args, kwargs))
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/variables/misc.py", line 655, in call_function
return self.obj.call_method(tx, self.name, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/variables/tensor.py", line 457, in call_method
return wrap_fx_proxy(
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/variables/builder.py", line 1519, in wrap_fx_proxy
return wrap_fx_proxy_cls(target_cls=TensorVariable, **kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/variables/builder.py", line 1604, in wrap_fx_proxy_cls
example_value = get_fake_value(proxy.node, tx, allow_non_graph_fake=True)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 1830, in get_fake_value
raise TorchRuntimeError(str(e)).with_traceback(e.__traceback__) from None
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 1762, in get_fake_value
ret_val = wrap_fake_exception(
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 1277, in wrap_fake_exception
return fn()
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 1763, in <lambda>
lambda: run_node(tx.output, node, args, kwargs, nnmodule)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 1898, in run_node
raise RuntimeError(make_error_message(e)).with_traceback(
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 1882, in run_node
return getattr(args[0], node.target)(*args[1:], **kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/utils/_stats.py", line 20, in wrapper
return fn(*args, **kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 956, in __torch_dispatch__
return self.dispatch(func, types, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 1318, in dispatch
return self._cached_dispatch_impl(func, types, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 1039, in _cached_dispatch_impl
output = self._dispatch_impl(func, types, args, kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 1624, in _dispatch_impl
r = func(*args, **kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_ops.py", line 630, in __call__
return self_._op(*args, **kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_refs/__init__.py", line 4550, in view
return _reshape_view_helper(a, *shape, allow_copy=False)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_refs/__init__.py", line 3623, in _reshape_view_helper
shape = utils.infer_size(shape, a.numel())
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_prims_common/__init__.py", line 900, in infer_size
torch._check(
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/__init__.py", line 1149, in _check
_check_with(RuntimeError, cond, message)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/__init__.py", line 1132, in _check_with
raise error_type(message_evaluated)
torch._dynamo.exc.TorchRuntimeError: Failed running call_method view(*(FakeTensor(..., size=(1, 128, 64), dtype=torch.bfloat16), [1, 2048, 32, 64]), **{}):
shape '[1, 2048, 32, 64]' is invalid for input of size 8192
from user code:
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/examples/models/llama2/llama_transformer.py", line 487, in forward
h = layer(
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/examples/models/llama2/llama_transformer.py", line 424, in forward
h = self.attention.forward(
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/examples/models/llama2/llama_transformer.py", line 321, in forward
q, k = apply_rotary_emb(q, k, freqs_cos, freqs_sin)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/examples/models/llama2/llama_transformer.py", line 169, in apply_rotary_emb
freqs_cos = reshape_for_broadcast(freqs_cos, xq_r)
File "~/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/examples/models/llama2/llama_transformer.py", line 160, in reshape_for_broadcast
return freqs_cis.view(shape)
Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
Has anyone succeeded on this? Please shed some lights and really appreciate the help.