Skip to content

Commit 08f9f79

Browse files
committed
add summary
Signed-off-by: Kyle Sayers <[email protected]>
1 parent 6c71263 commit 08f9f79

File tree

1 file changed

+30
-1
lines changed
  • src/llmcompressor/transformers/tracing

1 file changed

+30
-1
lines changed

src/llmcompressor/transformers/tracing/GUIDE.md

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -398,4 +398,33 @@ legacy_processing = False
398398
legacy_processing = (
399399
(input_ids == self.config.image_token_index).sum(1).max() < self.config.image_seq_length
400400
) or (input_ids.shape[-1] == 1 and pixel_values is not None).item()
401-
```
401+
```
402+
403+
## Summary ##
404+
This guide provides a comprehensive overview of tracing concepts as they apply to
405+
LLM Compressor, enabling effective use of the [Sequential Pipeline](/src/llmcompressor/pipelines/sequential/pipeline.py)
406+
and modifiers such as [GPTQModifier](/src/llmcompressor/modifiers/quantization/gptq/base.py).
407+
408+
The following key points are covered by this guide:
409+
410+
1. **Importance of Tracing**:
411+
Tracing is essential for compressing complex models and managing memory efficiently.
412+
It ensures accurate data flow capture for layer-by-layer processing during
413+
compression.
414+
415+
2. **Traceability**:
416+
A model's traceability depends on its ability to define clear input-output
417+
operations. Using tools like `llmcompressor.trace`, users can identify traceability
418+
issues and make necessary adjustments, such as modifying sequential targets or adding
419+
modules to the ignore list.
420+
421+
3. **Model Modifications**:
422+
Non-traceable models can be modified by addressing common errors, such as conditional
423+
logic and iteration issues, or by implementing techniques like function wrapping and
424+
shape inference correction.
425+
426+
This guide empowers users to adapt their models for optimal performance with LLM
427+
Compressor while maintaining compatibility with its pipeline and modifier tools. The
428+
outlined steps, examples, and troubleshooting tips ensure that even complex
429+
architectures can be effectively traced and compressed with minimal memory usage and
430+
accuracy loss.

0 commit comments

Comments
 (0)