@@ -398,4 +398,33 @@ legacy_processing = False
398
398
legacy_processing = (
399
399
(input_ids == self .config.image_token_index).sum(1 ).max() < self .config.image_seq_length
400
400
) or (input_ids.shape[- 1 ] == 1 and pixel_values is not None ).item()
401
- ```
401
+ ```
402
+
403
+ ## Summary ##
404
+ This guide provides a comprehensive overview of tracing concepts as they apply to
405
+ LLM Compressor, enabling effective use of the [ Sequential Pipeline] ( /src/llmcompressor/pipelines/sequential/pipeline.py )
406
+ and modifiers such as [ GPTQModifier] ( /src/llmcompressor/modifiers/quantization/gptq/base.py ) .
407
+
408
+ The following key points are covered by this guide:
409
+
410
+ 1 . ** Importance of Tracing** :
411
+ Tracing is essential for compressing complex models and managing memory efficiently.
412
+ It ensures accurate data flow capture for layer-by-layer processing during
413
+ compression.
414
+
415
+ 2 . ** Traceability** :
416
+ A model's traceability depends on its ability to define clear input-output
417
+ operations. Using tools like ` llmcompressor.trace ` , users can identify traceability
418
+ issues and make necessary adjustments, such as modifying sequential targets or adding
419
+ modules to the ignore list.
420
+
421
+ 3 . ** Model Modifications** :
422
+ Non-traceable models can be modified by addressing common errors, such as conditional
423
+ logic and iteration issues, or by implementing techniques like function wrapping and
424
+ shape inference correction.
425
+
426
+ This guide empowers users to adapt their models for optimal performance with LLM
427
+ Compressor while maintaining compatibility with its pipeline and modifier tools. The
428
+ outlined steps, examples, and troubleshooting tips ensure that even complex
429
+ architectures can be effectively traced and compressed with minimal memory usage and
430
+ accuracy loss.
0 commit comments