Commit 6cb38c3
Fix conversion of unnormalized BF16->BF16 weights (llama/7843)
* add truncate_bf16
* truncate intermediate fp32 if converting bf16 to bf16
* fix masking in __compute_fp32_to_bf16
* np.int16 no longer used
* missing cast and additional numpy 2.x fix
* ggml-impl : do not flush bf16 subnormals to zero
* ggml : add reference fp32 to bf16 conversion
The fast version is no longer equivalent for all platforms
because of the handling of subnormal values.
* gguf-py : remove flush to zero for bf16 subnormals
* gguf-py : remove float32 truncation to bf16
Rounding achieves the same thing in the cases where this was used.
* missed prototype update in merge
* merge cleanup
---------
Co-authored-by: Francis Couture-Harpin <[email protected]>1 parent 9cf14eb commit 6cb38c3
3 files changed
+13
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
349 | 349 | | |
350 | 350 | | |
351 | 351 | | |
| 352 | + | |
352 | 353 | | |
353 | 354 | | |
354 | 355 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
83 | | - | |
84 | | - | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
85 | 86 | | |
86 | 87 | | |
87 | 88 | | |
| |||
95 | 96 | | |
96 | 97 | | |
97 | 98 | | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | 99 | | |
103 | 100 | | |
104 | 101 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
483 | 483 | | |
484 | 484 | | |
485 | 485 | | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
486 | 492 | | |
487 | 493 | | |
488 | 494 | | |
| 495 | + | |
489 | 496 | | |
490 | 497 | | |
491 | 498 | | |
| |||
965 | 972 | | |
966 | 973 | | |
967 | 974 | | |
968 | | - | |
| 975 | + | |
969 | 976 | | |
970 | 977 | | |
971 | 978 | | |
| |||
20653 | 20660 | | |
20654 | 20661 | | |
20655 | 20662 | | |
20656 | | - | |
| 20663 | + | |
20657 | 20664 | | |
20658 | 20665 | | |
20659 | 20666 | | |
| |||
0 commit comments