File tree 2 files changed +10
-4
lines changed
2 files changed +10
-4
lines changed Original file line number Diff line number Diff line change @@ -206,8 +206,11 @@ impl f32 {
206
206
/// Fused multiply-add. Computes `(self * a) + b` with only one rounding
207
207
/// error, yielding a more accurate result than an unfused multiply-add.
208
208
///
209
- /// Using `mul_add` can be more performant than an unfused multiply-add if
210
- /// the target architecture has a dedicated `fma` CPU instruction.
209
+ /// Using `mul_add` *can* be more performant than an unfused multiply-add if
210
+ /// the target architecture has a dedicated `fma` CPU instruction. However,
211
+ /// this is not always true, and care must be taken not to overload the
212
+ /// architecture's available FMA units when using many FMA instructions
213
+ /// in a row, which can cause a stall and performance degradation.
211
214
///
212
215
/// # Examples
213
216
///
Original file line number Diff line number Diff line change @@ -206,8 +206,11 @@ impl f64 {
206
206
/// Fused multiply-add. Computes `(self * a) + b` with only one rounding
207
207
/// error, yielding a more accurate result than an unfused multiply-add.
208
208
///
209
- /// Using `mul_add` can be more performant than an unfused multiply-add if
210
- /// the target architecture has a dedicated `fma` CPU instruction.
209
+ /// Using `mul_add` *can* be more performant than an unfused multiply-add if
210
+ /// the target architecture has a dedicated `fma` CPU instruction. However,
211
+ /// this is not always true, and care must be taken not to overload the
212
+ /// architecture's available FMA units when using many FMA instructions
213
+ /// in a row, which can cause a stall and performance degradation.
211
214
///
212
215
/// # Examples
213
216
///
You can’t perform that action at this time.
0 commit comments