-First of all, we don't know how to handle 64-bit accumulates yet. We know how to handle 32-bit accumulates - just [initialize the partial sum with the value of the accumulator](#trick). We can use a similar trick for 64-bit ones. First, we can initialize the partial sum with the bottom 33 bits of the 64 bit accumulate. Why 33? I thought the partial sum was 32 bits wide? Well, if we make the width of the partial sum 33 bits, we'd also be able to handle unsigned and signed multiplication by zero / sign extending appropriately. This way, our algorithm itself only needs to be able to perform signed multiplication, and our choice of zero-extension or sign-extension at initialization will handle the rest. More on this in the next section.
0 commit comments