[bitmanip] Remove redundant logic #2296

ESE-2019 · 2025-07-16T12:54:52Z

Remove the redundant logic & ~bitcnt_mask_op, as explained in issue #2294 .

    always_comb begin
      unique case (1'b1)
        zbe_op:      bitcnt_bits = operand_b_i;
        // bitcnt_cz:   bitcnt_bits = bitcnt_bit_mask & ~bitcnt_mask_op; // clz / ctz
        bitcnt_cz:   bitcnt_bits = bitcnt_bit_mask; // clz / ctz
        default:     bitcnt_bits = operand_a_i; // cpop
      endcase
    end

rswarbrick · 2025-07-18T08:52:08Z

I think this is a sensible change, thanks. Would you mind expanding the commit message to explain what's going on? It's always easier to understand the history of the repository if it's not dependant on GitHub. With my "maths hat on", I'm always enthusiastic about giving a minimal "proof" of why a commit is the right thing to do.

Working out how to write such a commit message sometimes makes me adjust the code as well, which I think is probably a good thing. I'd also be tempted to extend this commit slightly to get rid of the slightly silly loops for replicating the bit. Is that easy? (And does it work? I only suggested this from reading the existing code...)

ESE-2019 · 2025-07-21T14:54:37Z

I think that comparing the PPA of different schemes is an important method for selecting among different options. Therefore, I used Design Compiler to synthesize three implementations (including only clz, ctz, and cpop), using the NanGate45 standard cell library and a 1GHz frequency constraint. All three designs achieved the same timing slack.

The original version had an area of 287.546; Scheme 1 (with & ~bitcnt_mask_op removed) had an area of 292.866; and Scheme 2 (new commit in the PR, i.e., (bitcnt_mask_op - 32'h1) & ~bitcnt_mask_op) had an area of 268.926.

Interestingly, despite removing redundant logic, Scheme 1 resulted in a larger area. In contrast, Scheme 2 achieved about a 7% area reduction.

I added comments in the code to explain why the method makes sense.

    // Bitcnt_bits generation for clz and ctz:
    // The operand of ctz can be divided into 3 parts: the don't-care bits in the high-order bits,
    // the lowest-order set bit, and the trailing zeros. Bitcnt_bit_mask is obtained by subtracting 1
    // from bitcnt_mask_op, which clears the lowest-order set bit and sets all the trailing zeros to 1.
    // Then, bitcnt_bit_mask & ~bitcnt_mask_op clears the don't-care bits.
    // At this point, counting the number of 1s gives the number of trailing zeros.
    // In order to create the bit mask for leading zeros, the input operand needs to be reversed.

[bitmanip] Remove redundant logic

77e13f5

ESE-2019 mentioned this pull request Jul 16, 2025

Ask a question related to Ibex alu bitcounting #2294

Open

[bitmanip] Remove redundant logic in converting clz/ctz to cpop

a754dbc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bitmanip] Remove redundant logic #2296

[bitmanip] Remove redundant logic #2296

Uh oh!

ESE-2019 commented Jul 16, 2025

Uh oh!

rswarbrick commented Jul 18, 2025

Uh oh!

ESE-2019 commented Jul 21, 2025

Uh oh!

Uh oh!

[bitmanip] Remove redundant logic #2296

Are you sure you want to change the base?

[bitmanip] Remove redundant logic #2296

Uh oh!

Conversation

ESE-2019 commented Jul 16, 2025

Uh oh!

rswarbrick commented Jul 18, 2025

Uh oh!

ESE-2019 commented Jul 21, 2025

Uh oh!

Uh oh!