Add new `nibbles(bytes)`, `clz(bytes)` and `clz(uint256)` functions #5725

ernestognw · 2025-06-09T01:08:54Z

Requires #5726

PR Checklist

Tests
Documentation
Changeset entry (run npx changeset add)

changeset-bot · 2025-06-09T01:08:57Z

🦋 Changeset detected

Latest commit: 740a056

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
openzeppelin-solidity	Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Amxx · 2025-06-10T08:08:52Z

contracts/utils/Bytes.sol

+    function clz(uint256 x) internal pure returns (uint256) {
+        if (x == 0) return 32; // All 32 bytes are zero
+        uint256 r = 0;
+        if (x > 0xffffffffffffffffffffffffffffffff) r = 128; // Upper 128 bits
+        if ((x >> r) > 0xffffffffffffffff) r |= 64; // Next 64 bits
+        if ((x >> r) > 0xffffffff) r |= 32; // Next 32 bits
+        if ((x >> r) > 0xffff) r |= 16; // Next 16 bits
+        if ((x >> r) > 0xff) r |= 8; // Next 8 bits
+        return 31 ^ (r >> 3); // Convert to leading zero bytes count
+    }


We could reuse the Math.log256 function here

Suggested change

function clz(uint256 x) internal pure returns (uint256) {

if (x == 0) return 32; // All 32 bytes are zero

uint256 r = 0;

if (x > 0xffffffffffffffffffffffffffffffff) r = 128; // Upper 128 bits

if ((x >> r) > 0xffffffffffffffff) r |= 64; // Next 64 bits

if ((x >> r) > 0xffffffff) r |= 32; // Next 32 bits

if ((x >> r) > 0xffff) r |= 16; // Next 16 bits

if ((x >> r) > 0xff) r |= 8; // Next 8 bits

return 31 ^ (r >> 3); // Convert to leading zero bytes count

}

function clz(uint256 x) internal pure returns (uint256) {

return Math.ternary(x == 0, 32, 31 - Math.log256(x));

}

Also, echoing on the reverseByte PR, but I think clz and reverseBytes should be in the same file (possibly a new one)

Updated. Agree that the reverseBit functions and clz should be in the same library. imo it's fine that we put them in Bytes.sol. Any strong reason not to do so?

IMO we should distinguish between bytes as in "arbitrary length buffer", and Bytes32 as in "a type of fixed length that has math properties".

I would argue that clz is one of these "local" math property and is closer to the Math library then it is to things like indexOf or slice that are more "global". By that argument, the reverseBytesXX are also more of a "local" operation (similar to clz)

If CLZ operates on uint256, I think there is a strong case for putting it in Math.

We could have a variant of CLZ in Bytes that does something like

function clz(bytes memory buffer) internal pure returns (uint256) { uint256 i = 0; while (i < buffer.length && buffer[i] != bytes1(0)) ++i; return 8 * i + (i == buffer.length ? 0 : Math.clz(buffer[i])); }

(should be optimized for reading in blocks of 32 instead of byte by byte

test/utils/Bytes.t.sol

0xClandestine · 2025-07-17T22:51:54Z

contracts/utils/Bytes.sol

+    function nibbles(bytes memory value) internal pure returns (bytes memory) {
+        uint256 length = value.length;
+        bytes memory nibbles_ = new bytes(length * 2);
+        for (uint256 i = 0; i < length; i++) {
+            (nibbles_[i * 2], nibbles_[i * 2 + 1]) = (value[i] & 0xf0, value[i] & 0x0f);
+        }
+        return nibbles_;
+    }


This is pretty inefficient, consider using unchecked at minimum.

Also unclear why this is useful:

➜ nibbles(hex"ABCD") Type: dynamic bytes ├ Hex (Memory): ├─ Length ([0x00:0x20]): 0x0000000000000000000000000000000000000000000000000000000000000004 ├─ Contents ([0x20:..]): 0xa00bc00d00000000000000000000000000000000000000000000000000000000 ├ Hex (Tuple Encoded): ├─ Pointer ([0x00:0x20]): 0x0000000000000000000000000000000000000000000000000000000000000020 ├─ Length ([0x20:0x40]): 0x0000000000000000000000000000000000000000000000000000000000000004 └─ Contents ([0x40:..]): 0xa00bc00d00000000000000000000000000000000000000000000000000000000 ➜

Using unchecked would be safe since overflow is impossible given bytes memory can realistically only have a length smaller than 256 bits.

➜ bytes memory b = new bytes(type(uint32).max); Traces: [942682544] 0xBd770416a3345F91E4B34576cb804a576fa48EB1::run() └─ ← [MemoryOOG] EvmError: MemoryOOG

I'd fix tests first and then optimize, it's likely that we may leverage other functions of the Bytes library

The initial motivation for this function is to use it in #5680 for RLP, but I agree that reallocating memory is perhaps not the most efficient thing to do. I suspect RLP may not require the nibbles() function if these are read in place, but I need to experiment a bit with that.

We can add the unchecked, but, is there an alternative you were thinking of? I am more worried about the memory expansion cost

Wondering if inspecting the nibbles JIT rather than converting to separate nibbles array is worthwhile to explore.

In regard to memory expansion, we need calldata variants for all methods related to MPT proofs (both RLP decoding and MPT branch verification) since the input data is almost always provided by the user via calldata. There's rarely a reason to copy the full RLP payload into memory, because typical use cases (like verifying a historical blockhash) only require extracting one or two fields (e.g. stateRoot, txRoot). Operating directly on calldata avoids unnecessary memory allocation and is significantly more gas-efficient. Comparing my personal implementation to RLPReader I'm saving about 40k gas when parsing every block header field (which should never really be needed but serves as a good gas comparison and unit test).

On the nibbles method it may make sense to first check the size of the input, if less than or equal to 32 bytes the above calculation could be much simpler (no loop needed)

contracts/utils/Bytes.sol

test/utils/Bytes.t.sol

Amxx

The CLZ opcode EIP https://eips.ethereum.org/EIPS/eip-7939 propose to count the leading zero bits. This proposal counts leading zero bytes.

I think this could lead to a lot of confusion.

I would advice we count zero bits (like the EIP). If we get that number of zero bits, we can very easily figure out the number of zero bytes by just dividing it by 8. The other way around it not possible.

Addressed in c749346

ernestognw added 2 commits June 8, 2025 19:07

Add new equal, nibbles and countLeadingZeroes functions

42c79f1

Rename countLeadingZeroes to clz

5754ab8

This was referenced Jun 9, 2025

Add RLP and TrieProof libraries #5680

Draft

Add equal to Bytes.sol and update pragma to 0.8.24 in String dependencies #5726

Merged

ernestognw added this to the 5.5 milestone Jun 9, 2025

ernestognw changed the title ~~Add new equal, nibbles and clz functions to Bytes.sol~~ Add new nibbles and clz functions to Bytes.sol Jun 9, 2025

Amxx reviewed Jun 10, 2025

View reviewed changes

Merge branch 'master' into feat/bytes-rlp

4383e01

ernestognw requested a review from a team as a code owner July 9, 2025 17:47

ernestognw added 4 commits July 9, 2025 11:49

up

5a44b11

Document

d6db2d7

Merge branch 'master' into feat/bytes-rlp

9b8af8f

up

2cf0acf

Amxx reviewed Jul 15, 2025

View reviewed changes

test/utils/Bytes.t.sol Outdated Show resolved Hide resolved

Amxx mentioned this pull request Jul 15, 2025

Fix bug in Bytes.lastIndexOf when array is empty and position is not 2²⁵⁶-1 #5797

Merged

3 tasks

Merge branch 'master' into feat/bytes-rlp

8116913

0xClandestine reviewed Jul 17, 2025

View reviewed changes

ernestognw added 4 commits July 21, 2025 13:52

Merge branch 'master' into feat/bytes-rlp

c5426dd

up

1dfee41

up

ab00bb1

Add clz(bytes)

e6092ae

ernestognw changed the title ~~Add new nibbles and clz functions to Bytes.sol~~ Add new nibbles(bytes), clz(bytes) and clz(uint256) functions Jul 21, 2025

ernestognw added 3 commits July 21, 2025 15:18

up

ede80e9

Minimize diff

d65d7a5

up

fb91c85

Amxx reviewed Jul 22, 2025

View reviewed changes

contracts/utils/Bytes.sol Outdated Show resolved Hide resolved

Amxx reviewed Jul 22, 2025

View reviewed changes

test/utils/Bytes.t.sol Outdated Show resolved Hide resolved

Amxx reviewed Jul 22, 2025

View reviewed changes

Change CLZ behavior to count zero bits instead of zero bytes

c749346

Amxx added 2 commits July 22, 2025 23:44

Update Bytes.sol

12c87dd

fix Math tests

740a056

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add new `nibbles(bytes)`, `clz(bytes)` and `clz(uint256)` functions #5725

Add new `nibbles(bytes)`, `clz(bytes)` and `clz(uint256)` functions #5725

ernestognw commented Jun 9, 2025 •

edited

Loading

Uh oh!

changeset-bot bot commented Jun 9, 2025 •

edited

Loading

Uh oh!

Amxx Jun 10, 2025

Uh oh!

ernestognw Jul 9, 2025

Uh oh!

Amxx Jul 15, 2025 •

edited

Loading

Uh oh!

Amxx Jul 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

0xClandestine Jul 17, 2025 •

edited

Loading

Uh oh!

0xClandestine Jul 17, 2025

Uh oh!

ernestognw Jul 18, 2025 •

edited

Loading

Uh oh!

ernestognw Jul 21, 2025 •

edited

Loading

Uh oh!

0xClandestine Jul 22, 2025

Uh oh!

0xClandestine Jul 22, 2025

Uh oh!

0xClandestine Jul 22, 2025

Uh oh!

Uh oh!

Uh oh!

Amxx left a comment •

edited

Loading

Uh oh!

Uh oh!

Add new nibbles(bytes), clz(bytes) and clz(uint256) functions #5725

Are you sure you want to change the base?

Add new nibbles(bytes), clz(bytes) and clz(uint256) functions #5725

Conversation

ernestognw commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Checklist

Uh oh!

changeset-bot bot commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

Amxx Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

ernestognw Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

Amxx Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Amxx Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

0xClandestine Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

0xClandestine Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

ernestognw Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ernestognw Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

0xClandestine Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

0xClandestine Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

0xClandestine Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Amxx left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Add new `nibbles(bytes)`, `clz(bytes)` and `clz(uint256)` functions #5725

Add new `nibbles(bytes)`, `clz(bytes)` and `clz(uint256)` functions #5725

ernestognw commented Jun 9, 2025 •

edited

Loading

changeset-bot bot commented Jun 9, 2025 •

edited

Loading

Amxx Jul 15, 2025 •

edited

Loading

Amxx Jul 15, 2025 •

edited

Loading

0xClandestine Jul 17, 2025 •

edited

Loading

ernestognw Jul 18, 2025 •

edited

Loading

ernestognw Jul 21, 2025 •

edited

Loading

Amxx left a comment •

edited

Loading