-
Notifications
You must be signed in to change notification settings - Fork 12.1k
Add new nibbles(bytes)
, clz(bytes)
and clz(uint256)
functions
#5725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
🦋 Changeset detectedLatest commit: 740a056 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
equal
, nibbles
and clz
functions to Bytes.solnibbles
and clz
functions to Bytes.sol
contracts/utils/Bytes.sol
Outdated
function clz(uint256 x) internal pure returns (uint256) { | ||
if (x == 0) return 32; // All 32 bytes are zero | ||
uint256 r = 0; | ||
if (x > 0xffffffffffffffffffffffffffffffff) r = 128; // Upper 128 bits | ||
if ((x >> r) > 0xffffffffffffffff) r |= 64; // Next 64 bits | ||
if ((x >> r) > 0xffffffff) r |= 32; // Next 32 bits | ||
if ((x >> r) > 0xffff) r |= 16; // Next 16 bits | ||
if ((x >> r) > 0xff) r |= 8; // Next 8 bits | ||
return 31 ^ (r >> 3); // Convert to leading zero bytes count | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could reuse the Math.log256
function here
function clz(uint256 x) internal pure returns (uint256) { | |
if (x == 0) return 32; // All 32 bytes are zero | |
uint256 r = 0; | |
if (x > 0xffffffffffffffffffffffffffffffff) r = 128; // Upper 128 bits | |
if ((x >> r) > 0xffffffffffffffff) r |= 64; // Next 64 bits | |
if ((x >> r) > 0xffffffff) r |= 32; // Next 32 bits | |
if ((x >> r) > 0xffff) r |= 16; // Next 16 bits | |
if ((x >> r) > 0xff) r |= 8; // Next 8 bits | |
return 31 ^ (r >> 3); // Convert to leading zero bytes count | |
} | |
function clz(uint256 x) internal pure returns (uint256) { | |
return Math.ternary(x == 0, 32, 31 - Math.log256(x)); | |
} |
Also, echoing on the reverseByte
PR, but I think clz and reverseBytes should be in the same file (possibly a new one)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Agree that the reverseBit
functions and clz
should be in the same library. imo it's fine that we put them in Bytes.sol
. Any strong reason not to do so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should distinguish between bytes
as in "arbitrary length buffer", and Bytes32
as in "a type of fixed length that has math properties".
I would argue that clz is one of these "local" math property and is closer to the Math library then it is to things like indexOf
or slice
that are more "global". By that argument, the reverseBytesXX are also more of a "local" operation (similar to clz)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If CLZ operates on uint256
, I think there is a strong case for putting it in Math.
We could have a variant of CLZ in Bytes that does something like
function clz(bytes memory buffer) internal pure returns (uint256) {
uint256 i = 0;
while (i < buffer.length && buffer[i] != bytes1(0)) ++i;
return 8 * i + (i == buffer.length ? 0 : Math.clz(buffer[i]));
}
(should be optimized for reading in blocks of 32 instead of byte by byte
contracts/utils/Bytes.sol
Outdated
function nibbles(bytes memory value) internal pure returns (bytes memory) { | ||
uint256 length = value.length; | ||
bytes memory nibbles_ = new bytes(length * 2); | ||
for (uint256 i = 0; i < length; i++) { | ||
(nibbles_[i * 2], nibbles_[i * 2 + 1]) = (value[i] & 0xf0, value[i] & 0x0f); | ||
} | ||
return nibbles_; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty inefficient, consider using unchecked
at minimum.
Also unclear why this is useful:
➜ nibbles(hex"ABCD")
Type: dynamic bytes
├ Hex (Memory):
├─ Length ([0x00:0x20]): 0x0000000000000000000000000000000000000000000000000000000000000004
├─ Contents ([0x20:..]): 0xa00bc00d00000000000000000000000000000000000000000000000000000000
├ Hex (Tuple Encoded):
├─ Pointer ([0x00:0x20]): 0x0000000000000000000000000000000000000000000000000000000000000020
├─ Length ([0x20:0x40]): 0x0000000000000000000000000000000000000000000000000000000000000004
└─ Contents ([0x40:..]): 0xa00bc00d00000000000000000000000000000000000000000000000000000000
➜
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using unchecked
would be safe since overflow is impossible given bytes memory
can realistically only have a length smaller than 256 bits.
➜ bytes memory b = new bytes(type(uint32).max);
Traces:
[942682544] 0xBd770416a3345F91E4B34576cb804a576fa48EB1::run()
└─ ← [MemoryOOG] EvmError: MemoryOOG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd fix tests first and then optimize, it's likely that we may leverage other functions of the Bytes library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial motivation for this function is to use it in #5680 for RLP, but I agree that reallocating memory is perhaps not the most efficient thing to do. I suspect RLP may not require the nibbles()
function if these are read in place, but I need to experiment a bit with that.
We can add the unchecked, but, is there an alternative you were thinking of? I am more worried about the memory expansion cost
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if inspecting the nibbles JIT rather than converting to separate nibbles array is worthwhile to explore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In regard to memory expansion, we need calldata variants for all methods related to MPT proofs (both RLP decoding and MPT branch verification) since the input data is almost always provided by the user via calldata. There's rarely a reason to copy the full RLP payload into memory, because typical use cases (like verifying a historical blockhash) only require extracting one or two fields (e.g. stateRoot, txRoot). Operating directly on calldata avoids unnecessary memory allocation and is significantly more gas-efficient. Comparing my personal implementation to RLPReader
I'm saving about 40k gas when parsing every block header field (which should never really be needed but serves as a good gas comparison and unit test).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the nibbles method it may make sense to first check the size of the input, if less than or equal to 32 bytes the above calculation could be much simpler (no loop needed)
nibbles
and clz
functions to Bytes.solnibbles(bytes)
, clz(bytes)
and clz(uint256)
functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CLZ opcode EIP https://eips.ethereum.org/EIPS/eip-7939 propose to count the leading zero bits. This proposal counts leading zero bytes.
I think this could lead to a lot of confusion.
I would advice we count zero bits (like the EIP). If we get that number of zero bits, we can very easily figure out the number of zero bytes by just dividing it by 8. The other way around it not possible.
Addressed in c749346
Requires #5726
PR Checklist
npx changeset add
)