Skip to content

Conversation

@zherczeg
Copy link
Collaborator

This patch moves a jit optimization to compiler optimization. The unicode categories (there are 30 of them) are combined into a 32 bit bitset, and stored in xclass, instead of a list of properties. This 32 bits, if present, follows the bitset for the first 256 characters. If the value of categories is 0, it is not stored in the xclass.

The patch is working, but the debug (/B) output is changed for 8-bit ucp case, because the bitset is stored in cranges, and the 8 bit ucp has no cranges.

When all 30 category bits are present, the xclass is converted to allany or nothing (negated case).

What do you think about this optimization?

@zherczeg
Copy link
Collaborator Author

Note: this patch can wait after the next release.

@NWilson
Copy link
Member

NWilson commented Dec 26, 2024

I like the idea. I'll have to think about it a bit, but it seems OK in principle.

Nice idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants