Skip to content

fix StringUtils.getDigits dropping supplementary digits#1729

Open
alhudz wants to merge 1 commit into
apache:masterfrom
alhudz:getdigits-supplementary-digits
Open

fix StringUtils.getDigits dropping supplementary digits#1729
alhudz wants to merge 1 commit into
apache:masterfrom
alhudz:getdigits-supplementary-digits

Conversation

@alhudz

@alhudz alhudz commented Jun 24, 2026

Copy link
Copy Markdown
Contributor
  1. getDigits scans the input one char at a time and calls Character.isDigit(char), so a supplementary digit such as U+1D7CF (MATHEMATICAL BOLD DIGIT ONE) is dropped: neither surrogate half is itself a digit, even though Character.isDigit reports the code point as a digit.
  2. Switched the scan to code points (codePointAt/charCount) and write the matched code point back with Character.toChars.

Repro getDigits("a" + new String(Character.toChars(0x1D7CF)) + "9"): expected the supplementary digit then 9, actual 9. BMP input (including the existing Devanagari case) is unchanged.

  • Read the contribution guidelines for this project.
  • Read the ASF Generative Tooling Guidance if you use Artificial Intelligence (AI).
  • I used AI to create any part of, or all of, this pull request. Which AI tool was used to create this pull request, and to what extent did it contribute?
  • Run a successful build using the default Maven goal with mvn; that's mvn on the command line by itself.
  • Write unit tests that match behavioral changes, where the tests fail if the changes to the runtime are not applied. This may not always be possible, but it is a best practice.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Each commit in the pull request should have a meaningful subject line and body. Note that a maintainer may squash commits during the merge process.

Scan by code point so a supplementary digit such as U+1D7CF is kept instead of dropped when neither surrogate half is itself a digit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant