Skip to content

[clang-tidy] Update confusables.txt in misc-confusable-identifiers #148399

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 14, 2025

Conversation

localspook
Copy link
Contributor

We're currently on Unicode 14.0.0. This PR updates it to Unicode 16.0.0.

@llvmbot
Copy link
Member

llvmbot commented Jul 12, 2025

@llvm/pr-subscribers-clang-tools-extra

@llvm/pr-subscribers-clang-tidy

Author: Victor Chernyakin (localspook)

Changes

We're currently on Unicode 14.0.0. This PR updates it to Unicode 16.0.0.


Patch is 39.17 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/148399.diff

1 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/misc/ConfusableTable/confusables.txt (+71-24)
diff --git a/clang-tools-extra/clang-tidy/misc/ConfusableTable/confusables.txt b/clang-tools-extra/clang-tidy/misc/ConfusableTable/confusables.txt
index 706177e3cf0bf..f88841b7ff0f5 100644
--- a/clang-tools-extra/clang-tidy/misc/ConfusableTable/confusables.txt
+++ b/clang-tools-extra/clang-tidy/misc/ConfusableTable/confusables.txt
@@ -1,13 +1,13 @@
-#### confusables.txt
-# Date: 2021-05-29, 22:09:29 GMT
-# © 2021 Unicode®, Inc.
+# confusables.txt
+# Date: 2024-08-14, 23:39:57 GMT
+# © 2024 Unicode®, Inc.
 # Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
-# For terms of use, see http://www.unicode.org/terms_of_use.html
+# For terms of use and license, see https://www.unicode.org/terms_of_use.html
 #
 # Unicode Security Mechanisms for UTS #39
-# Version: 14.0.0
+# Version: 16.0.0
 #
-# For documentation and usage, see http://www.unicode.org/reports/tr39
+# For documentation and usage, see https://www.unicode.org/reports/tr39
 #
 05AD ;	0596 ;	MA	# ( ֭ → ֖ ) HEBREW ACCENT DEHI → HEBREW ACCENT TIPEHA	# 
 
@@ -349,8 +349,8 @@ A4FA ;	002E 002E ;	MA	# ( ꓺ → .. ) LISU LETTER TONE MYA CYA → FULL STOP, F
 
 A6F4 ;	A6F3 A6F3 ;	MA	#* ( ꛴ → ꛳꛳ ) BAMUM COLON → BAMUM FULL STOP, BAMUM FULL STOP	# 
 
-30FB ;	00B7 ;	MA	#* ( ・ → · ) KATAKANA MIDDLE DOT → MIDDLE DOT	# →•→
-FF65 ;	00B7 ;	MA	#* ( ・ → · ) HALFWIDTH KATAKANA MIDDLE DOT → MIDDLE DOT	# →•→
+30FB ;	00B7 ;	MA	# ( ・ → · ) KATAKANA MIDDLE DOT → MIDDLE DOT	# →•→
+FF65 ;	00B7 ;	MA	# ( ・ → · ) HALFWIDTH KATAKANA MIDDLE DOT → MIDDLE DOT	# →•→
 16EB ;	00B7 ;	MA	#* ( ᛫ → · ) RUNIC SINGLE PUNCTUATION → MIDDLE DOT	# 
 0387 ;	00B7 ;	MA	# ( · → · ) GREEK ANO TELEIA → MIDDLE DOT	# 
 2E31 ;	00B7 ;	MA	#* ( ⸱ → · ) WORD SEPARATOR MIDDLE DOT → MIDDLE DOT	# 
@@ -577,10 +577,10 @@ FF07 ;	0027 ;	MA	#* ( ' → ' ) FULLWIDTH APOSTROPHE → APOSTROPHE	# →’
 2018 ;	0027 ;	MA	#* ( ‘ → ' ) LEFT SINGLE QUOTATION MARK → APOSTROPHE	# 
 2019 ;	0027 ;	MA	#* ( ’ → ' ) RIGHT SINGLE QUOTATION MARK → APOSTROPHE	# 
 201B ;	0027 ;	MA	#* ( ‛ → ' ) SINGLE HIGH-REVERSED-9 QUOTATION MARK → APOSTROPHE	# →′→
+05F3 ;	0027 ;	MA	#* ( ‎׳‎ → ' ) HEBREW PUNCTUATION GERESH → APOSTROPHE	# 
 2032 ;	0027 ;	MA	#* ( ′ → ' ) PRIME → APOSTROPHE	# 
 2035 ;	0027 ;	MA	#* ( ‵ → ' ) REVERSED PRIME → APOSTROPHE	# →ʽ→→‘→
 055A ;	0027 ;	MA	#* ( ՚ → ' ) ARMENIAN APOSTROPHE → APOSTROPHE	# →’→
-05F3 ;	0027 ;	MA	#* ( ‎׳‎ → ' ) HEBREW PUNCTUATION GERESH → APOSTROPHE	# 
 0060 ;	0027 ;	MA	#* ( ` → ' ) GRAVE ACCENT → APOSTROPHE	# →ˋ→→`→→‘→
 1FEF ;	0027 ;	MA	#* ( ` → ' ) GREEK VARIA → APOSTROPHE	# →ˋ→→`→→‘→
 FF40 ;	0027 ;	MA	#* ( ` → ' ) FULLWIDTH GRAVE ACCENT → APOSTROPHE	# →‘→
@@ -615,10 +615,10 @@ FF02 ;	0027 0027 ;	MA	#* ( " → '' ) FULLWIDTH QUOTATION MARK → APOSTROPHE,
 201C ;	0027 0027 ;	MA	#* ( “ → '' ) LEFT DOUBLE QUOTATION MARK → APOSTROPHE, APOSTROPHE	# →"→
 201D ;	0027 0027 ;	MA	#* ( ” → '' ) RIGHT DOUBLE QUOTATION MARK → APOSTROPHE, APOSTROPHE	# →"→
 201F ;	0027 0027 ;	MA	#* ( ‟ → '' ) DOUBLE HIGH-REVERSED-9 QUOTATION MARK → APOSTROPHE, APOSTROPHE	# →“→→"→
+05F4 ;	0027 0027 ;	MA	#* ( ‎״‎ → '' ) HEBREW PUNCTUATION GERSHAYIM → APOSTROPHE, APOSTROPHE	# →"→
 2033 ;	0027 0027 ;	MA	#* ( ″ → '' ) DOUBLE PRIME → APOSTROPHE, APOSTROPHE	# →"→
 2036 ;	0027 0027 ;	MA	#* ( ‶ → '' ) REVERSED DOUBLE PRIME → APOSTROPHE, APOSTROPHE	# →‵‵→
 3003 ;	0027 0027 ;	MA	#* ( 〃 → '' ) DITTO MARK → APOSTROPHE, APOSTROPHE	# →″→→"→
-05F4 ;	0027 0027 ;	MA	#* ( ‎״‎ → '' ) HEBREW PUNCTUATION GERSHAYIM → APOSTROPHE, APOSTROPHE	# →"→
 02DD ;	0027 0027 ;	MA	#* ( ˝ → '' ) DOUBLE ACUTE ACCENT → APOSTROPHE, APOSTROPHE	# →"→
 02BA ;	0027 0027 ;	MA	# ( ʺ → '' ) MODIFIER LETTER DOUBLE PRIME → APOSTROPHE, APOSTROPHE	# →"→
 02F6 ;	0027 0027 ;	MA	#* ( ˶ → '' ) MODIFIER LETTER MIDDLE DOUBLE ACUTE ACCENT → APOSTROPHE, APOSTROPHE	# →˝→→"→
@@ -1071,7 +1071,7 @@ A714 ;	02EB ;	MA	#* ( ꜔ → ˫ ) MODIFIER LETTER MID LEFT-STEM TONE BAR → MO
 25CB ;	00B0 ;	MA	#* ( ○ → ° ) WHITE CIRCLE → DEGREE SIGN	# →◦→→∘→
 25E6 ;	00B0 ;	MA	#* ( ◦ → ° ) WHITE BULLET → DEGREE SIGN	# →∘→
 
-235C ;	00B0 0332 ;	MA	#* ( ⍜ → °̲ ) APL FUNCTIONAL SYMBOL CIRCLE UNDERBAR → DEGREE SIGN, COMBINING LOW LINE	# →○̲→→∘̲→
+235C ;	00B0 0332 ;	MA	#* ( ⍜ → °̲ ) APL FUNCTIONAL SYMBOL CIRCLE UNDERBAR → DEGREE SIGN, COMBINING LOW LINE	# →○̲→
 
 2364 ;	00B0 0308 ;	MA	#* ( ⍤ → °̈ ) APL FUNCTIONAL SYMBOL JOT DIAERESIS → DEGREE SIGN, COMBINING DIAERESIS	# →◦̈→→∘̈→
 
@@ -1417,6 +1417,7 @@ A9C6 ;	A9D0 ;	MA	#* ( ꧆ → ꧐ ) JAVANESE PADA WINDU → JAVANESE DIGIT ZERO
 
 23E8 ;	2081 2080 ;	MA	#* ( ⏨ → ₁₀ ) DECIMAL EXPONENT SYMBOL → SUBSCRIPT ONE, SUBSCRIPT ZERO	# 
 
+1CCF2 ;	0032 ;	MA	# ( 𜳲 → 2 ) OUTLINED DIGIT TWO → DIGIT TWO	# 
 1D7D0 ;	0032 ;	MA	# ( 𝟐 → 2 ) MATHEMATICAL BOLD DIGIT TWO → DIGIT TWO	# 
 1D7DA ;	0032 ;	MA	# ( 𝟚 → 2 ) MATHEMATICAL DOUBLE-STRUCK DIGIT TWO → DIGIT TWO	# 
 1D7E4 ;	0032 ;	MA	# ( 𝟤 → 2 ) MATHEMATICAL SANS-SERIF DIGIT TWO → DIGIT TWO	# 
@@ -1490,6 +1491,7 @@ A9CF ;	0662 ;	MA	# ( ꧏ → ‎٢‎ ) JAVANESE PANGRANGKEP → ARABIC-INDIC DI
 335A ;	0032 70B9 ;	MA	#* ( ㍚ → 2点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR TWO → DIGIT TWO, CJK UNIFIED IDEOGRAPH-70B9	# 
 
 1D206 ;	0033 ;	MA	#* ( 𝈆 → 3 ) GREEK VOCAL NOTATION SYMBOL-7 → DIGIT THREE	# 
+1CCF3 ;	0033 ;	MA	# ( 𜳳 → 3 ) OUTLINED DIGIT THREE → DIGIT THREE	# 
 1D7D1 ;	0033 ;	MA	# ( 𝟑 → 3 ) MATHEMATICAL BOLD DIGIT THREE → DIGIT THREE	# 
 1D7DB ;	0033 ;	MA	# ( 𝟛 → 3 ) MATHEMATICAL DOUBLE-STRUCK DIGIT THREE → DIGIT THREE	# 
 1D7E5 ;	0033 ;	MA	# ( 𝟥 → 3 ) MATHEMATICAL SANS-SERIF DIGIT THREE → DIGIT THREE	# 
@@ -1529,6 +1531,7 @@ A76A ;	0033 ;	MA	# ( Ꝫ → 3 ) LATIN CAPITAL LETTER ET → DIGIT THREE	#
 
 335B ;	0033 70B9 ;	MA	#* ( ㍛ → 3点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR THREE → DIGIT THREE, CJK UNIFIED IDEOGRAPH-70B9	# 
 
+1CCF4 ;	0034 ;	MA	# ( 𜳴 → 4 ) OUTLINED DIGIT FOUR → DIGIT FOUR	# 
 1D7D2 ;	0034 ;	MA	# ( 𝟒 → 4 ) MATHEMATICAL BOLD DIGIT FOUR → DIGIT FOUR	# 
 1D7DC ;	0034 ;	MA	# ( 𝟜 → 4 ) MATHEMATICAL DOUBLE-STRUCK DIGIT FOUR → DIGIT FOUR	# 
 1D7E6 ;	0034 ;	MA	# ( 𝟦 → 4 ) MATHEMATICAL SANS-SERIF DIGIT FOUR → DIGIT FOUR	# 
@@ -1556,6 +1559,7 @@ A76A ;	0033 ;	MA	# ( Ꝫ → 3 ) LATIN CAPITAL LETTER ET → DIGIT THREE	#
 
 335C ;	0034 70B9 ;	MA	#* ( ㍜ → 4点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR FOUR → DIGIT FOUR, CJK UNIFIED IDEOGRAPH-70B9	# 
 
+1CCF5 ;	0035 ;	MA	# ( 𜳵 → 5 ) OUTLINED DIGIT FIVE → DIGIT FIVE	# 
 1D7D3 ;	0035 ;	MA	# ( 𝟓 → 5 ) MATHEMATICAL BOLD DIGIT FIVE → DIGIT FIVE	# 
 1D7DD ;	0035 ;	MA	# ( 𝟝 → 5 ) MATHEMATICAL DOUBLE-STRUCK DIGIT FIVE → DIGIT FIVE	# 
 1D7E7 ;	0035 ;	MA	# ( 𝟧 → 5 ) MATHEMATICAL SANS-SERIF DIGIT FIVE → DIGIT FIVE	# 
@@ -1577,6 +1581,7 @@ A76A ;	0033 ;	MA	# ( Ꝫ → 3 ) LATIN CAPITAL LETTER ET → DIGIT THREE	#
 
 335D ;	0035 70B9 ;	MA	#* ( ㍝ → 5点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR FIVE → DIGIT FIVE, CJK UNIFIED IDEOGRAPH-70B9	# 
 
+1CCF6 ;	0036 ;	MA	# ( 𜳶 → 6 ) OUTLINED DIGIT SIX → DIGIT SIX	# 
 1D7D4 ;	0036 ;	MA	# ( 𝟔 → 6 ) MATHEMATICAL BOLD DIGIT SIX → DIGIT SIX	# 
 1D7DE ;	0036 ;	MA	# ( 𝟞 → 6 ) MATHEMATICAL DOUBLE-STRUCK DIGIT SIX → DIGIT SIX	# 
 1D7E8 ;	0036 ;	MA	# ( 𝟨 → 6 ) MATHEMATICAL SANS-SERIF DIGIT SIX → DIGIT SIX	# 
@@ -1605,6 +1610,7 @@ A76A ;	0033 ;	MA	# ( Ꝫ → 3 ) LATIN CAPITAL LETTER ET → DIGIT THREE	#
 335E ;	0036 70B9 ;	MA	#* ( ㍞ → 6点 ) IDEOGRAPHIC TELEGRAPH SYMBOL FOR HOUR SIX → DIGIT SIX, CJK UNIFIED IDEOGRAPH-70B9	# 
 
 1D212 ;	0037 ;	MA	#* ( 𝈒 → 7 ) GREEK VOCAL NOTATION SYMBOL-19 → DIGIT SEVEN	# 
+1CCF7 ;	0037 ;	MA	# ( 𜳷 → 7 ) OUTLINED DIGIT SEVEN → DIGIT SEVEN	# 
 1D7D5 ;	0037 ;	MA	# ( 𝟕 → 7 ) MATHEMATICAL BOLD DIGIT SEVEN → DIGIT SEVEN	# 
 1D7DF ;	0037 ;	MA	# ( 𝟟 → 7 ) MATHEMATICAL DOUBLE-STRUCK DIGIT SEVEN → DIGIT SEVEN	# 
 1D7E9 ;	0037 ;	MA	# ( 𝟩 → 7 ) MATHEMATICAL SANS-SERIF DIGIT SEVEN → DIGIT SEVEN	# 
@@ -1630,6 +1636,7 @@ A76A ;	0033 ;	MA	# ( Ꝫ → 3 ) LATIN CAPITAL LETTER ET → DIGIT THREE	#
 09EA ;	0038 ;	MA	# ( ৪ → 8 ) BENGALI DIGIT FOUR → DIGIT EIGHT	# 
 0A6A ;	0038 ;	MA	# ( ੪ → 8 ) GURMUKHI DIGIT FOUR → DIGIT EIGHT	# 
 1E8CB ;	0038 ;	MA	#* ( ‎𞣋‎ → 8 ) MENDE KIKAKUI DIGIT FIVE → DIGIT EIGHT	# 
+1CCF8 ;	0038 ;	MA	# ( 𜳸 → 8 ) OUTLINED DIGIT EIGHT → DIGIT EIGHT	# 
 1D7D6 ;	0038 ;	MA	# ( 𝟖 → 8 ) MATHEMATICAL BOLD DIGIT EIGHT → DIGIT EIGHT	# 
 1D7E0 ;	0038 ;	MA	# ( 𝟠 → 8 ) MATHEMATICAL DOUBLE-STRUCK DIGIT EIGHT → DIGIT EIGHT	# 
 1D7EA ;	0038 ;	MA	# ( 𝟪 → 8 ) MATHEMATICAL SANS-SERIF DIGIT EIGHT → DIGIT EIGHT	# 
@@ -1658,6 +1665,7 @@ A76A ;	0033 ;	MA	# ( Ꝫ → 3 ) LATIN CAPITAL LETTER ET → DIGIT THREE	#
 0B68 ;	0039 ;	MA	# ( ୨ → 9 ) ORIYA DIGIT TWO → DIGIT NINE	# 
 09ED ;	0039 ;	MA	# ( ৭ → 9 ) BENGALI DIGIT SEVEN → DIGIT NINE	# 
 0D6D ;	0039 ;	MA	# ( ൭ → 9 ) MALAYALAM DIGIT SEVEN → DIGIT NINE	# 
+1CCF9 ;	0039 ;	MA	# ( 𜳹 → 9 ) OUTLINED DIGIT NINE → DIGIT NINE	# 
 1D7D7 ;	0039 ;	MA	# ( 𝟗 → 9 ) MATHEMATICAL BOLD DIGIT NINE → DIGIT NINE	# 
 1D7E1 ;	0039 ;	MA	# ( 𝟡 → 9 ) MATHEMATICAL DOUBLE-STRUCK DIGIT NINE → DIGIT NINE	# 
 1D7EB ;	0039 ;	MA	# ( 𝟫 → 9 ) MATHEMATICAL SANS-SERIF DIGIT NINE → DIGIT NINE	# 
@@ -1715,6 +1723,7 @@ FF41 ;	0061 ;	MA	# ( a → a ) FULLWIDTH LATIN SMALL LETTER A → LATIN SMALL
 2DF6 ;	0363 ;	MA	# ( ⷶ → ͣ ) COMBINING CYRILLIC LETTER A → COMBINING LATIN SMALL LETTER A	# 
 
 FF21 ;	0041 ;	MA	# ( A → A ) FULLWIDTH LATIN CAPITAL LETTER A → LATIN CAPITAL LETTER A	# →А→
+1CCD6 ;	0041 ;	MA	#* ( 𜳖 → A ) OUTLINED LATIN CAPITAL LETTER A → LATIN CAPITAL LETTER A	# 
 1D400 ;	0041 ;	MA	# ( 𝐀 → A ) MATHEMATICAL BOLD CAPITAL A → LATIN CAPITAL LETTER A	# 
 1D434 ;	0041 ;	MA	# ( 𝐴 → A ) MATHEMATICAL ITALIC CAPITAL A → LATIN CAPITAL LETTER A	# 
 1D468 ;	0041 ;	MA	# ( 𝑨 → A ) MATHEMATICAL BOLD ITALIC CAPITAL A → LATIN CAPITAL LETTER A	# 
@@ -1817,6 +1826,7 @@ A4EF ;	2C6F ;	MA	# ( ꓯ → Ɐ ) LISU LETTER AE → LATIN CAPITAL LETTER TURNE
 
 FF22 ;	0042 ;	MA	# ( B → B ) FULLWIDTH LATIN CAPITAL LETTER B → LATIN CAPITAL LETTER B	# →Β→
 212C ;	0042 ;	MA	# ( ℬ → B ) SCRIPT CAPITAL B → LATIN CAPITAL LETTER B	# 
+1CCD7 ;	0042 ;	MA	#* ( 𜳗 → B ) OUTLINED LATIN CAPITAL LETTER B → LATIN CAPITAL LETTER B	# 
 1D401 ;	0042 ;	MA	# ( 𝐁 → B ) MATHEMATICAL BOLD CAPITAL B → LATIN CAPITAL LETTER B	# 
 1D435 ;	0042 ;	MA	# ( 𝐵 → B ) MATHEMATICAL ITALIC CAPITAL B → LATIN CAPITAL LETTER B	# 
 1D469 ;	0042 ;	MA	# ( 𝑩 → B ) MATHEMATICAL BOLD ITALIC CAPITAL B → LATIN CAPITAL LETTER B	# 
@@ -1894,12 +1904,13 @@ ABAF ;	0063 ;	MA	# ( ꮯ → c ) CHEROKEE SMALL LETTER TLI → LATIN SMALL LETTE
 2DED ;	0368 ;	MA	# ( ⷭ → ͨ ) COMBINING CYRILLIC LETTER ES → COMBINING LATIN SMALL LETTER C	# 
 
 1F74C ;	0043 ;	MA	#* ( 🝌 → C ) ALCHEMICAL SYMBOL FOR CALX → LATIN CAPITAL LETTER C	# 
-118F2 ;	0043 ;	MA	#* ( 𑣲 → C ) WARANG CITI NUMBER NINETY → LATIN CAPITAL LETTER C	# 
 118E9 ;	0043 ;	MA	# ( 𑣩 → C ) WARANG CITI DIGIT NINE → LATIN CAPITAL LETTER C	# 
+118F2 ;	0043 ;	MA	#* ( 𑣲 → C ) WARANG CITI NUMBER NINETY → LATIN CAPITAL LETTER C	# 
 FF23 ;	0043 ;	MA	# ( C → C ) FULLWIDTH LATIN CAPITAL LETTER C → LATIN CAPITAL LETTER C	# →С→
 216D ;	0043 ;	MA	# ( Ⅽ → C ) ROMAN NUMERAL ONE HUNDRED → LATIN CAPITAL LETTER C	# 
 2102 ;	0043 ;	MA	# ( ℂ → C ) DOUBLE-STRUCK CAPITAL C → LATIN CAPITAL LETTER C	# 
 212D ;	0043 ;	MA	# ( ℭ → C ) BLACK-LETTER CAPITAL C → LATIN CAPITAL LETTER C	# 
+1CCD8 ;	0043 ;	MA	#* ( 𜳘 → C ) OUTLINED LATIN CAPITAL LETTER C → LATIN CAPITAL LETTER C	# 
 1D402 ;	0043 ;	MA	# ( 𝐂 → C ) MATHEMATICAL BOLD CAPITAL C → LATIN CAPITAL LETTER C	# 
 1D436 ;	0043 ;	MA	# ( 𝐶 → C ) MATHEMATICAL ITALIC CAPITAL C → LATIN CAPITAL LETTER C	# 
 1D46A ;	0043 ;	MA	# ( 𝑪 → C ) MATHEMATICAL BOLD ITALIC CAPITAL C → LATIN CAPITAL LETTER C	# 
@@ -1995,6 +2006,7 @@ A4D2 ;	0064 ;	MA	# ( ꓒ → d ) LISU LETTER PHA → LATIN SMALL LETTER D	#
 
 216E ;	0044 ;	MA	# ( Ⅾ → D ) ROMAN NUMERAL FIVE HUNDRED → LATIN CAPITAL LETTER D	# 
 2145 ;	0044 ;	MA	# ( ⅅ → D ) DOUBLE-STRUCK ITALIC CAPITAL D → LATIN CAPITAL LETTER D	# 
+1CCD9 ;	0044 ;	MA	#* ( 𜳙 → D ) OUTLINED LATIN CAPITAL LETTER D → LATIN CAPITAL LETTER D	# 
 1D403 ;	0044 ;	MA	# ( 𝐃 → D ) MATHEMATICAL BOLD CAPITAL D → LATIN CAPITAL LETTER D	# 
 1D437 ;	0044 ;	MA	# ( 𝐷 → D ) MATHEMATICAL ITALIC CAPITAL D → LATIN CAPITAL LETTER D	# 
 1D46B ;	0044 ;	MA	# ( 𝑫 → D ) MATHEMATICAL BOLD ITALIC CAPITAL D → LATIN CAPITAL LETTER D	# 
@@ -2087,6 +2099,7 @@ AB32 ;	0065 ;	MA	# ( ꬲ → e ) LATIN SMALL LETTER BLACKLETTER E → LATIN SMAL
 22FF ;	0045 ;	MA	#* ( ⋿ → E ) Z NOTATION BAG MEMBERSHIP → LATIN CAPITAL LETTER E	# 
 FF25 ;	0045 ;	MA	# ( E → E ) FULLWIDTH LATIN CAPITAL LETTER E → LATIN CAPITAL LETTER E	# →Ε→
 2130 ;	0045 ;	MA	# ( ℰ → E ) SCRIPT CAPITAL E → LATIN CAPITAL LETTER E	# 
+1CCDA ;	0045 ;	MA	#* ( 𜳚 → E ) OUTLINED LATIN CAPITAL LETTER E → LATIN CAPITAL LETTER E	# 
 1D404 ;	0045 ;	MA	# ( 𝐄 → E ) MATHEMATICAL BOLD CAPITAL E → LATIN CAPITAL LETTER E	# 
 1D438 ;	0045 ;	MA	# ( 𝐸 → E ) MATHEMATICAL ITALIC CAPITAL E → LATIN CAPITAL LETTER E	# 
 1D46C ;	0045 ;	MA	# ( 𝑬 → E ) MATHEMATICAL BOLD ITALIC CAPITAL E → LATIN CAPITAL LETTER E	# 
@@ -2182,6 +2195,7 @@ A799 ;	0066 ;	MA	# ( ꞙ → f ) LATIN SMALL LETTER F WITH STROKE → LATIN SMAL
 
 1D213 ;	0046 ;	MA	#* ( 𝈓 → F ) GREEK VOCAL NOTATION SYMBOL-20 → LATIN CAPITAL LETTER F	# →Ϝ→
 2131 ;	0046 ;	MA	# ( ℱ → F ) SCRIPT CAPITAL F → LATIN CAPITAL LETTER F	# 
+1CCDB ;	0046 ;	MA	#* ( 𜳛 → F ) OUTLINED LATIN CAPITAL LETTER F → LATIN CAPITAL LETTER F	# 
 1D405 ;	0046 ;	MA	# ( 𝐅 → F ) MATHEMATICAL BOLD CAPITAL F → LATIN CAPITAL LETTER F	# 
 1D439 ;	0046 ;	MA	# ( 𝐹 → F ) MATHEMATICAL ITALIC CAPITAL F → LATIN CAPITAL LETTER F	# 
 1D46D ;	0046 ;	MA	# ( 𝑭 → F ) MATHEMATICAL BOLD ITALIC CAPITAL F → LATIN CAPITAL LETTER F	# 
@@ -2250,6 +2264,7 @@ FF47 ;	0067 ;	MA	# ( g → g ) FULLWIDTH LATIN SMALL LETTER G → LATIN SMALL
 018D ;	0067 ;	MA	# ( ƍ → g ) LATIN SMALL LETTER TURNED DELTA → LATIN SMALL LETTER G	# 
 0581 ;	0067 ;	MA	# ( ց → g ) ARMENIAN SMALL LETTER CO → LATIN SMALL LETTER G	# 
 
+1CCDC ;	0047 ;	MA	#* ( 𜳜 → G ) OUTLINED LATIN CAPITAL LETTER G → LATIN CAPITAL LETTER G	# 
 1D406 ;	0047 ;	MA	# ( 𝐆 → G ) MATHEMATICAL BOLD CAPITAL G → LATIN CAPITAL LETTER G	# 
 1D43A ;	0047 ;	MA	# ( 𝐺 → G ) MATHEMATICAL ITALIC CAPITAL G → LATIN CAPITAL LETTER G	# 
 1D46E ;	0047 ;	MA	# ( 𝑮 → G ) MATHEMATICAL BOLD ITALIC CAPITAL G → LATIN CAPITAL LETTER G	# 
@@ -2310,6 +2325,7 @@ FF28 ;	0048 ;	MA	# ( H → H ) FULLWIDTH LATIN CAPITAL LETTER H → LATIN CAPI
 210B ;	0048 ;	MA	# ( ℋ → H ) SCRIPT CAPITAL H → LATIN CAPITAL LETTER H	# 
 210C ;	0048 ;	MA	# ( ℌ → H ) BLACK-LETTER CAPITAL H → LATIN CAPITAL LETTER H	# 
 210D ;	0048 ;	MA	# ( ℍ → H ) DOUBLE-STRUCK CAPITAL H → LATIN CAPITAL LETTER H	# 
+1CCDD ;	0048 ;	MA	#* ( 𜳝 → H ) OUTLINED LATIN CAPITAL LETTER H → LATIN CAPITAL LETTER H	# 
 1D407 ;	0048 ;	MA	# ( 𝐇 → H ) MATHEMATICAL BOLD CAPITAL H → LATIN CAPITAL LETTER H	# 
 1D43B ;	0048 ;	MA	# ( 𝐻 → H ) MATHEMATICAL ITALIC CAPITAL H → LATIN CAPITAL LETTER H	# 
 1D46F ;	0048 ;	MA	# ( 𝑯 → H ) MATHEMATICAL BOLD ITALIC CAPITAL H → LATIN CAPITAL LETTER H	# 
@@ -2371,7 +2387,7 @@ A6B1 ;	2C75 ;	MA	# ( ꚱ → Ⱶ ) BAMUM LETTER NDAA → LATIN CAPITAL LETTER HA
 A795 ;	A727 ;	MA	# ( ꞕ → ꜧ ) LATIN SMALL LETTER H WITH PALATAL HOOK → LATIN SMALL LETTER HENG	# 
 
 02DB ;	0069 ;	MA	#* ( ˛ → i ) OGONEK → LATIN SMALL LETTER I	# →ͺ→→ι→→ι→
-2373 ;	0069 ;	MA	#* ( ⍳ → i ) APL FUNCTIONAL SYMBOL IOTA → LATIN SMALL LETTER I	# →ι→
+2373 ;	0069 ;	MA	#* ( ⍳ → i ) APL FUNCTIONAL SYMBOL IOTA → LATIN SMALL LETTER I	# →ɩ→
 FF49 ;	0069 ;	MA	# ( i → i ) FULLWIDTH LATIN SMALL LETTER I → LATIN SMALL LETTER I	# →і→
 2170 ;	0069 ;	MA	# ( ⅰ → i ) SMALL ROMAN NUMERAL ONE → LATIN SMALL LETTER I	# 
 2139 ;	0069 ;	MA	# ( ℹ → i ) INFORMATION SOURCE → LATIN SMALL LETTER I	# 
@@ -2449,6 +2465,7 @@ FF4A ;	006A ;	MA	# ( j → j ) FULLWIDTH LATIN SMALL LETTER J → LATIN SMALL
 0458 ;	006A ;	MA	# ( ј → j ) CYRILLIC SMALL LETTER JE → LATIN SMALL LETTER J	# 
 
 FF2A ;	004A ;	MA	# ( J → J ) FULLWIDTH LATIN CAPITAL LETTER J → LATIN CAPITAL LETTER J	# →Ј→
+1CCDF ;	004A ;	MA	#* ( 𜳟 → J ) OUTLINED LATIN CAPITAL LETTER J → LATIN CAPITAL LETTER J	# 
 1D409 ;	004A ;	MA	# ( 𝐉 → J ) MATHEMATICAL BOLD CAPITAL J → LATIN CAPITAL LETTER J	# 
 1D43D ;	004A ;	MA	# ( 𝐽 → J ) MATHEMATICAL ITALIC CAPITAL J → LATIN CAPITAL LETTER J	# 
 1D471 ;	004A ;	MA	# ( 𝑱 → J ) MATHEMATICAL BOLD ITALIC CAPITAL J → LATIN CAPITAL LETTER J	# 
@@ -2496,6 +2513,7 @@ AB7B ;	1D0A ;	MA	# ( ꭻ → ᴊ ) CHEROKEE SMALL LETTER GU → LATIN LETTER SMA
 
 212A ;	004B ;	MA	# ( K → K ) KELVIN SIGN → LATIN CAPITAL LETTER K	# 
 FF2B ;	004B ;	MA	# ( K → K ) FULLWIDTH LATIN CAPITAL LETTER K → LATIN CAPITAL LETTER K	# →Κ→
+1CCE0 ;	004B ;	MA	#* ( 𜳠 → K ) OUTLINED LATIN CAPITAL LETTER K → LATIN CAPITAL LETTER K	# 
 1D40A ;	004B ;	MA	# ( 𝐊 → K ) MATHEMATICAL BOLD CAPITAL K → LATIN CAPITAL LETTER K	# 
 1D43E ;	004B ;	MA	# ( 𝐾 → K ) MATHEMATICAL ITALIC CAPITAL K → LATIN CAPITAL LETTER K	# 
 1D472 ;	004B ;	MA	# ( 𝑲 → K ) MATHEMATICAL BOLD ITALIC CAPITAL K → LATIN CAPITAL LETTER K	# 
@@ -2543,6 +2561,7 @@ FFE8 ;	006C ;	MA	#* ( │ → l ) HALFWIDTH FORMS LIGHT VERTICAL → LATIN SMALL
 06F1 ;	006C ;	MA	# ( ۱ → l ) EXTENDED ARABIC-INDIC DIGIT ONE → LATIN SMALL LETTER L	# →1→
 10320 ;	006C ;	MA	#* ( 𐌠 → l ) OLD ITALIC NUMERAL ONE → LATIN SMALL LETTER L	# →𐌉→→I→
 1E8C7 ;	006C ;	MA	#* ( ‎𞣇‎ → l ) MENDE KIKAKUI DIGIT ONE → LATIN SMALL LETTER L	# 
+1CCF1 ;	006C ;	MA	# ( 𜳱 → l ) OUTLINED DIGIT ONE → LATIN SMALL LETTER L	# →1→
 1D7CF ;	006C ;	MA	# ( 𝟏 → l ) MATHEMATICAL BOLD DIGIT ONE → LATIN SMALL LETTER L	# →1→
 1D7D9 ;	006C ;	MA	# ( 𝟙 → l ) MATHEMATICAL DOUBLE-STRUCK DIGIT ONE → LATIN SMALL LETTER L	# →1→
 1D7E3 ;	006C ;	MA	# ( 𝟣 → l ) MATHEMATICAL SANS-SERIF DIGIT ONE → LATIN SMALL LETTER L	# →1→
@@ -2554,6 +2573,7 @@ FF29 ;	006C ;	MA	# ( I → l ) FULLWIDTH LATIN CAPITAL LETTER I → LATIN SMAL
 2160 ;	006C ;	MA	# ( Ⅰ → l ) ROMAN NUMERAL ONE → LATIN SMALL LETTER L	# →Ӏ→
 2110 ;	006C ;	MA	# ( ℐ → l ) SCRIPT CAPITAL I → LATIN SMALL LETTER L	# →I→
 2111 ;	006C ;	MA	# ( ℑ → l ) BLACK-LETTER CAPITAL I → LATIN SMALL LETTER L	# →I→
+1CCDE ;	006C ;	MA	#* ( 𜳞 → l ) OUTLINED LATIN CAPITAL LETTER I → LATIN SMALL LETTER L	# →I→
 1D408 ;	006C ;	MA	# ( 𝐈 → l ) MATHEMATICAL BOLD CAPITAL I → LATIN SMALL LETTER L	# →I→
 1D43C ;	006C ;	MA	# ( 𝐼 → l ) MATHEMATICAL ITALIC CAPITAL I → LATIN SMALL LETTER L	# →I→
 1D470 ;	006C ;	MA	# ( 𝑰 → l ) MATHEMATICAL BOLD ITALIC CAPITAL I → LATIN SMALL LETTER L	# →I→
@@ -2610,6 +2630,7 @@ A4F2 ;	006C ;	MA	# ( ꓲ → l ) LISU LETTER I → LATIN SMALL LETTER L	# →I
 1D22A ;	004C ;	MA	#* ( 𝈪 → L ) GREEK INSTRUMENTAL NOTATION SYMBOL-23 → LATIN CAPITAL LETTER L	# 
 216C ;	004C ;	MA	# ( Ⅼ → L ) ROMAN NUMERAL FIFTY → LATIN CAPITAL LETTER L	# 
 2112 ;	004C ;	MA	# ( ℒ → L ) SCRIPT CAPITAL L → LATIN CAPITAL LETTER L	# 
+1CCE1 ;	004C ;	MA	#* ( 𜳡 → L ) OUTLINED LATIN CAPITAL LETTER L → LATIN CAPITAL LETTER L	# 
 1D40B ;	004C ;	MA	# ( 𝐋 → L ) MATHEMATICAL BOLD CAPITAL L → LATIN CAPITAL LETTER L	# 
 1D43F ;	004C ;	MA	# ( 𝐿 → L ) MATHEMATICAL ITALIC CAPITAL L → LATIN CAPITAL LETTER L	# 
 1D473 ;	004C ;	MA	# ( 𝑳 → L ) MATHEMATICAL BOLD ITALIC CAPITAL L → LATIN CAPITAL LETTER L	# 
@@ -2761,11 +2782,11 @@ FE87 ;	006C 0655 ;	MA	# ( ‎ﺇ‎ → lٕ ) ARABIC LETTER ALEF WITH HAMZA BELO
 
 02AB ;	006C 007A ;	MA	# ( ʫ → lz ) LATIN SMALL LETTER LZ DIGRAPH → LATIN SMALL LETTER L, LATIN SMALL LETTER Z	# 
 
+0675 ;	006C 0674 ;	MA	# ( ‎ٵ‎ → ‎lٴ‎ ) ARABIC LETTER HIGH HAMZA ALEF → LATIN SMALL LETTER L, ARABIC LETTER HIGH HAMZA	# →‎اٴ‎→
 0623 ;	006C 0674 ;	MA	# ( ‎أ‎ → ‎lٴ‎ ) ARABIC LETTER ALEF WITH HAMZA ABOVE → LATIN SMALL LETTER L, ARABIC LETTER HIGH HAMZA	# →‎ٵ‎→→‎اٴ‎→
 FE84 ;	006C 0674 ;	MA	# ( ‎ﺄ‎ → ‎lٴ‎ ) ARABIC LETTER ALEF WITH HAMZA ABOVE FINAL FORM → LATIN SMALL LETTER L, ARABIC LETTER HIGH HAMZA	# →‎أ‎→→‎ٵ‎→→‎اٴ‎→
 FE83 ;	006C 0674 ;	MA	# ( ‎ﺃ‎ → ‎lٴ‎ ) ARABIC LETTER ALEF WITH HAMZA ABOVE ISOLATED FORM → LATIN SMALL LETTER L, ARABIC LETTER HIGH HAMZA	# →‎ٵ‎→→‎اٴ‎→
 0672 ;	006C 0674 ;	MA	# ( ‎ٲ‎ → ‎lٴ‎ ) ARABIC LETTER ALEF WITH WAVY HAMZA ABOVE → LATIN SMALL LETTER L, ARABIC LETTER HIGH HAMZA	# →‎أ‎→→‎ٵ‎→→‎اٴ‎→
-0675 ;	006C 0674 ;	MA	# ( ‎ٵ‎ → ‎lٴ‎ ) ARABIC LETTER HIGH HAMZA ALEF → LATIN SMALL LETTER L, ARABIC LETTER HIGH HAMZA	# →‎اٴ‎→
 
 FDF3 ;	006C 0643 0628 0631 ;	MA	# ( ‎ﷳ‎ → ‎lكبر‎ ) ARABIC LIGATURE AKBAR ISOLATED FORM → LATIN SMALL LETTER L, ARABIC LETTER KAF, ARABIC LETTER BEH, ARABIC LETTER REH	# →‎اكبر‎→
 
@@ -2784,6 +2805,7 @@ ABAE ;	029F ;	MA	# ( ꮮ → ʟ ) CHEROKEE SMALL LETTER TLE → LATIN LETTER SMA
 FF2D ;	004D ;	MA	# ( M → M ) FULLWIDTH LATIN CAPITAL LETTER M → LATIN CAPITAL LETTER M	# →Μ→
 216F ;	004D ;	MA	# ( Ⅿ → M ) ROMAN NUMERAL ONE THOUSAND → LATIN CAPITAL LETTER M	# 
 2133 ;	004D ;	MA	# ( ℳ → M ) SCRIPT CAPITAL M → LATIN CAPITAL LETTER M	# 
+1CCE2 ;	004D ;	MA	#* ( 𜳢 → M ) OUTLINED LATIN CAPITAL LETTER M → LATIN CAPITAL LETTER M	# 
 1D40C ;...
[truncated]

@vbvictor
Copy link
Contributor

Is this file taken as-is from some place or crafted by hand? Could you share a source?

@localspook
Copy link
Contributor Author

Copy link
Contributor

@vbvictor vbvictor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
You should place verb in commit message in front so that the whole message would be like a sentence:
"Update 'confusables.txt' in 'misc-confusable-identifiers'"
That is the usual way of naming in LLVM (see other open PRs)

@localspook localspook changed the title [clang-tidy] misc-confusable-identifiers: Update confusables.txt [clang-tidy] Update confusables.txt in misc-confusable-identifiers Jul 12, 2025
@vbvictor vbvictor merged commit 4328b69 into llvm:main Jul 14, 2025
12 checks passed
@localspook localspook deleted the update-ucd branch July 15, 2025 05:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants