Skip to content

Commit

Permalink
Merge pull request #139 from n8willis/unicode14
Browse files Browse the repository at this point in the history
Update script data to Unicode 14.
  • Loading branch information
n8willis authored Feb 7, 2022
2 parents a0fabe2 + 1c8761b commit 0359447
Show file tree
Hide file tree
Showing 7 changed files with 92 additions and 30 deletions.
91 changes: 75 additions & 16 deletions character-tables/character-tables-arabic.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ This document lists the per-character shaping information needed to
- [Arabic character table](#arabic-character-table)
- [Arabic Supplement character table](#arabic-supplement-character-table)
- [Arabic Extended-A character table](#arabic-extended-a-character-table)
- [Arabic Extended-B character table](#arabic-extended-b-character-table)
- [Rumi Numeral Symbols character table](#rumi-numeral-symbols-character-table)
- [Miscellaneous character table](#miscellaneous-character-table)

Expand Down Expand Up @@ -78,7 +79,7 @@ treated differently during the mark-reordering stage.
|`U+061A` | Mark [Mn] | TRANSPARENT | _null_ | 32 | ؚ Small Kasra |
|`U+061B` | Punctuation | NON_JOINING | _null_ | _0_ | ؛ Semicolon |
|`U+061C` | Other | TRANSPARENT | _null_ | _0_ | ؜ Arabic Letter Mark |
|`U+061D` | _unassigned_ | | | | |
|`U+061D` | Punctuation | NON_JOINING | _null_ | _0_ | ؝ End Of Text Mark |
|`U+061E` | Punctuation | NON_JOINING | _null_ | _0_ | ؞ Triple Dot Punctuation Mark |
|`U+061F` | Punctuation | NON_JOINING | _null_ | _0_ | ؟ Question Mark |
| | | | | |
Expand Down Expand Up @@ -406,7 +407,7 @@ treated differently during the mark-reordering stage.
|`U+08B2` | Letter | RIGHT | REH | _0_ | ࢲ Reh With Dot And Inverted V Above |
|`U+08B3` | Letter | DUAL | AIN | _0_ | ࢳ Ain With 3 Dots Below |
|`U+08B4` | Letter | DUAL | KAF | _0_ | ࢴ Kaf With Dot Below |
|`U+08B5` | _unassigned_ | | | | |
|`U+08B5` | Letter | DUAL | QAF | _0_ | ࢵ Qaf With Dot Below |
|`U+08B6` | Letter | DUAL | BEH | _0_ | ࢶ Beh With Meem Above |
|`U+08B7` | Letter | DUAL | BEH | _0_ | ࢷ Dotless Beh With 3 Dots Below And Meem Above |
|`U+08B8` | Letter | DUAL | BEH | _0_ | ࢸ Dotless Beh With Teh Above |
Expand All @@ -426,18 +427,18 @@ treated differently during the mark-reordering stage.
|`U+08C5` | Letter | DUAL | HAH | _0_ | ࣅ Jeem With 3 Dots Above |
|`U+08C6` | Letter | DUAL | HAH | _0_ | ࣆ Jeem With 3 Dots Below |
|`U+08C7` | Letter | DUAL | LAM | _0_ | ࣇ Lam With Small Arabic Tah Above |
|`U+08C8` | _unassigned_ | | | | |
|`U+08C9` | _unassigned_ | | | | |
|`U+08CA` | _unassigned_ | | | | |
|`U+08CB` | _unassigned_ | | | | |
|`U+08CC` | _unassigned_ | | | | |
|`U+08CD` | _unassigned_ | | | | |
|`U+08CE` | _unassigned_ | | | | |
|`U+08CF` | _unassigned_ | | | | |
| | | | | |
|`U+08D0` | _unassigned_ | | | | |
|`U+08D1` | _unassigned_ | | | | |
|`U+08D2` | _unassigned_ | | | | |
|`U+08C8` | Letter | DUAL | GAF | _0_ | ࣈ Graf |
|`U+08C9` | Letter modifier | TRANSPARENT | _null_ | _0_ | ࣉ Small Farsi Yeh |
|`U+08CA` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࣊ Small High Farsi Yeh |
|`U+08CB` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࣋ Small High Yeh Barree With Two Dots Below |
|`U+08CC` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࣌ Small High Word Sah |
|`U+08CD` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࣍ Small High Zah |
|`U+08CE` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࣎ Large Round Dot Above |
|`U+08CF` | Mark [Mn] | TRANSPARENT | _null_ | 220 | ࣏ Large Round Dot Below |
| | | | | |
|`U+08D0` | Mark [Mn] | TRANSPARENT | _null_ | 220 | ࣐ Sukun Below |
|`U+08D1` | Mark [Mn] | TRANSPARENT | _null_ | 220 | ࣑ Large Circle Below |
|`U+08D2` | Mark [Mn] | TRANSPARENT | _null_ | 220 | ࣒ Large Round Dot Inside Circle Below |
|`U+08D3` | Mark [Mn] | TRANSPARENT | _null_ | 220 | ࣓ Small Low Waw |
|`U+08D4` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࣔ Small High Word Ar-Rub |
|`U+08D5` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࣕ Small High Sad |
Expand All @@ -451,7 +452,7 @@ treated differently during the mark-reordering stage.
|`U+08DD` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࣝ Small High Word Sakta |
|`U+08DE` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࣞ Small High Word Qif |
|`U+08DF` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࣟ Small High Word Waqfa |
| | | | | |
| | | | | |
|`U+08E0` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࣠ Small High Footnote Marker |
|`U+08E1` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࣡ Small High Sign Safha |
|`U+08E2` | Other | NON_JOINING | _null_ | _0_ | ࣢ Disputed End Of Ayah |
Expand Down Expand Up @@ -487,6 +488,64 @@ treated differently during the mark-reordering stage.
|`U+08FF` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࣿ Mark Sideways Noon Ghunna |


## Arabic Extended-B character table ##


| Codepoint | Unicode category | Joining type | Joining group | Mark class | Glyph |
|:----------|:-----------------|:-------------|:---------------------|:-----------|-------------------------------------------------------|
|`U+0870` | Letter | RIGHT | ALEF | _0_ | ࡰ Alef With Attached Fatha |
|`U+0871` | Letter | RIGHT | ALEF | _0_ | ࡱ Alef With Attached Top Right Fatha |
|`U+0872` | Letter | RIGHT | ALEF | _0_ | ࡲ Alef With Right Middle Stroke |
|`U+0873` | Letter | RIGHT | ALEF | _0_ | ࡳ Alef With Left Middle Stroke |
|`U+0874` | Letter | RIGHT | ALEF | _0_ | ࡴ Alef With Attached Kasra |
|`U+0875` | Letter | RIGHT | ALEF | _0_ | ࡵ Alef With Attached Bottom Right Kasra |
|`U+0876` | Letter | RIGHT | ALEF | _0_ | ࡶ Alef With Attached Round Dot Above |
|`U+0877` | Letter | RIGHT | ALEF | _0_ | ࡷ Alef With Attached Right Round Dot |
|`U+0878` | Letter | RIGHT | ALEF | _0_ | ࡸ Alef With Attached Left Round Dot |
|`U+0879` | Letter | RIGHT | ALEF | _0_ | ࡹ Alef With Attached Round Dot Below |
|`U+087A` | Letter | RIGHT | ALEF | _0_ | ࡺ Alef With Dot Above |
|`U+087B` | Letter | RIGHT | ALEF | _0_ | ࡻ Alef With Attached Top Right Fatha And Dot Above|
|`U+087C` | Letter | RIGHT | ALEF | _0_ | ࡼ Alef With Right Middle Stroke And Dot Above |
|`U+087D` | Letter | RIGHT | ALEF | _0_ | ࡽ Alef With Attached Bottom Right Kasra And Dot Above|
|`U+087E` | Letter | RIGHT | ALEF | _0_ | ࡾ Alef With Attached Top Right Fatha And Left Ring|
|`U+087F` | Letter | RIGHT | ALEF | _0_ | ࡿ Alef With Right Middle Stroke And Left Ring |
| | | | | |
|`U+0880` | Letter | RIGHT | ALEF | _0_ | ࢀ Alef With Attached Bottom Right Kasra And Left Ring|
|`U+0881` | Letter | RIGHT | ALEF | _0_ | ࢁ Alef With Attached Right Hamza |
|`U+0882` | Letter | RIGHT | ALEF | _0_ | ࢂ Alef With Attached Left Hamza |
|`U+0883` | Letter modifier | JOIN_CAUSING | _null_ | _0_ | ࢃ Tatweel With Overstruck Hamza |
|`U+0884` | Letter modifier | JOIN_CAUSING | _null_ | _0_ | ࢄ Tatweel With Overstruck Waw |
|`U+0885` | Letter modifier | JOIN_CAUSING | _null_ | _0_ | ࢅ Tatweel With Two Dots Below |
|`U+0886` | Letter | DUAL | THIN_YEH | _0_ | ࢆ Thin Yeh |
|`U+0887` | Letter | NON_JOINING | _null_ | _0_ | ࢇ Baseline Round Dot |
|`U+0888` | Symbol | NON_JOINING | _null_ | _0_ | ࢈ Raised Round Dot |
|`U+0889` | Letter | DUAL | NOON | _0_ | ࢉ Noon With Inverted Small V |
|`U+088A` | Letter | DUAL | HAH | _0_ | ࢊ Hah With Inverted Small V Below |
|`U+088B` | Letter | DUAL | TAH | _0_ | ࢋ Tah With Dot Below |
|`U+088C` | Letter | DUAL | TAH | _0_ | ࢌ Tah With Three Dots Below |
|`U+088D` | Letter | DUAL | GAF | _0_ | ࢍ Keheh With Two Dots Vertically Below |
|`U+088E` | Letter | RIGHT | VERTICAL_TAIL | _0_ | ࢎ Vertical Tail |
|`U+088F` | _unassigned_ | | | | |
| | | | | |
|`U+0890` | Symbol | NON_JOINING | _null_ | _0_ | ࢐ Pound Mark Above |
|`U+0891` | Symbol | NON_JOINING | _null_ | _0_ | ࢑ Piastre Mark Above |
|`U+0892` | _unassigned_ | | | | |
|`U+0893` | _unassigned_ | | | | |
|`U+0894` | _unassigned_ | | | | |
|`U+0895` | _unassigned_ | | | | |
|`U+0896` | _unassigned_ | | | | |
|`U+0897` | _unassigned_ | | | | |
|`U+0898` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࢘ Small High Word Al-Juz |
|`U+0899` | Mark [Mn] | TRANSPARENT | _null_ | 220 | ࢙ Small Low Word Ishmaam |
|`U+089A` | Mark [Mn] | TRANSPARENT | _null_ | 220 | ࢚ Small Low Word Imaala |
|`U+089B` | Mark [Mn] | TRANSPARENT | _null_ | 220 | ࢛ Small Low Word Tasheel |
|`U+089C` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࢜ Madda Waajib |
|`U+089D` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࢝ Superscript Alef Mokhassas |
|`U+089E` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࢞ Doubled Madda |
|`U+089F` | Mark [Mn] | TRANSPARENT | _null_ | 230 | ࢟ Half Madda Over Madda |
| | | | | |


## Rumi Numeral Symbols character table ##

| Codepoint | Unicode category | Joining type | Joining group | Mark class | Glyph |
Expand Down Expand Up @@ -523,7 +582,7 @@ treated differently during the mark-reordering stage.
|`U+10E7C` | Number | NON_JOINING | _null_ | _0_ | 𐹼 Fraction One Quarter |
|`U+10E7D` | Number | NON_JOINING | _null_ | _0_ | 𐹽 Fraction One Third |
|`U+10E7E` | Number | NON_JOINING | _null_ | _0_ | 𐹾 Fraction Two Thirds |
|`U+10E7F` | _unasigned_ | | | | |
|`U+10E7F` | _unassigned_ | | | | |


<!---
Expand Down
2 changes: 1 addition & 1 deletion character-tables/character-tables-kannada.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@ specific, script-aware behavior.
|`U+0CDA` | _unassigned_ | | | |
|`U+0CDB` | _unassigned_ | | | |
|`U+0CDC` | _unassigned_ | | | |
|`U+0CDD` | _unassigned_ | | | |
|`U+0CDD` | Letter | CONSONANT_DEAD | _null_ | &#x0CDD; Nakaara Pollu |
|`U+0CDE` | Letter | CONSONANT | _null_ | &#x0CDE; Fa |
|`U+0CDF` | _unassigned_ | | | |
| | | | |
Expand Down
2 changes: 1 addition & 1 deletion character-tables/character-tables-mongolian.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ treated differently during the mark-reordering stage.
|`U+180C` | Mark [Mn] | TRANSPARENT | _null_ | _0_ | &#x180C; Free Variation Selector Two |
|`U+180D` | Mark [Mn] | TRANSPARENT | _null_ | _0_ | &#x180D; Free Variation Selector Three |
|`U+180E` | Formatting | NON_JOINING | _null_ | _0_ | &#x180E; Mongolian Vowel Separator |
|`U+180F` | _unassigned_ | | | | |
|`U+180F` | Mark [Mn] | TRANSPARENT | _null_ | _0_ | &#x180f; Free Variation Selector Four |
| | | | | |
|`U+1810` | Number | NON_JOINING | _null_ | _0_ | &#x1810; Digit Zero |
|`U+1811` | Number | NON_JOINING | _null_ | _0_ | &#x1811; Digit One |
Expand Down
4 changes: 2 additions & 2 deletions character-tables/character-tables-telugu.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ specific, script-aware behavior.
|`U+0C39` | Letter | CONSONANT | _null_ | &#x0C39; Ha |
|`U+0C3A` | _unassigned_ | | | |
|`U+0C3B` | _unassigned_ | | | |
|`U+0C3C` | _unassigned_ | | | |
|`U+0C3C` | Mark [Mn] | NUKTA | BOTTOM_POSITION | &#x0C3C; Nukta |
|`U+0C3D` | Letter | AVAGRAHA | _null_ | &#x0C3D; Avagraha |
|`U+0C3E` | Mark [Mn] | VOWEL_DEPENDENT | TOP_POSITION | &#x0C3E; Sign Aa |
|`U+0C3F` | Mark [Mn] | VOWEL_DEPENDENT | TOP_POSITION | &#x0C3F; Sign I |
Expand Down Expand Up @@ -137,7 +137,7 @@ specific, script-aware behavior.
|`U+0C5A` | Letter | CONSONANT | _null_ | &#x0C5A; Rrra |
|`U+0C5B` | _unassigned_ | | | |
|`U+0C5C` | _unassigned_ | | | |
|`U+0C5D` | _unassigned_ | | | |
|`U+0C5D` | Letter | CONSONANT_DEAD | _null_ | &#x0C5D; Nakaara Pollu |
|`U+0C5E` | _unassigned_ | | | |
|`U+0C5F` | _unassigned_ | | | |
| | | | |
Expand Down
Loading

0 comments on commit 0359447

Please sign in to comment.