Skip to content

[cmap] Documentation of cmap table lacks information on normalization #591

@NorbertLindenberg

Description

@NorbertLindenberg

The documentation of the cmap table lacks information on the Unicode normalization that font consumers should be applying while using cmap tables to map characters to glyphs, and that font producers can rely on when constructing cmap tables and lookups in GSUB and GPOS tables.

Unicode normalization is relevant in two ways:

– It defines canonical decompositions of characters, such as U+00E4 “ä” to U+0061 “a” + U+0308 “◌̈”. In the absence of information about normalization, font producers have to provide entries for both precomposed and decomposed forms, and then either handle both in subsequent lookups or apply their own normalization.

– It defines canonical ordering of marks, which brings sequences of certain marks (those with a non-zero combining class) into a defined order. In the absence of information about normalization, font producers have to be prepared to handle such marks in any order in ligature or contextual lookups, and font consumers such as shaping engines have to be prepared to handle such marks in any order in cluster validation (see bug #568 for an example of their failure to do so).


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions