Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UFO4] support cmap Unicode Variation Sequences #79

Open
moyogo opened this issue Jan 9, 2019 · 7 comments
Open

[UFO4] support cmap Unicode Variation Sequences #79

moyogo opened this issue Jan 9, 2019 · 7 comments
Labels
considering Specification change under consideration. proposal Proposed specification change. ufo4 UFO 4 issues.

Comments

@moyogo
Copy link
Collaborator

moyogo commented Jan 9, 2019

See various comments in #77.

In particular #77 (comment):

The UVS data can be represented by a sequence of (unicodeValue, variationSelector, glyphName) tuples, where glyphName is optional. No glyph name means: this is the default variation, and the cmap should be used to find the glyph name for this code point.

@justvanrossum
Copy link
Contributor

I see two ways of storing the UVS data:

  1. As a nested structure, a dict at the top level, mapping variationSelector keys to dicts, that map unicodeValue keys to glyphName strings.
  2. A two-dimensional table of rows with three fields each.

Option 1 can be stored in plist format, with the caveat that we need to convert unicode value keys to (hex) strings, as plist dict keys must be strings. The nested data structure closely resembles the internal structure of the OpenType format 14 cmap subtable.

Option 2. could be stored as a tab-separated text file, with the caveat that care has to be taken to respect the "no restrictions in glyph names" UFO policy. The lines in the file represent the Variation Sequences quite literally: 0030 FE00 zero.slash.

Option 1 is more machine-friendly, option 2 is more human-friendly.

@justvanrossum
Copy link
Contributor

Option 1 with just one sequence:

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
  <dict>
    <key>FE00</key>
    <dict>
      <key>0030</key>
      <string>zero.slash</string>
    </dict>
  </dict>
</plist>

Option 2 with just one sequence:

0030 FE00 zero.slash

@justvanrossum
Copy link
Contributor

Storing UVS could be combined with the "regular" character mapping, by using an optional third column for the variation selector:

0030 zero
0030 zero.slash FE00

Or maybe we should consider using (a dialect of) csv:

0030;zero;
0030;zero.slash;FE00

@benkiel
Copy link
Contributor

benkiel commented Jun 23, 2020

@khaledhosny do you have any opinions on which option that @justvanrossum proposed would be better to work with (and any unseen gottchas that may be missed in them)?

@khaledhosny
Copy link

I don’t have a deep knowledge of the matter, so whatever works with the tools that consume this is fine for me.

@benkiel
Copy link
Contributor

benkiel commented Jun 24, 2020

From twitter:

In order to deal with default vs non-default UVSes, which is important for IVSes, I suggest something along the lines of the following (excerpt from the Adobe-Japan1 IVD collection):
8FBB E0100;cid3056
8FBB E0101;cid8267
Which UVS is default depends on which glyph is encoded.

JIS90-savvy Japanese fonts encode CID+3056 from U+8FBB 辻, meaning <8FBB E0100> 辻󠄀 is the default UVS. JIS2004-savvy ones encode CID+8267 from U+8FBB 辻, meaning <8FBB E0101> 辻󠄁 is the default UVS. The other, of course, is non-default, and requires a UVS to display properly.

And, to be clear, both UVSes should be specified so long as the font includes both glyphs, and both UVSes should be present and accounted for in the Format 14 'cmap' subtable.

Which UVS is default needs to be determined at compile time, because interaction with the Format 12 subtable is required to ascertain which glyph that corresponds to a UVS is encoded, and therefore the default one.

If you are looking for an extreme test case, check out the latest version of “IVS Test,” which I deployed a little over a year ago, and whose Format 14 'cmap' subtable includes nearly 40 million UVSes: https://github.com/adobe-fonts/ivs-test

@benkiel
Copy link
Contributor

benkiel commented Jun 25, 2020

Which, by my reading, means that the spec needs to state that the tool making the font needs to decide which UVS is the default, all the designer can do is to specify the UVS for the cmap.

I'm leaning towards option 2, as it seems the easiest for editing this data (yes, spreadsheets)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
considering Specification change under consideration. proposal Proposed specification change. ufo4 UFO 4 issues.
Projects
None yet
Development

No branches or pull requests

5 participants