Support multi-character emoji #511

forresto · 2022-05-02T11:04:29Z

This issue with monochrome Noto Emoji is distinct from the color emoji issue (#193).

#338 added support for non-Basic-Multilingual-Plane (BMP) characters, but uses Array.from, which doesn't account for combined emoji.

It seems that Opentype.js has the glyph information needed, but the initial text-to-glyph translation is the issue:

https://opentype.js.org/glyph-inspector.html

Expected Behavior

Calling notoEmojiFont.draw(context, "👨‍👩‍👧‍👦") should render

Current Behavior

Calling notoEmojiFont.draw(context, "👨‍👩‍👧‍👦") renders

Possible Solution

If "ccmp" is not supported yet and would cover this, this issue can be closed as a duplicate of Consider adding support to more GSUB tags? #443.
Intl.Segmenter is a native solution, but isn't supported by Firefox yet.

const splitSegmentArray = (string) => Array.from(new Intl.Segmenter().segment(string)).map(x => x.segment);
console.log(splitSegmentArray("😅👨‍👩‍👧‍👦💖👩‍💻💔👩‍🌾🧡👨🏽‍🌾💜🖖🏾🌈"))

graphemer is a library-based solution. (It is a fairly big library.)
twemoji-parser is focused on parsing emoji sequences, so it's smaller than graphemer.

Steps to Reproduce (for bugs)

Live demo: https://gm69qn.csb.app

Call notoEmojiFont.stringToGlyphs("👨‍👩‍👧‍👦") and get glyphs for "👨👩👧👦" interspersed with the combiner ("uni200D") instead of the one glyph for the combined family.

Same for other combined emoji, like 👩‍💻, 👩‍🌾, 👨🏽‍🌾, 🖖🏾

Context

We're adding support for emoji to Cuttle CAD, which can render various fonts as vectors for laser cutting, etc.

Your Environment

Version used: 1.3.4
Font used: Noto Emoji (ttf)
Browser Name and version: Various tested
Operating System and version (desktop or mobile): Mac OS desktop
Link to your project: https://gm69qn.csb.app

The text was updated successfully, but these errors were encountered:

forresto · 2022-05-03T10:25:07Z

It seems like font.tables.gsub has the ligatureSets info needed to combine these. Is that something that I can enable with an option?

notoEmojiFont.substitution.getFeature("ccmp") // Array(3640)

The feature tag is "ccmp" ... I'm not seeing that called with defaults via getFeature or getMultiple, though there are some tests. 🤔

If "ccmp" is not supported yet, this can be closed as a duplicate of #443.

forresto · 2022-05-09T09:11:31Z

Looking at #443 I thought this was worth a try:

notoEmojiFont.substitution.add(
  "ccmp", 
  notoEmojiFont.substitution.getFeature('ccmp')
);

but got:

Error: Ligature: unable to modify coverage table format 2

forresto · 2022-05-11T14:31:05Z

In addition to the ccmp substitutions, https://en.wikipedia.org/wiki/Variation_Selectors_(Unicode_block) need to be taken into account. For example, "☠" vs "☠️".

jamesjoung · 2022-05-16T05:04:48Z

Im also looking for a workaround for this.
It would be nice to support it or have workaround?

forresto · 2022-05-16T18:10:38Z

Here's my workaround.

// Opentype.js doesn't actually support these substitutions, so we'll have to
// search them manually
const substitutions = font.substitution.getFeature("ccmp");

function emojiToGlyph (emojiString) {

const glyphs = font
  .stringToGlyphs(emojiString)
  // Discarding these makes the substitution search work for emoji sequences
  // with variation selectors
  // https://en.wikipedia.org/wiki/Variation_Selectors_(Unicode_block)
  .filter((glyph) => glyph.index <= 1850);
let glyph;
if (glyphs.length === 1) {
  glyph = glyphs[0];
} else if (glyphs.length > 1) {
  const indexes = glyphs.map((glyph) => glyph.index);
  const sub = substitutions.find((substitution) => equals(substitution.sub, indexes));
  if (sub) {
    glyph = font.glyphs.get(sub.by);
  }
}
if (glyph) {
  return glyph;
} else {
  throw new Error(`${emojiString} - couldn't find a glyph :(`);
}

}

emojiToGlyph("👨‍👩‍👧‍👦");

/** Custom equals function that can also check lists. */
function equals(a, b) {
  if (a === b) {
    return true;
  } else if (Array.isArray(a) && Array.isArray(b)) {
    if (a.length !== b.length) {
      return false;
    }
    for (let i = 0; i < a.length; i += 1) {
      if (!equals(a[i], b[i])) {
        return false;
      }
    }
    return true;
  } else {
    return false;
  }
}

Caveats:

This only works for one emoji. To replace the glyphs in an arbitrary string, we would also need tokenizer logic.

Only tested with Noto Emoji.

ILOVEPIE · 2022-11-20T01:03:21Z

here's the different options: https://medium.com/making-faces-and-other-emoji/emoji-fonts-technically-40f3fdc0869e
I'd recommend at least supporting COLR/CPAL as it's probably the most widely supported one and one of the most implemented in fonts. It would also probably be a good idea to implement CBDT/CBLC support as well.

TonyJR · 2024-03-18T09:07:59Z

ccmp looks like an enforcement feature. It's not display in feature list, but always runs before decode a text.

https://learn.microsoft.com/en-us/typography/script-development/standard
Maybe we can add a preprocessing process in Font.stringToGlyphs() ?

TonyJR mentioned this issue Mar 22, 2024

Feature/ccmp #688

Merged

8 tasks

Connum linked a pull request Apr 10, 2024 that will close this issue

Feature/ccmp #688

Merged

8 tasks

Connum closed this as completed in #688 Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support multi-character emoji #511

Support multi-character emoji #511

forresto commented May 2, 2022 •

edited

Loading

forresto commented May 3, 2022 •

edited

Loading

forresto commented May 9, 2022

forresto commented May 11, 2022

jamesjoung commented May 16, 2022

forresto commented May 16, 2022 •

edited

Loading

ILOVEPIE commented Nov 20, 2022 •

edited

Loading

TonyJR commented Mar 18, 2024

Support multi-character emoji #511

Support multi-character emoji #511

Comments

forresto commented May 2, 2022 • edited Loading

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce (for bugs)

Context

Your Environment

forresto commented May 3, 2022 • edited Loading

forresto commented May 9, 2022

forresto commented May 11, 2022

jamesjoung commented May 16, 2022

forresto commented May 16, 2022 • edited Loading

ILOVEPIE commented Nov 20, 2022 • edited Loading

TonyJR commented Mar 18, 2024

forresto commented May 2, 2022 •

edited

Loading

forresto commented May 3, 2022 •

edited

Loading

forresto commented May 16, 2022 •

edited

Loading

ILOVEPIE commented Nov 20, 2022 •

edited

Loading