Skip to content

olivierberten/KhmerConverterPHP

Repository files navigation

KhmerConverterPHP

These scripts transcode strings from Legacy khmer fonts to Unicode and vice versa. You can see them in action at http://www.selapa.net/khmerfonts/

How does it work?

Legacy → Unicode

  1. Search and replace from the database
  2. Recompose characters
  3. Transcode other characters * Ligatures get separated into characters * Ornaments get enclosed between 0x91 and 0x92 * Khmer characters missing in Unicode get enclosed between 0x86 and 0x87 * Characters missing in the legacy font get enclosed between 0x96 and 0x97
  4. Reorder characters according to Unicode order
    This code is translated to PHP from KhmerOS khmerconverter Python software

Unicode → Legacy

  1. Reorder characters according to visual order
    This code is translated to PHP from KhmerOS khmerconverter Python software
  2. Search and replace from the database
  3. Transcode characters
  4. Decompose composite characters if necessary * Missing characters get enclosed between 0x96 and 0x97
  5. Apply ligatures if present in the font

TODO

  • Refine the database (some font mappings aren't yet correct)
  • Word-breaking
  • Transcode documents with multiple fonts

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages