Skip to content

Conversation

@muued
Copy link
Contributor

@muued muued commented Jan 3, 2023

Currently the JER and XER encoder escapes non-ascii characters.
ISO/IEC 8825-4:2015 clause 8.1.3 and ISO/IEC 8825-8:2015 clause 7.6.2 demand that both outputs should use UTF-8.

Escaping seems to be allowed for XER, but is forbidden in CXER.
Escaping seems to be allowed for JER.
So, this is probably not a bug.

In my eyes, using the escaping mechanism makes no sense here and leads to less readable output (readability for a human is one of the key features of these encodings).
This PR prevents this escaping mechanism.

If for any reason the current behaviour is beneficial, I suggest adding a further parameter (similar to indent) to control the escaping behaviour.

@muued muued changed the title Make xer and jer output utf-8 Prevent escaping in xer and jer output Jan 3, 2023
@codecov
Copy link

codecov bot commented Jan 3, 2023

Codecov Report

Merging #158 (12f34ca) into master (349e9a7) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #158   +/-   ##
=======================================
  Coverage   35.18%   35.18%           
=======================================
  Files           7        7           
  Lines        8567     8567           
=======================================
  Hits         3014     3014           
  Misses       5553     5553           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@coveralls
Copy link

Coverage Status

Coverage: 96.47%. Remained the same when pulling 12f34ca on muued:master into 349e9a7 on eerimoq:master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants