Skip to content

Unclear when to merge adjacet Junks #296

@alerque

Description

@alerque

The EBNF defines Junk as:

/* Junk represents unparsed content.
 *
 * Junk is parsed line-by-line until a line is found which looks like it might
 * be a beginning of a new message, term, or a comment. Any whitespace
 * following a broken Entry is also considered part of Junk.
 */
Junk                ::= junk_line (junk_line - "#" - "-" - [a-zA-Z])*
junk_line           ::= /[^\n]*/ ("\u000A" | EOF)

This makes sense to me in the context of fixtures/comments.ftl:

# Errors
#error
##error
###error

This produces one Comment node and three Junk nodes because the sequential Junk lines start with a `#" with indicates a new Junk.

What I can't make sense of or seem to match is what the tooling parser does with sequential junk as in structure/unclosed.ftl:

err03 = {
FUNC(
arg
,
namedArg: "Value"
,
key04 = Value 04

If I'm reading the spec right this should produce 4 Junks because the lines starting with "F", "a", and "n" are potentially new Entries and would break the grammar definition of a Junk. The lines starting "," would both naturally be part of the previous Junks.

That's not what the Javascript implementation is actually doing though, it is combining all the Junks up until the start of key04:

"content": "err03 = {\nFUNC(\narg\n,\nnamedArg: \"Value\"\n,\n"

Is the tooling parser off-spec here? Or am I missing something?

There doesn't seem to be a reference fixture that would trigger this. I mocked up a quick one and tested with the reference parser and the result matches my understanding of the spec.

I'm trying to figure out how to implement this is in fluent-lua and am unsure whether to follow the letter of the law here and stick to the spec or take a cue from other tooling implementations and follow their suit.

If the latter, what is the actual rule? Merge all Junks excluding ones that start with #?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions