-
Notifications
You must be signed in to change notification settings - Fork 46
Description
The EBNF defines Junk as:
/* Junk represents unparsed content.
*
* Junk is parsed line-by-line until a line is found which looks like it might
* be a beginning of a new message, term, or a comment. Any whitespace
* following a broken Entry is also considered part of Junk.
*/
Junk ::= junk_line (junk_line - "#" - "-" - [a-zA-Z])*
junk_line ::= /[^\n]*/ ("\u000A" | EOF)
This makes sense to me in the context of fixtures/comments.ftl
:
# Errors
#error
##error
###error
This produces one Comment node and three Junk nodes because the sequential Junk lines start with a `#" with indicates a new Junk.
What I can't make sense of or seem to match is what the tooling parser does with sequential junk as in structure/unclosed.ftl
:
err03 = {
FUNC(
arg
,
namedArg: "Value"
,
key04 = Value 04
If I'm reading the spec right this should produce 4 Junks because the lines starting with "F", "a", and "n" are potentially new Entries and would break the grammar definition of a Junk. The lines starting "," would both naturally be part of the previous Junks.
That's not what the Javascript implementation is actually doing though, it is combining all the Junks up until the start of key04
:
"content": "err03 = {\nFUNC(\narg\n,\nnamedArg: \"Value\"\n,\n"
Is the tooling parser off-spec here? Or am I missing something?
There doesn't seem to be a reference fixture that would trigger this. I mocked up a quick one and tested with the reference parser and the result matches my understanding of the spec.
I'm trying to figure out how to implement this is in fluent-lua and am unsure whether to follow the letter of the law here and stick to the spec or take a cue from other tooling implementations and follow their suit.
If the latter, what is the actual rule? Merge all Junks excluding ones that start with #
?