-
Notifications
You must be signed in to change notification settings - Fork 33
Add comma to grmtools section syntax #542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
// knows "case_insensitive" is a flag with no arguments. | ||
let src = r#" | ||
%grmtools {unicode, case_insensitive posix_escapes allow_comments} | ||
%% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment indicates a difference between the grmtools_section.y
parser, and the
lrlex/src/lib/parser.rs.
the grmtools_section.y
parser returns a case_insensitive posix_escapes
key/value pair, then fails on allow_comments
.
I suppose we should consider using key: value
here and to be honest, because of the closeness to rust syntax I've found myself adding the ":" accidentally a few times, so maybe it is just right?
So it'd be something like:
%grmtools{
yacckind: Original(NoAction),
x,
!y,
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess another option would be %yacckind
...
I don't think %yacckind ...
plays as well with %flag
, and %!flag
, I suppose it works, but I don't recall any case where %foo
omits a subsequent value, it always seems to be %foo xyz
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried key: value
in 10e6d74
lrlex/src/lib/parser.rs
Outdated
@@ -351,11 +351,19 @@ where | |||
&mut grmtools_section_span_map, | |||
&mut grmtools_section_lex_flags, | |||
)?; | |||
if i == self.src.len() { | |||
return Err(self.mk_error(LexErrorKind::PrematureEnd, i)); | |||
eprintln!("foo '{:?}'", &self.src[i..]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 57b94aa
// We don't have a second value to test. | ||
// | ||
// `RecoveryKind` seemed like an option for an additional value to allow, | ||
// but that is part of `lrpar` which cfgrammar doesn't depend upon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This highlights another interesting thing, and potential place where we might want to parse
the grmtools section multiple times. E.g. recognizing yacckind
in cfgrammar
, but recoverer
in lrpar
.
Presumably that means cfgrammar should ignore unknown values (rather than it's current strict checking).
This is very similar to the "before LexerKind" vs "after LRNonStreamingLexer" thing with lexerkind except across crates.
Apart from s/comma/colon/ I think this is ready to go. Please squash (and reword the commit) appropriately. |
Well, it adds both comma and colon, so not just s/comma/colon/ but I'll reword it appropriately. |
Aha, yes, now I understand the "," aspect! |
lrlex/src/lib/parser.rs
Outdated
@@ -340,6 +340,7 @@ where | |||
let mut grmtools_section_span_map = HashMap::new(); | |||
let mut grmtools_section_lex_flags = UNSPECIFIED_LEX_FLAGS; | |||
if let Some(j) = self.lookahead_is("%grmtools", i) { | |||
// lrlex currently doesn't have any `key: value' settings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually wrong, I totally forgot that regex
defined some limit related values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fixed in 54f907f
10e6d74
to
54f907f
Compare
Will need another squash, I fixed |
Good catch! Please squash. |
d2e6a83
to
9fcf7e1
Compare
Squashed, and cleaned up the commit message regarding fields in lex. |
Curious curious, I have a theory about what is wrong, the build logs aren't informative about it, I believe it is probably wasmtime sandboxing filesystem access to the current directory, and that test tries to read lex/yacc files in I'll just run it locally to confirm |
So, yes that seems to be the issue. Unfortunately I'm not certain how to fix it (properly)... My attempt was to add a
Unfortunately it looks like the test runner may get executed without the environment variable
So if that is the case, this would seem to indicate a cargo problem with setting that environment variable on that target, |
This is outside my comfort zone, but are we suggesting this is a cargo bug? or a wasmtime limitation? or ... ? |
It's actually just me misunderstanding cargo's weirdness. Cargo sets the environment variable, but only when So I think it is just normal behavior that my test runner needs to account for the fact that it may or may not be set. |
Can we fix things here? |
I think so but still working on it, at worst we just disable the test on that particular target. E.g. there is no way I know of to just get the I think it is possible we end up with something that fixes all issues of this kind in the workspace once and for all with a unfortunate but perhaps tolerable hack. But I'm definitely not there yet. |
For the day at least, i've declared bankruptcy on trying to get it working, Plus it's also becoming of a size more reasonable for it's own patch if we manage to get it working. |
If you have a bright idea after a good night's sleep, go for it, but if not: please squash. |
There is something I can try, all the paths I was setting manually were about setting up which directory was allowed inside the sandbox... That doesn't explain exactly why it didn't work when I tried to add |
I did mange to get something working, and fixed up most of the hacks So for the time being i've put it up on my github rather than add it to the workspace, Note that because .cargo/config.toml is committed to the repo people will need it to run cargo.test I don't have much of a preference, but i very rarely ever https://github.com/ratmice/workspace_time_test_runner Edit: It is probably worth renaming this to be more accurate, the runner flag is apparently also used for In .buildbot.sh add:
then in
then removing the
finally once it is installed |
Wow, I must admit, this feels like a lot of complexity for one (apparently weird!) platform. Do I understand correctly that this would require normal grmtool developers to install stuff before |
It only requires people to install stuff before
But yeah I agree that the question of additional complexity vs how worthwhile this is is a tough call for one test. One of the things I do like about that specific test though is that it is kind of a complete test of the stack: Edit: I had originally hoped to avoid the need to
Putting the runner inside the project. Unfortunately the recursive invocation of Anyhow, I do find the need to only cargo install when testing this one weird target to make it somewhat more tolerable. Edit: I suppose that one thing we could do is attempt to get it added upstream in wasmtime so that it gets installed when we install the target. I'm not sure that they would accept it but we could try (I chatted with them informally but they were pretty hesitant, from their perspective cargo is just one of many languages and toolchains so it's a bit weird to add this special stuff for cargo). |
bb96395
to
06cef3e
Compare
Anyhow, I added some minor comments + assertions + fixed whitespace, then squashed in case you wanted to just go ahead without the runner. |
@@ -79,10 +79,10 @@ pub(crate) fn run_test_path<P: AsRef<Path>>(path: P) -> Result<(), Box<dyn std:: | |||
// Create grammar files | |||
let base = path.file_stem().unwrap().to_str().unwrap(); | |||
let mut pg = PathBuf::from(&out_dir); | |||
pg.push(format!("{}.y.rs", base)); | |||
pg.push(format!("{}.test.y", base)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had believed that this .y.rs
thing was a mistake of sorts, because this is a .y
file not a .rs
file.
But it occurred to me that emitting of this file with a .rs extension may have been a way to get the same behavior as cargo:rerun-if-changed=path
behavior.
In which case I don't think it's necessary we emit rerun-if-changed
for the .test file these derive from.
But I do think that means we should also emit rerun-if-changed for glob(src/*.rs)
which I don't think we're doing.
Perhaps best to do that and an audit in a follow up.
Or should I revert this part of the change, and change the tests to glob for *.y.rs
for the time being?
Then do the whole renaming/rerun audit together in a follow up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps best to do that and an audit in a follow up.
Good idea!
@ratmice OK, phew, since it's limited to that target, I'm fine with this. Please squash. |
Alright, Just so I'm clear we're okay with this additional If so please give me a day or so to figure out a better name for this runner, The leading naming candidate for the moment is just plain old
|
Yep!
This seems like a good idea to me. It might be useful to others too! |
This reworks the syntax for the grmtools section to be unambiguous, adding a lexer and grammar file for it in `cttests/grmtools_section.test`. The grammar is run over the grmtools sections extracted from the .l and .y files in the repository. Previous unreleased versions of grmtools including this syntax used a space separated sequence of values. Where some values were flags, and other values acted like key value pairs. before: `%grmtools{!dot_matches_new_line case_insensitive size_limit 0}` `%grmtools{yacckind Grmtools}` This changes the grammar such that values must be separated by commas, and `key: value` pairs must be separated by colons, with an optional trailing comma. `%grmtools{!dot_matches_new_line, case_insensitive, size_limit: 0}` `%grmtools{yacckind: Grmtools}`
06cef3e
to
5b7a126
Compare
Updated to use the test runner, I moved the Anyhow I think this should be ready now 🤞 |
This uses the modified grmtools section grammar from #490 which adds a comma to make it unambiguous.
This isn't what is currently parsed by the section parsing code in
parser.rs
.using a self hosted grammar isn't something we currently do,
This patch doesn't intend to add it. However it occurred to me that we could still run it in
cttests/
over thelrpar/example/*.{l,y}
and those generated by cttests.This currently works for all the files in the repository,
you need to have at least 3 entries in the grmtools section before it realizes something is awry.
because it reads the first and second entries as key/value.
Marking this as draft for the following purposes
Edit:
The other thing this doesn't try to do is add a specialized parser for lex or one for yacc files.
Accepting just the keys for the respective files, I guess we could do a separate parser each?