Skip to content

Try to recover from unmatched delimiters in the parser #54029

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/librustc_metadata/cstore_impl.rs
Original file line number Diff line number Diff line change
Expand Up @@ -465,7 +465,7 @@ impl cstore::CStore {

let source_file = sess.parse_sess.source_map().new_source_file(source_name, def.body);
let local_span = Span::new(source_file.start_pos, source_file.end_pos, NO_EXPANSION);
let body = source_file_to_stream(&sess.parse_sess, source_file, None);
let body = source_file_to_stream(&sess.parse_sess, source_file, None).0;

// Mark the attrs as used
let attrs = data.get_item_attrs(id.index, sess);
Expand Down
2 changes: 2 additions & 0 deletions src/libsyntax/parse/lexer/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ pub struct StringReader<'a> {
/// The raw source span which *does not* take `override_span` into account
span_src_raw: Span,
open_braces: Vec<(token::DelimToken, Span)>,
crate unmatched_braces: Vec<(token::DelimToken, Span)>,
/// The type and spans for all braces
///
/// Used only for error recovery when arriving to EOF with mismatched braces.
Expand Down Expand Up @@ -221,6 +222,7 @@ impl<'a> StringReader<'a> {
span_src_raw: syntax_pos::DUMMY_SP,
open_braces: Vec::new(),
matching_delim_spans: Vec::new(),
unmatched_braces: Vec::new(),
override_span,
last_unclosed_found_span: None,
}
Expand Down
5 changes: 3 additions & 2 deletions src/libsyntax/parse/lexer/tokentrees.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,9 @@ impl<'a> StringReader<'a> {
token::Eof => {
let msg = "this file contains an un-closed delimiter";
let mut err = self.sess.span_diagnostic.struct_span_err(self.span, msg);
for &(_, sp) in &self.open_braces {
for &(tok, sp) in &self.open_braces {
err.span_label(sp, "un-closed delimiter");
self.unmatched_braces.push((tok, sp));
}

if let Some((delim, _)) = self.open_braces.last() {
Expand Down Expand Up @@ -134,7 +135,7 @@ impl<'a> StringReader<'a> {
}
err.emit();
}
self.open_braces.pop().unwrap();
self.unmatched_braces.push(self.open_braces.pop().unwrap());

// If the incorrect delimiter matches an earlier opening
// delimiter, then don't consume it (it can be used to
Expand Down
42 changes: 25 additions & 17 deletions src/libsyntax/parse/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -168,10 +168,13 @@ crate fn parse_stmt_from_source_str(name: FileName, source: String, sess: &Parse
new_parser_from_source_str(sess, name, source).parse_stmt()
}

pub fn parse_stream_from_source_str(name: FileName, source: String, sess: &ParseSess,
override_span: Option<Span>)
-> TokenStream {
source_file_to_stream(sess, sess.source_map().new_source_file(name, source), override_span)
pub fn parse_stream_from_source_str(
name: FileName,
source: String,
sess: &ParseSess,
override_span: Option<Span>,
) -> TokenStream {
source_file_to_stream(sess, sess.source_map().new_source_file(name, source), override_span).0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean that for rustc invocations that don't read the source from a file but from stdin we don't get these suggestions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. On the one hand, those will not have any worse output. On the other, this was a bit explorative and want to nail it down before trying to get to to work in all cases.

}

// Create a new parser from a source string
Expand All @@ -191,26 +194,28 @@ pub fn new_parser_from_file<'a>(sess: &'a ParseSess, path: &Path) -> Parser<'a>
/// Given a session, a crate config, a path, and a span, add
/// the file at the given path to the source_map, and return a parser.
/// On an error, use the given span as the source of the problem.
crate fn new_sub_parser_from_file<'a>(sess: &'a ParseSess,
path: &Path,
directory_ownership: DirectoryOwnership,
module_name: Option<String>,
sp: Span) -> Parser<'a> {
crate fn new_sub_parser_from_file<'a>(
sess: &'a ParseSess,
path: &Path,
directory_ownership: DirectoryOwnership,
module_name: Option<String>,
sp: Span,
) -> Parser<'a> {
let mut p = source_file_to_parser(sess, file_to_source_file(sess, path, Some(sp)));
p.directory.ownership = directory_ownership;
p.root_module_name = module_name;
p
}

/// Given a source_file and config, return a parser
fn source_file_to_parser(sess: & ParseSess, source_file: Lrc<SourceFile>) -> Parser {
fn source_file_to_parser(sess: &ParseSess, source_file: Lrc<SourceFile>) -> Parser {
let end_pos = source_file.end_pos;
let mut parser = stream_to_parser(sess, source_file_to_stream(sess, source_file, None));

let (tts, open_braces) = source_file_to_stream(sess, source_file, None);
let mut parser = stream_to_parser(sess, tts);
parser.open_braces = open_braces;
if parser.token == token::Eof && parser.span.is_dummy() {
parser.span = Span::new(end_pos, end_pos, parser.span.ctxt());
}

parser
}

Expand Down Expand Up @@ -240,12 +245,15 @@ fn file_to_source_file(sess: &ParseSess, path: &Path, spanopt: Option<Span>)
}

/// Given a source_file, produce a sequence of token-trees
pub fn source_file_to_stream(sess: &ParseSess,
source_file: Lrc<SourceFile>,
override_span: Option<Span>) -> TokenStream {
pub fn source_file_to_stream(
sess: &ParseSess,
source_file: Lrc<SourceFile>,
override_span: Option<Span>,
) -> (TokenStream, Vec<(token::DelimToken, Span)>) {
let mut srdr = lexer::StringReader::new(sess, source_file, override_span);
srdr.real_token();
panictry!(srdr.parse_all_token_trees())
let tt = panictry!(srdr.parse_all_token_trees());
(tt, srdr.unmatched_braces)
}

/// Given stream and the `ParseSess`, produce a parser
Expand Down
96 changes: 80 additions & 16 deletions src/libsyntax/parse/parser.rs
Original file line number Diff line number Diff line change
Expand Up @@ -250,6 +250,8 @@ pub struct Parser<'a> {
desugar_doc_comments: bool,
/// Whether we should configure out of line modules as we parse.
pub cfg_mods: bool,
/// Unmatched open delimiters, used for parse recovery when multiple tokens could be valid.
crate open_braces: Vec<(token::DelimToken, Span)>,
}


Expand Down Expand Up @@ -569,6 +571,7 @@ impl<'a> Parser<'a> {
},
desugar_doc_comments,
cfg_mods: true,
open_braces: Vec::new(),
};

let tok = parser.next_tok();
Expand Down Expand Up @@ -635,6 +638,55 @@ impl<'a> Parser<'a> {
}
}

fn recover_closing_delimiter(
&mut self,
tokens: &[token::Token],
mut err: DiagnosticBuilder<'a>,
) -> PResult<'a, ()> {
let mut pos = None;
let mut tokens: Vec<token::Token> = tokens.iter().map(|t| (*t).clone()).collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't that just tokens.to_vec()?

tokens.extend(self.expected_tokens.iter().filter_map(|t| match t {
TokenType::Token(t) => Some((*t).clone()),
_ => None,
}));
for (i, (delim, span)) in self.open_braces.iter().enumerate() {
if tokens.contains(&token::CloseDelim(*delim)) && self.span > *span {
pos = Some(i);
// do not break, we want to use the last one that would apply
}
}
match pos {
Some(pos) => {
// Recover and assume that the detected unclosed delimiter was meant for
// this location. Emit the diagnostic and act as if the delimiter was
// present for the parser's sake.

// Don't attempt to recover from this unclosed delimiter more than once.
let (delim, open_sp) = self.open_braces.remove(pos);
let delim = TokenType::Token(token::CloseDelim(delim));

// We want to suggest the inclusion of the closing delimiter where it makes
// the most sense, which is immediately after the last token:
//
// {foo(bar {}}
// - - ^ expected one of `)`, <...>
// | |
// | help: ...missing `)` might belong here
// you might have meant to close this...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this really the right spot? I would have suggested us to suggest adding the ) after the {}, like so {foo(bar {})}, not {foo(bar) {}}, which doesn't look to me like it would parse

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I'm intrigued on wether the comment is correct, at all...

I'd like to create a bunch of real "broken" code samples to test against, which would give me more confidence that what I'm doing in this PR is beneficial.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would help me too, since I'm starting my reviews by just looking at the test cases (I haven't really looked at the code at all yet)

err.span_label(open_sp, "if you meant to close this...");
err.span_suggestion_short_with_applicability(
self.sess.source_map().next_point(self.prev_span),
&format!("...the missing {} may belong here", delim.to_string()),
delim.to_string(),
Applicability::MaybeIncorrect,
);
err.emit();
Ok(())
}
_ => Err(err),
}
}

/// Expect and consume the token t. Signal an error if
/// the next token is not t.
pub fn expect(&mut self, t: &token::Token) -> PResult<'a, ()> {
Expand Down Expand Up @@ -668,7 +720,7 @@ impl<'a> Parser<'a> {
err.span_label(self.span, "unexpected token");
}
}
Err(err)
self.recover_closing_delimiter(&[t.clone()], err)
}
} else {
self.expect_one_of(slice::from_ref(t), &[])
Expand Down Expand Up @@ -761,7 +813,7 @@ impl<'a> Parser<'a> {
err.span_label(self.span, "unexpected token");
}
}
Err(err)
self.recover_closing_delimiter(edible, err)
}
}

Expand Down Expand Up @@ -1075,11 +1127,12 @@ impl<'a> Parser<'a> {
/// Parse a sequence, not including the closing delimiter. The function
/// f must consume tokens until reaching the next separator or
/// closing bracket.
pub fn parse_seq_to_before_end<T, F>(&mut self,
ket: &token::Token,
sep: SeqSep,
f: F)
-> PResult<'a, Vec<T>>
pub fn parse_seq_to_before_end<T, F>(
&mut self,
ket: &token::Token,
sep: SeqSep,
f: F,
) -> PResult<'a, Vec<T>>
where F: FnMut(&mut Parser<'a>) -> PResult<'a, T>
{
self.parse_seq_to_before_tokens(&[ket], sep, TokenExpectType::Expect, f)
Expand Down Expand Up @@ -1124,8 +1177,12 @@ impl<'a> Parser<'a> {
v.push(t);
continue;
},
Err(mut e) => {
e.cancel();
Err(mut err) => {
err.cancel();
let kets: Vec<token::Token> = kets.iter()
.map(|t| (*t).clone())
.collect();
let _ = self.recover_closing_delimiter(&kets[..], e);
break;
}
}
Expand All @@ -1141,7 +1198,13 @@ impl<'a> Parser<'a> {
break;
}

let t = f(self)?;
let t = match f(self) {
Ok(t) => t,
Err(e) => {
let kets: Vec<token::Token> = kets.iter().map(|t| (*t).clone()).collect();
return self.recover_closing_delimiter(&kets[..], e).map(|_| v);
}
};
v.push(t);
}

Expand All @@ -1151,12 +1214,13 @@ impl<'a> Parser<'a> {
/// Parse a sequence, including the closing delimiter. The function
/// f must consume tokens until reaching the next separator or
/// closing bracket.
fn parse_unspanned_seq<T, F>(&mut self,
bra: &token::Token,
ket: &token::Token,
sep: SeqSep,
f: F)
-> PResult<'a, Vec<T>> where
fn parse_unspanned_seq<T, F>(
&mut self,
bra: &token::Token,
ket: &token::Token,
sep: SeqSep,
f: F,
) -> PResult<'a, Vec<T>> where
F: FnMut(&mut Parser<'a>) -> PResult<'a, T>,
{
self.expect(bra)?;
Expand Down
24 changes: 24 additions & 0 deletions src/test/ui/parser-recovery-3.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
// Copyright 2018 The Rust Project Developers. See the COPYRIGHT
// file at the top-level directory of this distribution and at
// http://rust-lang.org/COPYRIGHT.
//
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
// option. This file may not be copied, modified, or distributed
// except according to those terms.

struct A;

impl A {
fn banana(&mut self) {
fn peach(this: &Self, foo: usize {
//~^ ERROR expected one of
//~| ERROR expected pattern
}
}
//~^ ERROR expected one of
//~| ERROR incorrect close delimiter
}

fn main() {}
34 changes: 34 additions & 0 deletions src/test/ui/parser-recovery-3.stderr
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
error: incorrect close delimiter: `}`
--> $DIR/parser-recovery-3.rs:19:5
|
LL | fn banana(&mut self) {
| - close delimiter possibly meant for this
LL | fn peach(this: &Self, foo: usize {
| - un-closed delimiter
...
LL | }
| ^ incorrect close delimiter

error: expected one of `!`, `(`, `)`, `+`, `,`, `::`, or `<`, found `{`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's kind of surprising to me that this advice comes here, in a second error after the "main" error above... why is that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lexer first performs the basic matching braces checks and emits the appropriate error. Then the parser takes the tokenstream and tries to parse things. So the first error is about the lexer finding unbalanced delimiters, while the second one is a parse error not finding an expected close delimiter.

There're two options: keeping the un-emitted DiagnosticBuilders from the lexer and emitting them once we are parsing and find an appropriate place for the close delimiter suggestion, or completely stop performing balanced delimiter checks in the lexer and rely entirely on the parser for parse errors. Both felt like too much work for little win. I'd prefer to iterate this PR until we have the recovery nailed down before revisiting the above possibilities.

--> $DIR/parser-recovery-3.rs:15:42
|
LL | fn peach(this: &Self, foo: usize {
| - -^ expected one of 7 possible tokens here
| | |
| | help: ...the missing `)` may belong here
| if you meant to close this...

error: expected pattern, found `{`
--> $DIR/parser-recovery-3.rs:15:42
|
LL | fn peach(this: &Self, foo: usize {
| ^ expected pattern

error: expected one of `->`, `where`, or `{`, found `}`
--> $DIR/parser-recovery-3.rs:19:5
|
LL | }
| ^ expected one of `->`, `where`, or `{` here

error: aborting due to 4 previous errors

21 changes: 21 additions & 0 deletions src/test/ui/parser-recovery-4.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
// Copyright 2018 The Rust Project Developers. See the COPYRIGHT
// file at the top-level directory of this distribution and at
// http://rust-lang.org/COPYRIGHT.
//
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
// option. This file may not be copied, modified, or distributed
// except according to those terms.

fn foo(_: usize) {}
fn bar() {}

fn main() {
let x = 1;
foo(x
bar()
//~^ ERROR expected one of
//~^^^ ERROR this function takes 1 parameter but 2 parameters were supplied
}
//~^ ERROR incorrect close delimiter
Loading