Skip to content

Commit 5f4a4ca

Browse files
committed
refactor: more explicit terminal sequence parsing
Introduces parse_new, which takes advantage of a generalized sequence parser and slice patterns to (hopefully) make the mapping between a byte pattern and the corresponding terminal behavior more explicit. This initial implementation is running in "dark launch" mode: we run both parse_new and parse_classic (always taking the latter's output as canonical), and print a warning when the two don't match. This has resulted in a number of "bug-compatible" changes to the parser to match the old behavior precisely, both in an effort to ensure that we're faithfully reproducing the semantics of the old parser and to give us the opportunity to directly compare the relative merits of the old & new approaches. In terms of completeness, this current implementation has a few limitations: * Most notably, we don't handle the "redraw all" request given (currently) by the sequence "\u{1b}[VxD": per the standard, that parses as `[CSI, .., "V"]` followed by the unrelated bytes `xD`. We can either extend the parser to recognize this sequence, or, as I would prefer, change the sequence to "fit" within the standard. * The sequence enumeration and handling feels pretty good to me, both in terms of how the existing sequences are handled and ease of adding new ones (including, as `set_text_mode` demonstrates, the flexibility to integrate external combinatorial parsers if necessary), especially with respect to the parameter parsing. However, the parser itself is a mess: the standard proved less helpful in recognizing the set of sequences we've encountered in the wild" than I'd hoped, so I think we could definitely do better if we revisit it with fresh eyes. * Not all of the error cases are exactly the same when given "weird" sequences like "\u{1b}[?m"; they both produce an Err::Failure, but marking slightly different portions of the input. I believe this to be roughly acceptable (we'd still make progress towards parsing the entire input, just while printing out slightly different results for the unrecognized sequences). * I based the current general parser on the ECMA standard rather than ANSI's, despite being in `ansi.rs`. So that's potential for some comedy maybe. And some work that remains entirely untouched is: * Integrating a utf-8 parser so we can correctly handle multi-byte sequences. * Collapsing the Text/Op hierarchy so the parser can more directly split incoming inputs rather than "leaking" those details into parse_str_tail. * Critically evaluating the efficiency and throughpout of the parser, especially with an eye towards reducing the number of times we scan over the whole input. * Replacing the allocating branches (TextOp, DecPrivate*) with iterators.
1 parent 5e933ed commit 5f4a4ca

File tree

6 files changed

+385
-83
lines changed

6 files changed

+385
-83
lines changed

Cargo.toml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -65,8 +65,9 @@ unroll = "0.1.5"
6565
[dev-dependencies]
6666

6767
[features]
68-
default = ["perf_log"]
69-
perf_log = []
68+
default = ["background"]
69+
perf_log = []
70+
background = []
7071

7172
[patch.crates-io]
7273
# TODO: automate these updates

examples/escape_seq.rs

Lines changed: 17 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,9 @@ use esp32c3_hal::{
1111
timer::TimerGroup,
1212
Rtc, IO,
1313
};
14-
use esp_println::{print, println};
14+
use esp_backtrace as _;
15+
use esp_println::println;
16+
use nom::Parser;
1517
use riscv::asm::wfi;
1618
use vgaterm::ansi;
1719

@@ -33,22 +35,6 @@ fn init_heap() {
3335
}
3436
}
3537

36-
#[panic_handler]
37-
fn panic(info: &core::panic::PanicInfo) -> ! {
38-
print!("Aborting: ");
39-
if let Some(p) = info.location() {
40-
println!(
41-
"line {}, file {}: {}",
42-
p.line(),
43-
p.file(),
44-
info.message().unwrap()
45-
);
46-
} else {
47-
println!("no information available.");
48-
}
49-
stop();
50-
}
51-
5238
#[no_mangle]
5339
extern "C" fn stop() -> ! {
5440
loop {
@@ -96,8 +82,20 @@ fn main() -> ! {
9682
riscv::interrupt::enable();
9783
}
9884

99-
let r = ansi::parse_esc_str("abcd\u{1B}[XYZ\u{1B}[");
100-
println!("{:?}", r);
85+
// let r = ansi::parse_esc_str("abcd\u{1B}[XYZ\u{1B}[");
86+
// println!("{:?}\n", r);
87+
// let r = ansi::parse_esc_str("\u{1B}[");
88+
// println!("{:?}\n", r);
89+
// let r = ansi::parse_esc_str("\u{1B}8");
90+
// println!("{:?}\n", r);
91+
92+
println!("{:?}", ansi::parse("\u{1B}[;"));
93+
println!("{:?}", ansi::parse("\u{1B}[1;;"));
94+
println!("{:?}", ansi::parse("\u{1B}[m"));
95+
println!("{:?}", ansi::parse("\u{1B}[1;2m"));
96+
println!("{:?}", ansi::parse("\u{1b}[?m"));
97+
98+
println!("{:?}", ansi::parse("\u{1B}[1;"));
10199

102100
// match escape.push_str("abcd\u{1B}[5") {
103101
// ParseRes::InSequence(s) => {

0 commit comments

Comments
 (0)