-
Notifications
You must be signed in to change notification settings - Fork 6
Complete implementation of all commands, in-place editing, and optimizations #77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
According to POSIX, labels consist of portable filename characters.
Previously empty RE were saved and reused based on their occurrence in the compiled script. Per POSIX if an RE is empty (that is, no pattern is specified) sed shall behave as if the last RE used in the last command applied (either as an address or as part of a substitute command) was specified. Now REs are saved and reused at runtime.
The previous order resulted in a duplicate label error.
regex::Regex EREs don't support back-references, so switch to fancy_regex. While at it remove test_compile_re_reuse_saved test case, which is no longer valid, as REs are reused at runtime.
src/uu/sed/src/processor.rs
Outdated
fn re_or_saved_re(regex: &Option<Regex>, context: &mut ProcessingContext) -> UResult<Regex> { | ||
match regex { | ||
Some(re) => { | ||
*context.saved_regex.borrow_mut() = Some(re.clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be possible to ignore the clone ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed; fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change further reduced the 50! calculation from 4.34" to 3.46" (against 10.28" of GNU sed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
impressive!
src/uu/sed/src/processor.rs
Outdated
let caps = match caps { | ||
Ok(c) => c, | ||
Err(e) => { | ||
return Err(USimpleError::new( | ||
2, | ||
format!("regular expression capture retrieval error: {}", e), | ||
)); | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as you wish but you could write:
let caps = match caps { | |
Ok(c) => c, | |
Err(e) => { | |
return Err(USimpleError::new( | |
2, | |
format!("regular expression capture retrieval error: {}", e), | |
)); | |
} | |
}; | |
let caps = caps | |
.map_err(|e| USimpleError::new( | |
2, | |
format!("regular expression capture retrieval error: {}", e), | |
))?; | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed!
Suggested by: sylvestre
Suggested by: sylvestre
|
||
/// Copy the specified file to the output. | ||
pub fn copy_file(&mut self, path: &PathBuf) -> io::Result<()> { | ||
#[cfg(unix)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you know what will happen here on !unix ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there are no memory mapped data, so there's nothing to flush. (The flush is needed when we switch from coalescing and writing out memory mapped data (with zero copies) to BufWriter
.
This is code for the r
command, which is still WIP.
This improves performance up to 1.5 times, with no downside. no-op-short rust-sed is similarly fast as rust-sed-f0 access-log-no-op rust-sed is 1.5 times faster than rust-sed-f0 access-log-no-subst rust-sed is similarly fast as rust-sed-f0 access-log-subst rust-sed is similarly fast as rust-sed-f0 access-log-no-del rust-sed is 1.1 times faster than rust-sed-f0 access-log-all-del rust-sed is similarly fast as rust-sed-f0 access-log-translit rust-sed is 1.2 times faster than rust-sed-f0 access-log-complex-sub rust-sed is similarly fast as rust-sed-f0 remove-cr rust-sed is similarly fast as rust-sed-f0 genome-subst rust-sed is similarly fast as rust-sed-f0 number-fix rust-sed is 1.1 times faster than rust-sed-f0 long-script rust-sed is similarly fast as rust-sed-f0 hanoi rust-sed is similarly fast as rust-sed-f0 factorial rust-sed is similarly fast as rust-sed-f0
This improves performance by 10% in the following benchmark case. Performance in all others remains the same. access-log-subst rust-sed is 1.1 times faster than rust-sed-eb64104
In theory, find, find_iter, captures, and captures_iter according to the values of sub.occurrence, sub.replacement.max_group_number could allow for better performance. In practice, specializing for the easiest case to use find did not show any improvement. no-op-short rust-sed is similarly fast as rust-sed-7868f4f access-log-no-op rust-sed is similarly fast as rust-sed-7868f4f access-log-no-subst rust-sed is similarly fast as rust-sed-7868f4f access-log-subst rust-sed is similarly fast as rust-sed-7868f4f access-log-no-del rust-sed is similarly fast as rust-sed-7868f4f access-log-all-del rust-sed is similarly fast as rust-sed-7868f4f access-log-translit rust-sed is similarly fast as rust-sed-7868f4f access-log-complex-sub rust-sed is similarly fast as rust-sed-7868f4f remove-cr rust-sed is similarly fast as rust-sed-7868f4f genome-subst rust-sed is similarly fast as rust-sed-7868f4f number-fix rust-sed is similarly fast as rust-sed-7868f4f long-script rust-sed is similarly fast as rust-sed-7868f4f hanoi rust-sed is similarly fast as rust-sed-7868f4f factorial rust-sed is similarly fast as rust-sed-7868f4f Consequently, this commit just documents the change and its results, and will be reverted.
This reverts commit f18f772. No performance improvement was seen.
This improves one benchmark case by 20%: access-log-all-del rust-sed is 1.2 times faster than rust-sed-f18f772 Unfortunatelly, all other cases remain unaffected.
The hope is that the now available regex::bytes will run find() faster than captures_iter(). Indeed, using find() increases the performance of several benchmark cases by 2-12%. no-op-short rust-sed is 1.02 times faster than rust-sed-5fad204 access-log-no-op rust-sed is 1.01 times faster than rust-sed-5fad204 access-log-no-subst rust-sed is 1.08 times faster than rust-sed-5fad204 access-log-subst rust-sed is 1.12 times faster than rust-sed-5fad204 access-log-translit rust-sed is 1.02 times faster than rust-sed-5fad204 access-log-complex-sub rust-sed is 1.04 times faster than rust-sed-5fad204 remove-cr rust-sed is 1.12 times faster than rust-sed-5fad204 genome-subst rust-sed is 1.02 times faster than rust-sed-5fad204 hanoi rust-sed is 1.01 times faster than rust-sed-5fad204 All other cases remain the same.
Performance improves significantly in the following case: number-fix rust-sed is 1.07 times faster than rust-sed-90d60b8 Other changes <5% are likely to be noise.
- Add RE to detect specifications that require an RE engine, rather than a literal string match. - Change RE unit tests to test directly the RE rather than the constructor. - Use arrays for similar tests to avoid repetition.
Regex uses an automaton even for matching literal strings. Moving through its state transitions is suboptimal. This commit introduces literal string matching with the memchr::memmem matcher, which may use SIMD and other specialized features to speed up the search as well as other algorithms for special sizes. This boosts substantially the performance of several benchmark cases. access-log-no-subst rust-86 is 2.84 times faster than rust-sed-a6 access-log-subst rust-86 is 1.87 times faster than rust-sed-a6 access-log-no-del rust-86 is 3.00 times faster than rust-sed-a6 access-log-all-del rust-86 is 2.80 times faster than rust-sed-a6 remove-cr rust-86 is 4.35 times faster than rust-sed-a6 genome-subst rust-86 is 3.65 times faster than rust-sed-a6 It makes makes the performance of Rust better than the GNU and FreeBSD implementations for most benchmark cases. no-op-short rust is 3.04 times faster than gnu access-log-no-op rust is 1.91 times faster than gnu access-log-no-subst rust is 1.59 times faster than gnu access-log-subst rust is 1.32 times faster than gnu access-log-no-del rust is 1.54 times faster than gnu access-log-all-del rust is 1.99 times faster than gnu access-log-translit rust is 12.67 times faster than gnu access-log-complex-sub gnu is 4.24 times faster than rust remove-cr rust is 1.68 times faster than gnu genome-subst rust is 1.15 times faster than gnu number-fix rust is 1.06 times faster than gnu long-script gnu is 2.19 times faster than rust hanoi rust is 1.96 times faster than gnu factorial rust is 1.25 times faster than gnu no-op-short rust is 3.50 times faster than fbsd access-log-no-op rust is 2.45 times faster than fbsd access-log-no-subst rust is 1.38 times faster than fbsd access-log-subst rust is 4.02 times faster than fbsd access-log-no-del rust is 1.35 times faster than fbsd access-log-all-del rust is 4.74 times faster than fbsd access-log-translit fbsd is 1.14 times faster than rust access-log-complex-sub rust is 2.00 times faster than fbsd remove-cr rust is 2.48 times faster than fbsd genome-subst rust is 2.21 times faster than fbsd number-fix fbsd is 1.04 times faster than rust long-script fbsd is 15.46 times faster than rust hanoi rust is 9.33 times faster than fbsd factorial rust is 115.72 times faster than fbsd
By using uucore's .map_err_context.
@sylvestre I integrated the commits for the remaining commands and in-place editing. The implementation is mostly done now. |
Excellent, I will have a look :) |
I will reply to your email too :) |
At this state sed implements all POSIX commands, and can correctly run the two complex scripts: hanoi.sed (solves the Towers of Hanoi puzzle) and math.sed (implements an arbitrary precision integer math calculator).
The performance of this Rust implementation is now better than the GNU and FreeBSD implementations for most benchmark cases.
GNU sed under Linux
FreeBSD sed
Remaining tasks