Complete implementation of all commands, in-place editing, and optimizations #77

dspinellis · 2025-05-19T16:53:19Z

At this state sed implements all POSIX commands, and can correctly run the two complex scripts: hanoi.sed (solves the Towers of Hanoi puzzle) and math.sed (implements an arbitrary precision integer math calculator).

The performance of this Rust implementation is now better than the GNU and FreeBSD implementations for most benchmark cases.

GNU sed under Linux

               no-op-short rust is  3.04 times faster than gnu
          access-log-no-op rust is  1.91 times faster than gnu
       access-log-no-subst rust is  1.59 times faster than gnu
          access-log-subst rust is  1.32 times faster than gnu
         access-log-no-del rust is  1.54 times faster than gnu
        access-log-all-del rust is  1.99 times faster than gnu
       access-log-translit rust is 12.67 times faster than gnu
    access-log-complex-sub gnu is  4.24 times faster than rust
                 remove-cr rust is  1.68 times faster than gnu
              genome-subst rust is  1.15 times faster than gnu
                number-fix rust is  1.06 times faster than gnu
               long-script gnu is  2.19 times faster than rust
                     hanoi rust is  1.96 times faster than gnu
                 factorial rust is  1.25 times faster than gnu

FreeBSD sed

               no-op-short rust is  3.50 times faster than fbsd
          access-log-no-op rust is  2.45 times faster than fbsd
       access-log-no-subst rust is  1.38 times faster than fbsd
          access-log-subst rust is  4.02 times faster than fbsd
         access-log-no-del rust is  1.35 times faster than fbsd
        access-log-all-del rust is  4.74 times faster than fbsd
       access-log-translit fbsd is  1.14 times faster than rust
    access-log-complex-sub rust is  2.00 times faster than fbsd
                 remove-cr rust is  2.48 times faster than fbsd
              genome-subst rust is  2.21 times faster than fbsd
                number-fix fbsd is  1.04 times faster than rust
               long-script fbsd is 15.46 times faster than rust
                     hanoi rust is  9.33 times faster than fbsd
                 factorial rust is 115.72 times faster than fbsd

Remaining tasks

Improve runtime error reporting by including script coordinates in each command.
Fix buffering on terminal output to match current implementations
Profile lagging benchmarks to see whether there's room for further optimizations.
Implement more GNU extensions.

According to POSIX, labels consist of portable filename characters.

Previously empty RE were saved and reused based on their occurrence in the compiled script. Per POSIX if an RE is empty (that is, no pattern is specified) sed shall behave as if the last RE used in the last command applied (either as an address or as part of a substitute command) was specified. Now REs are saved and reused at runtime.

The previous order resulted in a duplicate label error.

regex::Regex EREs don't support back-references, so switch to fancy_regex. While at it remove test_compile_re_reuse_saved test case, which is no longer valid, as REs are reused at runtime.

sylvestre · 2025-05-19T18:47:46Z

src/uu/sed/src/processor.rs

+fn re_or_saved_re(regex: &Option<Regex>, context: &mut ProcessingContext) -> UResult<Regex> {
+    match regex {
+        Some(re) => {
+            *context.saved_regex.borrow_mut() = Some(re.clone());


would it be possible to ignore the clone ?

Indeed; fixed.

The change further reduced the 50! calculation from 4.34" to 3.46" (against 10.28" of GNU sed).

impressive!

sylvestre · 2025-05-19T18:48:48Z

src/uu/sed/src/processor.rs

+        let caps = match caps {
+            Ok(c) => c,
+            Err(e) => {
+                return Err(USimpleError::new(
+                    2,
+                    format!("regular expression capture retrieval error: {}", e),
+                ));
+            }
+        };


as you wish but you could write:

Suggested change

let caps = match caps {

Ok(c) => c,

Err(e) => {

return Err(USimpleError::new(

2,

format!("regular expression capture retrieval error: {}", e),

));

}

};

let caps = caps

.map_err(|e| USimpleError::new(

2,

format!("regular expression capture retrieval error: {}", e),

))?;

Thanks, fixed!

Suggested by: sylvestre

sylvestre · 2025-05-21T07:41:47Z

src/uu/sed/src/fast_io.rs

+
+    /// Copy the specified file to the output.
+    pub fn copy_file(&mut self, path: &PathBuf) -> io::Result<()> {
+        #[cfg(unix)]


do you know what will happen here on !unix ?

Yes, there are no memory mapped data, so there's nothing to flush. (The flush is needed when we switch from coalescing and writing out memory mapped data (with zero copies) to BufWriter.

This is code for the r command, which is still WIP.

This improves performance up to 1.5 times, with no downside. no-op-short rust-sed is similarly fast as rust-sed-f0 access-log-no-op rust-sed is 1.5 times faster than rust-sed-f0 access-log-no-subst rust-sed is similarly fast as rust-sed-f0 access-log-subst rust-sed is similarly fast as rust-sed-f0 access-log-no-del rust-sed is 1.1 times faster than rust-sed-f0 access-log-all-del rust-sed is similarly fast as rust-sed-f0 access-log-translit rust-sed is 1.2 times faster than rust-sed-f0 access-log-complex-sub rust-sed is similarly fast as rust-sed-f0 remove-cr rust-sed is similarly fast as rust-sed-f0 genome-subst rust-sed is similarly fast as rust-sed-f0 number-fix rust-sed is 1.1 times faster than rust-sed-f0 long-script rust-sed is similarly fast as rust-sed-f0 hanoi rust-sed is similarly fast as rust-sed-f0 factorial rust-sed is similarly fast as rust-sed-f0

This improves performance by 10% in the following benchmark case. Performance in all others remains the same. access-log-subst rust-sed is 1.1 times faster than rust-sed-eb64104

In theory, find, find_iter, captures, and captures_iter according to the values of sub.occurrence, sub.replacement.max_group_number could allow for better performance. In practice, specializing for the easiest case to use find did not show any improvement. no-op-short rust-sed is similarly fast as rust-sed-7868f4f access-log-no-op rust-sed is similarly fast as rust-sed-7868f4f access-log-no-subst rust-sed is similarly fast as rust-sed-7868f4f access-log-subst rust-sed is similarly fast as rust-sed-7868f4f access-log-no-del rust-sed is similarly fast as rust-sed-7868f4f access-log-all-del rust-sed is similarly fast as rust-sed-7868f4f access-log-translit rust-sed is similarly fast as rust-sed-7868f4f access-log-complex-sub rust-sed is similarly fast as rust-sed-7868f4f remove-cr rust-sed is similarly fast as rust-sed-7868f4f genome-subst rust-sed is similarly fast as rust-sed-7868f4f number-fix rust-sed is similarly fast as rust-sed-7868f4f long-script rust-sed is similarly fast as rust-sed-7868f4f hanoi rust-sed is similarly fast as rust-sed-7868f4f factorial rust-sed is similarly fast as rust-sed-7868f4f Consequently, this commit just documents the change and its results, and will be reverted.

This reverts commit f18f772. No performance improvement was seen.

This improves one benchmark case by 20%: access-log-all-del rust-sed is 1.2 times faster than rust-sed-f18f772 Unfortunatelly, all other cases remain unaffected.

The hope is that the now available regex::bytes will run find() faster than captures_iter(). Indeed, using find() increases the performance of several benchmark cases by 2-12%. no-op-short rust-sed is 1.02 times faster than rust-sed-5fad204 access-log-no-op rust-sed is 1.01 times faster than rust-sed-5fad204 access-log-no-subst rust-sed is 1.08 times faster than rust-sed-5fad204 access-log-subst rust-sed is 1.12 times faster than rust-sed-5fad204 access-log-translit rust-sed is 1.02 times faster than rust-sed-5fad204 access-log-complex-sub rust-sed is 1.04 times faster than rust-sed-5fad204 remove-cr rust-sed is 1.12 times faster than rust-sed-5fad204 genome-subst rust-sed is 1.02 times faster than rust-sed-5fad204 hanoi rust-sed is 1.01 times faster than rust-sed-5fad204 All other cases remain the same.

Performance improves significantly in the following case: number-fix rust-sed is 1.07 times faster than rust-sed-90d60b8 Other changes <5% are likely to be noise.

- Add RE to detect specifications that require an RE engine, rather than a literal string match. - Change RE unit tests to test directly the RE rather than the constructor. - Use arrays for similar tests to avoid repetition.

Regex uses an automaton even for matching literal strings. Moving through its state transitions is suboptimal. This commit introduces literal string matching with the memchr::memmem matcher, which may use SIMD and other specialized features to speed up the search as well as other algorithms for special sizes. This boosts substantially the performance of several benchmark cases. access-log-no-subst rust-86 is 2.84 times faster than rust-sed-a6 access-log-subst rust-86 is 1.87 times faster than rust-sed-a6 access-log-no-del rust-86 is 3.00 times faster than rust-sed-a6 access-log-all-del rust-86 is 2.80 times faster than rust-sed-a6 remove-cr rust-86 is 4.35 times faster than rust-sed-a6 genome-subst rust-86 is 3.65 times faster than rust-sed-a6 It makes makes the performance of Rust better than the GNU and FreeBSD implementations for most benchmark cases. no-op-short rust is 3.04 times faster than gnu access-log-no-op rust is 1.91 times faster than gnu access-log-no-subst rust is 1.59 times faster than gnu access-log-subst rust is 1.32 times faster than gnu access-log-no-del rust is 1.54 times faster than gnu access-log-all-del rust is 1.99 times faster than gnu access-log-translit rust is 12.67 times faster than gnu access-log-complex-sub gnu is 4.24 times faster than rust remove-cr rust is 1.68 times faster than gnu genome-subst rust is 1.15 times faster than gnu number-fix rust is 1.06 times faster than gnu long-script gnu is 2.19 times faster than rust hanoi rust is 1.96 times faster than gnu factorial rust is 1.25 times faster than gnu no-op-short rust is 3.50 times faster than fbsd access-log-no-op rust is 2.45 times faster than fbsd access-log-no-subst rust is 1.38 times faster than fbsd access-log-subst rust is 4.02 times faster than fbsd access-log-no-del rust is 1.35 times faster than fbsd access-log-all-del rust is 4.74 times faster than fbsd access-log-translit fbsd is 1.14 times faster than rust access-log-complex-sub rust is 2.00 times faster than fbsd remove-cr rust is 2.48 times faster than fbsd genome-subst rust is 2.21 times faster than fbsd number-fix fbsd is 1.04 times faster than rust long-script fbsd is 15.46 times faster than rust hanoi rust is 9.33 times faster than fbsd factorial rust is 115.72 times faster than fbsd

By using uucore's .map_err_context.

dspinellis · 2025-05-30T17:18:46Z

@sylvestre I integrated the commits for the remaining commands and in-place editing. The implementation is mostly done now.

sylvestre · 2025-05-30T19:30:15Z

Excellent, I will have a look :)

sylvestre · 2025-05-30T19:30:32Z

I will reply to your email too :)

dspinellis added 15 commits May 19, 2025 19:28

Improve label name compliance

09d4693

According to POSIX, labels consist of portable filename characters.

Support the compilation of text (aci) commands

826ebf9

Implement the i command

858c3a6

Implement a c i commands

6afd775

Fix compiled data structure patching order

aa66386

The previous order resulted in a duplicate label error.

Escape dollar and caret in BREs

416c4a7

Fix handling of escaped delimiter

f844ddd

Add math torture test

1360174

Support regular expression back-references

fddd6c1

regex::Regex EREs don't support back-references, so switch to fancy_regex. While at it remove test_compile_re_reuse_saved test case, which is no longer valid, as REs are reused at runtime.

Fix ERE conversion of BRE back-references

a0da132

Add Towers of Hanoi test case

84ca66c

Add Pi calculation test case

1a6b078

Tidy up and document integration tests

bb9bfb6

Fix CI test failures

d74e3a7

dspinellis force-pushed the aci-commands branch from f6b99da to d74e3a7 Compare May 19, 2025 17:24

dspinellis changed the title ~~Implement a, c, i commands and fix faults to run example scripts~~ Implement a, c, i commands and fix faults to run sophisticated scripts May 19, 2025

sylvestre reviewed May 19, 2025

View reviewed changes

dspinellis added 3 commits May 19, 2025 23:42

Optimize away needless RE clone

7ffb4a4

Suggested by: sylvestre

Shorten error mapping code

fd90cc0

Suggested by: sylvestre

Utilize shadowing to avoid convoluted names

f0cceb7

sylvestre reviewed May 21, 2025

View reviewed changes

dspinellis added 7 commits May 23, 2025 18:30

Update status

a233619

Add benchmark and comparison scripts

f7365fa

Handle failed executions

a4cee51

Check at runtime for invalid s/// group regerences

25b84bd

Catch invalid group references at compile time

cbe314d

Support replacement group \0 as synonym for &

eb64104

dspinellis added 19 commits May 24, 2025 20:26

Early exit in substitution

7868f4f

This improves performance by 10% in the following benchmark case. Performance in all others remains the same. access-log-subst rust-sed is 1.1 times faster than rust-sed-eb64104

Revert "Try to specialize regex retrieval"

a6010ae

This reverts commit f18f772. No performance improvement was seen.

Use regex::bytes whenever possible

5fad204

This improves one benchmark case by 20%: access-log-all-del rust-sed is 1.2 times faster than rust-sed-f18f772 Unfortunatelly, all other cases remain unaffected.

Report performance changes up to 1%

28b5eb0

Specialize single replacement multiple groups

a67be74

Performance improves significantly in the following case: number-fix rust-sed is 1.07 times faster than rust-sed-90d60b8 Other changes <5% are likely to be noise.

Improve fancy_regex detection RE and tests

aa12a12

Fix Windows warning

ea3df65

Add RE detection and refactor unit tests

a069605

- Add RE to detect specifications that require an RE engine, rather than a literal string match. - Change RE unit tests to test directly the RE rather than the constructor. - Use arrays for similar tests to avoid repetition.

Merge branch 'optimize' into aci-commands

27bdec6

Implement the r, w, and = commands

428fabe

Remove unneeded text field

c73f88a

Add l and Q commands, extended with GNU number

18cf3fc

Improve error reporting

4b547f4

By using uucore's .map_err_context.

Remove and fix dead code

be0a6a1

Add in-place editing support

a2e1969

Merge remote-tracking branch 'upstream/main' into aci-commands

751621e

dspinellis changed the title ~~Implement a, c, i commands and fix faults to run sophisticated scripts~~ Complete implementation of all commands, in-place editing, and optimizations May 30, 2025

Merge remote-tracking branch 'upstream/main' into aci-commands

84add8a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Complete implementation of all commands, in-place editing, and optimizations #77

Complete implementation of all commands, in-place editing, and optimizations #77

Uh oh!

dspinellis commented May 19, 2025 •

edited

Loading

Uh oh!

sylvestre May 19, 2025

Uh oh!

dspinellis May 19, 2025

Uh oh!

dspinellis May 19, 2025

Uh oh!

sylvestre May 19, 2025

Uh oh!

sylvestre May 19, 2025

Uh oh!

dspinellis May 19, 2025

Uh oh!

sylvestre May 21, 2025

Uh oh!

dspinellis May 21, 2025

Uh oh!

dspinellis commented May 30, 2025

Uh oh!

sylvestre commented May 30, 2025

Uh oh!

sylvestre commented May 30, 2025

Uh oh!

Uh oh!

Complete implementation of all commands, in-place editing, and optimizations #77

Are you sure you want to change the base?

Complete implementation of all commands, in-place editing, and optimizations #77

Uh oh!

Conversation

dspinellis commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GNU sed under Linux

FreeBSD sed

Remaining tasks

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dspinellis commented May 30, 2025

Uh oh!

sylvestre commented May 30, 2025

Uh oh!

sylvestre commented May 30, 2025

Uh oh!

Uh oh!

dspinellis commented May 19, 2025 •

edited

Loading