Posix compliant escaping #5

wolfmanx · 2025-08-07T03:59:31Z

As reported in issue #4, the tokenizer does not follow the POSIX rules for the shell. Since a frequent use case for the tokenizer module is to reuse commands from shell scripts, it is unfortunate, if the tokenizer behaves differently.

I have added extensive comments for better comprehension, but they can of course be removed.

I have also modified the newline test case, so that npm run test does not fail.

There is a test case that constructs a string of all characters in the ASCII code range 1-127 escaped, which is then supplied as unquoted, double and single quoted argument to argsTokenizer. The results are checked against the actual output from /bin/sh.

That test fails against the unmodified version of args-tokenizer and passes with the proposed modifications.

wolfmanx · 2025-08-09T01:46:59Z

I have constructed an extensive test with all characters in the range 0-255:

let argument = `\\\0x00\\\0x01 ... \\\0xFF`;

The output is tested against expected results constructed according to POSIX standard for

unescaping of unquoted argument tokenizeArgs(argument),
unescaping of double quoted argument tokenizeArgs('"' + argument + '"'),
unescaping of single quoted argument tokenizeArgs("'" + argument + "'").

The same test is performed with all characters in the range 1-127 against actual output from /bin/sh.

await result = x("/bin/sh", ["-c", "echo " + argument]);  // unquoted
await result = x("/bin/sh", ["-c", "echo " + '"' + argument + '"']); // double quoted
await result = x("/bin/sh", ["-c", "echo " + "'" + argument + "'"]); // single quoted

This should demonstrate the correctness of the modification convincingly enough.

wolfmanx · 2025-08-10T21:33:47Z

The result of the escape test against the main branch shows the problems with escaped characters in double and single quoted strings:

   ✓ all escaped characters outside quoting context (POSIX)
   × all escaped characters in double quoting context (POSIX)
   × all escaped characters in single quoting context (POSIX)
   ✓ all escaped characters outside quoting context (/bin/sh)
   × all escaped characters in double quoting context (/bin/sh)
   × all escaped characters in single quoting context (/bin/sh)

⎯⎯⎯⎯⎯⎯⎯ Failed Tests 4 ⎯⎯⎯⎯⎯⎯⎯

 FAIL  src/args-tokenizer.test.ts > all escaped characters in double quoting context (POSIX)
 FAIL  src/args-tokenizer.test.ts > all escaped characters in double quoting context (/bin/sh)

  --------------------------------------------------
- \^A 1
+ ^A 1
- \^B 2
+ ^B 2
- \^C 3
+ ^C 3
- \^D 4
+ ^D 4

[...]

- \{ 123
+ { 123
- \| 124
+ | 124
- \} 125
+ } 125
- \~ 126
+ ~ 126
- \\x7F 127
+ \x7F 127

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[1/4]⎯

 FAIL  src/args-tokenizer.test.ts > all escaped characters in single quoting context (POSIX)
 FAIL  src/args-tokenizer.test.ts > all escaped characters in single quoting context (/bin/sh)
Error: Unexpected end of string. Closing quote is missing.

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯[2/4]⎯

 Test Files  1 failed (1)
      Tests  4 failed | 18 passed (22)

This was referenced Aug 7, 2025

Quoted empty strings are not recognized as valid arguments #6

Open

Quoted empty strings are valid arguments. #7

Open

wolfmanx mentioned this pull request Aug 9, 2025

Backslashes in single quotes should not be treated as escape characters. #4

Open

Fix parser to follow POSIX shell escaping rules

b86e4c4

wolfmanx force-pushed the posix-compliant-escaping branch from d4e1458 to b86e4c4 Compare August 10, 2025 21:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Posix compliant escaping #5

Posix compliant escaping #5

Uh oh!

wolfmanx commented Aug 7, 2025 •

edited

Loading

Uh oh!

wolfmanx commented Aug 9, 2025 •

edited

Loading

Uh oh!

wolfmanx commented Aug 10, 2025

Uh oh!

Uh oh!

Posix compliant escaping #5

Are you sure you want to change the base?

Posix compliant escaping #5

Uh oh!

Conversation

wolfmanx commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wolfmanx commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wolfmanx commented Aug 10, 2025

Uh oh!

Uh oh!

wolfmanx commented Aug 7, 2025 •

edited

Loading

wolfmanx commented Aug 9, 2025 •

edited

Loading