You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Migrate the NemoClaw CLI from ~4,700 lines of shell + ~4,200 lines of untyped CJS to strict ESM TypeScript on oclif, delivered across 6 sequential PRs. This subsumes #909 (oclif migration) and addresses the same 24 issues and 17 superseded PRs listed there.
No user-facing behavioral changes. The CLI commands, flags, and output stay the same — the internals become typed, tested, and structurally sound.
Motivation
The current CLI has three categories of problem that compound each other:
Shell duplication. Three copies of the install logic (install.sh, scripts/install.sh, inline in onboard.js). A dead setup.sh that leaks NVIDIA_API_KEY on the command line. An uninstall path that exists entirely outside the JS test suite. ~4,700 lines of shell that mostly duplicate logic already in bin/lib/*.js.
No type safety. The 12 modules in bin/lib/ are untyped CJS. The CLI and plugin share domain concepts (sandbox entries, onboard config, credential shapes, endpoint types) but have no shared type definitions — the same shapes are implicit in JS and explicit in TS, with no compile-time guarantee they agree. Silent bugs like registerSandbox({ naem: "foo" }) store name: undefined with no complaint.
No CLI framework. The 515-line switch/case dispatcher in bin/nemoclaw.js has no shell completions, no per-command --help, no flag validation, no --json output, no structured error handling. Fixing these individually is less efficient than fixing the architecture.
Architecture
oclif commands live at the repo root. The OpenClaw plugin (nemoclaw/src/) is a separate concern (Commander-based, different runtime) and is unchanged.
Security fix.scripts/setup.sh passes NVIDIA_API_KEY on the openshell sandbox create command line. onboard.js handles this through the gateway's stored credential. The shell path must die.
shared/types.ts — shared domain types with at least one consumer from day one: SandboxEntry, SandboxRegistry, EndpointType, NemoClawOnboardConfig, PlatformInfo, CredentialKey
Delete scripts/setup.sh (242 lines)
Convert 4 core modules (CJS→ESM TS in one step, straight to src/lib/):
All subprocess calls move from string-concatenated bash -c commands to execa array args. A project-wide $$ helper in src/lib/runner.ts provides defaults.
See #909 for the full list of 24 issues addressed and 17 PRs superseded.
UX compatibility
The nemoclaw <sandbox-name> <action> syntax is preserved via a command_not_found hook — no breaking changes to existing workflows or scripts. The new nemoclaw sandbox <action> <name> syntax is also available.
Summary
Migrate the NemoClaw CLI from ~4,700 lines of shell + ~4,200 lines of untyped CJS to strict ESM TypeScript on oclif, delivered across 6 sequential PRs. This subsumes #909 (oclif migration) and addresses the same 24 issues and 17 superseded PRs listed there.
No user-facing behavioral changes. The CLI commands, flags, and output stay the same — the internals become typed, tested, and structurally sound.
Motivation
The current CLI has three categories of problem that compound each other:
Shell duplication. Three copies of the install logic (
install.sh,scripts/install.sh, inline inonboard.js). A deadsetup.shthat leaksNVIDIA_API_KEYon the command line. An uninstall path that exists entirely outside the JS test suite. ~4,700 lines of shell that mostly duplicate logic already inbin/lib/*.js.No type safety. The 12 modules in
bin/lib/are untyped CJS. The CLI and plugin share domain concepts (sandbox entries, onboard config, credential shapes, endpoint types) but have no shared type definitions — the same shapes are implicit in JS and explicit in TS, with no compile-time guarantee they agree. Silent bugs likeregisterSandbox({ naem: "foo" })storename: undefinedwith no complaint.No CLI framework. The 515-line
switch/casedispatcher inbin/nemoclaw.jshas no shell completions, no per-command--help, no flag validation, no--jsonoutput, no structured error handling. Fixing these individually is less efficient than fixing the architecture.Architecture
oclif commands live at the repo root. The OpenClaw plugin (
nemoclaw/src/) is a separate concern (Commander-based, different runtime) and is unchanged.Two TypeScript projects (
tsconfig.cli.jsonat root +nemoclaw/tsconfig.json) withshared/bridging them.Plan
6 sequential PRs. Each builds on the last.
flowchart TD PR0["**PR 0** — Foundation ✅ #913\ntsconfig.cli.json · root execa · TS coverage ratchet"] PR1["**PR 1** — Security fix + first TS batch\nKill setup.sh · shared/types.ts\nrunner · registry · credentials · platform → src/lib/"] PR2["**PR 2** — Port shell scripts to TS\nuninstall.sh · start-services.sh · debug.sh → src/lib/\n+ nim · policies · inference-config · local-inference"] PR3["**PR 3** — oclif scaffold + simple commands\nScaffold oclif · list · status · start · stop · debug · uninstall\nMerge installers · port fix-coredns · convert onboard + preflight"] PR4["**PR 4** — oclif complex commands\nSandbox topic commands · deploy · onboard wizard\nCentralized error handling · secret redaction"] PR5["**PR 5** — E2E migration + cleanup\nAll e2e → vitest TS · shell completions\nDelete old dispatcher · strict TS everywhere"] PR0 --> PR1 PR1 --> PR2 PR2 --> PR3 PR3 --> PR4 PR4 --> PR5 style PR0 fill:#2d6a2d,color:#fffPR 0: Foundation ✅
#913Done.
tsconfig.cli.json(strict,noEmit),execa ^9.6.1in root devDependencies,scripts/check-coverage-ratchet.tsreplacing the bash+python version,tsc-check-clipre-push hook, CI updated.PR 1: Kill
setup.sh+ shared types + first CJS→ESM TS batchSecurity fix.
scripts/setup.shpassesNVIDIA_API_KEYon theopenshell sandbox createcommand line.onboard.jshandles this through the gateway's stored credential. The shell path must die.shared/types.ts— shared domain types with at least one consumer from day one:SandboxEntry,SandboxRegistry,EndpointType,NemoClawOnboardConfig,PlatformInfo,CredentialKeyscripts/setup.sh(242 lines)src/lib/):bin/lib/runner.js→src/lib/runner.ts— rewrite on execa ($$project helper, deleteshellQuote())bin/lib/registry.js→src/lib/registry.tsbin/lib/credentials.js→src/lib/credentials.tsbin/lib/platform.js→src/lib/platform.ts~240 lines of shell deleted, 4 modules converted. Risk: Low.
PR 2: Port uninstall + start-services + debug to TS
All three are shell scripts that
nemoclaw.jsshells out to viaspawnSync('bash', ...). Each becomes a TS module using execa, called directly.uninstall.sh(555 lines) →src/lib/uninstall.tsscripts/start-services.sh(204 lines) →src/lib/services.tsscripts/debug.sh(328 lines) →src/lib/debug.tsnim,policies,inference-config,local-inference~1,090 lines of shell → ~620 lines of TS + ~400 lines of tests. Risk: Medium.
PR 3: oclif scaffold + simple commands
The biggest structural change. Scaffold oclif at repo root and migrate the simple commands.
@oclif/core,@oclif/plugin-help,@oclif/plugin-autocomplete,@oclif/plugin-updatetsconfig.cli.jsonfor compiled output (outDir,declaration, dropnoEmit)list,status,start,stop,debug,uninstall,setup-sparkdeploy()Shell Commands #575), auto-generated--help(feat(cli): add per-command --help for all top-level and sandbox-scoped commands #757),enableJsonFlag(feat(cli): add --json output for list and status commands #753),--verbose/--debugglobals (feat: add --verbose / --debug flag for CLI observability #666)command_not_foundhook for backward compat (nemoclaw <sandbox> <action>still works)fix-coredns.sh→src/lib/fix-coredns.tsinstall.sh+scripts/install.sh→ single installer (~200 lines)onboard.js,preflight.js,resolve-openshell.jsOld dispatcher (
bin/nemoclaw.js) stays as fallback during this PR. Risk: Medium-high.PR 4: oclif sandbox commands + onboard wizard + deploy
sandbox/topic:connect,status,logs,destroy,policy add/listdeploy(SSH + cloud provider logic)@inquirer/promptsprocess.exit()calls with thrown errors (~15-20 call sites)Risk: High for onboard wizard. Test interactive flow end-to-end.
PR 5: E2E test migration + completions + cleanup
bin/nemoclaw.js, 515 lines) andscripts/lib/runtime.sh(229 lines)@oclif/plugin-autocomplete(bash/zsh/PowerShell; fish needs custom work) (feat: add shell completion for nemoclaw CLI #155)oclif readme(ci: add CLI/docs drift test for documented commands and slash subcommands #756, docs(cli): update command reference to match implemented slash commands and host commands #758)waitForhelper replaces bashexpectscripts):e2e-test.sh,test-full-e2e.sh,test-double-onboard.sh,e2e-gateway-isolation.sh,e2e-cloud-experimental/tree (~2,800 lines),brev-e2e.test.jsstrict: trueeverywhere,allowJs: false, full coverage ratchetRisk: Medium. Run old and new e2e in parallel for one PR cycle.
End state
pie title Lines of code by language (after) "TypeScript (CLI commands)" : 1500 "TypeScript (CLI lib)" : 3000 "TypeScript (plugin)" : 4800 "TypeScript (shared)" : 80 "TypeScript (e2e tests)" : 700 "Shell (stays)" : 1300 "Python (1 script)" : 100waitForhelper--json, flag validation)shared/types.ts— one definition, two consumersbash -c+shellQuote()Shell that stays (and why)
install.shuninstall.shbrev-setup.shapt-get,sudo,systemd)nemoclaw-start.shscripts/setup-spark.shsudo -Echeck-spdx-headers.sh,backup-workspace.sh,walkthrough.sh, etc.Subprocess execution: execa replaces
child_processAll subprocess calls move from string-concatenated
bash -ccommands to execa array args. A project-wide$$helper insrc/lib/runner.tsprovides defaults.Conflict management
Must merge before PR 1 (large changes to files we're rewriting):
onboard.jsrunner.js,policies.js,credentials.js,registry.jsonboard.js+platform.jsShould merge if ready (smaller
onboard.jschanges):Superseded by oclif migration (close before PR 3):
--json), fix(cli): use openshell top-level logs command #350 (routing), fix(cli): validate deploy instance names before setup prompts #604 (validation), feat(cli): add self-update command #644 (self-update), fix(security): redact secret patterns from CLI log and error output #672, security: redact secret patterns from CLI log and error output #794 (secret redaction), fix(cli): respect defaultValue in promptOrDefault interactive mode (Fixes #360) #631, fix: use correct --type flag for brev create command #477, fix: improve activeModelEntries consistency and validation #853, fix(blueprint): add error handling for JSON.parse in state, config, and migration loaders #638, Add error handling for gateway startup #398 (error handling)See #909 for the full list of 24 issues addressed and 17 PRs superseded.
UX compatibility
The
nemoclaw <sandbox-name> <action>syntax is preserved via acommand_not_foundhook — no breaking changes to existing workflows or scripts. The newnemoclaw sandbox <action> <name>syntax is also available./cc @cv @HagegeR @prekshivyas