Skip to content

What is the right approach for a typed API #3792

@epage

Description

@epage

This is a follow up to #2683

State of Clap 3.0

Clap's builder API only natively supports two types, OsString and String, controlled via Arg::allow_invalid_utf8 and then accessed via ArgMarches::value_of{,_os}.

The user can provide extra validation via

  • Arg::forbid_empty_value
  • Arg::possible_values
  • Arg::validator
  • Arg::validator_regex

ArgMatches::value_of_t then adapts to the user type via FromStr.

clap_derive allows users to set a custom adapter to their type (instead of the fixed one of value_of_t. This then automatically gets put in as an Arg::validator, causing it to be run twice (#3589).

This requires users to know how to use Paths correctly, see #3496

Basic Builder

#!/usr/bin/env -S rust-script --debug

//! ```cargo
//! [dependencies]
//! clap = { path = "../clap", features = ["derive"] }
//! ```

fn main() {
    let cmd =
        clap::Command::new("test").arg(clap::Arg::new("value").long("value").takes_value(true));
    let args = cmd.get_matches();
    dbg!(args.value_of_t_or_exit::<u64>("value"));
}

Output:

$ ./clap-from_str.rs --value foo
error: Invalid value "foo" for 'value': The argument 'foo' isn't a valid value for 'value': invalid digit found in string

(note: no coloring)

Basic Derive:

#!/usr/bin/env -S rust-script --debug

//! ```cargo
//! [dependencies]
//! clap = { path = "../clap", features = ["derive"] }
//! ```

use clap::Parser;

#[derive(Parser, Debug)]
pub struct Args {
    #[clap(long)]
    value: u64,
}

fn main() {
    let args = Args::parse();
    dbg!(args);
}

Output:

$ ./clap-derive_t.rs --value foo
error: Invalid value "foo" for '--value <VALUE>': invalid digit found in string

For more information try --help

State of 3.2

With #3732 and many follow up, we've made the API typed. This removes all of those extra validators and makes the builder API closer to the derive API.

Basic Builder

#!/usr/bin/env -S rust-script --debug

//! ```cargo
//! [dependencies]
//! clap = { path = "../clap", features = ["derive"] }
//! ```

fn main() {
    let cmd = clap::Command::new("test").arg(
        clap::Arg::new("value")
            .long("value")
            .takes_value(true)
            .value_parser(clap::value_parser!(u64)),
    );
    let args = cmd.get_matches();
    dbg!(args.get_one::<u64>("value").unwrap());
}

Output:

$ ./clap-from_str.rs --value foo
error: Invalid value "foo" for '--value <value>': invalid digit found in string

For more information try --help

Problems with this approach:

  • With ArgAction::Count (the builder replacement for derive's parse(from_occurrences) only works with u64 and only through docs / asserts can we tell users
  • Users have to match the type from Arg::value_parser with ArgMatches::get_one::<T>
  • The user's type is forced to be std::any::Any + Clone + Send + Sync + 'static, before Clone wasn't required, allowing it to be used with more types
    • Ideally we'd also track std::fmt::Debug...
  • Our storage is effectively (Vec<Box<T>>, Vec<OsString>), causing an extra allocation per value while still having to have the OsString for validation
    • any_vec might help in the future to take Vec<Box<T>> to Vec<T>
    • Maybe as we break compatibility, we can ooch away from tracking the Raw Value
  • In addition, we have to do a lot of type casting

Alternatives

Double-down on storing OsString

  • The Builder API would only work with OsString
  • get_one::<T>(id, typed_value_parser) would replace value_of_t_or_exit (try_ would replace the non-exit version)
  • Derive would use try_get_one / try_get_many

Compared to the 3.0 solution

  • ✅ Simpler
  • ✅ Builder users have greater control over the parser used
  • ✅ We could deprecate all validators but Arg::possible_values
  • Error messages would be on-par with the basic solution

Compared to the 3.2 solution

  • ✅ Simpler
  • ArgAction::Count would support any type, unlike the 3.2 solution
  • value_parser would likely not be as featureful
  • value_parser wouldn't have access to look up ignore_case but would need its own independent ignore_case
  • ❌ We'd still need Arg::possible_values
  • ❌ Error messages would regress without access to the Command for color support, knowing what help to suggest and access to Arg for showing the flag name or checking if ignore case is set (used also by the default/required rules)
    • Derive error messages for Parser would be slightly better because we would call err.format(&mut cmd) implicitly. If the user uses CommandFactory / FromArgMatches directly, they are on their own
    • We could take the hit of rendering the flag name for all args that are parsed. That is one render / allocation performance hit per arg definition used by the user
  • ✅ Less overhead per arg value
  • ✅ Less restrictive on traits used

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-validatorsArea: ArgMatches validation logiC-enhancementCategory: Raise on the bar on expectationsM-breaking-changeMeta: Implementing or merging this will introduce a breaking change.S-waiting-on-decisionStatus: Waiting on a go/no-go before implementing

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions