Improve API for configuring a high number of arguments #45

aaronccasanova · 2022-01-23T22:20:07Z

I wanted to play devils advocate on the design of the parseArgs options object. Are there any concerns that the options object will become unwieldy as the number of arguments grow? While I am still catching up on prior discussions and understanding the target audience of this tool, I imagine creating large scale CLIs (e.g. aws-cli, git, docker, etc.) will be challenging to configure as configuring a single argument requires setting multiple top level keys (e.g. withValues, multiples, short).

I wanted to gauge folks opinion on allowing users to set all the configurations for a single argument in one object.

(Current) Example configuring large scale CLI options

parseArgs(argv, {
  withValue: [
    'foo',
    // +500 more values
  ],
  multiples: [
    'foo',
    // +500 more multiples
  ],
  short: {
    f: 'foo',
    // +500 more shorts
  },
})

(Proposed) Example configuring large scale CLI options

Keeping the same terminology established thus far.

parseArgs(argv, {
  args: {
    foo: {
      short: 'f',
      withValue: true,
      multiples: true,
    },
    // +500 more args
  },
})

The text was updated successfully, but these errors were encountered:

ljharb · 2022-01-23T22:22:12Z

That seems like a useful semver-minor addition in the future, when usage patterns are well-understood.

shadowspawn · 2022-01-23T22:57:36Z

Fair question. I think the initial focus is on keeping it simple, and large scale CLI is not a concern for now. (And a large scale program could repackage the one-object configuration fairly easily.)

aaronccasanova · 2022-01-24T03:03:35Z

I think the initial focus is on keeping it simple

Totally understand the desire to keep things simple. However, I don't believe the proposed suggestion adds complexity to the implementation or configuration. For example, I modified a few lines in the current implementation and enabled individual configuration objects: main...aaronccasanova:feat/improve-options-api

const argv = ['--f', 'foo1', '--foo', 'foo2', '--bar', 'baz'];

const parsed = parseArgs(argv, {
  args: {
    foo: {
      short: 'f',
      withValue: true,
      multiples: true,
    },
  },
});

console.log('parsed:', parsed);

large scale CLI is not a concern for now

Being that this is intended for Node core, I would have thought otherwise.

Appreciate the quick response and feedback!

ljharb · 2022-01-24T04:35:45Z

My concern about including it now isn't about the code complexity, it's about the API surface, and the design space.

shadowspawn · 2022-01-25T05:10:16Z

For background, the Intended Audience from the "Initial Proposal". (May or may not reflect what eventuates!)

nodejs/node#35015 (comment)

Intended Audience

It is already possible to build great arg parsing modules on top of what Node.js provides; the prickly API is abstracted away by these modules. Thus, process.parseArgs() is not necessarily intended for library authors; it is intended for developers of simple CLI tools, ad-hoc scripts, deployed Node.js applications, and learning materials.

It is exceedingly difficult to provide an API which would both be friendly to these Node.js users while being extensible enough for libraries to build upon. We chose to prioritize these use cases because these are currently not well-served by Node.js' API.

aaronccasanova · 2022-01-26T17:15:43Z

That's a helpful bit of context! Thanks for the reference @shadowspawn

bcoe · 2022-02-06T00:12:38Z

@aaronccasanova, I'm with @ljharb on erring on the side of a smaller API for MVP. With the goal initially being to support simple throw away CLIs and code samples.

I'm hopeful we might be able to target Node 18, or Node 19 with the MVP, so the tighter we can keep the API the better, hence trying to keep behavior constrained.

I think this work could immediate be a great follow up, and a good place to contribute to the Node.s codebase.

bakkot · 2022-02-06T03:54:28Z

@bcoe What about making this the only way to configure options? It seems more natural, to me: it means you're describing each argument in one place, rather than splitting up the configuration for each option across several fields. And if it's the only way to do it, it doesn't increase the API surface.

const parsed = parseArgs(argv, {
  args: {
    foo: {
      withValue: true,
    },
  },
});

does not seem noticeably harder to use than

const parsed = parseArgs(argv, {
  withValue: ['foo'],
});

and has the advantage of being a lot more natural as soon as you get at all beyond that:

const parsed = parseArgs(argv, {
  args: {
    foo: {
      withValue: true,
      multiple: true,
    },
    bar: {
      withValue: true,
    },
  },
});

is much easier to understand than

const parsed = parseArgs(argv, {
  withValue: ['foo', 'bar'],
  multiple: ['foo'],
});

since the latter requires you to scan multiple lists to figure out how foo is going to behave. More importantly, it also matches how people are likely to be accustomed to doing configuration: the properties for each field are set within that field, rather than having lists of fields-with-this-property.

shadowspawn · 2022-02-06T06:27:43Z

does not seem noticeably harder to use than

I think using nested objects is noticeably harder. The simplest configuration is a triply nested object. I think that is more challenging for a beginner to scan, or indeed type.

Smaller comments:

Multiples are going to be relatively rare. But there will still often be two properties in place as I think short options are pretty routine. So your comments about how to slice still apply.
I personally slightly prefer the objects from a purist point of view, but I still prefer the arrays from a pragmatic point of view for the initial target audience.

bakkot · 2022-02-06T07:27:01Z

I don't think "nested objects" as a concept is terribly confusing for most beginners. In my experience that's not the kind of complexity which usually trips people up when starting out. "These two things are implicitly related despite having no syntactic connection", on the other hand, definitely is.

So I think the above approach is better even from a strictly pragmatic point of view when optimizing for the experience of beginners.

That said, you can quite reasonably drop a level of indentation by having the second argument just be args, and having remaining options be in an optional third parameter. Then you have only doubly-nested objects, which is the same level of nesting as in the current API.

That is:

const parsed = parseArgs(argv, {
  foo: {
    withValue: true,
    multiple: true,
  },
  bar: {
    withValue: true,
  },
}, /* any other options */);

bcoe · 2022-02-06T13:12:23Z

What about making this the only way to configure options?

@shadowspawn @darcyclarke @ljharb, if we went with the nested object approach for configuration, I feel like strict becomes more natural, since all your valid arguments are defined in one place ... For flags, we could allow any empty object for configuration, or flag: true?

ljharb · 2022-02-06T15:23:01Z

I’m ambivalent on either of the two ways to configure, but i feel strongly that non-strict is a very bad default, so if this argument form comes with default-strict, I’m super on board.

aaronccasanova · 2022-02-06T15:43:43Z

I think using nested objects is noticeably harder.

I used a nested object structure for three reasons:

Allows for more top level configurations
Minimizes the chances of introducing breaking changes in future iterations
Easier to static type check

Allows for more top level configurations

e.g.

parseArgs(argv, {
  strict: true,
  ...moreTopLevelConfigs,
  args: {...}
})

Easier to static type check

e.g.

type ParseArgsOptions = {
  strict?: boolean
  args?: {
    [arg: string]: { // 👈 Isolated from top level configs
       short?: string
       withValue?: boolean
       multiples?: boolean
    }
  }
}

That said, you can quite reasonably drop a level of indentation by having the second argument just be args

We may not necessarily care about this here, but dropping a level of indentation will result in flaky types if we want to allow more top level configurations:

e.g. How do you accurately type this example:
TSPlayground - with nested object structure
TSPlayground - without nested object structure

parseArgs(argv, {
  strict: true,
  foo: {
    short: 'f',
    withValue: true,
  },
  bar: false, // Should error, not a valid arg value or top level config
})

// This would pass, but is no longer valid or representative of the implementation:
type ParseArgsOptions = {
  strict?: boolean
  [arg: string]: boolean | {
     short?: string
     withValue?: boolean
     multiples?: boolean
  }
}

bakkot · 2022-02-06T16:04:07Z

We may not necessarily care about this here, but dropping a level of indentation will result in flaky types if we want to allow more top level configurations:

The suggestion was that options like strict, which are not intended as CLI arguments but rather configure the parser itself, would go in a new third parameter. I agree it would be bad to mix those with arguments in the same object.

aaronccasanova · 2022-02-06T16:18:58Z

Ah yes, looking more closely that is what you proposed 👍 I think my only nit with that API is when you want to set parser configs and no args configs:

e.g.

parseArgs(argv, null, { requireEqualSign: true })

vs.

parseArgs(argv, {
  requireEqualSign: true,
})

Otherwise, the drop level config you proposed looks great!

shadowspawn · 2022-02-06T21:19:40Z

parseArgs(argv, null, { requireEqualSign: true })

This touches on something I had been wondering about, so I'll throw it into the mix! We have a default for the args input parameter which means it often won't be specified. If we do go strict by default, it might be better to make the input arguments the second parameter, or even a property on the configuration parameter, to avoid a clumsy invocation for the normal usage.

e.g.

parseArgs(myConfig);
parseArgs(myConfig, myArgs); // reordered
// or all properties on single parameter, easier to leave things out and named
parseArgs({
   withValue: ['foo'], // (existing property as example)
   strict: false,
   args: process.argv
});

shadowspawn · 2022-02-06T21:46:57Z

A nit. args has being used as the parent property name for the options in the previous examples, which seems wrong to me. (See previous comment where I deliberately used args for the unprocessed args.) I think options is the better name, albeit overloaded in general usage. Trying that out with one of the previous examples:

const parsed = parseArgs(argv, {
  options: {
    foo: {
      withValue: true,
    },
  },
});

(I raised the same naming issue previously in the context of the returned properties: #12)

aaronccasanova · 2022-02-06T22:18:57Z

@shadowspawn I'm a fan of both suggestions:

Consolidating input arguments and the configurations object
Changing the proposed args configurations property with options
e.g.

parseArgs({
  strict: true,
  args: process.argv,
  options: { foo: {...} },
})

bcoe · 2022-02-10T19:30:17Z

I like the suggestion of:

const parsed = parseArgs(argv, {
  options: {
    foo: {
      withValue: true,
    },
  },
});

Personally, for whatever reason ergonomically I like passing first the arguments that you're parsing, then the description of how the parse shoudl take place.

aaronccasanova · 2022-02-10T21:17:22Z

Personally, for whatever reason ergonomically I like passing first the arguments that you're parsing, then the description of how the parse should take place.

Me too! Although, while I have no data to back this up, I imagine the majority of users will leverage the default argv = getMainArgs(). As a result, we may see a lot of this out in the wild: parseArgs(undefined, { options }). Thoughts?

ljharb · 2022-02-10T21:21:10Z

I agree with both viewpoints - passing the args first is better, but being able to omit them is the many-nines use case.

Short of providing two APIs - one that takes required args, and one that defaults them - i'm not sure how to reconcile those.

bakkot · 2022-02-10T21:50:57Z

Personally I think it makes more sense for the thing-to-be-parsed to come after the specification-of-how-to-parse. (Compare, for example, a tool like ajv, where you build a parser and then apply it.) I'm having trouble wrapping my head around the other viewpoint; can someone say more about why they prefer that?

That said, I note that if you use an options bag, as in #63, there's no issues with ordering.

ljharb · 2022-02-10T23:24:55Z

"parse args" is a verb, "args" is the noun i want to parse, and "options" is how i want it parsed.

bakkot · 2022-02-10T23:47:35Z

I agree with that but don't understand why it implies anything about the arguments order. If anything it's the reverse: "use this specification to parse these args".

ljharb · 2022-02-10T23:48:39Z

Certainly one can form a sentence in either direction - "parse these args this way" versus "parse, this way, these args".

bcoe · 2022-02-11T13:40:40Z

@bakkot another argument I'd make for arguments and then options bag, is that it's what the community is accustomed to, as it's the approach taken by commander, yargs, minimist, args (worth noting I did find one exception meow).

I'm pretty convinced that the combined popularlity of yargs, minimist, commander, and args, has enough momentum, that if we switch the order on people the community would constantly be screwing up the argument ordering.

bakkot · 2022-02-11T16:05:12Z

I'm pretty convinced that the combined popularlity of yargs, minimist, commander, and args, has enough momentum, that if we switch the order on people the community would constantly be screwing up the argument ordering.

Are people constantly screwing up the argument ordering with meow? I would be kinda surprised to learn that was a common problem.

I also note that the way I've usually seen commander and yargs used is as yargs().variousBuilderMethods().parse(argv) or yargs(argv).variousBuilderMethods().parse(), where the variousBuilderMethods are functioning like the options bag parameter here. That's the style used by both projects in their introductions. That does not seem directly analogous to what we're discussing.

args

command-line-args is much more popular than args and takes the options bag approach (its first parameter is an options bag with an optional argv key).

bakkot · 2022-02-23T20:38:47Z

(I should mention that, while it seems more natural to me to have the configuration parameter first, I'm not all that worried either way. I care more about the general shape of the configuration discussed in this issue than about the precise order of the arguments.)

shadowspawn · 2022-02-23T21:32:50Z

another argument I'd make for arguments and then options bag, is that it's what the community is accustomed to, as it's the approach taken by commander, yargs, minimist, args (worth noting I did find one exception meow).

Short version: Commander only superficially follows the pattern, and I don't think is prior art supporting "arguments and then option bag" expectations.

Long version

For many years in Commander, .parse() had a single required argument which was the array of arguments to parse. Then the argument was made optional with a default of process.argv, and support was added to allow specifying the source of the arguments to override automatic decision about which ones to ignore. The only configuration in the option bag is how to interpret the array of arguments, and the expectation is in practice the options-bag is only used when the first argument is supplied.

program.parse(process.argv); // classic

program.parse(); // modern
program.parse(['--demo'], { from: 'user' }); // common pattern in unit tests for example

bcoe · 2022-02-27T14:53:15Z

I'm fine with @aaronccasanova approach of providing argv as an option on the config object, rather than taking two arguments, as we see in #63. With argv defaulting to process.argv -- this seems to side step disagreement about argument order.

I'm not seeing much argument in this thread against the spirit of #63, so I propose we land the work soon, since it seems to get us close to our goal of an MVP that we could put in Node 18.

Per the discussion in #45 this PR restructures the current options API where each option is configured in three separate list and instead allows options to be configured in a single object. The goal being to make the API more intuitive for configuring options (e.g. short, withValue, and multiples) while creating a path forward for introducing more configuration options in the future (e.g. default). Co-authored-by: Benjamin E. Coe <bencoe@google.com> Co-authored-by: John Gee <john@ruru.gen.nz> Co-authored-by: Jordan Harband <ljharb@gmail.com>

shadowspawn · 2022-03-02T23:03:44Z

Closing as #63 has landed.

shadowspawn mentioned this issue Jan 28, 2022

make it possible to require long options with values use = #52

Closed

bcoe added the enhancement New feature or request label Feb 6, 2022

aaronccasanova mentioned this issue Feb 9, 2022

refactor!: restructure configuration to take options bag #63

Merged

1 task

shadowspawn closed this as completed Mar 2, 2022

aaronccasanova mentioned this issue Mar 16, 2022

Restructure result conventions (again) #70

Closed

shadowspawn mentioned this issue Oct 28, 2022

Second argument (first in process.argv) is always ignored. #146

Closed

shadowspawn mentioned this issue Apr 9, 2023

parseArgs Type Definition nodejs/node#47487

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve API for configuring a high number of arguments #45

Improve API for configuring a high number of arguments #45

aaronccasanova commented Jan 23, 2022

ljharb commented Jan 23, 2022

shadowspawn commented Jan 23, 2022

aaronccasanova commented Jan 24, 2022

ljharb commented Jan 24, 2022

shadowspawn commented Jan 25, 2022

Intended Audience

aaronccasanova commented Jan 26, 2022

bcoe commented Feb 6, 2022 •

edited

Loading

bakkot commented Feb 6, 2022

shadowspawn commented Feb 6, 2022 •

edited

Loading

bakkot commented Feb 6, 2022 •

edited

Loading

bcoe commented Feb 6, 2022

ljharb commented Feb 6, 2022

aaronccasanova commented Feb 6, 2022 •

edited

Loading

bakkot commented Feb 6, 2022

aaronccasanova commented Feb 6, 2022

shadowspawn commented Feb 6, 2022

shadowspawn commented Feb 6, 2022

aaronccasanova commented Feb 6, 2022 •

edited

Loading

bcoe commented Feb 10, 2022

aaronccasanova commented Feb 10, 2022

ljharb commented Feb 10, 2022

bakkot commented Feb 10, 2022

ljharb commented Feb 10, 2022

bakkot commented Feb 10, 2022

ljharb commented Feb 10, 2022

bcoe commented Feb 11, 2022 •

edited

Loading

bakkot commented Feb 11, 2022 •

edited

Loading

bakkot commented Feb 23, 2022

shadowspawn commented Feb 23, 2022

bcoe commented Feb 27, 2022 •

edited

Loading

shadowspawn commented Mar 2, 2022

Improve API for configuring a high number of arguments #45

Improve API for configuring a high number of arguments #45

Comments

aaronccasanova commented Jan 23, 2022

(Current) Example configuring large scale CLI options

(Proposed) Example configuring large scale CLI options

ljharb commented Jan 23, 2022

shadowspawn commented Jan 23, 2022

aaronccasanova commented Jan 24, 2022

ljharb commented Jan 24, 2022

shadowspawn commented Jan 25, 2022

Intended Audience

aaronccasanova commented Jan 26, 2022

bcoe commented Feb 6, 2022 • edited Loading

bakkot commented Feb 6, 2022

shadowspawn commented Feb 6, 2022 • edited Loading

bakkot commented Feb 6, 2022 • edited Loading

bcoe commented Feb 6, 2022

ljharb commented Feb 6, 2022

aaronccasanova commented Feb 6, 2022 • edited Loading

bakkot commented Feb 6, 2022

aaronccasanova commented Feb 6, 2022

shadowspawn commented Feb 6, 2022

shadowspawn commented Feb 6, 2022

aaronccasanova commented Feb 6, 2022 • edited Loading

bcoe commented Feb 10, 2022

aaronccasanova commented Feb 10, 2022

ljharb commented Feb 10, 2022

bakkot commented Feb 10, 2022

ljharb commented Feb 10, 2022

bakkot commented Feb 10, 2022

ljharb commented Feb 10, 2022

bcoe commented Feb 11, 2022 • edited Loading

bakkot commented Feb 11, 2022 • edited Loading

bakkot commented Feb 23, 2022

shadowspawn commented Feb 23, 2022

bcoe commented Feb 27, 2022 • edited Loading

shadowspawn commented Mar 2, 2022

bcoe commented Feb 6, 2022 •

edited

Loading

shadowspawn commented Feb 6, 2022 •

edited

Loading

bakkot commented Feb 6, 2022 •

edited

Loading

aaronccasanova commented Feb 6, 2022 •

edited

Loading

aaronccasanova commented Feb 6, 2022 •

edited

Loading

bcoe commented Feb 11, 2022 •

edited

Loading

bakkot commented Feb 11, 2022 •

edited

Loading

bcoe commented Feb 27, 2022 •

edited

Loading