Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve API for configuring a high number of arguments #45

Closed
aaronccasanova opened this issue Jan 23, 2022 · 31 comments
Closed

Improve API for configuring a high number of arguments #45

aaronccasanova opened this issue Jan 23, 2022 · 31 comments
Labels
enhancement New feature or request

Comments

@aaronccasanova
Copy link
Collaborator

I wanted to play devils advocate on the design of the parseArgs options object. Are there any concerns that the options object will become unwieldy as the number of arguments grow? While I am still catching up on prior discussions and understanding the target audience of this tool, I imagine creating large scale CLIs (e.g. aws-cli, git, docker, etc.) will be challenging to configure as configuring a single argument requires setting multiple top level keys (e.g. withValues, multiples, short).

I wanted to gauge folks opinion on allowing users to set all the configurations for a single argument in one object.

(Current) Example configuring large scale CLI options

parseArgs(argv, {
  withValue: [
    'foo',
    // +500 more values
  ],
  multiples: [
    'foo',
    // +500 more multiples
  ],
  short: {
    f: 'foo',
    // +500 more shorts
  },
})

(Proposed) Example configuring large scale CLI options

Keeping the same terminology established thus far.

parseArgs(argv, {
  args: {
    foo: {
      short: 'f',
      withValue: true,
      multiples: true,
    },
    // +500 more args
  },
})
@ljharb
Copy link
Member

ljharb commented Jan 23, 2022

That seems like a useful semver-minor addition in the future, when usage patterns are well-understood.

@shadowspawn
Copy link
Collaborator

Fair question. I think the initial focus is on keeping it simple, and large scale CLI is not a concern for now. (And a large scale program could repackage the one-object configuration fairly easily.)

@aaronccasanova
Copy link
Collaborator Author

I think the initial focus is on keeping it simple

Totally understand the desire to keep things simple. However, I don't believe the proposed suggestion adds complexity to the implementation or configuration. For example, I modified a few lines in the current implementation and enabled individual configuration objects: main...aaronccasanova:feat/improve-options-api

const argv = ['--f', 'foo1', '--foo', 'foo2', '--bar', 'baz'];

const parsed = parseArgs(argv, {
  args: {
    foo: {
      short: 'f',
      withValue: true,
      multiples: true,
    },
  },
});

console.log('parsed:', parsed);

Screen Shot 2022-01-23 at 6 35 38 PM

large scale CLI is not a concern for now

Being that this is intended for Node core, I would have thought otherwise.

Appreciate the quick response and feedback!

@ljharb
Copy link
Member

ljharb commented Jan 24, 2022

My concern about including it now isn't about the code complexity, it's about the API surface, and the design space.

@shadowspawn
Copy link
Collaborator

For background, the Intended Audience from the "Initial Proposal". (May or may not reflect what eventuates!)

nodejs/node#35015 (comment)

Intended Audience

It is already possible to build great arg parsing modules on top of what Node.js provides; the prickly API is abstracted away by these modules. Thus, process.parseArgs() is not necessarily intended for library authors; it is intended for developers of simple CLI tools, ad-hoc scripts, deployed Node.js applications, and learning materials.

It is exceedingly difficult to provide an API which would both be friendly to these Node.js users while being extensible enough for libraries to build upon. We chose to prioritize these use cases because these are currently not well-served by Node.js' API.

@aaronccasanova
Copy link
Collaborator Author

That's a helpful bit of context! Thanks for the reference @shadowspawn

@bcoe
Copy link
Collaborator

bcoe commented Feb 6, 2022

@aaronccasanova, I'm with @ljharb on erring on the side of a smaller API for MVP. With the goal initially being to support simple throw away CLIs and code samples.

I'm hopeful we might be able to target Node 18, or Node 19 with the MVP, so the tighter we can keep the API the better, hence trying to keep behavior constrained.

I think this work could immediate be a great follow up, and a good place to contribute to the Node.s codebase.

@bakkot
Copy link
Collaborator

bakkot commented Feb 6, 2022

@bcoe What about making this the only way to configure options? It seems more natural, to me: it means you're describing each argument in one place, rather than splitting up the configuration for each option across several fields. And if it's the only way to do it, it doesn't increase the API surface.

const parsed = parseArgs(argv, {
  args: {
    foo: {
      withValue: true,
    },
  },
});

does not seem noticeably harder to use than

const parsed = parseArgs(argv, {
  withValue: ['foo'],
});

and has the advantage of being a lot more natural as soon as you get at all beyond that:

const parsed = parseArgs(argv, {
  args: {
    foo: {
      withValue: true,
      multiple: true,
    },
    bar: {
      withValue: true,
    },
  },
});

is much easier to understand than

const parsed = parseArgs(argv, {
  withValue: ['foo', 'bar'],
  multiple: ['foo'],
});

since the latter requires you to scan multiple lists to figure out how foo is going to behave. More importantly, it also matches how people are likely to be accustomed to doing configuration: the properties for each field are set within that field, rather than having lists of fields-with-this-property.

@shadowspawn
Copy link
Collaborator

shadowspawn commented Feb 6, 2022

does not seem noticeably harder to use than

I think using nested objects is noticeably harder. The simplest configuration is a triply nested object. I think that is more challenging for a beginner to scan, or indeed type.

Smaller comments:

  • Multiples are going to be relatively rare. But there will still often be two properties in place as I think short options are pretty routine. So your comments about how to slice still apply.
  • I personally slightly prefer the objects from a purist point of view, but I still prefer the arrays from a pragmatic point of view for the initial target audience.

@bakkot
Copy link
Collaborator

bakkot commented Feb 6, 2022

I don't think "nested objects" as a concept is terribly confusing for most beginners. In my experience that's not the kind of complexity which usually trips people up when starting out. "These two things are implicitly related despite having no syntactic connection", on the other hand, definitely is.

So I think the above approach is better even from a strictly pragmatic point of view when optimizing for the experience of beginners.

That said, you can quite reasonably drop a level of indentation by having the second argument just be args, and having remaining options be in an optional third parameter. Then you have only doubly-nested objects, which is the same level of nesting as in the current API.

That is:

const parsed = parseArgs(argv, {
  foo: {
    withValue: true,
    multiple: true,
  },
  bar: {
    withValue: true,
  },
}, /* any other options */);

@bcoe
Copy link
Collaborator

bcoe commented Feb 6, 2022

What about making this the only way to configure options?

@shadowspawn @darcyclarke @ljharb, if we went with the nested object approach for configuration, I feel like strict becomes more natural, since all your valid arguments are defined in one place ... For flags, we could allow any empty object for configuration, or flag: true?

@ljharb
Copy link
Member

ljharb commented Feb 6, 2022

I’m ambivalent on either of the two ways to configure, but i feel strongly that non-strict is a very bad default, so if this argument form comes with default-strict, I’m super on board.

@aaronccasanova
Copy link
Collaborator Author

aaronccasanova commented Feb 6, 2022

I think using nested objects is noticeably harder.

I used a nested object structure for three reasons:

  1. Allows for more top level configurations
  2. Minimizes the chances of introducing breaking changes in future iterations
  3. Easier to static type check
  1. Allows for more top level configurations

e.g.

parseArgs(argv, {
  strict: true,
  ...moreTopLevelConfigs,
  args: {...}
})
  1. Easier to static type check

e.g.

type ParseArgsOptions = {
  strict?: boolean
  args?: {
    [arg: string]: { // 👈 Isolated from top level configs
       short?: string
       withValue?: boolean
       multiples?: boolean
    }
  }
}

That said, you can quite reasonably drop a level of indentation by having the second argument just be args

We may not necessarily care about this here, but dropping a level of indentation will result in flaky types if we want to allow more top level configurations:

e.g. How do you accurately type this example:
TSPlayground - with nested object structure
TSPlayground - without nested object structure

parseArgs(argv, {
  strict: true,
  foo: {
    short: 'f',
    withValue: true,
  },
  bar: false, // Should error, not a valid arg value or top level config
})

// This would pass, but is no longer valid or representative of the implementation:
type ParseArgsOptions = {
  strict?: boolean
  [arg: string]: boolean | {
     short?: string
     withValue?: boolean
     multiples?: boolean
  }
}

@bakkot
Copy link
Collaborator

bakkot commented Feb 6, 2022

We may not necessarily care about this here, but dropping a level of indentation will result in flaky types if we want to allow more top level configurations:

The suggestion was that options like strict, which are not intended as CLI arguments but rather configure the parser itself, would go in a new third parameter. I agree it would be bad to mix those with arguments in the same object.

@aaronccasanova
Copy link
Collaborator Author

Ah yes, looking more closely that is what you proposed 👍 I think my only nit with that API is when you want to set parser configs and no args configs:

e.g.

parseArgs(argv, null, { requireEqualSign: true })

vs.

parseArgs(argv, {
  requireEqualSign: true,
})

Otherwise, the drop level config you proposed looks great!

@shadowspawn
Copy link
Collaborator

parseArgs(argv, null, { requireEqualSign: true })

This touches on something I had been wondering about, so I'll throw it into the mix! We have a default for the args input parameter which means it often won't be specified. If we do go strict by default, it might be better to make the input arguments the second parameter, or even a property on the configuration parameter, to avoid a clumsy invocation for the normal usage.

e.g.

parseArgs(myConfig);
parseArgs(myConfig, myArgs); // reordered
// or all properties on single parameter, easier to leave things out and named
parseArgs({
   withValue: ['foo'], // (existing property as example)
   strict: false,
   args: process.argv
});

@shadowspawn
Copy link
Collaborator

A nit. args has being used as the parent property name for the options in the previous examples, which seems wrong to me. (See previous comment where I deliberately used args for the unprocessed args.) I think options is the better name, albeit overloaded in general usage. Trying that out with one of the previous examples:

const parsed = parseArgs(argv, {
  options: {
    foo: {
      withValue: true,
    },
  },
});

(I raised the same naming issue previously in the context of the returned properties: #12)

@aaronccasanova
Copy link
Collaborator Author

aaronccasanova commented Feb 6, 2022

@shadowspawn I'm a fan of both suggestions:

  • Consolidating input arguments and the configurations object
  • Changing the proposed args configurations property with options
    e.g.
parseArgs({
  strict: true,
  args: process.argv,
  options: { foo: {...} },
})

@bcoe
Copy link
Collaborator

bcoe commented Feb 10, 2022

I like the suggestion of:

const parsed = parseArgs(argv, {
  options: {
    foo: {
      withValue: true,
    },
  },
});

Personally, for whatever reason ergonomically I like passing first the arguments that you're parsing, then the description of how the parse shoudl take place.

@aaronccasanova
Copy link
Collaborator Author

Personally, for whatever reason ergonomically I like passing first the arguments that you're parsing, then the description of how the parse should take place.

Me too! Although, while I have no data to back this up, I imagine the majority of users will leverage the default argv = getMainArgs(). As a result, we may see a lot of this out in the wild: parseArgs(undefined, { options }). Thoughts?

@ljharb
Copy link
Member

ljharb commented Feb 10, 2022

I agree with both viewpoints - passing the args first is better, but being able to omit them is the many-nines use case.

Short of providing two APIs - one that takes required args, and one that defaults them - i'm not sure how to reconcile those.

@bakkot
Copy link
Collaborator

bakkot commented Feb 10, 2022

Personally I think it makes more sense for the thing-to-be-parsed to come after the specification-of-how-to-parse. (Compare, for example, a tool like ajv, where you build a parser and then apply it.) I'm having trouble wrapping my head around the other viewpoint; can someone say more about why they prefer that?

That said, I note that if you use an options bag, as in #63, there's no issues with ordering.

@ljharb
Copy link
Member

ljharb commented Feb 10, 2022

"parse args" is a verb, "args" is the noun i want to parse, and "options" is how i want it parsed.

@bakkot
Copy link
Collaborator

bakkot commented Feb 10, 2022

I agree with that but don't understand why it implies anything about the arguments order. If anything it's the reverse: "use this specification to parse these args".

@ljharb
Copy link
Member

ljharb commented Feb 10, 2022

Certainly one can form a sentence in either direction - "parse these args this way" versus "parse, this way, these args".

@bcoe
Copy link
Collaborator

bcoe commented Feb 11, 2022

@bakkot another argument I'd make for arguments and then options bag, is that it's what the community is accustomed to, as it's the approach taken by commander, yargs, minimist, args (worth noting I did find one exception meow).

I'm pretty convinced that the combined popularlity of yargs, minimist, commander, and args, has enough momentum, that if we switch the order on people the community would constantly be screwing up the argument ordering.

@bakkot
Copy link
Collaborator

bakkot commented Feb 11, 2022

I'm pretty convinced that the combined popularlity of yargs, minimist, commander, and args, has enough momentum, that if we switch the order on people the community would constantly be screwing up the argument ordering.

Are people constantly screwing up the argument ordering with meow? I would be kinda surprised to learn that was a common problem.

I also note that the way I've usually seen commander and yargs used is as yargs().variousBuilderMethods().parse(argv) or yargs(argv).variousBuilderMethods().parse(), where the variousBuilderMethods are functioning like the options bag parameter here. That's the style used by both projects in their introductions. That does not seem directly analogous to what we're discussing.

args

command-line-args is much more popular than args and takes the options bag approach (its first parameter is an options bag with an optional argv key).

@bakkot
Copy link
Collaborator

bakkot commented Feb 23, 2022

(I should mention that, while it seems more natural to me to have the configuration parameter first, I'm not all that worried either way. I care more about the general shape of the configuration discussed in this issue than about the precise order of the arguments.)

@shadowspawn
Copy link
Collaborator

another argument I'd make for arguments and then options bag, is that it's what the community is accustomed to, as it's the approach taken by commander, yargs, minimist, args (worth noting I did find one exception meow).

Short version: Commander only superficially follows the pattern, and I don't think is prior art supporting "arguments and then option bag" expectations.

Long version

For many years in Commander, .parse() had a single required argument which was the array of arguments to parse. Then the argument was made optional with a default of process.argv, and support was added to allow specifying the source of the arguments to override automatic decision about which ones to ignore. The only configuration in the option bag is how to interpret the array of arguments, and the expectation is in practice the options-bag is only used when the first argument is supplied.

program.parse(process.argv); // classic

program.parse(); // modern
program.parse(['--demo'], { from: 'user' }); // common pattern in unit tests for example

@bcoe
Copy link
Collaborator

bcoe commented Feb 27, 2022

I'm fine with @aaronccasanova approach of providing argv as an option on the config object, rather than taking two arguments, as we see in #63. With argv defaulting to process.argv -- this seems to side step disagreement about argument order.


I'm not seeing much argument in this thread against the spirit of #63, so I propose we land the work soon, since it seems to get us close to our goal of an MVP that we could put in Node 18.

shadowspawn added a commit that referenced this issue Mar 2, 2022
Per the discussion in #45 this PR restructures the current options API where each option is configured in three separate list and instead allows options to be configured in a single object.

The goal being to make the API more intuitive for configuring options (e.g. short, withValue, and multiples) while creating a path forward for introducing more configuration options in the future (e.g. default).

Co-authored-by: Benjamin E. Coe <bencoe@google.com>
Co-authored-by: John Gee <john@ruru.gen.nz>
Co-authored-by: Jordan Harband <ljharb@gmail.com>
@shadowspawn
Copy link
Collaborator

Closing as #63 has landed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants