Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util: add tokens to parseArgs #43459

Merged
merged 13 commits into from
Jul 18, 2022
123 changes: 121 additions & 2 deletions doc/api/util.md
Original file line number Diff line number Diff line change
Expand Up @@ -1028,6 +1028,11 @@ equality.

<!-- YAML
added: v18.3.0
changes:
- version: REPLACEME
pr-url: https://github.com/nodejs/node/pull/43459
description: add support for returning detailed parse information
using `tokens` in input `config` and returned properties.
-->

> Stability: 1 - Experimental
Expand All @@ -1044,18 +1049,24 @@ added: v18.3.0
times. If `true`, all values will be collected in an array. If
`false`, values for the option are last-wins. **Default:** `false`.
* `short` {string} A single character alias for the option.
* `strict`: {boolean} Should an error be thrown when unknown arguments
* `strict` {boolean} Should an error be thrown when unknown arguments
are encountered, or when arguments are passed that do not match the
`type` configured in `options`.
**Default:** `true`.
* `allowPositionals`: {boolean} Whether this command accepts positional
* `allowPositionals` {boolean} Whether this command accepts positional
arguments.
**Default:** `false` if `strict` is `true`, otherwise `true`.
* `tokens` {boolean} Return the parsed tokens. This is useful for extending
the built-in behavior, from adding additional checks through to reprocessing
the tokens in different ways.
**Default:** `false`.

* Returns: {Object} The parsed command line arguments:
* `values` {Object} A mapping of parsed option names with their {string}
or {boolean} values.
* `positionals` {string\[]} Positional arguments.
* `tokens` {Object\[] | undefined} See [parseArgs tokens](#parseargs-tokens)
section. Only returned if `config` includes `tokens: true`.

Provides a higher level API for command-line argument parsing than interacting
with `process.argv` directly. Takes a specification for the expected arguments
Expand Down Expand Up @@ -1104,6 +1115,114 @@ console.log(values, positionals);
`util.parseArgs` is experimental and behavior may change. Join the
conversation in [pkgjs/parseargs][] to contribute to the design.

### `parseArgs` `tokens`

Detailed parse information is available for adding custom behaviours by
shadowspawn marked this conversation as resolved.
Show resolved Hide resolved
specifying `tokens: true` in the configuration.
The returned tokens have properties describing:

* all tokens
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought, I would be tempted to make tokens a class rather than a POJO with a kind property:

class Token {}
class OptionTerminator extends Token {}
class PositionalToken extends Token {}

With the classes exposed on the parseArgs export:

const {Token, PositionalToken} = parseArgs

Then I think you could just document the tokens as classes, similar to Buffer.blob, nested under the top level parseArgs docs, similar to how the docs are laid out for the JavaScript embedder API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if these are made classes they should definitely expose a kind property - it's really awkward to work with tree nodes which lack such a property, which is why pretty much every parser in the world has them. (Similarly, the DOM has nodeName in addition to nodes being instances of distinct classes.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if these are made classes they should definitely expose a kind property

Sounds good to me. I think it might be worth the class based approach, so that people could opt to check instanceof, if they're so inclined?

Should we consider making kind a Symbol, or is a string preferable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strings are traditional - they're a lot easier to work with, since you don't have to import them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought, I would be tempted to make tokens a class rather than a POJO

It did cross my mind as soon as I added the kind property. 😄 .

The examples of Blob and AsyncResource and AsyncLocalStorage are all objects that authors will create themselves, with methods and state, where I think a Class adds more value.

I'm not against classes as such, but not seeing benefits in this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My 2 cents is we should go with the first one and make it less verbose, the "existing" version can be a bit confusing I think. But feel free to disagree, none option is perfect anyway, and we can always edit the docs later if we find a better layout.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely find the existing one massively clearer; muddling things up with unnecessary TS/Java-ish terms and structure makes things more confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixed opinions straight away! Thanks both for feedback.

I will leave it as is, barring further feedback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that we could map the interface based approach directly to the TypeScript types we release on DefinitelyTyped.

But I don't feel super strongly, and would rather not block progress on the API (as @aduh95 says, we can edit the docs later).

Reapproving this PR 👍, @aduh95 are you happy as is?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compact POJO description is a bit subtle to read. How about an expanded version with all the properties listed?

Proposed expanded

A returned token has two properties which are always defined,
and some other properties which vary depending on the kind:

  • kind {string} One of 'option', 'positional', or 'option-terminator'.
  • index {number} Index of element in args containing token. So the
    source argument for a token is args[token.index].

An option token has additional parse details
for an option detected in the input args:

  • kind = 'option'
  • index {number} Index of element in args containing token.
  • name {string} Long name of option.
  • rawName {string} How option used in args, like -f of --foo.
  • value {string | undefined} Option value specified in args.
    Undefined for boolean options.
  • inlineValue {boolean | undefined} Whether option value specified inline,
    like --foo=bar.

A positional token has just one additional property with the positional value:

  • kind = 'positional'
  • index {number} Index of element in args containing token.
  • value {string} The value of the positional argument in args (i.e. args[index]).

An option-terminator token has only the base properties:

  • kind = 'option-terminator'
  • index {number} Index of element in args containing token.

Old compact

The returned tokens have properties describing:

  • all tokens
    • kind {string} One of 'option', 'positional', or 'option-terminator'.
    • index {number} Index of element in args containing token. So the
      source argument for a token is args[token.index].
  • option tokens
    • name {string} Long name of option.
    • rawName {string} How option used in args, like -f of --foo.
    • value {string | undefined} Option value specified in args.
      Undefined for boolean options.
    • inlineValue {boolean | undefined} Whether option value specified inline,
      like --foo=bar.
  • positional tokens
    • value {string} The value of the positional argument in args (i.e. args[index]).
  • option-terminator token

(Duplicate post of: pkgjs/parseargs#129 (comment))

* `kind` {string} One of 'option', 'positional', or 'option-terminator'.
* `index` {number} Index of element in `args` containing token. So the
source argument for a token is `args[token.index]`.
* option tokens
* `name` {string} Long name of option.
* `rawName` {string} How option used in args, like `-f` of `--foo`.
* `value` {string | undefined} Option value specified in args.
Undefined for boolean options.
* `inlineValue` {boolean | undefined} Whether option value specified inline,
like `--foo=bar`.
* positional tokens
* `value` {string} The value of the positional argument in args (i.e. `args[index]`).
* option-terminator token

The returned tokens are in the order encountered in the input args. Options
that appear more than once in args produce a token for each use. Short option
groups like `-xy` expand to a token for each option. So `-xxx` produces
three tokens.

For example to use the returned tokens to add support for a negated option
like `--no-color`, the tokens can be reprocessed to change the value stored
for the negated option.

```mjs
import { parseArgs } from 'node:util';

const options = {
'color': { type: 'boolean' },
'no-color': { type: 'boolean' },
'logfile': { type: 'string' },
'no-logfile': { type: 'boolean' },
};
const { values, tokens } = parseArgs({ options, tokens: true });

// Reprocess the option tokens and overwrite the returned values.
tokens
.filter((token) => token.kind === 'option')
.forEach((token) => {
if (token.name.startsWith('no-')) {
// Store foo:false for --no-foo
const positiveName = token.name.slice(3);
values[positiveName] = false;
delete values[token.name];
} else {
// Resave value so last one wins if both --foo and --no-foo.
values[token.name] = token.value ?? true;
}
});

const color = values.color;
const logfile = values.logfile ?? 'default.log';

console.log({ logfile, color });
```

```cjs
const { parseArgs } = require('node:util');
shadowspawn marked this conversation as resolved.
Show resolved Hide resolved

const options = {
'color': { type: 'boolean' },
'no-color': { type: 'boolean' },
'logfile': { type: 'string' },
'no-logfile': { type: 'boolean' },
};
const { values, tokens } = parseArgs({ options, tokens: true });

// Reprocess the option tokens and overwrite the returned values.
tokens
.filter((token) => token.kind === 'option')
.forEach((token) => {
if (token.name.startsWith('no-')) {
// Store foo:false for --no-foo
const positiveName = token.name.slice(3);
values[positiveName] = false;
delete values[token.name];
} else {
// Resave value so last one wins if both --foo and --no-foo.
values[token.name] = token.value ?? true;
}
});

const color = values.color;
const logfile = values.logfile ?? 'default.log';

console.log({ logfile, color });
```

Example usage showing negated options, and when an option is used
multiple ways then last one wins.

```console
$ node negate.js
{ logfile: 'default.log', color: undefined }
$ node negate.js --no-logfile --no-color
{ logfile: false, color: false }
$ node negate.js --logfile=test.log --color
{ logfile: 'test.log', color: true }
$ node negate.js --no-logfile --logfile=test.log --color --no-color
{ logfile: 'test.log', color: false }
```

## `util.promisify(original)`

<!-- YAML
Expand Down
Loading