Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge documentation of "command-line" and "configuration" options (Issue #469) #473

Merged
merged 47 commits into from
Jan 8, 2017

Conversation

eric-brechemier
Copy link
Contributor

I have generalized the SYNOPSIS and the description of OPTIONS to cover both "standard" options and "extended" configuration options in the same sections of the documentation.

I have modified the XSLT transformation which generates the man page. I have also added two temporary files, the generated man page (source) and the formatted text (for display in terminal) generated by groff.

The latter is best viewed in a terminal, e.g. using less tidy.1.text to view the file, or git log -p tidy.1.txt to review the effect of individual commits on the resulting document.

This is a work in progress submitted for preliminary review and discussion.

Before this branch is merged, I intend to simplify the description of each configuration option to avoid repeating information, make entries shorter and similar in format to the description of standard options. I will also delete the temporary files once the review has been completed.

Eric Bréchemier added 24 commits December 17, 2016 18:05
I am adding the file to the git repository to track and review
the changes to this generated file. I will then update the XSLT
transformation which produces this file to remove duplicate sections.
As a first step, I will stop outputting duplicate sections; I will
then merge them into existing sections. I will commit the changes
to the generated file at each step.

Related issue: htacg#469
The rendering to text was done with following command:

  /usr/bin/groff -Tascii -mandoc -c tidy.1

This format should make the review of differences more readable.

Related issue: htacg#469
The new SYNOPSIS expresses the fact that multiple files can
be provided as argument, and that options and files can be mixed
(options apply only to the files specified after, not the ones before).

It does not explain that there are actually two types of options; this
shall be detailed afterwards: simple options (aka standard options) start
with single dash while configuration options start with a double dash.
Only the latter can be defined in configuration files, using their name
without the double dash.

I have also reformatted the terms 'options' and 'file' to be underlined,
to follow conventions that I observed in other man pages (ls, grep, wget...)

Related issue: htacg#469
This is an intermediate step before adapting the text to its new
location. I will probably start the section with a paragraph to
introduce the two different kinds of options. Then describe the
"standard" options in more details. Then list the standard options.
Then describe the configuration options in more details. Then list
the configuration options, using a format similar to the one used
for standard options.

Related issue: htacg#469
The section now starts with a description of both types of options,
and explains that the first part of the section concerns with the
"standard" options while the second part of the section concerns with
the "expanded" options.

More details are provided about "standard" options, which are then
listed individually.

More details are then provided about "expanded" options and their
usage on the command line and in configuration files. The configuration
options are not listed yet. In order to avoid repeating a lot of
information with every separate configuration option, I will first
describe common values and formats; I will then describe each option
more succinctly, like "standard" options.

Related issue: htacg#469
The fact that the input file defaults to standard input
and the output file to standard output is already indicated
in the DESCRIPTION section. This was the only information
left in this section at this point.

Related issue: htacg#469
The line used to separate "standard" usage from "extended" usage.
Both forms are now integrated in the common description of OPTIONS.

Related issue: htacg#469
The detailed configuration options are now described together
with standard options in a common OPTIONS section.

Related issue: htacg#469
A single generalized SYNOPSIS now encompasses both kinds of options.

Related issue: htacg#469
The WARNING referred to a separate section for the description
of "standard" options. They are now described in the same OPTIONS
section as "extended" options.

Related issue: htacg#469
Just before listing all the configuration options, this is the
expected place to describe the "extended" options in more details.
The description was already worded as an introduction to the list
of configuration options. I will update this description after having
compacted entries which describe individual configuration options.

Related issue: htacg#469
This section has been merged into the generalized OPTIONS section.

Related issue: htacg#469
This list is very long, with lots of duplicate information
repeated for entries of the same type. The description of
configuration options should be compacted to match as closely
as possible the description of "standard" options.

Related issue: htacg#469
I contained the list of configuration options, which is now included
at the end of the generalized OPTIONS section.

Related issue: htacg#469
The template was now empty. Its contents have been merged
into the cmdline-section template.

Related issue: htacg#469
The sentence listed the five categories of configuration options.
This kind of made sense when the options were listed in the following
section. Now that they are listed just below, it has become redundant.

Related issue: htacg#469
The categories of "standard" options do not end with a colon;
no title does actually.

Related issue: htacg#469
For consistency with usage, sentences within paragraphs shall be
separated by a double space rather than a single space. This was
done in most places in the document, with only a few places missing.

Related issue: htacg#469
The comment refers to cmdline section at the start of the processing
of configuration options. The cmdline options are opposed to
config options in the context of this transformation. They are
provided through two separate XML input files.

Related issue: htacg#469
@geoffmcl
Copy link
Contributor

@eric-brechemier have reviewed this WIP, and it is looking great... some little points...

SYNOPSIS

I like you have reduced this to one(1) single simple line, as it should be... but maybe it should read like -

SYNOPSIS
   tidy [options] [file[ ...]] [[options] [file[ ...] ...]

But maybe the additional square braces are too much, too pedantic... what do you think?

And in fact maybe it could be reduced to just tidy [options] [file1[ file2[ ...]]], and the fact that this [options] [file(s)] can be repeated be only mentioned say in the description. Or not at all...

I have never really used, or tested, this multiple [options file], since as far as I can see, if tidy exits 1, you would not know, without actually reading the output, which file caused this warning... but assume it works... but what really is the use case?

In any batch, or scripted processing, I can understand having one set of options applied to a big list of files, although again, you do not know which actual file caused the exit value... but to then change the options for another file, or files, does not seem to make much sense...

But ok, maybe this is not the place or time to discuss this quite unique multiple options file(s) that tidy offers... but I would be ok with not suggesting, sort of promoting this...

As stated, I am happy with the single line, with, or without, the addition square braces, as it is...

OPTIONS

Again it is great that you have added a good description, clearly pointing out tidy has two types, -, and --. I suppose some might argue that in *nix systems two dashes, --, is more the 'regular', or 'standard', but we can wear that semantic difference... and your description makes the intention clear...

And I like that, for most single dash, -, you have added a double dash, --, form, which can then be put in a config file...

And maybe it would be too much if each double -- option mentioned a - option, where it exists... like say clean with a See also: -c... that is each is fully crossed referenced... but as I say, maybe too much?

Temporary Files

Yes, thanks for making this effort, to be able to read them in a non-nix system...

But in *nix I found I could just compile tidy in say build/cmake, and view the results with man -l tidy.1, without doing any install...

And I certainly hope others, espceially @jidanni, can do this, and offer his comments, since he opened #469. I do understand that he wants to use only released tidy in production, but as stated this great new man page change does not need to be installed - just in a few mintues -

$ cd some/tmp/work/dir
$ git clone git@github.com:htacg/tidy-html5.git
$ cd tidy-html5/build/cmake
$ cmake ../..
$ make
$ man -l tidy.1

So I think these 'temporary' files can be remove, but as stated, thanks for the thought...

Multiple, multiple commits

While I have no real problem with this as you were developing the change, but hopefully the final PR is essentially only one commit, since as far as I can see it only effects one file, man/tidy.xsl.in...

You would probably need copy your final modified tidy.xsl.in to say new-tidy.xsl.in... back up to master... make sure it is rebased to the current master... create a new branch, say man-final... overwrite tidy.xsl.in with your new... maybe commit with a multiline commit message... and push... otherwise I understand github just adds each new commit to this existing WIP PR...

But, as stated, not really a problem...

Other possible changes

  1. Yes, we are always looking for better, more concise, suscinct descriptions, but these each can be raised as separate issues...

  2. I wish we could remove internal only option, like gnu-emacs-file. Maybe these should not appear in any docs, even with a comment Used internally.. See TidyDoctypeMode option documentation missing #472 for some discussion on others like this... But again these could be dealt with as separate issues.

So, as stated, this is a sterling effort... look forward to the final... thanks...

@eric-brechemier
Copy link
Contributor Author

SYNOPSIS

But maybe the additional square braces are too much, too pedantic... what do you think?

Maybe not pedantic, but definitely confusing in a lispy way. I'll think about it.

And in fact maybe it could be reduced to just tidy [options] [file1[ file2[ ...]]], and the fact that this [options] [file(s)] can be repeated be only mentioned say in the description. Or not at all...

This is the option taken by man curl:

curl [options] [URL...]

but then they have a special parameter to separate options that are related to one URL from the next:

-:, --next
              Tells curl to use a separate operation for the following URL and
              associated  options.  This  allows  you  to  send  several   URL
              requests,  each  with  their  own specific options, for example,
              such as different user names or custom requests for each. (Added
              in 7.36.0)

while in the case of tidy, an option applies to all the following files.

In any batch, or scripted processing, I can understand having one set of options applied to a big list of files, although again, you do not know which actual file caused the exit value... but to then change the options for another file, or files, does not seem to make much sense...

It may be useful when using a wildcard from the shell: *.html is expanded to the list of files that matches the pattern. Also, I remember a particular time when we had different files with different encodings, and used to include a hint for the encoding of the file in the file name, such as .utf8.html or .latin1.html. We could apply tidy to all these files in a single run with -latin1 *.latin1.html -utf8 *.utf8.html.

@eric-brechemier
Copy link
Contributor Author

OPTIONS

And maybe it would be too much if each double -- option mentioned a - option, where it exists... like say clean with a See also: -c... that is each is fully crossed referenced... but as I say, maybe too much?

Ah, good point. Some options are available in both flavors. That deserves at least a mention when it is the case. I will look into it.

@eric-brechemier
Copy link
Contributor Author

eric-brechemier commented Dec 20, 2016

Temporary Files

Yes, thanks for making this effort, to be able to read them in a non-nix system...

But in *nix I found I could just compile tidy in say build/cmake, and view the results with man -l tidy.1, without doing any install...

I looked long and large for this option, which is simply not available in macOS's man :( I am glad to know that it exists though.

Multiple, multiple commits

While I have no real problem with this as you were developing the change, but hopefully the final PR is essentially only one commit, since as far as I can see it only effects one file, man/tidy.xsl.in...

Sure, I will squash the commits into one, leaving no trace of the temporary files. Apparently, it is also possible to do it directly from GitHub interface when you merge the branch.

Eric Bréchemier added 3 commits January 4, 2017 08:43
This puts it in the position expected on the command line.

Related issue: htacg#469
I tried different formats for the default value:

  --clean Boolean:no
  --clean Boolean[no]

and more formats after I realized that the 'default' value is
not applied when the value is omitted, but when the option is
not used at all:

  --clean Boolean (initially: no)
  --clean Boolean (unset: no)

I selected the less confusing format:

  --clean Boolean (no if unset)

which is self-explanatory.

Related issue: htacg#469
For example, using --clean without a value is not equivalent to
using -clean option:

  curl -s https://www.google.com | tidy --clean 2>&1 1>/dev/null | head -n 1

results in:

  Config: missing or malformed argument for option: clean

Related issue: htacg#469
@eric-brechemier
Copy link
Contributor Author

(...) I suppose some might argue that in *nix systems two dashes, --, is more the 'regular', or 'standard', but we can wear that semantic difference... and your description makes the intention clear...

I revisited this issue and enhanced the wording in 29215e3. The terms are now in line with the descriptions of -xml-help and -xml-config: (purely) command-line options (single dash) and configuration options (double dash).

@eric-brechemier
Copy link
Contributor Author

And maybe it would be too much if each double -- option mentioned a - option, where it exists... like say clean with a See also: -c... that is each is fully crossed referenced... but as I say, maybe too much?

I gave more thought to this. I noticed that the documentation of some command-line options includes an equivalent configuration option and value, in the <eqconfig> element.

For example:

 <option class="process-directives">
  <name>-clean</name>
  <name>-c</name>
  <description>replace FONT, NOBR and CENTER tags by CSS</description>
  <eqconfig>clean: yes</eqconfig>
 </option>

Therefore, a configuration option alone is not equivalent to using a command-line option: it is a pair option+value that must be used.

This could justify having references only in one direction from command-line options to configuration options, and no references back.

If we do want to include the references from configuration options back to command-line options, mixing both kinds of options in the 'See also' section, we should use the same mechanism currently used to refer only to the double dash options, using <seealso> elements. Each <seealso> element includes the name of a single option, without the dashes:

  <seealso>input-encoding</seealso>
  <seealso>output-encoding</seealso>

We could change the format of <seealso> elements in the output of tidy -xml-config, adding the dashes to the option names:

  <seealso>--input-encoding</seealso>
  <seealso>--output-encoding</seealso>

This would allow to mix references to both kinds of options:

 <option class="print">
  <name>indent</name>
  <type>AutoBool</type>
  <default>no</default>
  <example>auto, y/n, yes/no, t/f, true/false, 1/0</example>
  <description>This option specifies if Tidy should indent block-level tags. If set to "auto", this option causes Tidy to decide whether or not to indent the content of tags such as TITLE, H1-H6, LI, TD, TD, or P depending on whether or not the content includes a block-level element. You are advised to avoid setting indent to yes as this can expose layout bugs in some browsers. </description>
   <seealso>-indent</seealso>
   <seealso>--indent-spaces</seealso>
 </option>

Which would result in:

See also: -indent, --indent-spaces

Eric Bréchemier added 12 commits January 4, 2017 19:25
This is consistent with the format used at the top of the
description of configuration options.

Related issue: htacg#469
The 'seealso' comes last actually, after the description.
…cters)

This makes no change on the text generated by

  /usr/bin/groff -Tascii -mandoc -c tidy.1 > tidy.1.txt
Otherwise, the description starts with an empty line when
no Example section is present.

Related issue: htacg#469
Using a template match instead of a named template,
I will then add rules with higher priority to ignore
examples for certain types of values, which are very
redundant (identical for all options of the same type).

Related issue: htacg#469
Examples for Boolean and AutoBool are redundant because they are
described in the main text and identical for all options of that type.

Examples for Tag names are redundant because they are redundant
with the name of the Type, and identical for all options of that type.

Examples for Integer are redundant because they are identical for
all options of that type but one, where the value 0 is followed with
a comment, but even in this case the examples are redundant because
the comment for the value 0 is also included in the description.

Related issue: htacg#469
I also updated the description related to 'Examples' section
in the introduction paragraphs to the configuration options.

Related issue: htacg#469
Previously, a mix of

  * Type set in bold font
  * Type set in regular font
  * "types" (quoted)
  * types (unquoted)

was found. I replaced all instances by Type in regular font.

Related issue: htacg#469
Both parameter names and values are now in bold,
while keys and values for configuration files are in italics.

Related issue: htacg#469
The subsection is now flush left, in regular font, like
the 'Supported values' subsection.

The previous format was less adequate when the list wrapped
to the next line (--new-inline-tags): wrapping started on
the very first column, breaking the alignment of the rest of
the description.

Related issue: htacg#469
Parts of the file were indented with 2 spaces, others with 3 spaces.
Parts of the templates were separated with two empty lines, others
with a single one.
@eric-brechemier eric-brechemier changed the title WIP: merge documentation of "standard" and "extended" options (Issue #469) Merge documentation of "command-line" and "configuration" options (Issue #469) Jan 5, 2017
@eric-brechemier
Copy link
Contributor Author

@geoffmcl the PR is ready for review.

@eric-brechemier
Copy link
Contributor Author

Note: I still need to delete the temporary tidy.1 and tidy.1.txt at the root before the merging.

@geoffmcl
Copy link
Contributor

geoffmcl commented Jan 5, 2017

@eric-brechemier ok, have pulled your fork, issue-469 branch, built and reviewed this, in linux, and everything looks good, but really wish there were others mac/unix reviewers. Being a windows person, which does not have a $ man page command, do not feel totally comfortable with just my quick linux $ man -l tidy.1 review...

As far as I can see, so far this is only modification of one file, man/tidy1.xsl.in. And it seems to me you have completely removed the duplications, and improved the wording here and there... that effectively solved the #469 issue raised by @jidanni... As I say this looks good to me... and once you remove the two temporary files, and hopefully a positive review by at least one other, would be prepared to merge this... many thanks...

And if I read, and understand things correctly, as a further enhancement, you would now like to modify the xml output of tidy.c. That is the modify the xml fed to the transformation. If I am understanding this correctly then I would like that as a separate issue. I need to explore how that is done presently...

Like I see in the code - printf(" <seealso>%s</seealso>\n",tidyOptGetName(optLinked));, but presently do not immediately see how to know to add a single - or a double -- in front... The tidyOptGetName(optLinked) is a tidy API call, which probably can not be changed, but maybe another internal function could be added to do that... or something...

I certainly like the idea the we could have xml ouput of -

   ...
   <seealso>-i</seealso>
   <seealso>-indent</seealso>
   <seealso>--indent-spaces</seealso>
 </option>

which in turn would add See also: -i, -indent, --indent-spaces... to the tidy.1 page... That is fully cross-referencing the options... But as stated else where this is not essential that we have every cross-reference...

But is it convenient to separate this as a further, separate step?

Regards, geoff.

@eric-brechemier
Copy link
Contributor Author

@geoffmcl Agreed. It's a wrap!

I noticed a few other things that I would like to improve; let's open separate issues for further enhancements.

I have removed the temporary files from the branch.

@geoffmcl
Copy link
Contributor

geoffmcl commented Jan 7, 2017

@eric-brechemier yes, it's a wrap, even if we do not get another unix reviewer...

Just about out of time tonight, but tomorrow, or soonest, will merge this using the special Squash and merge option offered, which I think will pull in you some 47 commits as one... First time I have tried this, so hope it works out well...

Be back soonest... thanks

And look forward to your further enhancement as a separate issue... thanks...

@geoffmcl geoffmcl merged commit 7593d7b into htacg:master Jan 8, 2017
geoffmcl added a commit that referenced this pull request Jan 8, 2017
@geoffmcl
Copy link
Contributor

geoffmcl commented Jan 8, 2017

@eric-brechemier seems the SnM worked fine... thanks...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants