Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add configuration for addheader and provide --defaults for addheader #68

Open
carmenbianca opened this issue Aug 13, 2019 · 7 comments · May be fixed by #761
Open

Add configuration for addheader and provide --defaults for addheader #68

carmenbianca opened this issue Aug 13, 2019 · 7 comments · May be fixed by #761
Milestone

Comments

@carmenbianca
Copy link
Member

See also #13.

git config and .reuse/config should provide default values for author/email and default license respectively.

It makes sense for me to implement this as a composite of Project. This might be a little difficult, though, because I'd have to forward the config object to a lot of places that do not currently have one.

@carmenbianca
Copy link
Member Author

Additionally, make it possible to configure a default template for a given template.

@nicorikken
Copy link
Member

nicorikken commented Jun 11, 2021

@carmenbianca @siiptuo @CharString @mxmehl

I really like to see this feature become a reality, and at the moment there are already 2 good PR's to read copyright information from the project: #240 and #345 As these two are already in conflict, as they both add a single side-step to the explicit arguments, I think we need to discuss how this should be implemented. Also considering that in line with the pyproject.toml as a source in #345 there might be many more formats coming in the future. So let me share my thoughts, and hopefully we can refine through a discussion here:

Use-case

Currently with version 0.13 the init prompt is the following

$ reuse init
Initializing project for REUSE.

What license is your project under? Provide the SPDX License Identifier.
To stop adding licenses, hit RETURN.
GPL-3.0-or-later

What other license is your project under? Provide the SPDX License Identifier.
To stop adding licenses, hit RETURN.


What is the name of the project?
myproject   

What is the internet address of the project?
https://....org

What is the name of the maintainer?
Best Maintainer

What is the e-mail address of the maintainer?
thebest@maintainer.org

All done! Initializing now.

Downloading GPL-3.0-or-later
LICENSES/GPL-3.0-or-later.txt already exists

Creating .reuse/dep5

Initialization complete.

I think we can make it something like:

$ reuse init
Initializing project for REUSE.

# A license was found in the pyproject.toml and translated to the SPDX identifier.
What license is your project under? GPL-3.0-or-later was found in pyproject.toml would you like to add it? [Y/n]
 Y

# As this is a multiple-input, the user is prompted for additions
What other license is your project under? Provide the SPDX License Identifier.
To stop adding licenses, hit RETURN.

# Again an existing option is found, this time only a single answer is desired.
What is the name of the project? The name 'default-template' was found in pyproject.toml. Would you like to use it? [Y/n]
 n

# As the suggestion is not used, the original prompt is shown
What is the name of the project?
 myproject   

# No internet address was found, so a regular prompt is shown
What is the internet address of the project?
 https://....org

# Many options were found on different location, the user can select
What is the name of the maintainer? Several names were found. Would you like to use any of those? Press the according number and hit RETURN or type [n] to fill one in yourself.
[1] BestMaintainerGitHub
      from repository git settings
[2] laptop-user
      from global git settings
[3] Best Maintainer
      from pyproject.toml author
[4] Best Maintainer
      from pyproject.toml maintainers
[5] Best Maintainer friend
      from pyproject.toml maintainers

 4

# Example to auto-complete the email after the username was selected from a particular source. This might be too complicated with little benefit.
What is the e-mail address of the maintainer? Several addresses were found. The previously selected user came with the email address 'best.maintainer@work.com'. Would you like to use it? [Y/n]
 n

# Back to the prompt for multiple options.
What is the e-mail address of the maintainer? Several addresses were found. Would you like to use any of those? Press the according number and hit RETURN or type [n] to fill one in yourself.

[1] bestmaintainer+github@maintainer.org
      from repository git settings
[2] laptop-user@localhost
      from global git settings
[3] info@maintainer.org
      from pyproject.toml author
[4]  best.maintainer@work.com
      from pyproject.toml maintainers
[5] good.maintainer@friend.org
      from pyproject.toml maintainers

 n

# And finally back to the original prompt because no suggestion is used
What is the e-mail address of the maintainer?
 thebest@maintainer.org

All done! Initializing now.

Downloading GPL-3.0-or-later
LICENSES/GPL-3.0-or-later.txt already exists

Creating .reuse/dep5
Creating .reuse/config  #NOTE: also saving it for later

Initialization complete.

So the idea is to gather all available information and use it to help the user in filling in the information. Perhaps my example is a bit verbose, but I did it to show some edge-cases and get some discussion going.

Implementation

A modular approach is needed to enable additional sources of information in the future. Also they might be used in different stages than init. So for every source of truth a function or file can be created, say truthPyprojectToml() All of these function return the found information in a similar structure for further parsing (the interface). Perhaps additional information can be provided under specific keys (like with a pyproject_ prefix), to allow these functions to provide more information for possible future use.

A generic function could trigger these truth finding functions, like findThruths() returning a list of objects or dictionaries containing all the collected information. All these different functions can be run in parallel.

When calling reuse init the information will be gathered and used to auto-fill the prompts.

Further ideas

  • Eventuelly there might be discrepancies between the reuse config and the project config (pyproject.toml in this example). Reuse could warn the user during init and even fail on it when linting. Or even have these kind of thruth-finding functions run in reverse and fill the pyproject.toml with the information from the .reuse/config.

Next steps

This approach would mean that the current PR's would be moved to a separate location in the source so both can provide information to the init function. The rewrite of the prompt is work that needs to be done from scratch. We can start there and work on improving the init at a later stage.

What do you think?

Also I hope you can suggest a proper name for this feature in the source code. I was thinking something like "truth" or "config".

Edit: PR #349 to determine the copyright year could be integrated in a similar way.
Edit2: Issue #182 mentions that SPDX files could also be considered as a source.

@mxmehl
Copy link
Member

mxmehl commented Jun 15, 2021

Thanks a lot for sharing your thoughts here!

Your modular approach and integration in init totally makes sense to me. Allowing incremental additions of truth sources is a good strategy.

Eventuelly there might be discrepancies between the reuse config and the project config (pyproject.toml in this example). Reuse could warn the user during init and even fail on it when linting. Or even have these kind of thruth-finding functions run in reverse and fill the pyproject.toml with the information from the .reuse/config.

To be frank, both don't really convince me:

  1. This could trigger a lot of warnings if there is legitimately reused code from third parties or just historically differently marked files.
  2. Reversely adding info from .reuse/config to thruth sources will probably not be used that often, and the complexity of maintaining such functions and the problems that may arise from it feel a bit too high. Not sure whether it's actually worth it.

Edit2: Issue #182 mentions that SPDX files could also be considered as a source.

I imagined that to be a one-shot conversion. SPDX files are often not inside a repo permanently, but rather generated on shipping a package. The workflow I'd imagine for this is:

  1. Export SPDX file from FOSSology etc
  2. Run reuse import --spdx billofmaterial.spdx for instance

@CharString
Copy link

@nicorikken This pretty much describes the workflow I expect from using a tool like reuse; it removes the tedium wherever it can, by discovering educated defaults that I can confirm or deviate from. In the end I, as the user, am still legally bound by what the tool does for me, so in a way it should function like a legal clerk... The rest is just implementation ;-)

@mxmehl
Copy link
Member

mxmehl commented Jan 22, 2022

In a hackathon we are just running we concluded to do the following:

  1. Provide a --defaults flag for reuse addheader that reads the default copyright and license from this config file. So using reuse addheader --defaults foobar.py would take the project-defined defaults.
  2. However, these are easily overridable by using the addheader command as before with --copyright "Jane" --license MIT. In this case, the defaults are ignored. We noticed that override files, e.g. via git/gh config (Read default copyright from git/hg configuration #240) or python variables (In python files reuse tool should accept the info in __copyright__ and __license__ variables. #403), would cause a lot of edge case issues.
  3. Down the road we might consider support for env variables, e.g. to shorten long names. However, we first focus on the basic functionality.

@mxmehl mxmehl changed the title Add configuration for addheader Add configuration for addheader and provide --defaults for addheader Jan 22, 2022
nicorikken added a commit that referenced this issue Feb 9, 2022
This example for adding headers based on git config handles one of the
use-cases as discribed in #68
and #240

Signed-off-by: Nico Rikken <nico.rikken@fsfe.org>
@nicorikken
Copy link
Member

Now that scripts are part of the documentation I think most of this discussion can be ignored and mostly the idea of a reuse.config file remains.

@niccokunzmann
Copy link

niccokunzmann commented Mar 17, 2023

Nice to see this! I had added #707 for the default settings flow that I liked to have when I thought about it.

This might be interested in this issue:

  • use annotate without parameter
    • when default is set, it is used
    • when default is not set, there is a note on how to add a default value

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants