Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: documentation on json and json-pp formats #2826

Open
1 of 5 tasks
lucasgonze opened this issue Jan 31, 2022 · 11 comments
Open
1 of 5 tasks

Feature: documentation on json and json-pp formats #2826

lucasgonze opened this issue Jan 31, 2022 · 11 comments

Comments

@lucasgonze
Copy link

Description

Scancode-toolkit emphasizes output formats labeled only as "json" or "json-pp." There is no information on the underlying schema.

Is it the same as any format used by other tools? How is it different than SPDX?

Link to Documentation Page

https://scancode-toolkit.readthedocs.io/en/latest/cli-reference/output-format.html

Among the ScanCode Output Formats, json is the most important one, and is recommended over others. Scancode Workbench and other applications that use Scancode Result data as input accept only the json format.

image

Select Category

  • Inconsistency
  • New Section Request
  • General Improvement
  • Typo/Mistakes
  • Other
@pombredanne
Copy link
Contributor

@lucasgonze That's part of the upcomin work that's on deck... It could a nice GSoC project... this is a lot of work but super important in any case :)

@Jeeppler
Copy link

@pombredanne As far as I understood it, the json and json-pp are not in SPDX JSON format. Is this correct?

@mjherzog
Copy link
Member

This is correct. SPDX is a data exchange format not a scan output format. There are --spdx-tv and spdx-rdf output options for creating a SPDX output file (currently SPDX v2.1).

@Jeeppler
Copy link

I understand where my confusion is coming from. New in SPDX v2.2 is the JSON file format.

JSON, YAML, and a development version of XML have been added as supported file formats.

Source: https://spdx.github.io/spdx-spec/diffs-from-previous-editions/#i2-differences-from-v22-and-v21

However, spdx.json is not the same as json or json-pp. json and json-pp are ScanCode specific formats. json-pp is the same as json, except that json-pp is pretty printed. The -pp stands for pretty printed.

@Jeeppler
Copy link

Jeeppler commented Feb 22, 2022

@mjherzog It seems to me, that ScanCode does support the latest SPDX 2.2 standard. As a result of scanning with scancode-toolkit 30.1.0, I get SPDX-2.2 as SPDX Version:

# Document Information

SPDXVersion: SPDX-2.2
DataLicense: CC0-1.0
DocumentNamespace: http://spdx.org/spdxdocs/sourcecode-9006581e-263a-4c8f-b7bc-6bdb705f088c
DocumentName: SPDX Document created by ScanCode Toolkit
LicenseListVersion: 3.14
SPDXID: SPDXRef-DOCUMENT
DocumentComment: <text>Generated with ScanCode and provided on an "AS IS" BASIS, WITHOUT WARRANTIES
OR CONDITIONS OF ANY KIND, either express or implied. No content created from
ScanCode should be considered or used as legal advice. Consult an Attorney
for any legal advice.
ScanCode is a free software code scanning tool from nexB Inc. and others.
Visit https://github.com/nexB/scancode-toolkit/ for support and download.</text>


# Creation Info

Creator: Tool: scancode-toolkit 30.1.0
Created: 2022-02-21T16:04:58Z

I used the Java SPDX-Tool to convert the tag-value output file from ScanCode to the JSON format (ScanCode --spdx-tv to spdx.json):

java -jar tools-java-1.0.4-jar-with-dependencies.jar Convert scancode.spdx tagtojson.json

And verified both the ScanCode Tag-Value and JSON file:

# Tag-Value output by ScanCode
java -jar tools-java-1.0.4-jar-with-dependencies.jar Verify scancode.spdx
This SPDX Document is valid.

# Converted to Json
java -jar tools-java-1.0.4-jar-with-dependencies.jar Verify tagtojson.json 
This SPDX Document is valid.

Did I overlook something? Does scancode-toolkit 30.1.0 support SPDX 2.2?

@pombredanne
Copy link
Contributor

@Jeeppler This is correct. We support the SPDX 2.2 spec but not all the formats.

Note that I have been mentoring a student to add support for JSON and YAML to the upstream SPDX Python library, but I did not have had the time to review and merge this unfortunately for now. This library needs some love.

May be the right way could be to use the Java tools for now to perform the conversion?
You may know if there is a way to create a standalone exe from these?
@goneall would you know?

@mjherzog
Copy link
Member

@Jeeppler I am happy to be corrected

@Jeeppler
Copy link

Jeeppler commented Feb 22, 2022

@pombredanne thank you for clarifying it.

What Python library? Do you have a link?

@goneall
Copy link

goneall commented Feb 22, 2022

@Jeeppler Here's the link to the Python library: https://github.com/spdx/tools-python

@pombredanne
Copy link
Contributor

@Jeeppler FWIW I am technically a maintainer on the Python library: https://github.com/spdx/tools-python too but I do not have enough time to handle it these days and I am looking for sponsors that could help fund some work of community members on this.

@pombredanne
Copy link
Contributor

I am looking for sponsors that could help fund some work of community members on this.

Or folks that could help maintain it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants