spdx-tool
License Match Count Ratio
Apache-2.0 TMPL 62 60.1
None 34 33.0
GPL-3.0-or-later 0.81 3 2.9
GPL-2.0-or-later 0.86 2 1.9
FSFUL TMPL 1 0.9
NTP 0.74 1 0.9
spdx-tool --only-licenses=FSFUL,NTP -f .
FSFUL 1
configure
NTP 1
install-sh
spdx-tool --only-licenses=Apache-2.0 --print-license --line-number src
spdx-tool --only-licenses=Apache-2.0 --update=spdx src
Likewise but keep the first two lines of the existing license header:
spdx-tool --only-licenses=Apache-2.0 --update=1..2.spdx src
spdx-tool --output-xml=report.xml .
spdx-tool --output-json=report.json .
- Fix compilation on FreeBSD and Windows
spdx-tool scans the source files to identify licenses and allows to update them in order to use the SPDX license format. It can be used to:
- identify the license used in source files of a project,
- produce a JSON/XML report for the licenses found with the list of files,
- replace a license header by the SPDX license tag equivalent.
The tool uses the license templates to identify the license used in source files. The builtin repository contains arround 600 license templates and it can be completed by your own templates as long as you use the SPDX license description format described in The Software Package Data Exchange® (SPDX®) Specification Version 2.3.
The spdx-tool scans the directory or files passed as parameter. Directories are scanned recursively
and the .gitignore
file is first looked in each directory to take into account files which are ignored
in the project. For each file, the spdx-tool tries to:
- identify the language of the source file,
- extract the license header text at beginning of the source file,
- identify the license by using the following algorithms:
- look for a
SPDX-License-Identifier
tag, when it was found, the match report indicatesSPDX
, - look for a template match from the license templates
builtin repository or the templates configured for the tool. When this succeeds, the match
report indicates
TMPL
, - last, we guess the best matching license by using an inverted index of license tokens. The tool then uses a classical term frequency inverse document frequency algorithm to find the best matching license. The report will indicate the highest Cosine similarity found.
- look for a
- Man page: spdx-tool (1)
You can install spdx-tool by using the Debian 12 and Ubuntu 24.04 packages. First, setup to accept the signed packages:
wget -O - https://apt.vacs.fr/apt.vacs.fr.gpg.asc | sudo tee /etc/apt/trusted.gpg.d/apt-vacs-fr.asc
and choose one of the echo
command according to your Linux distribution:
Ubuntu 24.04
echo "deb https://apt.vacs.fr/ubuntu-noble noble main" | sudo tee -a /etc/apt/sources.list.d/vacs.list
Debian 12
echo "deb https://apt.vacs.fr/debian-bullseye bullseye main" | sudo tee -a /etc/apt/sources.list.d/vacs.list
Then, launch the apt update command:
sudo apt-get update
and install the tool using:
sudo apt-get install -y spdx-tool
To build the spdx-tool you will need the GNAT Ada compiler as well as the Alire package manager.
make
New languages can be easily added by editing the tools/languages-addon.json
file
and declaring the language with the corresponding file extensions and the comment
type that must be used to parse the header and extract the license. A typical
configuration looks like:
"GNAT Project": {
"type": "programming",
"extensions": [
".gpr"
],
"comment_style": "dash-style"
},
After updating the tools/languages-addon.json
file, rebuild the generated Ada
files by running:
make generate
When a language is recognized but the analyser does not know how to extract
comments, it can be fixed by adding a definition in tools/languages-addon.json
file:
"Dart": {
"comment_style": "C-style"
},
Recognized comment styles include:
"dash-style", "--"
"C-line", "//"
"Shell", "#"
"Latex-style", "%"
"Forth", "\"
"C-block", "/*", "*/"
"XML", "<!--", "-->"
"OCaml-style", "(*", "*)"
"Erlang-style", "%%"
"Semicolon", ";"
"JSP-style", "<%--", "--%>"
"Smarty-style", "{*", "*}"
"Haskell-style", "{-", "-}"
"Smalltalk-style", """", """"
"PowerShell-block", "<#", "#>"
"CoffeeScript-block", "###", "###"
"PowerShell-style", "", "", "PowerShell-block,Shell"
"CoffeeScript-style", "", "", "Shell,CoffeeScript-block"
"C-style", "", "", "C-line,C-block"