If you are beginning your journey with Senzing, please start with Senzing Quick Start guides.
You are in the Senzing Garage where projects are "tinkered" on. Although this GitHub repository may help you understand an approach to using Senzing, it's not considered to be "production ready" and is not considered to be part of the Senzing product. Heck, it may not even be appropriate for your application of Senzing!
validate
is a command in the
senzing-tools
suite of tools.
This command validates that a JSONL file is properly formatted and each line
contains sufficient key-value pairs for Senzing to each as a record. It is
highly recommend that this code be taken and extended to validate JSONL records
to meet your needs.
validate
tests each line of a give JSONL file to ensure that it is valid
JSON and contains two necessary key-value pairs: RECORD_ID
and DATA_SOURCE
.
The file is given to validate
with the command-line parameter input-url
or
as the environment variable SENZING_TOOLS_INPUT_URL
. Note this is a URL so
local files will need file://
and remote files http://
or https://
. If
the given file has the .gz
extension, it will be treated as a compressed file
JSONL file. If the file has a .jsonl
extension it will be treated
accordingly. If the file has another extension it will be rejected, unless the
input-file-type
or SENZING_TOOLS_INPUT_FILE_TYPE
is set to JSONL
.
validate
is intended as a starting point for other validation needs. It
should be fairly straight forward to extend it to test other JSON objects or
extend it to other file types.
- The
validate
command is installed with the senzing-tools suite of tools. See senzing-tools install.
senzing-tools validate [flags]
-
For options and flags:
-
Runtime documentation:
senzing-tools validate --help
-
In addition to the following simple usage examples, there are additional Examples.
-
✏️ Specify file URL using command line option. Example:
senzing-tools validate \ --input-url https://public-read-access.s3.amazonaws.com/TestDataSets/SenzingTruthSet/truth-set-3.0.0.jsonl
-
See Parameters for additional parameters.
-
✏️ Specify file URL using environment variable. Example:
export SENZING_TOOLS_INPUT_URL=https://public-read-access.s3.amazonaws.com/TestDataSets/SenzingTruthSet/truth-set-3.0.0.jsonl senzing-tools validate
-
See Parameters for additional parameters.
This usage shows how to validate a file with a Docker container.
-
✏️ Run
senzing/senzing-tools
. Example:docker run \ --env SENZING_TOOLS_COMMAND=validate \ --env SENZING_TOOLS_INPUT_URL=https://public-read-access.s3.amazonaws.com/TestDataSets/SenzingTruthSet/truth-set-3.0.0.jsonl \ --rm \ senzing/senzing-tools
-
See Parameters for additional parameters.
- SENZING_TOOLS_INPUT_FILE_TYPE
- SENZING_TOOLS_INPUT_URL
- SENZING_TOOLS_JSON_OUTPUT
- SENZING_TOOLS_LOG_LEVEL
- SDK documentation
- Development
- Errors
- Examples
- Package reference
- Related artifacts: