Introduction

This is the core repository for the Common Identifier application, containing all configuration, file handling, and data processing components.

This repository is designed to be used with file-based data programmatically; for use via an UI application, please refer to the Common Identifier Application repository.

Repo Structure

📦common-identifier-algorithm-shared
 ┣ 📂src
 ┃ ┣ 📂config     # functions related to the handling of configuration files
 ┃ ┣ 📂decoding   # reading and decoding files - CSV or XLSX
 ┃ ┣ 📂encoding   # encoding and writing files - CSV or XLSX
 ┃ ┣ 📂hashing    # base hashing logic and supporting utilities
 ┃ ┣ 📂processing # main API into the backend logic
 ┃ ┣ 📂validation # validation logic and wrappers
 ┣ 📂tests        # tests for all components

Usage

const REGION = 'NWS';
const CONFIG_PATH = join(__dirname, './config.toml');

const INPUT_PATH = join(__dirname, 'files', 'input_data.csv');
const OUTPUT_PATH = join(__dirname, 'output', 'output.csv');
const VALIDATION_ERRORS_PATH = join(__dirname, 'output', 'validation_errors.csv');

// load configuration file
const configLoadResult = loadConfig(CONFIG_PATH, REGION);
if (!configLoadResult.success) throw new Error('unable to load configuration file.');

const config = configLoadResult.config;

// validate the input file against all configured validation rules.
const preprocessResult = await preprocessFile({
  config: config,
  inputFilePath: INPUT_PATH,
  errorFileOutputPath: VALIDATION_ERRORS_PATH,
  limit: 2,
});

if (!preprocessResult.isValid)
  throw new Error('Validation errors found in input file, review error file output.');

// validate the input file against all configured validation rules.
const processFileResult = await processFile({
  config: config,
  inputFilePath: INPUT_PATH,
  outputPath: OUTPUT_PATH,
  hasherFactory: makeHasher,
  limit: 2,
});
// print the result, save the result, etc.
console.dir(processFileResult, { depth: 3 });

🧪 Data Processing Pipeline (file-based data)

Configuration

The src/config/ConfigStore ConfigStore attempts to load the configuration from the application directory or the backup location (app bundle) if the primary configuration fails to load. It also handles updating the user configuration file on config changes.
The terms & conditions and window placement / size are also handled by the ConfigStore using the src/config/appConfig application config save/write process
See docs/configuration-files.md for more information.

Pre-processing (validation)

The src/decoding Decoders (CSV and XLSX) read the source file and convert it (using the config.source setup) to a Document containing the input data with column aliases renamed
The src/processing pre-processing function identifies if the target is a mapping document based on the current configuration and the data in the file and sets up validation accordingly
The src/validation Validators are setup based on the active configuration, and ran against the Document.
If there are errors, the src/encoding Encoders (CSV and XLSX) write the validation error output based on [destination_errors] section of the active configuration
The frontend shows the results and either allows processing or shows the errors
See docs/validators.md for more information.

Processing

The src/decoding Decoders (CSV and XLSX) read the source file and convert it (using the config.source setup) to a Document containing the input data with column aliases renamed
The src/processing processing function identifies if the target is a mapping document based on the current configuration and the data in the file. Using the active configuration it collects data into static to_translate and reference buckets per-row and passes it to the active algorithm for processing
The active algorithm takes the { static:[...], to_translate:[...], reference: [...] } per-row data and returns a map with the columns it wants to add -- ex: { USCADI: "....", DOCUMENT_HASH: "..." }
The data returned by the algorithm is merged into the source rows so the encoders can package multiple different outputs
The src/encoding Encoders (CSV and XLSX) write the output based on the relevant [destination] / [destination_map] section of the active configuration.

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
.github/workflows		.github/workflows
docs		docs
src		src
tests		tests
.gitignore		.gitignore
.prettierrc		.prettierrc
COPYING		COPYING
README.md		README.md
jest.ci.config.js		jest.ci.config.js
jest.config.js		jest.config.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Repo Structure

Usage

🧪 Data Processing Pipeline (file-based data)

Configuration

Pre-processing (validation)

Processing

About

Releases

Packages

Languages

License

wfp/common-identifier-algorithm-shared

Folders and files

Latest commit

History

Repository files navigation

Introduction

Repo Structure

Usage

🧪 Data Processing Pipeline (file-based data)

Configuration

Pre-processing (validation)

Processing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages