Skip to content
povimd9 edited this page May 14, 2023 · 14 revisions

Usage

Getting Started

To get started with FileChampion4j, you will need to:

  1. Import the library.
  2. Create a configuration file.
  3. Create a FileValidator object.
  4. Validate a file.

Importing filechampion4j

To import FileChampion4j, you can use Maven Central supported package managers, download compiled JAR from release branch, or compile from source. Latest version can be found under 'release-*' branches. Release branches include working code, including compiled slim/fat JAR files.

Import from Maven Central

FileChampion is hosted in Maven Central. To import FileChampion using Apache Maven, just add the following section to pom.xml:

<dependency>
    <groupId>dev.filechampion</groupId>
    <artifactId>filechampion4j</artifactId>
    <version>0.9.8.3</version>
</dependency>

Build from source

Compiling from source can be achieved by cloning the repo, and running 'mvn clean install'.

Use as JAR

JAR files can be found under 'release-*/target/', which includes:

JAR resources are signed, and should be validated for integrity.

Importing Library

Once JAR is imported to project libraries, FileValidator, and ValidationResponse should be imported as follows:

import dev.filechampion.filechampion4j.FileValidator;
import dev.filechampion.filechampion4j.ValidationResponse;

Configuring

JSON configurations must be defined and loaded for initializing the FileValidator class. For configuration details, see FileChampion Configurations.

The JSON object is loaded to memory once at initialization, and used throughout the lifecycle of the class.

Creating a Validator Object

FileValidator class expects FileChampion configurations as a org.json.JSONObject. Class initialization is done as follows:

// Create a new FileValidator object
FileValidator validator = new FileValidator(jsonObject);

Validating Files

FileValidator.validateFile() is used for performing validation actions on a target file. This method supports multiple call arguments, impacting the behavior of the validation flow.

FileValidator.validateFile() Options

validateFile (String fileCategory, byte[] originalFile, String fileName)

  • Validate file from byte array without storing file, returning only validation status.

validateFile (String fileCategory, byte[] originalFile, String fileName, Path outputDir)

  • Validate file from byte array and store file, returning validation status, and new file path.

validateFile (String fileCategory, byte[] originalFile, String fileName, String mimeString)

  • Validate file from byte array without storing file, providing MIME type string as part of validation, returning validation status.
  • Providing MIME type as part of the method call, is recommended for http uploads, reducing overhead of guessing file MIME type during validation flow.

validateFile (String fileCategory, byte[] originalFile, String fileName, Path outputDir, String mimeString)

  • Validate file from byte array and store file, providing MIME type string as part of validation, returning validation status.
  • Providing MIME type as part of the method call, is recommended for http uploads, reducing overhead of guessing file MIME type during validation flow.

validateFile (String fileCategory, Path filePath, String fileName)

  • Validate file from file path without storing file, returning only validation status.

validateFile (String fileCategory, Path filePath, String fileName, Path outputDir)

  • Validate file from file path and store file, returning validation status, and new file path.

validateFile (String fileCategory, Path filePath, String fileName, String mimeString)

  • Validate file from file path without storing file, providing MIME type string as part of validation, returning validation status.
  • Providing MIME type as part of the method call, is recommended for http uploads, reducing overhead of guessing file MIME type during validation flow.

validateFile (String fileCategory, Path filePath, String fileName, Path outputDir, String mimeString)

  • Validate file from file path and store file, providing MIME type string as part of validation, returning validation status.
  • Providing MIME type as part of the method call, is recommended for http uploads, reducing overhead of guessing file MIME type during validation flow.

Response of validateFile() is a ValidationResponse object, which contains

  • @param isValid (Boolean) true if the file is valid, false otherwise
  • @param resultsInfo (String) a String containing the result summary of the validation
  • @param resultsDetails (String) a String containing the details of the validation
  • @param cleanFileName (String) the file name with all special characters replaced with underscores
  • @param fileBytes (bytes[]) the file bytes
  • @param fileChecksums (Map<String, String>) hash map containing the file checksums as 'algorithm' => 'checksum'
  • @param validFilePath (String) optional valid file path if outputDir was set in the filechampion4j constructor

Note!

'Processed file' might be the original file that passed validations, but it can also be a NEW FILE, due to configured plugins steps.

// Validate file assumed to be in the "Documents" category of the configurations, and save to 'outDir' if validations passed.
ValidationResponse fileValidationResults = validator.validateFile("Documents", fileInBytes, inFile.getName(),outDir);


// Validate file assumed to be in the "Documents" category of the configurations with MIME type application/pdf, and save to 'outDir' if validations passed.
ValidationResponse fileValidationResults = validator.validateFile("Documents", fileInBytes, inFile.getName(),outDir, "application/pdf");

Performance Considerations

FileChampion offers various approaches to file validations, supporting a wide range of use cases. This means that while some validations provide better accuracy, they might impact performance of the validation process. Below are some accuracy/performance considerations when using FileChampion:

  • Providing file path/output directory to the validation method, requires disk I/O operations, impacting performance.
  • Guessing MIME string is an expensive operation, requiring write to disk for accuracy, while providing low confidence of results. This is due to MIME definition as metadata in http request/email content/etc. As such, it is not advised to define MIME validation for files originating other than relevant objects defined by MIME/S/MIME RFCs.
    • When validating objects relevant for MIME validation, the MIME string from the original request should be passed to FileChampion as part of validation.
  • While file Checksum is important for integrity/scanning/tracking of processed files, calculating hash of file bytes is an expensive operation, correlation to size of file.
    • FileChampion checksum implementation supports concurrent processing of the hashes for files larger than 2MB, chunking them by 3MB for concurrent processing.
    • "add_checksum": false can be defined for specific file extensions in the JSON options, supporting skipping of checksum for files meeting some criteria.
  • Plugin performance is directly dependent on executed processes defined for the plugin. As such, defining relevant 'timeout' and testing the plugins for performance impact is critical for reliable estimates.

Usage Example

The following example shows simple use of filechampion on local pdf file, including saving of file if validations passed.

Steps:

  1. Define example files locations (pdf file from 'samples/in', JSONObject from 'config/config.json', output directory as 'samples/out')
  2. Initializes a FileValidator object as a 'validator' object with the JSONObject
  3. Read pdf file bytes and get name
  4. Validate file as a 'Documents' type
  5. Print valid/invalid results and exit
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import org.json.JSONObject;
import dev.filechampion.filechampion4j.FileValidator;
import dev.filechampion.filechampion4j.ValidationResponse;

public class Main {
    public static void main(String[] args) {
        // Path to the file to be validated in this simple example
        File pdfFile = new File("samples/In/test.pdf");

        // Path to the config.json file
        String configPath = "config/config.json";

        // Placeholders for the JSON object and the file in bytes
        JSONObject jsonObject = null;
        byte[] fileInBytes = null;
        FileValidator validator = null;

        // Path to the output directory
        Path outDir = Paths.get("samples", "Out");

        // Create a new FileValidator object with json config file
        try {
            // Read the JSON object from the config.json file
            String jsonConfigContent = new String(Files.readAllBytes(Paths.get(configPath)));
            jsonObject = new JSONObject(jsonConfigContent);

            // Create a new FileValidator object
            validator = new FileValidator(jsonObject);
        } catch (IOException e) {
            e.printStackTrace();
            System.exit(1);
        } catch (IllegalArgumentException e) {
            System.out.println("Error creating FileValidator object");
            System.exit(1);
        } catch (Exception e) {
            System.out.println("Error reading config file");
            System.exit(1);
        }

        try {
            // Read the file to be validated into a byte array
            fileInBytes = Files.readAllBytes(pdfFile.toPath());

            // Validate the file
            ValidationResponse fileValidationResults = validator.validateFile("Documents", fileInBytes, pdfFile.getName(),outDir);

            // Check if the file is valid
            if (fileValidationResults.isValid()) {
                // Print the results if the file is valid
                String validMessage = String.format("%s is a valid document file.%n New file: %s, SHA-256 Checksum: %s",
                        fileValidationResults.resultsInfo(),
                        fileValidationResults.getValidFilePath().length == 0 ? "" : fileValidationResults.getValidFilePath()[0],
                        fileValidationResults.getFileChecksums().get("SHA-256"));
                System.out.println(validMessage);
                System.exit(0);
            } else {
                // Print the results if the file is invalid
                System.out.println(pdfFile.getName() + " is not a valid document file  because " + fileValidationResults.resultsInfo());
                System.exit(0);
            }
        } catch (IOException e) {
            e.printStackTrace();
            System.exit(1);
        }
    }
}