From b95c8226351be913f98243980d424a70356247e4 Mon Sep 17 00:00:00 2001 From: Konstantin Baierer Date: Fri, 6 Dec 2019 17:41:39 +0100 Subject: [PATCH 1/2] no multi-page TIFF, OCR-D/core#243 --- CHANGELOG.md | 4 ++++ mets.md | 8 ++++++++ 2 files changed, 12 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 66518b0..1857239 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,10 @@ Versioned according to [Semantic Versioning](http://semver.org/). ## Unreleased +Added: + + * No multi-page TIFF, #132 + ## [3.4.0] - 2019-11-05 Fixed diff --git a/mets.md b/mets.md index 77e3105..4e8d11c 100644 --- a/mets.md +++ b/mets.md @@ -34,6 +34,14 @@ However, since technical metadata about pixel density is so often lost in conversion or inaccurate, processors should assume **300 ppi** for images with missing or suspiciously low pixel density metadata. +## No multi-page images + +Image formats like TIFF support encoding multiple images in a single file. + +Data providers SHOULD provide single-image TIFF. + +OCR-D processors MUST ignore any additional images beyond the first. + ## Unique ID for the document processed METS provided to the MP must be uniquely addressable within the global library community. From 0be31eb0c64f5b1a83e0aeecdd3eb51abd950d39 Mon Sep 17 00:00:00 2001 From: Konstantin Baierer Date: Tue, 17 Dec 2019 16:19:20 +0100 Subject: [PATCH 2/2] multi-image tiff: completely reject multi-image TIFF --- mets.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mets.md b/mets.md index 4e8d11c..00c0a3c 100644 --- a/mets.md +++ b/mets.md @@ -38,9 +38,9 @@ missing or suspiciously low pixel density metadata. Image formats like TIFF support encoding multiple images in a single file. -Data providers SHOULD provide single-image TIFF. +Data providers MUST provide single-image TIFF files. -OCR-D processors MUST ignore any additional images beyond the first. +OCR-D processors MUST raise an exception if they encounter multi-image TIFF files. ## Unique ID for the document processed