Zimit2: Allow deduplication of entries #199
Labels
bug
Something isn't working
enhancement
New feature or request
question
Further information is requested
Milestone
It looks like Zimcheck is complaining about quality issues in most (all?) Zimit2 files.
It already did so for Zimit1, but maybe it is time to address the problems.
The first obvious problem is that lots of content is duplicated inside the ZIM due to different URLs leading to the same content. I think this could be pretty easily addressed (even if it clearly means additional processing to deduplicate).
For a website like solar.lowtechmagazine.com which is available in multiple languages, it could even make a significant difference in terms of final file size (not sure if compression achieves to cancel duplicated content like this well, at least some persons says it is not possible, e.g. https://superuser.com/a/479083).
The text was updated successfully, but these errors were encountered: