Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A better page number-crop algorithm. (Manga) #709

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

neyney10
Copy link

Summary

  1. I wrote a new method to crop pages, as the current one fails.
  2. Attached some cool gifs.

Motivation

E-reader screen real estate is an important resource.
More available screen size means more details can be better seen, especially text.
Text is one of the most important elements that need to be clearly readable on e-readers,
which mostly are smaller devices where the need to zoom is unwanted.

By cropping the page number on the bottom of the page, 2%-5% of the page height can be regained
that allows us to upscale the image even more.

  • Most of the time the screen height is the limiting factor in upscaling, rather than its width.

The Current Algorithm Fails

While it is possible to adjust the 'power' of the cropping, I can only gain some success when the power is at least 4 (using the power user EXE), where the GUI itself only allows up to 2.

The current algorithm fails in all of the cases that I have at hand.

  • I also don't want to crop the page number In case it would crop some manga-content. Some high power levels cropped some of my pages too much.
  • But I allow some level of cropping to achieve a tighter bounding box around the main manga-content with higher power levels.

A Better Algorithm

I wrote a Python function using PIL, NumPy (openCV might perform faster, but I regressed to PIL as it exists as a dependency already and including OpenCV is heavy, package weight-wise).

OUTPUT EXAMPLES OF THE NEW ALGORITHM

(I extended the power level limit to 3, from the previous limit of 2)

Page Number Cropping

ex1h001
ex5ga002

Showcasing Its Ability To Retain Manga Content

It does not crop the page number if it detects other things that might be cropped along with the page number.

ex8n003
ex2h010
ex3b001
ex4bl001
ex6h006
ex7h009

Issues

  • My new algorithm uses the 'power' parameter differently than the other algorithm that simply crops the margins without the page number. A 'power level' in "Crop margins" is different than a 'power level' in "Crop margins & page numbers". This might lead to inconsistent expectations from the user and confuse them.

@axu2
Copy link
Collaborator

axu2 commented Jun 20, 2024

Wow this looks incredible, might take me a while to fully review it though.

For reference, my perspective is mostly using a Kindle Scribe 10" which doesn't need to zoom or crop at all. But when I use smaller devices like an android Hisense a9 eink phone cropping is essential.

Have you looked into any android manga reading software? I bet they might like a feature like this, and maybe they have something similar.

And what platform did you develop on? I use macOS (arm64/apple silicon).

And do you you have a general idea of the performance difference and how much increases the binary size? Aka

’python setup.py build_binary’

@neyney10
Copy link
Author

Wow this looks incredible, might take me a while to fully review it though.
For reference, my perspective is mostly using a Kindle Scribe 10" which doesn't need to zoom or crop at all. But when I use smaller devices like an android Hisense a9 eink phone cropping is essential.

I hope it would actually work for others, I tested it on 30+ images from 5-6 different Mangas. I want to try it on some more Mangas before I'll continue.

Have you looked into any android manga reading software? I bet they might like a feature like this, and maybe they have something similar.

Very shallowly. I tried to search on Google for some existing manga or comic crop algorithms, even some general manga-readers apps, But couldn't find something meaningful (aside of the standard cropping). I usually read on my PC or my [used to have] first gen Kindle Paperwhite 6" (as you can understand my constraint here).

Recently, my Kindle's screen shattered and I've ordered a new e-reader with a 7.8" screen. I've yet to receive it so in the meantime I thought I would at least try to optimize the Mangas that I plan to read/load when I get it. Hence, I tried to write this algorithm.

And what platform did you develop on? I use macOS (arm64/apple silicon).

I use an old Windows 10 laptop (intel x86-64)

And do you you have a general idea of the performance difference and how much increases the binary size? Aka
’python setup.py build_binary’

TBH, I forgot to measure the binary, so I ran it just now to see.
It does increase the EXE file considerably by 20 MB~ (a 30% increase) to 68 MB up from 48.3 MB. The only package that I've added is NumPy. I'll see how I can optimize it. In the worst case, I can re-implement it in basic Python math, with probably some performance hit, although I didn't use anything complicated that necessitates NumPy.

Regarding performance, to my surprise, it didn't suffer much of a hit, I did multiple types of benchmarks.

  • On my PC the algorithm alone takes roughly 0.15 sec per image (when running it outside of KCC).
  • Compared to the existing "crop margins" which is similar to the existing "crop margins & page numbers", on 10 chapters of 'Blame!' (300MB, 440~ Images) takes a similar time on my machine: 6 minutes from clicking "start" to the time I get the output file. Hence I assume that the processing here of the algorithm is negligible compared to all the other stuff that happening (Open file, extract, read image, upscale, rotate, convert, build book...).

By the way,
I found a small bug in one of my functions, I'll upload a fix soon.
I think I might want to postpone the pull request a bit until I try it a bit more on other types of Mangas.

2. Add dependency to the setup.py: Numpy
@axu2
Copy link
Collaborator

axu2 commented Jul 30, 2024

I don't know about having numpy replacement functions...

@neyney10
Copy link
Author

While I agree, I thought we wanted to optimize output file size (as numpy increases it by 30%).
Regardless, I only used trivial functions of NumPy and upon first inspection, the performance of the non-numpy replacement methods are on par (and sometimes faster in my quite shallow tests).

Do you want me to add the NumPy version back?

Unrelated - I've detected that the algorithm sometimes fails on thin fonts and single-digit page numbers. Also, there is an issue with using higher power levels that causes the page number to become distorted and the algorithm fails to detect it (currently I'm optimizing power=1 to work the best in most cases).

@zhaohengkang
Copy link

That's great! Excuse me, is the Cropping mode value Cropping Power: 2 in KCC's current version of the cropping mode that corresponds to the Power3 you're demonstrating?

@neyney10
Copy link
Author

neyney10 commented Aug 1, 2024

That's great! Excuse me, is the Cropping mode value Cropping Power: 2 in KCC's current version of the cropping mode that corresponds to the Power3 you're demonstrating?

Hi! No, the power levels [0...3] that I'm using are different entirely from the current power levels of [0...2] that exist currently in KCC v6.1.0.

I've noted that in the issues at the bottom of OP.

My new algorithm uses the 'power' parameter differently than the other algorithm that simply crops the margins without the page number. A 'power level' in "Crop margins" is different than a 'power level' in "Crop margins & page numbers". This might lead to inconsistent expectations from the user and confuse them.

@zhaohengkang
Copy link

That's great! Excuse me, is the Cropping mode value Cropping Power: 2 in KCC's current version of the cropping mode that corresponds to the Power3 you're demonstrating?

Hi! No, the power levels [0...3] that I'm using are different entirely from the current power levels of [0...2] that exist currently in KCC v6.1.0.

I've noted that in the issues at the bottom of OP.

My new algorithm uses the 'power' parameter differently than the other algorithm that simply crops the margins without the page number. A 'power level' in "Crop margins" is different than a 'power level' in "Crop margins & page numbers". This might lead to inconsistent expectations from the user and confuse them.

Ok, so how should I use the algorithm you provided in KCC? I felt it was fantastic, just what I needed. Or, can you help me provide a KCCexe file that supports this algorithm?

@axu2
Copy link
Collaborator

axu2 commented Aug 1, 2024

@neyney10 Enable GitHub Actions on your fork. Then create a release on your fork, which triggers GitHub actions to create binaries for all supported platforms, like here" https://github.com/axu2/kcc/releases

Also, feel free to build a numpy version so we can more easily see the size difference across platforms.

Edit: built myself:

image

@zhaohengkang
Copy link

I have used the better optimization algorithm you provided and it is really great! Thank you very much!

@zhaohengkang
Copy link

在分支上启用 GitHub Actions。然后在你的分支上创建一个发布,这会触发 GitHub 操作为所有支持的平台创建二进制文件,就像这里 https://github.com/axu2/kcc/releases 一样。

此外,请随意构建一个 numpy 版本,以便我们可以更轻松地查看跨平台的大小差异。

I did it. Thank you

@axu2
Copy link
Collaborator

axu2 commented Aug 2, 2024

@zhaohengkang feel free to upload some comparison photos.

@zhaohengkang
Copy link

随意上传一些比较照片。

The following is the mobi file generated by importing pdf files to KCC, you can see that the original algorithm can only extract pictures and not crop, and a better algorithm can also crop it.

1
2
3
4
5
6

@zhaohengkang
Copy link

随意上传一些比较照片。

Here are the mobi files generated by the import folder, and these are the comparison graphs.

7
8
9
10
11

@neyney10 neyney10 force-pushed the master branch 2 times, most recently from 320d056 to 38ff5e3 Compare August 6, 2024 13:46
@axu2
Copy link
Collaborator

axu2 commented Aug 6, 2024

Made some test builds and size comparison:

https://github.com/axu2/kcc/releases

image

@neyney10
Copy link
Author

neyney10 commented Aug 6, 2024

Made some test builds and size comparison:

https://github.com/axu2/kcc/releases

image

Yes, I reached the same numbers.

How do you want to continue?

@axu2
Copy link
Collaborator

axu2 commented Aug 6, 2024

I think the size difference of the final distributable files is negligible, I'd rather keep numpy. I've done lots of image processing work in the past with numpy and know how useful it is, it may come in handy in the future.

I also tested startup times and didn't see any significant difference. (maybe 1 second?)

@neyney10
Copy link
Author

neyney10 commented Aug 7, 2024

I think the size difference of the final distributable files is negligible, I'd rather keep numpy. I've done lots of image processing work in the past with numpy and know how useful it is, it may come in handy in the future.

I also tested startup times and didn't see any significant difference. (maybe 1 second?)

I see. Then NumPy version it is.

Regarding the algorithm - there are still issues with thin fonts of page numbers and instances of a single-digit page num that the algorithm fails to detect with higher power levels (but less of a problem with lower power levels).

As Zhao demonstrated, the algorithm works fine in most cases and I think it is ready for general use, even if more tweaks and improvements are needed for some edge cases or maybe some scenarios that I haven't tested yet.

What about the issue I have with the inconsistency with power levels of the current "crop margins" versus the new "crop margin + page num"?
Should we modify the existing "crop margins" algorithm to make it consistent?


@zhaohengkang Very nice comparison images you compiled there! seems like you worked hard sharing these results!
I was wondering if you tried running the algorithm with power=1, as it seems you only shared comparisons with higher power levels such as 2 and 3. I assume that lower power levels are too weak to crop the watermark?

@zhaohengkang
Copy link

@neyney10 Sorry, because when I turned it on, the default was power=2, and precisely this power met my requirements, so I did not use the lower power to test.

@axu2
Copy link
Collaborator

axu2 commented Aug 18, 2024

What about the issue I have with the inconsistency with power levels of the current "crop margins" versus the new "crop margin + page num"?
Should we modify the existing "crop margins" algorithm to make it consistent?

Sorry, totally forgot you asked a question here.

Is it easy to modify the existing modify "crop margins" algorithm? And is it unlikely to cause issues for users? If so, do it.

Otherwise, just put a note in the tooltip that the power is used differently.

If you really want to offer the old behavior too, you can add a LEGACY checkbox.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants