-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transparent text selection #149
Comments
Ok, I can reproduce this using poppler 22.08 with the |
OK well I found a work around (below) but agree that this seems to be a poppler issue of not handling By following the instructions here up to step 2, where instead replacing |
This is a very useful discovery and should probably be communicated somewhere else too! To make the first step slightly easier one could use my qpdf transient wrapper qpdf.el. In fact the whole procedure can be automated with the following command: (defun my-fix-pdf-selection ()
"Replace pdf with one where selection shows transparently."
(interactive)
(unless (equal (file-name-extension (buffer-file-name)) "pdf")
(error "Buffer should visit a pdf file."))
(unless (equal major-mode 'pdf-view-mode)
(pdf-view-mode))
;; save file in QDF-mode
(qpdf-run (list
(concat "--infile="
(buffer-file-name))
"--qdf --object-streams=disable"
"--replace-input"))
;; do replacements
(text-mode)
(read-only-mode -1)
(while (re-search-forward "3 Tr" nil t)
(replace-match "7 Tr" nil nil))
(save-buffer)
(pdf-view-mode)) This still depends on qpdf.el but with a few extra lines of code one could avoid that. Note that this overwrites the pdf file visited in the buffer from which it is run! To avoid this replace the first |
Great, thanks for sharing your package and function. Reopening as it could be helpful if your function was added to the wiki, or at the very least if this could be added to known issues. One small issue with your function above: when tested it only worked on file names that do not contain spaces. Not a major issue for me as I use zotfile/zotero to rename my pdfs. And FWIW I have not noticed the same issue with text selection lag between "fixed" and unfixed pdfs, both are quite slow for me (compared to other PDF readers) as noted in #87 (comment) |
Thanks for testing. I fixed the issue with spaces in file names, plus added the problem and workaround to known problems in the README. If I understand correctly, once the pull request is merged, pdftools.wiki (which just mirrors the README) will rebuild automatically to reflect the changes. I tried to work around the qpdf.el dependency, but to make it behave correctly in all cases, basically the whole |
Thanks again for your work on this! FYI the function recently stopped working on my setup, it appears to be inserting a trailing Minibuffer contents after running
When I copy this into a terminal window and remove the trailing |
@workcomplete : ^ This might be a bug introduced in the latest commit on I am going ahead and merging the PR which explains this as a known problem (which will also update the wiki). Thank you! |
closed via aec8ecd |
This change fixes the bug pointed out in vedang/pdf-tools#149. When `--replace-input` is used and no `outfile` is provided, an empty quote (`''`) is inserted into the call to `qpdf`. This change guards against that.
I merged vedang's fix. Thanks both of you! |
I'll preface this by saying I am not sure if this is an issue with pdf-tools specifically, vs a more general issue with tesseract OCR/poppler. I originally posted this in discussions, but I don't think many people are using this feature atm.
When I select text (i.e. click and drag mouse over a region of text) in pdf-tools, the text selection is not transparent. For pdf's that are created digitally this is fine, but for pdf's that are scanned and OCR'd with tesseract, the selected text becomes hidden behind the selection (see images below). Is there a way to change this behavior without re-OCRing the pdf with something like adobe acrobat (i.e. is it possible make the text selection transparent in pdf-tools)?
What selected text looks like in a pdf generated from a website:
What selected text looks like in a scanned book that has been OCRd with tesseract
OCR text can still be copied and pasted and looks fine when markup is applied.
Originally posted by @workcomplete in #147
If I understand this post on stack exchange, poppler has already implemented transparent text selection?
https://tex.stackexchange.com/questions/565909/invisible-text-even-when-selected-in-evince
This same issue with pdf-tools (under previous maintainer) is mentioned here
https://gitlab.freedesktop.org/poppler/poppler/-/issues/157
The text was updated successfully, but these errors were encountered: