Hyperlink in markdown cell to pdf document stopped working #3652

kdeleeuw11 · 2018-05-31T12:06:04Z

I have a large number of Jupyter Notebooks and in many of them I have hyperlinks to locally stored pdf documents. Today on my iMac the links stopped working. When clicking on a link, a new tab is opened with the proper address, but the page is just black. When I do this on my MacBook with exactly the same Jupyter Notebook, it works ok. Up to yesterday I had no problems. I have tried a number of things to resolve this, amongst others I normally work with Google Chrome, but I switched to Safari and had the same problem. When opening the pdf in either Chrome or Safari from Finder, it works fine. So it looks like Jupyter Notebook issue. When executing the hyperlink in the notebook, I get the following entry in the log file:
[I 21:56:01.222 NotebookApp] 302 GET /notebooks/Cookbooks/Git%20%26%20GitHub/books/Pro_Git.pdf (::1) 1.01ms

I get the same entry on MacBook where it works ok.

A screenshot of the page after trying to load the pdf is attached

takluyver · 2018-05-31T19:12:00Z

Any messages in the browser's Javascript console?

kdeleeuw11 · 2018-05-31T22:54:32Z

I found this in the Javascript console:
Failed to load 'http://localhost:8888/files/Cookbooks/Git%20%26%20GitHub/books/Pro_Git.pdf' as a plugin, because the frame into which the plugin is loading is sandboxed.

This must be the cause of the problem. I have no idea how to address this. Can you help?

bryango · 2018-08-08T12:15:57Z

Same issue! No idea what's happening... Tried launching another simple HTTP server, PDF links worked just fine there, so it shouldn't be a browser issue. PDF.js extension (firefox) works fine though.

jupyter-troubleshoot attached:
jupyter-troubleshoot.log

bryango · 2018-08-14T16:38:46Z

@takluyver I have zero experience in web development, but after some googling, I believe it's some kind of cross origin request issue... This PR: #3341 seems to be related?
@kdeleeuw11 Have you found any solution to this? PDF documents really matters to me too.

kdeleeuw11 · 2018-08-14T22:29:49Z

I got it to work in Google Chrome by installing the PDF Viewer extension. I am not very technical and I have no idea why it initially stopped working in Google Chrome and Safari. But at least I have it working again. Google Chrome is my default browser.

bryango · 2018-10-06T11:47:32Z

@takluyver Now I'm confident that this issue is indeed caused by #3341. After manually remove the lines included in #3341 from my conda installation ([...]/anaconda3/lib/python3.7/site-packages/notebook), my pdf links work perfectly again.

FYI, These are the lines I removed:

Subject: [PATCH] UN-patch #3341

---
 base/handlers.py  | 7 -------
 files/handlers.py | 7 -------
 2 files changed, 14 deletions(-)

diff --git a/base/handlers.py b/base/handlers.py
index e3fbddc..72677c9 100644
--- a/base/handlers.py
+++ b/base/handlers.py
@@ -640,13 +640,6 @@ class Template404(IPythonHandler):
 class AuthenticatedFileHandler(IPythonHandler, web.StaticFileHandler):
     """static files should only be accessible when logged in"""
 
-    @property
-    def content_security_policy(self):
-        # In case we're serving HTML/SVG, confine any Javascript to a unique
-        # origin so it can't interact with the notebook server.
-        return super(AuthenticatedFileHandler, self).content_security_policy + \
-                "; sandbox allow-scripts"
-
     @web.authenticated
     def get(self, path):
         if os.path.splitext(path)[1] == '.ipynb' or self.get_argument("download", False):
diff --git a/files/handlers.py b/files/handlers.py
index 7973fd6..b942149 100644
--- a/files/handlers.py
+++ b/files/handlers.py
@@ -26,13 +26,6 @@ class FilesHandler(IPythonHandler):
     a subclass of StaticFileHandler.
     """
 
-    @property
-    def content_security_policy(self):
-        # In case we're serving HTML/SVG, confine any Javascript to a unique
-        # origin so it can't interact with the notebook server.
-        return super(FilesHandler, self).content_security_policy + \
-               "; sandbox allow-scripts"
-
     @web.authenticated
     def head(self, path):
         self.get(path, include_body=False)
-- 
2.18.0

takluyver · 2018-10-22T14:09:31Z

This works correctly for me in Firefox, but fails in Chromium with the error Failed to load 'http://localhost:8889/(...).pdf' as a plugin, because the frame into which the plugin is loading is sandboxed.

It is sandboxed, and quite deliberately so. And you're right that #3341 is where the sandboxing was introduced. This is a security measure, so we can't just disable it again. If you're interested, I'd suggest someone research what relaxations of the sandbox would be needed to let Chrome display a PDF.

CSP sandboxing docs: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/sandbox

matanox · 2018-12-24T07:30:03Z

I think this is also the case when trying to display a PDF inline in a notebook a la

from IPython.display import IFrame
IFrame("foo.pdf", width=900, height=800)

Could be nice if this worked again even in Chrome.

bryango · 2018-12-24T08:32:50Z

This works correctly for me in Firefox, but fails in Chromium with the error Failed to load 'http://localhost:8889/(...).pdf' as a plugin, because the frame into which the plugin is loading is sandboxed.

It is sandboxed, and quite deliberately so. And you're right that #3341 is where the sandboxing was introduced. This is a security measure, so we can't just disable it again. If you're interested, I'd suggest someone research what relaxations of the sandbox would be needed to let Chrome display a PDF.

CSP sandboxing docs: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/sandbox

@takluyver I suppose that as a security measure this is somehow meaningful, but since we already allow the kernel (e.g. python) to do anything to the filesystem, isn't it sort of pointless to have this kind of sandboxing? 😜

I do hope this bug can be resolved sooner. Sometimes PDF.js extension feels too clumsy for me... Unfortunately I don't have the necessary expertise to contribute, but I was able to (kind of?) circumvent this by reading the PDF as binary from the python kernel, then embedding it with a server side PDF.js engine - which is even clumsier, but at least I don't have to ask every one of my collaborators to install a PDF.js extension. 😉

@matanster If you really want PDF in your ipynb, you can try something like this. 😂

kav2k · 2019-02-11T09:09:32Z

(previous post was wrong and was deleted)

Relevant Chromium bug: https://bugs.chromium.org/p/chromium/issues/detail?id=413851
Note that it's currently WontFix.

It boils down to "there's nothing in the standard to allow plugins to operate in sandbox; there's no allow-plugins rule".

It seems like Chrome and Firefox take different approaches to handling this. Chrome just straight up disallows it.

takluyver · 2019-02-11T09:45:40Z

since we already allow the kernel (e.g. python) to do anything to the filesystem, isn't it sort of pointless to have this kind of sandboxing? 😜

The model we've got is that code you deliberately run can do anything (within the context of where the kernel runs), but opening a file should never be able to execute arbitrary code on your system. People don't expect that opening a document (whether that's a notebook, an HTML page, or a PDF) can start running code outside a sandbox. See also: word macro viruses.

The technical implication of this is that any pages served by the notebook server where we don't entirely control the content must either be sandboxed (so they can't talk to kernels) or sanitised (so they can't run Javascript).

We sanitise untrusted notebooks, because the notebook page has to be able to talk to the kernel. But sanitisation is tricky, edge cases can be missed (we had a CVE because of an interaction between our sanitisation engine and jQuery), and it breaks a lot of rich content. So we sandbox when serving (non-notebook) files - they can run Javascript, but the browser's cross-origin security mechanisms stop them talking to kernels.

zhyiyu · 2021-01-29T16:59:11Z

I got it to work in Google Chrome by installing the PDF Viewer extension. I am not very technical and I have no idea why it initially stopped working in Google Chrome and Safari. But at least I have it working again. Google Chrome is my default browser.

The error message I got (for Google Chrome) is ERR_BLOCKED_BY_CLIENT.

I installed an extension called PDF Viewer and it now works (though not perfect).

Hope this issue can be fixed soon.

meeseeksmachine · 2021-03-14T20:14:52Z

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/what-is-the-best-method-of-importing-pdf-files-into-a-notebook/8324/1

jliu1999 · 2021-03-19T01:07:16Z

I got it to work in Google Chrome by installing the PDF Viewer extension. I am not very technical and I have no idea why it initially stopped working in Google Chrome and Safari. But at least I have it working again. Google Chrome is my default browser.

The error message in my case (Google Chrome) is ERR_BLOCKED_BY_CLIENT, I installed an extension called PDF Viewer and it now works. The only discomfort I have is that you can not refresh on the PDF viewing page.

Hope this issue can be fixed soon.

Thanks, it's working now.

gdbassett · 2021-08-31T18:22:30Z

I have a similar problem.

I generate HTML files using an airflow (work automation) workflow. Those HTML files I access through jupyter with the goal of triggering additional workflows by API. Unfortunately, the sandbox prevents this.

Would it be possible to get a 'trust' button on HTML files as well to remove the sandbox?

gdbassett · 2021-09-09T17:09:00Z

This also creates problems with linking back to other files on the jupyter server because the request can't carry the auth tokens through and the auth isn't allowed in the iframe. A 'trust' button to remove the iframe and sandbox would be very helpful.

bryango mentioned this issue Aug 12, 2018

新版本 jupyter 似乎导致 PDF 无法显示 / PDF display not working in newer versions of jupyter bryango/PKUComputationalPhysics#1

Closed

ryanlovett mentioned this issue Jan 28, 2019

Hyperlink to PDF in /tree is non-functional #4371

Open

bryango mentioned this issue Jun 7, 2019

Jupyter Notebook shows pdfs from others sites but not local pdfs #4554

Open

jacobtolar mentioned this issue Apr 7, 2020

Allow links in rendered html content #5344

Open

stevengj mentioned this issue May 3, 2022

Fix PDF render JuliaLang/IJulia.jl#1038

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hyperlink in markdown cell to pdf document stopped working #3652

Hyperlink in markdown cell to pdf document stopped working #3652

kdeleeuw11 commented May 31, 2018

takluyver commented May 31, 2018

kdeleeuw11 commented May 31, 2018

bryango commented Aug 8, 2018 •

edited

Loading

bryango commented Aug 14, 2018

kdeleeuw11 commented Aug 14, 2018

bryango commented Oct 6, 2018 •

edited

Loading

takluyver commented Oct 22, 2018

matanox commented Dec 24, 2018 •

edited

Loading

bryango commented Dec 24, 2018 •

edited

Loading

kav2k commented Feb 11, 2019

takluyver commented Feb 11, 2019

zhyiyu commented Jan 29, 2021 •

edited

Loading

meeseeksmachine commented Mar 14, 2021

jliu1999 commented Mar 19, 2021

gdbassett commented Aug 31, 2021

gdbassett commented Sep 9, 2021

Hyperlink in markdown cell to pdf document stopped working #3652

Hyperlink in markdown cell to pdf document stopped working #3652

Comments

kdeleeuw11 commented May 31, 2018

takluyver commented May 31, 2018

kdeleeuw11 commented May 31, 2018

bryango commented Aug 8, 2018 • edited Loading

bryango commented Aug 14, 2018

kdeleeuw11 commented Aug 14, 2018

bryango commented Oct 6, 2018 • edited Loading

takluyver commented Oct 22, 2018

matanox commented Dec 24, 2018 • edited Loading

bryango commented Dec 24, 2018 • edited Loading

kav2k commented Feb 11, 2019

takluyver commented Feb 11, 2019

zhyiyu commented Jan 29, 2021 • edited Loading

meeseeksmachine commented Mar 14, 2021

jliu1999 commented Mar 19, 2021

gdbassett commented Aug 31, 2021

gdbassett commented Sep 9, 2021

bryango commented Aug 8, 2018 •

edited

Loading

bryango commented Oct 6, 2018 •

edited

Loading

matanox commented Dec 24, 2018 •

edited

Loading

bryango commented Dec 24, 2018 •

edited

Loading

zhyiyu commented Jan 29, 2021 •

edited

Loading