-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Identify notebook file being run #1000
Comments
A custom KernelManager could add an environment variable when a kernel is started, though the KernelManager doesn't have access to the notebook path. A SessionManager could pass that down, though it wouldn't be updated when the notebook is renamed, so a filename is probably not the best key to use. |
You can put a GUID in the notebook-level metadata. I think you can do it without JS, at the web server level, on new or existing notebook load. |
--- oh, this is issue #1000 ! 🍰 🎉 |
:-P |
Wouldn't a custom It is highly unlikely that the notebook would be renamed during the swap of VMs. There might need some extra logic for clean startup/exit/restart, but that should be able to resume connections. |
So, I've picked up this work where @aggFTW left off. I think this is how we're thinking about doing this:
Item (4) will certainly be an internal extension to Jupyter for us, but we were wondering whether items (1) and (2) would have any chance of being accepted upstream. I understand that the kernel not knowing what's talking to it is part of the design, but it seems like it would be generally useful (not just for this scenario) if kernels could be made aware what the name of their notebook is either through an environment variable, a command-line argument, or a 0mq message. Do you suppose there would be any interest in that PR? |
I think it is generally useful, and we should probably do it. An environment variable is the way to go, I think. The only disadvantage of that is that you cannot update the file location on rename after the kernel has started, but a zmq message updating the file doesn't seem like the right thing to do, to me. |
Was this ever resolved? I'm making output and figure folders based off of the name of the notebooks and this code works in the notebooks, but when I from IPython.core.display import Javascript
from IPython.display import display
def get_notebook_name():
"""Returns the name of the current notebook as a string
From From https://mail.scipy.org/pipermail/ipython-dev/2014-June/014096.html
"""
display(Javascript('IPython.notebook.kernel.execute("theNotebook = " + \
"\'"+IPython.notebook.notebook_name+"\'");'))
return theNotebook But when I move it into a Is this because the |
Handwaving: The display javascript will take some time to reach the browser, and it will take some time execute the JS and get back to the kernel. During this time IPython have have to continue executing code, so try to "return theNotebook" which is undefined. So it raise. even if you could "Wait for the JS to execute" you could not set the name of the notebook before returning the function . Does that make some sens ? |
The JS sets the name in the main user namespace. When the function is moved into a module, it's looking in the module namespace, so it never sees that name. But that function is a hack, and I wouldn't rely on it in any case. |
ok, maybe this would sound silly, but would it be enough to add the ipynb filename in the metadata section of the notebook data structure when it's read? the field should not be stored in file but only updated once read in memory. - a sort of ephemeral metadata info |
I see it looks like the kernel is completely agnostic to the concept of file and it just processes cells data. I would say that the only options are indeed env variables or passing the filename during the creation of the kernel if any filename is available at that point. |
I may be late to the party, but if we could somehow determine just the port of the notebook server, then getting the notebook path is easy by using the REST api. The example below hardwires port 8080:
But I couldn't find any way to automatically determine the port, without using the not-so-safe/useful Javascript hacks. So, can we get the port? |
This seems to work: import json
import os.path
import re
import ipykernel
import requests
#try: # Python 3
# from urllib.parse import urljoin
#except ImportError: # Python 2
# from urlparse import urljoin
# Alternative that works for both Python 2 and 3:
from requests.compat import urljoin
try: # Python 3 (see Edit2 below for why this may not work in Python 2)
from notebook.notebookapp import list_running_servers
except ImportError: # Python 2
import warnings
from IPython.utils.shimmodule import ShimWarning
with warnings.catch_warnings():
warnings.simplefilter("ignore", category=ShimWarning)
from IPython.html.notebookapp import list_running_servers
def get_notebook_name():
"""
Return the full path of the jupyter notebook.
"""
kernel_id = re.search('kernel-(.*).json',
ipykernel.connect.get_connection_file()).group(1)
servers = list_running_servers()
for ss in servers:
response = requests.get(urljoin(ss['url'], 'api/sessions'),
params={'token': ss.get('token', '')})
for nn in json.loads(response.text):
if nn['kernel']['id'] == kernel_id:
relative_path = nn['notebook']['path']
return os.path.join(ss['notebook_dir'], relative_path) You can put it inside a module, and import it in the jupyter notebook. Edit: Thanks to @thesneaker, I changed the way to get the token. References: |
Thanks @gcbeltramini for this pure python solution! I'm running Jupyter 4.1.0 and had to take care of the missing I wouldn't mind if this functionality would find it's way into the |
Not quite sure why but the response was not always json for me, I fixed it by adding a
|
Also another useful method:
|
This code... try: # Python 3
from urllib.parse import urljoin
except ImportError: # Python 2
from urlparse import urljoin
try: # Python 3
from notebook.notebookapp import list_running_servers
except ImportError: # Python 2
import warnings
from IPython.utils.shimmodule import ShimWarning
with warnings.catch_warnings():
warnings.simplefilter("ignore", category=ShimWarning)
from IPython.html.notebookapp import list_running_servers ...can be replaced with this code and still work on Python 2/3. from requests.compat import urljoin
from notebook.notebookapp import list_running_servers |
The code doesn't work for me in JupyterHub. |
Note that if you do not have the right token to query the server on the REST call, json.loads(response.text) may return if nn['kernel']['id'] == kernel_id: raising |
Note that the solution above won't work when executing a nb via |
How to achieve this with the latest versions? |
Could the |
It's seems to be unreliable @gershwinlabs , sometimes |
@elgalu I can't seem to reproduce the problem. Can you tell me more about your environment and notebook? I don't think it's possible to get away from the reliance on Javascript given the deliberate separation between the front and back ends. |
Similar question asked here: |
Thanks @thorade. I posted an answer with ipyparams. |
Maybe issues with |
Does anyone know if there is a command line argument under If it doesn't exist, it's not planned or the question is out of the scope of this issue, I could open a new one and describe in detail with examples/ideas. Let me know :) |
This has been a controversial topic from some time: jupyter/notebook#1000 https://forums.databricks.com/questions/21390/is-there-any-way-to-get-the-current-notebook-name.html https://stackoverflow.com/questions/12544056/how-do-i-get-the-current-ipython-jupyter-notebook-name https://ask.sagemath.org/question/36873/access-notebook-filename-from-jupyter-with-sagemath-kernel/ This is also sometime critical to linter, and tab completion to know current name. Of course current answer is that the question is ill-defined, there might not be a file associated with the current kernel, there might be multiple files, files might not be on the same system, it could change through the execution and many other gotchas. This suggest to add an JPY_ASSOCIATED_FILE env variable which is not too visible, but give an escape hatch which should mostly be correct unless the notebook is renamed or kernel attached to a new one. Do do so this handles the new associated_file parameters in a few function of the kernel manager. On jupyter_server this one line change make the notebook name available using typical local installs: --- a/jupyter_server/services/sessions/sessionmanager.py +++ b/jupyter_server/services/sessions/sessionmanager.py @@ -96,7 +96,12 @@ class SessionManager(LoggingConfigurable): """Start a new kernel for a given session.""" # allow contents manager to specify kernels cwd kernel_path = self.contents_manager.get_kernel_path(path=path) - kernel_id = await self.kernel_manager.start_kernel(path=kernel_path, kernel_name=kernel_name) + + kernel_id = await self.kernel_manager.start_kernel( + path=kernel_path, kernel_name=kernel_name, associated_file=name + ) return kernel_id Of course only launchers that will pass forward this value will allow the env variable to be set. I'm thinking that various kernels may use this and expose it in different ways. like __notebook_name__ if it ends with `.ipynb` in ipykernel.
This has been a controversial topic from some time: jupyter/notebook#1000 https://forums.databricks.com/questions/21390/is-there-any-way-to-get-the-current-notebook-name.html https://stackoverflow.com/questions/12544056/how-do-i-get-the-current-ipython-jupyter-notebook-name https://ask.sagemath.org/question/36873/access-notebook-filename-from-jupyter-with-sagemath-kernel/ This is also sometime critical to linter, and tab completion to know current name. Of course current answer is that the question is ill-defined, there might not be a file associated with the current kernel, there might be multiple files, files might not be on the same system, it could change through the execution and many other gotchas. This suggest to add an JPY_KERNEL_SESSION_NAME env variable which is not too visible, but give an escape hatch which should mostly be correct unless the notebook is renamed or kernel attached to a new one. Do do so this handles the new associated_file parameters in a few function of the kernel manager. On jupyter_server this one line change make the notebook name available using typical local installs: --- a/jupyter_server/services/sessions/sessionmanager.py +++ b/jupyter_server/services/sessions/sessionmanager.py @@ -96,7 +96,12 @@ class SessionManager(LoggingConfigurable): """Start a new kernel for a given session.""" # allow contents manager to specify kernels cwd kernel_path = self.contents_manager.get_kernel_path(path=path) - kernel_id = await self.kernel_manager.start_kernel(path=kernel_path, kernel_name=kernel_name) + + kernel_id = await self.kernel_manager.start_kernel( + path=kernel_path, kernel_name=kernel_name, session_name=name + ) return kernel_id Of course only launchers that will pass forward this value will allow the env variable to be set. I'm thinking that various kernels may use this and expose it in different ways. like __notebook_name__ if it ends with `.ipynb` in ipykernel. Commit ammended – originally the name was associated_file, and JPY_ASSOCIATED_FILE, but was changed.
This has been a controversial topic from some time: jupyter/notebook#1000 https://forums.databricks.com/questions/21390/is-there-any-way-to-get-the-current-notebook-name.html https://stackoverflow.com/questions/12544056/how-do-i-get-the-current-ipython-jupyter-notebook-name https://ask.sagemath.org/question/36873/access-notebook-filename-from-jupyter-with-sagemath-kernel/ This is also sometime critical to linter, and tab completion to know current name. Of course current answer is that the question is ill-defined, there might not be a file associated with the current kernel, there might be multiple files, files might not be on the same system, it could change through the execution and many other gotchas. This suggest to add an JPY_KERNEL_SESSION_NAME env variable which is not too visible, but give an escape hatch which should mostly be correct unless the notebook is renamed or kernel attached to a new one. Do do so this handles the new associated_file parameters in a few function of the kernel manager. On jupyter_server this one line change make the notebook name available using typical local installs: ```diff diff --git a/notebook/services/sessions/sessionmanager.py b/notebook/services/sessions/sessionmanager.py index 92b2a7345..f7b4011ce 100644 --- a/notebook/services/sessions/sessionmanager.py +++ b/notebook/services/sessions/sessionmanager.py @@ -108,7 +108,9 @@ class SessionManager(LoggingConfigurable): # allow contents manager to specify kernels cwd kernel_path = self.contents_manager.get_kernel_path(path=path) kernel_id = yield maybe_future( - self.kernel_manager.start_kernel(path=kernel_path, kernel_name=kernel_name) + self.kernel_manager.start_kernel( + path=kernel_path, kernel_name=kernel_name, session_name=path + ) ) # py2-compat raise gen.Return(kernel_id) ```diff Of course only launchers that will pass forward this value will allow the env variable to be set. I'm thinking that various kernels may use this and expose it in different ways. like __notebook_name__ if it ends with `.ipynb` in ipykernel.
This has been a controversial topic from some time: jupyter/notebook#1000 https://forums.databricks.com/questions/21390/is-there-any-way-to-get-the-current-notebook-name.html https://stackoverflow.com/questions/12544056/how-do-i-get-the-current-ipython-jupyter-notebook-name https://ask.sagemath.org/question/36873/access-notebook-filename-from-jupyter-with-sagemath-kernel/ This is also sometime critical to linter, and tab completion to know current name. Of course current answer is that the question is ill-defined, there might not be a file associated with the current kernel, there might be multiple files, files might not be on the same system, it could change through the execution and many other gotchas. This suggest to add an JPY_KERNEL_SESSION_NAME env variable which is not too visible, but give an escape hatch which should mostly be correct unless the notebook is renamed or kernel attached to a new one. Do do so this handles the new associated_file parameters in a few function of the kernel manager. On jupyter_server this one line change make the notebook name available using typical local installs: ```diff diff --git a/notebook/services/sessions/sessionmanager.py b/notebook/services/sessions/sessionmanager.py index 92b2a7345..f7b4011ce 100644 --- a/notebook/services/sessions/sessionmanager.py +++ b/notebook/services/sessions/sessionmanager.py @@ -108,7 +108,9 @@ class SessionManager(LoggingConfigurable): # allow contents manager to specify kernels cwd kernel_path = self.contents_manager.get_kernel_path(path=path) kernel_id = yield maybe_future( - self.kernel_manager.start_kernel(path=kernel_path, kernel_name=kernel_name) + self.kernel_manager.start_kernel( + path=kernel_path, kernel_name=kernel_name, session_name=path + ) ) # py2-compat raise gen.Return(kernel_id) ```diff Of course only launchers that will pass forward this value will allow the env variable to be set. I'm thinking that various kernels may use this and expose it in different ways. like __notebook_name__ if it ends with `.ipynb` in ipykernel.
This has been a controversial topic from some time: jupyter/notebook#1000 https://forums.databricks.com/questions/21390/is-there-any-way-to-get-the-current-notebook-name.html https://stackoverflow.com/questions/12544056/how-do-i-get-the-current-ipython-jupyter-notebook-name https://ask.sagemath.org/question/36873/access-notebook-filename-from-jupyter-with-sagemath-kernel/ This is also sometime critical to linter, and tab completion to know current name. Of course current answer is that the question is ill-defined, there might not be a file associated with the current kernel, there might be multiple files, files might not be on the same system, it could change through the execution and many other gotchas. This suggest to add an JPY_KERNEL_SESSION_NAME env variable which is not too visible, but give an escape hatch which should mostly be correct unless the notebook is renamed or kernel attached to a new one. Do do so this handles the new associated_file parameters in a few function of the kernel manager. On jupyter_server this one line change make the notebook name available using typical local installs: ```diff diff --git a/notebook/services/sessions/sessionmanager.py b/notebook/services/sessions/sessionmanager.py index 92b2a7345..f7b4011ce 100644 --- a/notebook/services/sessions/sessionmanager.py +++ b/notebook/services/sessions/sessionmanager.py @@ -108,7 +108,9 @@ class SessionManager(LoggingConfigurable): # allow contents manager to specify kernels cwd kernel_path = self.contents_manager.get_kernel_path(path=path) kernel_id = yield maybe_future( - self.kernel_manager.start_kernel(path=kernel_path, kernel_name=kernel_name) + self.kernel_manager.start_kernel( + path=kernel_path, kernel_name=kernel_name, session_name=path + ) ) # py2-compat raise gen.Return(kernel_id) ```diff Of course only launchers that will pass forward this value will allow the env variable to be set. I'm thinking that various kernels may use this and expose it in different ways. like __notebook_name__ if it ends with `.ipynb` in ipykernel.
This looks hackish to me:
is there any simpler way to get id? Was trying to look into the code, and coulnd't find where id is in Kernel. connection_file created as
Or probably I'm looking into the wrong place. Any suggestions? |
Hi,
I've seen this type of question a lot:
http://stackoverflow.com/questions/20050927/how-to-get-the-ipython-notebook-title-associated-with-the-currently-running-ipyt?rq=1
It makes sense to me that the kernel should not know what it's talking to from a design perspective.
However, I'm currently in the process of working through a Jupyter High Availability scenario. Our goal is to have two Jupyter instances running in two different VMs and switch them if one of those two VMs go down for some reason without losing the kernel state.
We have control over the kernels we are running (see https://github.com/jupyter-incubator/sparkmagic/blob/master/remotespark/wrapperkernel/sparkkernelbase.py), and we'd like to be able to tie some state (a session number) to a particular kernel instance.
It seems to me like I'd need some things to achieve this, but maybe you have better ideas:
__init__
method in my kernel or some other piece of code that is triggered every time a kernel gets started (some Javascript code in the notebook maybe? I know this wouldn't apply for other clients but it's a start).I thought of a concrete implementation and I'd like to hear some feedback on it if possible:
There is a Notebook extension that reads some ID in the notebook's page DOM (I need help knowing what ID this would be: e.g. notebook name with relative paths from root folder included or a GUID in some hidden cell in the notebook file), which would then issue a request to the kernel with this ID to restore its state. The kernel would then take this ID and get the session ID from cloud storage. If the ID is embedded in Javascript, both Jupyter servers would need to trust the notebook from the get go.
Thanks for any help or pointers you may have!
(cc. @msftristew, @MohamedElKamhawy, @ellisonbg)
The text was updated successfully, but these errors were encountered: