Identify notebook file being run #1000

aggFTW · 2016-01-26T22:37:50Z

Hi,

I've seen this type of question a lot:
http://stackoverflow.com/questions/20050927/how-to-get-the-ipython-notebook-title-associated-with-the-currently-running-ipyt?rq=1

It makes sense to me that the kernel should not know what it's talking to from a design perspective.

However, I'm currently in the process of working through a Jupyter High Availability scenario. Our goal is to have two Jupyter instances running in two different VMs and switch them if one of those two VMs go down for some reason without losing the kernel state.

We have control over the kernels we are running (see https://github.com/jupyter-incubator/sparkmagic/blob/master/remotespark/wrapperkernel/sparkkernelbase.py), and we'd like to be able to tie some state (a session number) to a particular kernel instance.

It seems to me like I'd need some things to achieve this, but maybe you have better ideas:

Fire some piece of code automatically every time a notebook starts: this could be the __init__ method in my kernel or some other piece of code that is triggered every time a kernel gets started (some Javascript code in the notebook maybe? I know this wouldn't apply for other clients but it's a start).
This previous bit of code that gets fired would need to always be run with the same ID to be able to identify the state it needs to reconstruct (i.e. it would need to know that for this particular kernel we had X particular state).
Some persistent storage that both Jupyter instances could have access to.

I thought of a concrete implementation and I'd like to hear some feedback on it if possible:
There is a Notebook extension that reads some ID in the notebook's page DOM (I need help knowing what ID this would be: e.g. notebook name with relative paths from root folder included or a GUID in some hidden cell in the notebook file), which would then issue a request to the kernel with this ID to restore its state. The kernel would then take this ID and get the session ID from cloud storage. If the ID is embedded in Javascript, both Jupyter servers would need to trust the notebook from the get go.

Thanks for any help or pointers you may have!
(cc. @msftristew, @MohamedElKamhawy, @ellisonbg)

The text was updated successfully, but these errors were encountered:

aggFTW · 2016-01-29T02:57:37Z

cc @Carreau and @jdfreder

minrk · 2016-01-29T08:08:04Z

A custom KernelManager could add an environment variable when a kernel is started, though the KernelManager doesn't have access to the notebook path. A SessionManager could pass that down, though it wouldn't be updated when the notebook is renamed, so a filename is probably not the best key to use.

jdfreder · 2016-01-29T17:34:54Z

You can put a GUID in the notebook-level metadata. I think you can do it without JS, at the web server level, on new or existing notebook load.

jdfreder · 2016-01-29T17:49:58Z

--- oh, this is issue #1000 ! 🍰 🎉

Carreau · 2016-01-29T17:54:20Z

:-P

Carreau · 2016-01-29T17:59:22Z

Wouldn't a custom MappingKernelManager that store the various kernel-models in a shared DB we enough ? (or I miss something about the notebook name).

It is highly unlikely that the notebook would be renamed during the swap of VMs.

There might need some extra logic for clean startup/exit/restart, but that should be able to resume connections.

msftristew · 2016-02-24T20:28:29Z

So, I've picked up this work where @aggFTW left off. I think this is how we're thinking about doing this:

Use a custom SessionManager that passes down the notebook name as an argument to the MappingKernelManager.
Use a custom KernelManager that communicates the notebook name to the new kernel process on startup (through an environment variable or some other method).
Our custom kernels will take the notebook name as a key and will update their metadata as appropriate in the way that @aggFTW described above.
Use a custom ContentsManager to update the metadata necessary for resuming stale sessions when a method is renamed.

Item (4) will certainly be an internal extension to Jupyter for us, but we were wondering whether items (1) and (2) would have any chance of being accepted upstream. I understand that the kernel not knowing what's talking to it is part of the design, but it seems like it would be generally useful (not just for this scenario) if kernels could be made aware what the name of their notebook is either through an environment variable, a command-line argument, or a 0mq message. Do you suppose there would be any interest in that PR?

minrk · 2016-02-25T10:57:41Z

I think it is generally useful, and we should probably do it. An environment variable is the way to go, I think. The only disadvantage of that is that you cannot update the file location on rename after the kernel has started, but a zmq message updating the file doesn't seem like the right thing to do, to me.

olgabot · 2017-01-17T21:53:03Z

Was this ever resolved? I'm making output and figure folders based off of the name of the notebooks and this code works in the notebooks, but when I

from IPython.core.display import Javascript
from IPython.display import display


def get_notebook_name():
    """Returns the name of the current notebook as a string
    
    From From https://mail.scipy.org/pipermail/ipython-dev/2014-June/014096.html
    """
    display(Javascript('IPython.notebook.kernel.execute("theNotebook = " + \
    "\'"+IPython.notebook.notebook_name+"\'");'))
    return theNotebook

But when I move it into a common.py file so it can be accessed across all notebooks, I get a NameError:

Is this because the .py file has no notebook? Is there a way to get the .py file to recognize the notebook it is being called from?

Carreau · 2017-01-17T23:52:15Z

display(Javascript('IPython.notebook.kernel.execute("theNotebook = " + \
"\'"+IPython.notebook.notebook_name+"\'");'))
## Here are dragons. 
return theNotebook

Handwaving:

The display javascript will take some time to reach the browser, and it will take some time execute the JS and get back to the kernel.

During this time IPython have have to continue executing code, so try to "return theNotebook" which is undefined. So it raise. even if you could "Wait for the JS to execute" you could not set the name of the notebook before returning the function .

Does that make some sens ?

takluyver · 2017-01-18T10:41:28Z

The JS sets the name in the main user namespace. When the function is moved into a module, it's looking in the module namespace, so it never sees that name. But that function is a hack, and I wouldn't rely on it in any case.

natbusa · 2017-04-27T15:58:03Z

ok, maybe this would sound silly, but would it be enough to add the ipynb filename in the metadata section of the notebook data structure when it's read? the field should not be stored in file but only updated once read in memory. - a sort of ephemeral metadata info

natbusa · 2017-04-27T22:23:49Z

I see it looks like the kernel is completely agnostic to the concept of file and it just processes cells data. I would say that the only options are indeed env variables or passing the filename during the creation of the kernel if any filename is available at that point.

jordansamuels · 2017-05-22T14:51:17Z

I may be late to the party, but if we could somehow determine just the port of the notebook server, then getting the notebook path is easy by using the REST api. The example below hardwires port 8080:

kernel_id = re.search('kernel-(.*).json', ipykernel.connect.get_connection_file()).group(1)
response = requests.get('http://127.0.0.1:{port}/api/sessions'.format(port=8080))
matching = [s for s in json.loads(response.text) if s['kernel']['id'] == kernel_id]
if matching:
    return matching[0]['notebook']['path']

But I couldn't find any way to automatically determine the port, without using the not-so-safe/useful Javascript hacks.

So, can we get the port?

gcbeltramini · 2018-01-23T17:55:51Z

This seems to work:

import json
import os.path
import re
import ipykernel
import requests

#try:  # Python 3
#    from urllib.parse import urljoin
#except ImportError:  # Python 2
#    from urlparse import urljoin

# Alternative that works for both Python 2 and 3:
from requests.compat import urljoin

try:  # Python 3 (see Edit2 below for why this may not work in Python 2)
    from notebook.notebookapp import list_running_servers
except ImportError:  # Python 2
    import warnings
    from IPython.utils.shimmodule import ShimWarning
    with warnings.catch_warnings():
        warnings.simplefilter("ignore", category=ShimWarning)
        from IPython.html.notebookapp import list_running_servers


def get_notebook_name():
    """
    Return the full path of the jupyter notebook.
    """
    kernel_id = re.search('kernel-(.*).json',
                          ipykernel.connect.get_connection_file()).group(1)
    servers = list_running_servers()
    for ss in servers:
        response = requests.get(urljoin(ss['url'], 'api/sessions'),
                                params={'token': ss.get('token', '')})
        for nn in json.loads(response.text):
            if nn['kernel']['id'] == kernel_id:
                relative_path = nn['notebook']['path']
                return os.path.join(ss['notebook_dir'], relative_path)

You can put it inside a module, and import it in the jupyter notebook.

Edit: Thanks to @thesneaker, I changed the way to get the token.
Edit2: I tested in Python 2, but the Jupyter notebook couldn't import from notebook.notebookapp import list_running_servers when it was inside a module.
Edit3: Added an alternative and an observation thanks to this comment.

References:

thesneaker · 2018-01-25T15:20:14Z

Thanks @gcbeltramini for this pure python solution! I'm running Jupyter 4.1.0 and had to take care of the missing token key. Other than that it's the best solution I've come across so far!

I wouldn't mind if this functionality would find it's way into the notebookapp class and be the recommended way by the jupyter devs. Having easy access to the notebook name (and preferably the path) is essential to do reproducible measurements with jupyter notebooks.

vpillac · 2018-02-07T00:18:49Z

Not quite sure why but the response was not always json for me, I fixed it by adding a try statement:

        try:
            for nn in json.loads(response.text):
                if nn['kernel']['id'] == kernel_id:
                    relative_path = nn['notebook']['path']
                    return os.path.join(ss['notebook_dir'], relative_path)
        except:
            pass

vpillac · 2018-02-07T00:20:34Z

Also another useful method:

def save_notebook_to_html():
    nb_name = get_notebook_name()
    s = os.system('jupyter nbconvert --to html {notebook}'.format(notebook=nb_name))
    return s == 0

jakirkham · 2018-06-28T18:00:49Z

This code...

try:  # Python 3
    from urllib.parse import urljoin
except ImportError:  # Python 2
    from urlparse import urljoin

try:  # Python 3
    from notebook.notebookapp import list_running_servers
except ImportError:  # Python 2
    import warnings
    from IPython.utils.shimmodule import ShimWarning
    with warnings.catch_warnings():
        warnings.simplefilter("ignore", category=ShimWarning)
        from IPython.html.notebookapp import list_running_servers

...can be replaced with this code and still work on Python 2/3.

from requests.compat import urljoin

from notebook.notebookapp import list_running_servers

dclong · 2018-08-05T08:30:03Z

The code doesn't work for me in JupyterHub.

convoliution · 2019-06-17T23:37:18Z

Note that if you do not have the right token to query the server on the REST call,

json.loads(response.text)

may return {"message": "Forbidden", "reason": null} instead of a list of sessions, resulting in

if nn['kernel']['id'] == kernel_id:

raising TypeError: string indices must be integers

DBCerigo · 2019-06-24T15:58:31Z

Note that the solution above won't work when executing a nb via jupyter nbconvert --to notebook --execute mynotebook.ipynb or via from nbconvert.preprocessors import ExecutePreprocessor from within a python script, as (of course?!) there's no server running to query.

elgalu · 2019-07-26T14:09:24Z

How to achieve this with the latest versions?

billallen256 · 2019-12-09T01:49:57Z

Could the ipyparams package work for this? It can return the notebook file name as well as any query string parameters passed in the URL.

elgalu · 2019-12-18T10:13:00Z

It's seems to be unreliable @gershwinlabs , sometimes ipyparams.raw_url comes back as an empty string, seems to be related to the reliance on JavaScript, some sort of race condition.

billallen256 · 2019-12-18T21:57:45Z

@elgalu I can't seem to reproduce the problem. Can you tell me more about your environment and notebook? I don't think it's possible to get away from the reliance on Javascript given the deliberate separation between the front and back ends.

thorade · 2020-03-19T09:14:18Z

Similar question asked here:
https://stackoverflow.com/questions/12544056/how-do-i-get-the-current-ipython-jupyter-notebook-name

billallen256 · 2020-03-19T15:52:46Z

Thanks @thorade. I posted an answer with ipyparams.

jakirkham · 2020-03-22T22:14:07Z

Maybe issues with ipyparams can be raised against that repo? 😉

Ismar11 · 2020-04-16T13:18:58Z

Does anyone know if there is a command line argument under jupyter notebook list or a similar feature to get notebook names running in each server from console directly?

If it doesn't exist, it's not planned or the question is out of the scope of this issue, I could open a new one and describe in detail with examples/ideas. Let me know :)

This has been a controversial topic from some time: jupyter/notebook#1000 https://forums.databricks.com/questions/21390/is-there-any-way-to-get-the-current-notebook-name.html https://stackoverflow.com/questions/12544056/how-do-i-get-the-current-ipython-jupyter-notebook-name https://ask.sagemath.org/question/36873/access-notebook-filename-from-jupyter-with-sagemath-kernel/ This is also sometime critical to linter, and tab completion to know current name. Of course current answer is that the question is ill-defined, there might not be a file associated with the current kernel, there might be multiple files, files might not be on the same system, it could change through the execution and many other gotchas. This suggest to add an JPY_ASSOCIATED_FILE env variable which is not too visible, but give an escape hatch which should mostly be correct unless the notebook is renamed or kernel attached to a new one. Do do so this handles the new associated_file parameters in a few function of the kernel manager. On jupyter_server this one line change make the notebook name available using typical local installs: --- a/jupyter_server/services/sessions/sessionmanager.py +++ b/jupyter_server/services/sessions/sessionmanager.py @@ -96,7 +96,12 @@ class SessionManager(LoggingConfigurable): """Start a new kernel for a given session.""" # allow contents manager to specify kernels cwd kernel_path = self.contents_manager.get_kernel_path(path=path) - kernel_id = await self.kernel_manager.start_kernel(path=kernel_path, kernel_name=kernel_name) + + kernel_id = await self.kernel_manager.start_kernel( + path=kernel_path, kernel_name=kernel_name, associated_file=name + ) return kernel_id Of course only launchers that will pass forward this value will allow the env variable to be set. I'm thinking that various kernels may use this and expose it in different ways. like __notebook_name__ if it ends with `.ipynb` in ipykernel.

This has been a controversial topic from some time: jupyter/notebook#1000 https://forums.databricks.com/questions/21390/is-there-any-way-to-get-the-current-notebook-name.html https://stackoverflow.com/questions/12544056/how-do-i-get-the-current-ipython-jupyter-notebook-name https://ask.sagemath.org/question/36873/access-notebook-filename-from-jupyter-with-sagemath-kernel/ This is also sometime critical to linter, and tab completion to know current name. Of course current answer is that the question is ill-defined, there might not be a file associated with the current kernel, there might be multiple files, files might not be on the same system, it could change through the execution and many other gotchas. This suggest to add an JPY_KERNEL_SESSION_NAME env variable which is not too visible, but give an escape hatch which should mostly be correct unless the notebook is renamed or kernel attached to a new one. Do do so this handles the new associated_file parameters in a few function of the kernel manager. On jupyter_server this one line change make the notebook name available using typical local installs: --- a/jupyter_server/services/sessions/sessionmanager.py +++ b/jupyter_server/services/sessions/sessionmanager.py @@ -96,7 +96,12 @@ class SessionManager(LoggingConfigurable): """Start a new kernel for a given session.""" # allow contents manager to specify kernels cwd kernel_path = self.contents_manager.get_kernel_path(path=path) - kernel_id = await self.kernel_manager.start_kernel(path=kernel_path, kernel_name=kernel_name) + + kernel_id = await self.kernel_manager.start_kernel( + path=kernel_path, kernel_name=kernel_name, session_name=name + ) return kernel_id Of course only launchers that will pass forward this value will allow the env variable to be set. I'm thinking that various kernels may use this and expose it in different ways. like __notebook_name__ if it ends with `.ipynb` in ipykernel. Commit ammended – originally the name was associated_file, and JPY_ASSOCIATED_FILE, but was changed.

This has been a controversial topic from some time: jupyter/notebook#1000 https://forums.databricks.com/questions/21390/is-there-any-way-to-get-the-current-notebook-name.html https://stackoverflow.com/questions/12544056/how-do-i-get-the-current-ipython-jupyter-notebook-name https://ask.sagemath.org/question/36873/access-notebook-filename-from-jupyter-with-sagemath-kernel/ This is also sometime critical to linter, and tab completion to know current name. Of course current answer is that the question is ill-defined, there might not be a file associated with the current kernel, there might be multiple files, files might not be on the same system, it could change through the execution and many other gotchas. This suggest to add an JPY_KERNEL_SESSION_NAME env variable which is not too visible, but give an escape hatch which should mostly be correct unless the notebook is renamed or kernel attached to a new one. Do do so this handles the new associated_file parameters in a few function of the kernel manager. On jupyter_server this one line change make the notebook name available using typical local installs: ```diff diff --git a/notebook/services/sessions/sessionmanager.py b/notebook/services/sessions/sessionmanager.py index 92b2a7345..f7b4011ce 100644 --- a/notebook/services/sessions/sessionmanager.py +++ b/notebook/services/sessions/sessionmanager.py @@ -108,7 +108,9 @@ class SessionManager(LoggingConfigurable): # allow contents manager to specify kernels cwd kernel_path = self.contents_manager.get_kernel_path(path=path) kernel_id = yield maybe_future( - self.kernel_manager.start_kernel(path=kernel_path, kernel_name=kernel_name) + self.kernel_manager.start_kernel( + path=kernel_path, kernel_name=kernel_name, session_name=path + ) ) # py2-compat raise gen.Return(kernel_id) ```diff Of course only launchers that will pass forward this value will allow the env variable to be set. I'm thinking that various kernels may use this and expose it in different ways. like __notebook_name__ if it ends with `.ipynb` in ipykernel.

cono · 2022-02-06T14:32:35Z

This looks hackish to me:

    kernel_id = re.search('kernel-(.*).json',
                          ipykernel.connect.get_connection_file()).group(1)

is there any simpler way to get id?

Was trying to look into the code, and coulnd't find where id is in Kernel. connection_file created as os.getpid():

    def init_connection_file(self):
        if not self.connection_file:
            self.connection_file = "kernel-%s.json"%os.getpid()
        try:
            self.connection_file = filefind(self.connection_file, ['.', self.connection_dir])
        except OSError:

Or probably I'm looking into the wrong place. Any suggestions?

Carreau added this to the wishlist milestone Jun 27, 2016

jhconning mentioned this issue Jan 25, 2017

information on current notebook filename? rasbt/watermark#25

Open

damianavila mentioned this issue Jun 29, 2017

using export_png or save without filename from within jupyter notebook saves png file to lib/python bokeh/bokeh#6560

Closed

AbdealiLoKo mentioned this issue Dec 23, 2017

Getting IP of client in and filename in Notebook #3156

Open

stevengj mentioned this issue Feb 17, 2018

access attachments JuliaLang/IJulia.jl#625

Closed

joshtemple mentioned this issue Mar 12, 2018

Get path to current notebook via Python jupyterhub/jupyterhub#1718

Closed

AtsushiHashimoto mentioned this issue Oct 24, 2018

get notebook name automatically under password authentification AtsushiHashimoto/lab-note#1

Open

elgalu mentioned this issue Nov 4, 2019

Identify notebook file being run jupyterhub/jupyterhub#2805

Closed

jfischer mentioned this issue Nov 16, 2019

get_notebook_name() fails in JupyterHub data-workspaces/data-workspaces-core#44

Closed

thangleiter mentioned this issue Jun 23, 2020

TypeError in util at import qutech/filter_functions#26

Closed

ianfore mentioned this issue Mar 1, 2021

Review FaspScript naming and documentation ga4gh/fasp-scripts#8

Closed

Carreau mentioned this issue Jun 7, 2021

Lay foundation to pass notebook names to kernel at startup. jupyter/jupyter_client#656

Closed

marcinwrochna mentioned this issue Sep 23, 2021

Not passing token to NotebookApp spoils notebook discovery. jupyterhub/jupyterhub#3605

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identify notebook file being run #1000

Identify notebook file being run #1000

aggFTW commented Jan 26, 2016

aggFTW commented Jan 29, 2016

minrk commented Jan 29, 2016

jdfreder commented Jan 29, 2016

jdfreder commented Jan 29, 2016

Carreau commented Jan 29, 2016

Carreau commented Jan 29, 2016

msftristew commented Feb 24, 2016

minrk commented Feb 25, 2016

olgabot commented Jan 17, 2017 •

edited

Loading

Carreau commented Jan 17, 2017

takluyver commented Jan 18, 2017

natbusa commented Apr 27, 2017

natbusa commented Apr 27, 2017

jordansamuels commented May 22, 2017

gcbeltramini commented Jan 23, 2018 •

edited

Loading

thesneaker commented Jan 25, 2018

vpillac commented Feb 7, 2018

vpillac commented Feb 7, 2018

jakirkham commented Jun 28, 2018

dclong commented Aug 5, 2018

convoliution commented Jun 17, 2019 •

edited

Loading

DBCerigo commented Jun 24, 2019

elgalu commented Jul 26, 2019

billallen256 commented Dec 9, 2019

elgalu commented Dec 18, 2019

billallen256 commented Dec 18, 2019

thorade commented Mar 19, 2020

billallen256 commented Mar 19, 2020

jakirkham commented Mar 22, 2020

Ismar11 commented Apr 16, 2020

cono commented Feb 6, 2022

Identify notebook file being run #1000

Identify notebook file being run #1000

Comments

aggFTW commented Jan 26, 2016

aggFTW commented Jan 29, 2016

minrk commented Jan 29, 2016

jdfreder commented Jan 29, 2016

jdfreder commented Jan 29, 2016

Carreau commented Jan 29, 2016

Carreau commented Jan 29, 2016

msftristew commented Feb 24, 2016

minrk commented Feb 25, 2016

olgabot commented Jan 17, 2017 • edited Loading

Carreau commented Jan 17, 2017

takluyver commented Jan 18, 2017

natbusa commented Apr 27, 2017

natbusa commented Apr 27, 2017

jordansamuels commented May 22, 2017

gcbeltramini commented Jan 23, 2018 • edited Loading

thesneaker commented Jan 25, 2018

vpillac commented Feb 7, 2018

vpillac commented Feb 7, 2018

jakirkham commented Jun 28, 2018

dclong commented Aug 5, 2018

convoliution commented Jun 17, 2019 • edited Loading

DBCerigo commented Jun 24, 2019

elgalu commented Jul 26, 2019

billallen256 commented Dec 9, 2019

elgalu commented Dec 18, 2019

billallen256 commented Dec 18, 2019

thorade commented Mar 19, 2020

billallen256 commented Mar 19, 2020

jakirkham commented Mar 22, 2020

Ismar11 commented Apr 16, 2020

cono commented Feb 6, 2022

olgabot commented Jan 17, 2017 •

edited

Loading

gcbeltramini commented Jan 23, 2018 •

edited

Loading

convoliution commented Jun 17, 2019 •

edited

Loading