WIP Recreate old behaviour of ExecutePreprocessor(kernel_name="") #886

mpacer · 2018-09-11T20:05:22Z

This fixes #878 and spatialaudio/nbsphinx#202.

We had an API before that allowed passing an empty string as your kernel name, which it would then fall back on getting the kernel name from the notebook metadata.

This is now the actual default behaviour, and before this PR setting kernel_name="" would overwrite this default. This reintroduces the fall-back as well as a warning that this behaviour shouldn't be used (you can pass no value to kernel_name and you will get the same behaviour).

To ensure the same behaviour in both cases, it introduces the find_kernel_name API, which is then used in both the default and fallback mechanisms.

This also introduces 2 tests and a new notebook file to be tested. In order to ensure that the functionality works I'm specifying a python3 kernel, which cannot be guaranteed to exist, so we have an expected fail in the case where there is no python3 kernel.

This is being introduced because we previously allowed users to pass "" to indicate that the kernel name should be exclusively inferred from the notebook itself. Doing so now overrides the default behaviour (which now defaults to infering the kernel name from the notebook itself if nothing is set). This is not a great pattern since it involves setting a parameter to a magic value to keep the default behaviour. This also issues a warning if you use the old way to do this.

In order to not use the default "python" behvaiour, I set the kernel to be "python3". However, that means this can only work if we have a python3 kernel available.

mpacer · 2018-09-11T20:45:53Z

Note: spatialaudio/nbsphinx#208 updates nbsphinx to use the preferred means of getting a kernel from notebook metadata.

mpacer · 2018-09-11T20:47:05Z

@mgeier @akhmerov @Juanlu001 This should address your concerns & test to avoid further regressions.

MSeal · 2018-09-11T21:04:31Z

nbconvert/preprocessors/execute.py

+        # Because of an old API, an empty kernel string should be interpreted as indicating
+        # that you want to use the notebook's metadata specified kernel.
+        # We use find_kernel_name to do this.
+        if self.kernel_name == u"":


Would it be better to do if not self.kernel_name:? As a python developer I would expect that if "" is treated as falsy for a condition that None would be too. Or does the kernel_manager_class already handle None separately?

if not self.kernel_name: self.kernel_name = find_kernel_name(self.nb) if isinstance(self.kernel_name, string_types): warnings.warn(...)

After looking at it, I'm pretty sure None ~~would~~ wouldn't have worked previously… ~~but I still don't want to encourage passing people to do this regardless. So we'll want to warn no matter what type the value was.~~

The reason is that it's specified as a Unicode traitlet so passing in None would cause a validation error.

So would checking for false kernel_name and declaring it deprecated in all cases make more sense from a function perspective then? I'm ok with merging if you feel strongly to just check for unicode string, it just had some code smell for something that might fail under similar situations -- like python version differences

The python version difference was why I made it specifically a unicode string (since it's a Unicode() traitlet). I pushed a more general check and added a test to make sure that None raises a TraitError.

mgeier · 2018-09-12T17:36:10Z

This is all really confusing!

I don't see the necessity for dynamically changing the default value for kernel_name.

Why don't you just completely get rid of _kernel_name_default()?
The helper function find_kernel_name() isn't (currently) necessary either.

Then, in start_new_kernel() you could simply do something like this (untested!):

        if self.kernel_name:
            kernel_name = self.kernel_name
        else:
            kernel_name = self.nb.metadata.get('kernelspec', {}).get('name', 'python')

MSeal · 2018-09-12T18:21:04Z

@mgeier I think I follow the suggestion. I'm still learning the nbconvert config system, so I don't know if there's side-effects that are important in assigning the default in the config and raising there for the AttributeError. But if there's not a secondary reason for this pattern, and nothing else would reuse this config then it would probably be simpler to move all the checks into start_new_kernel.

minrk · 2018-09-14T14:36:24Z

I would recommend one of two approaches:

remove dynamic default for kernel_name, so that the if not self.kernel_name logic in this PR is the behavior for both the default
go "full traitlets" and add, in addition to the dynamic default, a @validate('kernel_name') that coerces kernel_name = '' into find_kernel_name(). That way self.kernel_name is always defined as the correct kernel name, whether config specified it as empty==auto or didn't specify it at all.

I don't have a strong preference, but the advantage of the first is that it relies less on traitlets magic, while the advantage of the second uses traitlets to ensure that self.kernel_name is always available as the kernel name being used. An argument for case 1 is if it makes sense to re-use a Preprocessor instance across multiple exports, since the dynamic default will be constructed only on the first notebook it sees, not on each call. I don't recall if this use case makes sense, though.

Because we cannot guarantee that self.nb exists when kernel_name is set, this requires some… finagling to handle the empty string correctly. It would be better if we could lazily query the value rather than need to specify at the time of setting.

mpacer · 2018-09-15T06:21:40Z

Thanks for weighing in!

I went with the second suggestion and ran into an issue. Because we set the kernel_name at the time of creating the Preprocessor, we run into an issue where we cannot set it to the correct value until the notebook is present to be able to query.

If we could lazily load the value, that would work. I think we can have a blend of the two approaches, where the empty string coercion is done at the time of evaluation and we still include validation to check that something is either an empty string, 'python' or in the list of available kernel specs (the python comes from the standard KernelManager behaviour).

That's going to be my next attempt at doing this. In talking with @MSeal about this we arrived at the idea that it could be helpful to have a method that explicitly defines these conventions so that they could be easily overwritten by subclasses without needing to change other parts of the machinery.

In any case I'll get back to this tomorrow.

mgeier · 2018-09-15T16:18:29Z

Using @validate('kernel_name') sounds slightly more reasonable than using @default('kernel_name'). But still, it doesn't really make sense. The notebook is passed to the ExecutePreprocessor in the method preprocess(self, nb, resources, km=None). Validating the kernel_name before this method is called doesn't make sense, because at that point the notebook isn't known yet.

It makes much more sense to keep kernel_name='' and then validate it within the preprocess() method.

Or am I missing something?

MSeal · 2018-09-15T17:32:35Z

FYI telling someone their design doesn't make sense comes off a bit aggressive, even when the suggested changes are valid suggestions.

I would argue passing an empty string as a signal/default coercion behavior is not usually a good practice. None or absence from kwargs entirely would be more Pythonic, but for compatibility to capture u"" falsey is a reasonable fallback. Having a validate that checks that you passed something we can cooerce into a kernel, or is a valid kernel name as @mpacer described sounds reasonable to me. Having a common function to convert placeholders (or empty strings) to actual kernel names JIT or as a function wrapper also seems the most robust and reusable since there is more than one place that would likely want this behavior.

Given this is just a backwards compatibility item, I wouldn't stress too much on exact details as we I believe we want to drop explicit u"" matching in the future anyway.

mgeier · 2018-09-17T10:27:15Z

I would argue passing an empty string as a signal/default coercion behavior is not usually a good practice.

There are certainly cases where it is not a good practice, but this case is not one of them.

There is no reasonable interpretation of different intent whether a user passes an empty string or nothing at all for kernel_name. Those two cases are for all intents and purposes equal.

Sure, the original author could have chosen None as default value, that would have been fine, too.

Having a common function to convert placeholders (or empty strings) to actual kernel names JIT or as a function wrapper also seems the most robust and reusable since there is more than one place that would likely want this behavior.

That would be totally fine and probably useful, as long as this function takes the notebook in question as an argument or if this function is a method of an object that represents a notebook.

I wouldn't stress too much on exact details as we I believe we want to drop explicit u"" matching in the future anyway.

Please don't.

FYI telling someone their design doesn't make sense comes off a bit aggressive, even when the suggested changes are valid suggestions.

I'm sorry if I sounded aggressive, I didn't intend to.

I'm not a native speaker, I didn't know of another way to say it and I didn't know that somebody could interpret this as aggressive.
I just wanted to state the fact that the discussed change would be logically inconsistent and simply bad software engineering, I thought colloquially using "doesn't make sense" would transport this sentiment quite succinctly.

I've looked a bit more into this and found that the logical error was committed much earlier than this pull request or the commit it tries to fix.
The real problem is that the notebook should not be assigned to self.nb.
The ExecutePreprocessor should only hold state based on its initial configuration. It shouldn't hold state about the currently processed notebook nor about the currently running kernel.
The current notebook should only be known (and the kernel should only be running) during the runtime of the preprocess() method. If some runtime information is needed in other methods, it should be passed there via function arguments.

It is also quite confusing that there is a method named preprocess_cell() and a very similarly named method run_cell() as well.

The method preprocess_cell() is normally (in the other preprocessors) run repeatedly from within the method preprocess(). So here it should either be removed (or deactivated somehow since it would also be inherited from the base class) or it should be combined with run_cell().

[Update: it would probably be easier to understand if run_cell() would be incorporated into preprocess_cell().]

preprocess_cell() has a resources argument, so the obvious place to put the runtime state (running kernel and reference to the notebook if needed) would be this resources argument.
~~But it's probably simpler to disable preprocess_cell() and pass all the necessary information to run_cell() as arguments.~~

Once this problem is resolved, our discussion here will become moot. But in the meantime, can we please keep kernel_name='' as a default that can also be passed by a user?

mgeier · 2018-09-17T10:33:44Z

Update: I just realized that preprocess_cell() is still called by the base class, as it is supposed to be.

In this case, all necessary state about the current notebook and the currently running kernel should be transported in the resources argument.

akhmerov · 2018-09-17T16:38:45Z

Once this problem is resolved, our discussion here will become moot. But in the meantime, can we please keep kernel_name='' as a default that can also be passed by a user?

What is your motivation for this preference? None is a much more common none-value. While I imagine that there wouldn't be kernels that have an empty string for a name, having None for a default is a more standard interface.

mgeier · 2018-09-17T17:01:43Z

@akhmerov

What is your motivation for this preference? None is a much more common none-value.

Sorry I wasn't clear enough.

In general I don't have a strong preference. As I said, I think both '' and None would be fine as default values in this case.

The reason why I mentioned kernel_name='' is for backwards compatibility. The default was an empty string before and I don't see a reason to change this.

If the default were changed to None (and the empty string were disallowed), this would break my nbsphinx module (again).

akhmerov · 2018-09-17T17:22:24Z

@mgeier doesn't kernel_name=None already work with all versions of nbconvert and has the same effect as kernel_name=''? If that is correct, then you could consider changing it right away in nbsphinx. This wouldn't break any compatibility and ensure nbsphinx is future-proof and also has a more conventional interface.

mgeier · 2018-09-18T07:57:51Z

@akhmerov No, using kernel_name=None doesn't currently work, as @mpacer said in #886 (comment). Trying to pass None leads to

traitlets.traitlets.TraitError: The 'kernel_name' trait of an ExecutePreprocessor instance must be a unicode string, but a value of None <class 'NoneType'> was specified.

MSeal · 2018-09-18T18:10:01Z

I do agree with @akhmerov that None is the better pattern here and that from a python developer perspective "" is not expected to have special behavior independent of None. But given history and all the discussions here, treating anything pythonic falsey solves the issue either way and is a reasonable interface choice.

Let's get M's change finalized to respect the pattern (even if it's not perfect for everyone) and redirect some of this energy into improving Preprocessor design overall. I think that's a more valuable use of time.

mgeier · 2018-11-13T10:19:05Z

It looks like #809 would also solve the issue that this PR is solving. However, #809 has milestone 6.0, and it would be great if we could solve this issue (namely #878) in release 5.5!

Can someone please tag this with milestone 5.5?

mgeier · 2018-12-04T12:06:39Z

Since there doesn't seem to be any progress here, I've created an alternative (simpler) PR here: #924.

I'm fine with either solution, as long as the problem is fixed before the next nbconvert release.

MSeal · 2018-12-14T19:32:15Z

Closing as request has been satisfied by related PRs

This causes nbsphinx failures, for more details see: spatialaudio/nbsphinx#202 jupyter/nbconvert#878 jupyter/nbconvert#886 Should be fixed by the next nbconvert release (5.5).

mpacer added 3 commits September 11, 2018 13:21

add test for find_kernel_name and notebook with kernel name metadata

07109c4

add test for old behaviour where kernel_name="" uses file specified

fc3dea1

In order to not use the default "python" behvaiour, I set the kernel to be "python3". However, that means this can only work if we have a python3 kernel available.

mpacer force-pushed the fix_empty_kernelspec branch from 719a887 to fc3dea1 Compare September 11, 2018 20:21

mpacer mentioned this pull request Sep 11, 2018

Don't pass an empty kernel name by default to ExecutePreprocessor spatialaudio/nbsphinx#208

Closed

mpacer requested review from MSeal and takluyver September 11, 2018 20:49

MSeal reviewed Sep 11, 2018

View reviewed changes

Check for any falsy value, test kernel_name=None causes TraitError

6e725b9

MSeal approved these changes Sep 12, 2018

View reviewed changes

mpacer added 2 commits September 14, 2018 23:13

remove find_kernel_name and tests for it

0d09e9d

mpacer force-pushed the fix_empty_kernelspec branch from 74c36bb to 0d09e9d Compare September 15, 2018 06:13

mgeier mentioned this pull request Sep 18, 2018

'NoneType' object has no attribute 'argv' with nbconvert 5.4 and nbsphinx #878

Closed

mpacer changed the title ~~Recreate old behaviour of ExecutePreprocessor(kernel_name="")~~ WIP Recreate old behaviour of ExecutePreprocessor(kernel_name="") Sep 18, 2018

aplamada mentioned this pull request Oct 1, 2018

AttributeError: 'NoneType' object has no attribute 'argv' hz-inova/run_jnb#3

Open

mgeier mentioned this pull request Nov 7, 2018

ExecutePreprocessor using jupyter_kernel_mgmt APIs #809

Open

mgeier mentioned this pull request Dec 4, 2018

ExecutePreprocessor: Get kernel name when executing, and not during initialization #924

Merged

MSeal mentioned this pull request Dec 7, 2018

Added empty kernel name test #927

Merged

MSeal closed this Dec 14, 2018

mgeier mentioned this pull request Jan 2, 2019

Question: thread safety of execute preprocessor #936

Closed

andportnoy mentioned this pull request Feb 4, 2019

Restrict nbconvert version to avoid version 5.4 kutaslab/fitgrid#127

Merged

mgeier mentioned this pull request Jan 21, 2020

AttributeError in .start_kernel() #1171

Closed

mgeier mentioned this pull request Feb 10, 2020

Combined run_cell and execute_cell jupyter/nbclient#12

Merged

MSeal added this to the no action milestone Sep 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP Recreate old behaviour of ExecutePreprocessor(kernel_name="") #886

WIP Recreate old behaviour of ExecutePreprocessor(kernel_name="") #886

mpacer commented Sep 11, 2018

mpacer commented Sep 11, 2018

mpacer commented Sep 11, 2018

MSeal Sep 11, 2018

mpacer Sep 12, 2018 •

edited

Loading

MSeal Sep 12, 2018

mpacer Sep 12, 2018 •

edited

Loading

mgeier commented Sep 12, 2018

MSeal commented Sep 12, 2018

minrk commented Sep 14, 2018

mpacer commented Sep 15, 2018

mgeier commented Sep 15, 2018

MSeal commented Sep 15, 2018

mgeier commented Sep 17, 2018 •

edited

Loading

mgeier commented Sep 17, 2018

akhmerov commented Sep 17, 2018

mgeier commented Sep 17, 2018

akhmerov commented Sep 17, 2018

mgeier commented Sep 18, 2018

MSeal commented Sep 18, 2018

mgeier commented Nov 13, 2018

mgeier commented Dec 4, 2018

MSeal commented Dec 14, 2018

WIP Recreate old behaviour of ExecutePreprocessor(kernel_name="") #886

WIP Recreate old behaviour of ExecutePreprocessor(kernel_name="") #886

Conversation

mpacer commented Sep 11, 2018

mpacer commented Sep 11, 2018

mpacer commented Sep 11, 2018

MSeal Sep 11, 2018

Choose a reason for hiding this comment

mpacer Sep 12, 2018 • edited Loading

Choose a reason for hiding this comment

MSeal Sep 12, 2018

Choose a reason for hiding this comment

mpacer Sep 12, 2018 • edited Loading

Choose a reason for hiding this comment

mgeier commented Sep 12, 2018

MSeal commented Sep 12, 2018

minrk commented Sep 14, 2018

mpacer commented Sep 15, 2018

mgeier commented Sep 15, 2018

MSeal commented Sep 15, 2018

mgeier commented Sep 17, 2018 • edited Loading

mgeier commented Sep 17, 2018

akhmerov commented Sep 17, 2018

mgeier commented Sep 17, 2018

akhmerov commented Sep 17, 2018

mgeier commented Sep 18, 2018

MSeal commented Sep 18, 2018

mgeier commented Nov 13, 2018

mgeier commented Dec 4, 2018

MSeal commented Dec 14, 2018

mpacer Sep 12, 2018 •

edited

Loading

mpacer Sep 12, 2018 •

edited

Loading

mgeier commented Sep 17, 2018 •

edited

Loading