Pass config to kernel from ExecutePreprocessor for configurable kernels #816

mpacer · 2018-05-17T00:53:26Z

This became a much larger PR due to my efforts to write tests for the original PR.

Most notably, this does the refactor that I proposed in @MSeal's #811 to make it easier to overwrite the preprocess method without allowing people to override the reliance on the superclass preprocess method on its own. This way all the overriding can occur without relying on the implementer recreating all the complex internal state logic (as was the case before).

It creates 2 new public methods, and adds a dynamic default to an existing traitlet:

new dynamic default for kernel_name:
Now if no kernel_name has been supplied as a traitlet value, it internally checks for whether self.nb has been set, and raises an AttributeError if self.nb has not been set. The logic is that if you don't set it directly, then there is no meaningful way to describe what an ExecutePreprocessor's kernel should be other than if a notebook is already being processed. Previously this logic was inside the preprocess method. It gives the same priority to the traitlet defined value.
start_new_kernel: previously a callback inside preprocess; it has been reworked to rely on the dynamic default of kernel_name to decide which kernel to create. It also passes through any configuration from ExecutePreprocessor.
setup_preprocessor: a context manager that sets up the self.nb, self.km, and self.kc values and ensures that they are cleaned up after the execution is done. It also resets the self._display_id_map value which was previously being done in preprocess.

This also changes the value of the default kernel_manager_class function to match the older convention where its name should be _default_kernel_manager_class.

Smaller original PR:

Pass config to kernel from ExecutePreprocessor for configurable kernels

Right now, the way that we're instantiating the kernels means any configuration that can be passed to those kernels stops at the point where we actually create the Kernel.

If you want any custom Kernels to be configurable (e.g., to use a different kernel_spec_manager) then you need to be able to pass this through at instantiation.

@rgbkrk @takluyver @Carreau

setup_preprocessor is a contextmanager that handles setting and cleaning up attributes needed by the rest of the execution logic (specifically self.nb, self.km, self.kc). start_new_kernel was previously a callback function and is now a separate method used inside setup_preprocessor. This also introduces a dynamic default for the `kernel_name` traitlet that checks if self.nb is defined to see if it has a kernel name, and otherwise will raise an AttributeError (since that means it is being relied upon without any way to know which kernel is to be used.

moved _display_id_map help inside the declaration instead of comment unified named imports from traitlets

mpacer · 2018-05-19T02:23:14Z

I still need to add tests to make sure that configuration is passed through more directly. This is partially why I wanted the start_new_kernel to be an isolated function so that that can be tested without needing to do the entirety of executing a notebook.

mpacer · 2018-05-19T02:31:29Z

Note: I think the correct pattern might be to pass through parent=self instead of config=self.config but I'm not sure where I can verify which pattern is preferred.

takluyver · 2018-05-21T17:50:24Z

This looks sensible for now, but if we do adopt my WIP new kernel machinery (see #809), the ability to pick a kernel manager class will go away, because the kernel manager is something that the kernel provider gets to decide. I.e. you ask it to start a kernel of type spec/python3, and it gives you back a subprocess kernel manager.

The good news is that there should be less reason for the frontend to set the kernel manager class, because it affects less. In particular, it sounds like you want to configure it to affect finding and starting a kernel, which is all moved out of kernel manager in the new machinery.

For passing parent/config: it looks like there's some long forgotten magic in traitlets so if we use parent, it's possible to use config names like ExecutePreprocessor.KernelManager.autorestart=False. I could see that being useful in some situations, but... it's yet another level of complexity in our config system, and it further exposes the internals of our code to config. I'd be in favour of quietly getting rid of that option entirely and hoping no-one has noticed it's possible yet. ;-)

rgbkrk · 2018-05-21T18:13:22Z

@takluyver I'd love to make a tradeoff here where we allow for the current machinery to work by adopting this while also moving ahead with your kernel provider work as the long term solution.

takluyver · 2018-05-21T18:21:58Z

I'm fine with that, I just want to make sure that the motivation for adding this configurability is not something that we're going to irrevocably break if/when we do adopt the new machinery. If the motivation is to replace kernel_spec_manager, that's probably OK - the customisation will need to be redone for the new machinery, but it should still be possible to control how a kernel is found.

MSeal · 2018-05-21T18:40:22Z

nbconvert/preprocessors/execute.py

        try:
-            nb, resources = super(ExecutePreprocessor, self).preprocess(nb, resources)
+            yield


Should we yield the context variables nb, km, kc for more easier understanding and variable consumption downstream?

We can yield the context variables… but the problem is that we still need all three (notebook (nb), kernel client (kc) and kernel mangager (km)) to be accessible via self later on:

https://github.com/mpacer/nbconvert/blob/0de8f30404a99e63d5cc22e31cb4a984cc201beb/nbconvert/preprocessors/execute.py#L374-L375

for cell_idx, output_indices in self._display_id_map[display_id].items(): cell = self.nb['cells'][cell_idx]

https://github.com/mpacer/nbconvert/blob/0de8f30404a99e63d5cc22e31cb4a984cc201beb/nbconvert/preprocessors/execute.py#L392

msg = self.kc.shell_channel.get_msg(timeout=timeout)

https://github.com/mpacer/nbconvert/blob/0de8f30404a99e63d5cc22e31cb4a984cc201beb/nbconvert/preprocessors/execute.py#L413-L427

def run_cell(self, cell, cell_index=0): msg_id = self.kc.execute(cell.source) self.log.debug("Executing cell:\n%s", cell.source) exec_reply = self._wait_for_reply(msg_id, cell) outs = cell.outputs = [] while True: try: # We've already waited for execute_reply, so all output # should already be waiting. However, on slow networks, like # in certain CI systems, waiting < 1 second might miss messages. # So long as the kernel sends a status:idle message when it # finishes, we won't actually have to wait this long, anyway. msg = self.kc.iopub_channel.get_msg(timeout=self.iopub_timeout)

https://github.com/mpacer/nbconvert/blob/0de8f30404a99e63d5cc22e31cb4a984cc201beb/nbconvert/preprocessors/execute.py#L393-L399

except Empty: self.log.error( "Timeout waiting for execute reply (%is)." % self.timeout) if self.interrupt_on_timeout: self.log.error("Interrupting kernel") self.km.interrupt_kernel() break

So I don't know how helpful it will be to yield them only to then ignore them.

I'm up for making a larger refactor to remove this dependence on instance attributes, but that should be in another PR.

In the meantime do you still want me to yield these for better downstream consumption and easier understanding even if they will immediately be assigned to self? (also, because of the way that the dynamic default works for kernel_name we need to make sure that self.nb is assigned before we start up the kernel (or we need to manually extract the kernel name and pass it through).

MSeal · 2018-05-21T18:42:04Z

nbconvert/preprocessors/execute.py


-        The input argument `nb` is modified in-place.
+        This creates The input argument `nb` is modified in-place.


and is?
Also is it creating nb? Seems like it's assigning nb to self not creating it.

MSeal · 2018-05-21T18:43:29Z

nbconvert/preprocessors/execute.py

        finally:
            self.kc.stop_channels()
            self.km.shutdown_kernel(now=self.shutdown_kernel == 'immediate')
+
+        for attr in ['nb', 'km', 'kc']:


Should this be in the finally block?

MSeal · 2018-05-21T18:44:13Z

nbconvert/preprocessors/execute.py

@@ -242,45 +290,46 @@ def preprocess(self, nb, resources):
        path = resources.get('metadata', {}).get('path', '')


Can remove the if statement by adding or None to the end of this line

sounds good!

mpacer · 2018-06-05T18:13:41Z

Note: I'd still like to test this better, but unfortunately, the way we've approached testing the execute preprocessor has made it much more difficult than I at first thought.

mpacer · 2018-06-05T18:15:33Z

@MSeal thanks greatly for the review — I tried to address all of your concerns let me know if I misinterpreted any of them.

MSeal

Looks good :)

Pass config to kernel from ExecutePreprocessor for configurable kernels

1726d1f

rgbkrk approved these changes May 17, 2018

View reviewed changes

mpacer added 2 commits May 18, 2018 19:17

cleans up traitlet stuff

0de8f30

moved _display_id_map help inside the declaration instead of comment unified named imports from traitlets

mpacer mentioned this pull request May 19, 2018

Moved super call into wrapper method for easier subclassing of ExecutePreprocessor #811

Closed

rgbkrk approved these changes May 20, 2018

View reviewed changes

MSeal requested changes May 21, 2018

View reviewed changes

put attr deletion in finally block, clearer docstring, shorter path def

d793902

surface variables from yield as requested in review

9611e95

MSeal approved these changes Jun 5, 2018

View reviewed changes

MSeal merged commit 3a1eca8 into jupyter:master Aug 2, 2018

mpacer added this to the 5.4 milestone Sep 4, 2018

mgeier mentioned this pull request Sep 9, 2018

'NoneType' object has no attribute 'argv' with nbconvert 5.4 and nbsphinx #878

Closed

mgeier mentioned this pull request Dec 4, 2018

ExecutePreprocessor: Get kernel name when executing, and not during initialization #924

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass config to kernel from ExecutePreprocessor for configurable kernels #816

Pass config to kernel from ExecutePreprocessor for configurable kernels #816

mpacer commented May 17, 2018 •

edited

Loading

mpacer commented May 19, 2018

mpacer commented May 19, 2018

takluyver commented May 21, 2018

rgbkrk commented May 21, 2018

takluyver commented May 21, 2018

MSeal May 21, 2018

mpacer Jun 5, 2018

mpacer Jun 5, 2018

MSeal May 21, 2018

MSeal May 21, 2018

mpacer Jun 5, 2018

MSeal May 21, 2018

mpacer Jun 5, 2018

mpacer commented Jun 5, 2018

mpacer commented Jun 5, 2018

MSeal left a comment


		The input argument `nb` is modified in-place.
		This creates The input argument `nb` is modified in-place.

		@@ -242,45 +290,46 @@ def preprocess(self, nb, resources):
		path = resources.get('metadata', {}).get('path', '')

Pass config to kernel from ExecutePreprocessor for configurable kernels #816

Pass config to kernel from ExecutePreprocessor for configurable kernels #816

Conversation

mpacer commented May 17, 2018 • edited Loading

Pass config to kernel from ExecutePreprocessor for configurable kernels

mpacer commented May 19, 2018

mpacer commented May 19, 2018

takluyver commented May 21, 2018

rgbkrk commented May 21, 2018

takluyver commented May 21, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mpacer commented Jun 5, 2018

mpacer commented Jun 5, 2018

MSeal left a comment

Choose a reason for hiding this comment

mpacer commented May 17, 2018 •

edited

Loading