-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
notebooks with many images in the output take an extremely long time to show a diff #396
Comments
Thanks again for the report. For better debugging, could you share the two versions of the linked notebook that are used for diffing when the computation blows up? E.g. add a modified version of the one linked to a gist? Then I can now I'm reproducing your issue reliably, without having to install all the dependencies of the notebook code. Separately, it might be reasonable for nbdime to do its diffing in a thread to try to prevent it blocking the entire server. |
No need to install anything, modify or run the notebook. Just check out, clear outputs and run the diff. and about using a separate thread for nbdime -- yes, absolutely, -- I did notice that everything else was stuck while running the never-ending diff. p.s. does nbdime run against the locally saved version of the notebook, or the currently loaded and potentially with changes which are unsaved yet, notebook? (if the former you probably want to save the notebook after clearing the outputs) |
@stas00 I did the following steps here:
While it did take a few seconds to print all the output to console, the actual diffing was < 1 second. For consistency, I also checked with:
and it too completed in less than a few seconds. Also, when comparing outputs to an empty list of outputs, the diff should always be trivial. It makes no sense for such a diff to take a long time. To try to narrow the troubleshooting, I hope you could try/answer the following:
|
Hmm, I thought it'd be simpler to reproduce. You're correct though, giving you the files explicitly would be the easiest. Here you go - both the original and the modified one: https://www.dropbox.com/sh/vi4nrt0ph5fm5yg/AACpsLqVtsbEhe4jqDR5XBHua?dl=0 % time nbdiff lesson2-rf_interpretation-orig.ipynb lesson2-rf_interpretation.ipynb real 11m52.799s Python 3.6.5 |
That did the trick - the output is almost instant. Thank you, Vidar. I guess the only remaining todo list from this issue is to move nbdime to its own thread - definitely not a show stopper, but a nice-to-have. |
Another example: I'm experiencing a similar issue with the notebook https://github.com/claresloggett/mbs-dataviz-2018/blob/master/Plotly-and-Altair-demos-exercises.ipynb . An nbiff of this notebook with a very slightly altered version of it either hangs, or more likely, takes a very long time to run (I can't say which as I haven't got it to finish in 3 hours or so). This notebook contains several Plotly plots and Altair plots, both of which tend to put the data into the notebook. It also may contain injected javascript from either library at the setup stage - I am pretty sure that even though it says Running nbdiff on this notebook with the outputs stripped works fine. |
@claresloggett Is this fixed in master? |
With #400 merged, this should be fixed. |
e.g. consider this notebook:
https://github.com/fastai/fastai/blob/master/courses/ml1/lesson2-rf_interpretation.ipynb
w/ or w/o any custom configuration it takes 10-15min (!) with CPU at 100% to get the diff output for:
http://localhost:8888/nbdime/git-difftool?base=fastai%2Fcourses%2Fml1%2Flesson2-rf_interpretation.ipynb
I think it has to do with multiple images appearing in the output of the notebook.
if I do the same with another notebook of about the same length, but with only a few images in the output:
https://github.com/fastai/fastai/blob/master/courses/ml1/lesson1-rf.ipynb
it takes some 10-30 secs to complete. My guestimate is that each image in the output adds some 30secs to the completion of the diff, but perhaps it's something else that causes that.
Thank you.
The text was updated successfully, but these errors were encountered: